EU-ETS Linked Data

From Enipedia
Jump to: navigation, search

Contents

[edit] Overview

This page describes a Linked Data version of the EU Emissions Trading System (EU ETS) data set. This is based on the XML files from their site, although we have converted it into RDF, which allows for queries across interlinked data.

The rest of this page is quite detailed and assumes that you are familiar with the tools in use. See Using SPARQL with Enipedia for a more general overview and links to tutorials.

Note: There is another version of the EU ETS in RDF as can be seen with the citl.rdf.gz file mentioned here. From talking to the people involved in this, the RDF data is a stripped down version of what is in the EU ETS European Union Transaction Log, and is used for some of the data visualization/exploration tools that they develop. Due to this, this version contains information on the CO2 emissions and allowances surrendered, but doesn't contain metadata about the facilities (name, location, owner, etc). As an example, one can compare the Enipedia version versus the Eionet version for the Amercentrale power station in the Netherlands.

Having an official version of the EU ETS data in RDF would be great as getting data out of the EU ETS European Union Transaction Log is currently not very easy. There is no bulk download option, and the site is generally tailored to viewing data for one installation at a time per commitment period. Additionally, we're currently involved in efforts to link together entities describing installations in both the E-PRTR and EU ETS data. To increase the accuracy of this process, it helps considerably to have information about the address of facilities, and this data is not available in the Excel files which can be downloaded. From what we can tell, for the EU ETS, this is only available for the operator holding accounts, and these can only be downloaded for one operator at a time (roughly 6500 of them).

[edit] EU ETS Documentation

These describe what is in the data and how it is structured.

[edit] Development Ideas

The Ownership Links and Enhanced EUTL Dataset Project tracks down the parent companies for firms involved in the EU ETS. This can be linked to the existing work we've done. The data can be integrated via RDF, and could benefit from the work that we've done on developing better search interfaces.

Rough ideas for a search interface:

  • Search for installation, company name, etc, bring up spark lines of emissions, allocations, etc per installation, company, parent company.
  • EU ETS data doesn't contain much in the way of coordinates - augment with Enipedia, E-PRTR?
  • Possibilities to include NACE codes? Check meaning of main_activity_code column
  • installation_ID & permit_ID need to be checked for overlap - are they 1:1 matches, can one value appear for multiple of the other values?
  • The ownership structure is (likely) a flat hierarchy with the ultimate parent company on top, and the hierarchy of subsidiaries represented as a single level below it.
  • Include transaction data - previously ran into issue with account identifiers in the XML files being specific to the country (which was not listed in the XML files).

[edit] Details

Example entry: http://enipedia.tudelft.nl/data/page/EU-ETS/country/NL/installation/172

SPARQL Queries can be run at: http://enipedia.tudelft.nl/sparql

See Category: Country EU ETS Profile for pages providing different views over the EU ETS data.

[edit] Installation data structure

These are the properties available for all the installations. The euets prefix needs to be defined when performing queries:

PREFIX euets: <http://enipedia.tudelft.nl/data/EU-ETS/>

[edit] Direct property values

By "direct property values", we mean those values that can be retrieved using queries in the form of:

Property usage in SPARQL Notes
rdfs:label same as euets:installationName
euets:address1
euets:address2
euets:city
euets:countryCode
euets:eperIdentification ID for the EPER, and possibley E-PRTR? TODO - check if this allows us for matching with the E-PRTR dataset in RDF.
euets:euetsID In the XML documents from the EU-ETS website, a unique identifier for a particular installation in a country is given (see euets:installationIdentifier), but this is not a unique identifier across all of Europe. The value stored for euets:euetsID is a unique identifier for the entire registry, and can be used in a URL to get the page for that specific installation on the EU-ETS website.
euets:installationIdentifier Identifier for the installation. This is only unique for all the installations in one country - installations in other countries may have the same exact identifier. See euets:euetsID to get an identifier that is unique across all of Europe.
euets:installationName
euets:latitude Not very common in the original data. We have tried to fill these in, but some coordinates are very far off the actual location.
euets:longitude Not very common in the original data. We have tried to fill these in, but some coordinates are very far off the actual location.
euets:mainActivityTypeCode
euets:mainActivityTypeCodeLookup
euets:name
euets:parentCompany
euets:permitIdentifier
euets:subsidiaryCompany
rdf:type always euets:Installation
euets:zipCode

[edit] Property values connected to intermediate data objects

This refers to data that is accessible by first querying an intermediate object.

Property usage in SPARQL Notes
euets:account object with more information about the account for this installation. This lists the owner of the account, account status, etc.
euets:napInfo object containing yearly information pertaining to installation data related to the National Allocation Plan
euets:contactPerson

[edit] National Allocation Plan data structure

This refers to National Allocation Plan data about an installation. This information can be found for a facility by querying the value of its euets:napInfo property.

Property usage in SPARQL Notes
euets:complianceCode
euets:accountStatus
euets:permitDate
euets:allowanceAllocation
euets:totalOfAllowancesSurrendered total cumulative allowances surrendered from the beginning of this reporting period (indicated by euets:periodCode) up until the current year (euets:periodYear). See euets:calculatedAllowancesSurrendered for values calculated on a yearly basis.
euets:calculatedAllowancesSurrendered allowances surrendered only in this year (i.e. the value of euets:periodYear). Note: this is not in the original data set. But is calculated based on the value of euets:totalOfAllowancesSurrendered which is cumulative over the value of euets:periodCode. See details below for how calculations were done.
euets:totalVerifiedEmissions total cumulative verified emissions from the beginning of this reporting period (indicated by euets:periodCode) up until the current year (euets:periodYear). See euets:calculatedEmissions for values calculated on a yearly basis.
euets:calculatedEmissions emissions occurring only in this year (i.e. the value of euets:periodYear). Note: this is not in the original data set. But is calculated based on the value of euets:totalVerifiedEmissions which is cumulative over the value of euets:periodCode. See details below for how calculations were done.
euets:periodYear
euets:periodCode
rdf:type

As mentioned above, the values of the properties euets:calculatedEmissions and euets:calculatedAllowancesSurrendered are not in the original data, but we calculate them ourselves. The reason for this is that if you look at the values of euets:totalOfAllowancesSurrendered and euets:totalVerifiedEmissions, these are both cumulative over the reporting periods, and we would like to have this information for a specific year. The values for both euets:calculatedEmissions and euets:calculatedAllowancesSurrendered are calculated as shown below:

Period from 2005 - 2007:

  • calculatedEmissions in 2005 = totalVerifiedEmissions in 2005
  • calculatedEmissions in 2006 = totalVerifiedEmissions in 2006 - totalVerifiedEmissions in 2005
  • calculatedEmissions in 2007 = totalVerifiedEmissions in 2007 - totalVerifiedEmissions in 2006

Period from 2008 - 2012:

  • calculatedEmissions in 2008 = totalVerifiedEmissions in 2008
  • calculatedEmissions in 2009 = totalVerifiedEmissions in 2009 - totalVerifiedEmissions in 2008
  • calculatedEmissions in 2010 = totalVerifiedEmissions in 2010 - totalVerifiedEmissions in 2009
  • calculatedEmissions in 2011 = totalVerifiedEmissions in 2011 - totalVerifiedEmissions in 2010
  • calculatedEmissions in 2012 = totalVerifiedEmissions in 2012 - totalVerifiedEmissions in 2011

For the resulting values of both euets:calculatedEmissions and euets:calculatedAllowancesSurrendered, we first filter out negative numbers. It seems that if an installation is no longer operating, the value for euets:totalOfAllowancesSurrendered and/or euets:totalVerifiedEmissions is set to zero, which when assuming a cumulative summation, will result in negative numbers when the value for the previous year is subtracted from this.

[edit] Account data structure

Property usage in SPARQL Notes
euets:AccountHolder
euets:identifierInReg
euets:installationIdentifier identifier for the installation - NOTE: this is only unique within a country. Installations in other countries may have the same exact identifier.
euets:accountTypeCode
euets:accountTypeCodeLookup
euets:accountStatus
euets:registryCode
euets:registryCodeLookup
euets:contactPerson
rdf:type always euets:Account

[edit] Person data structure

Property usage in SPARQL Notes
rdfs:label same as euets:name
euets:name
euets:city
euets:zipCode
euets:address1
euets:address2
euets:phoneNumber1
euets:phoneNumber2
euets:relationshipTypeCode
euets:email
euets:faxNumber
euets:relationshipTypeCodeLookup
euets:countryCode
euets:countryCodeLookup
rdf:type always euets:Person

[edit] Examples

These can be run by copy/pasting the queries below into the SPARQL endpoint at http://enipedia.tudelft.nl/sparql

[edit] All emissions by power plants owned by Essent N.V.

PREFIX euets: <http://enipedia.tudelft.nl/data/EU-ETS/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
select ?name ?emissions ?year where {
    ?installation euets:countryCode ?countryCode .
    ?installation euets:napInfo ?napInfo .
    ?installation rdfs:label ?name . 
    ?napInfo euets:periodYear ?year . 
    ?napInfo euets:calculatedEmissions ?emissions .
    ?installation euets:account ?account . 
    ?account euets:AccountHolder ?account_holder .
    FILTER (?account_holder = "Essent") . 
} order by DESC(?emissions)

[edit] All emissions for power plants in the Netherlands

PREFIX euets: <http://enipedia.tudelft.nl/data/EU-ETS/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
select ?account_holder ?name ?emissions ?year where {
    ?installation euets:countryCode ?countryCode .
    ?installation euets:napInfo ?napInfo .
    ?installation rdfs:label ?name . 
    ?napInfo euets:periodYear ?year . 
    ?napInfo euets:calculatedEmissions ?emissions .
    ?installation euets:account ?account . 
    ?installation euets:countryCode "NL" . 
    ?account euets:AccountHolder ?account_holder .
} order by ?account_holder ?year ?emissions

[edit] All emissions for entries in the EU ETS (limited to 100), listing the contact person and account holder

PREFIX euets: <http://enipedia.tudelft.nl/data/EU-ETS/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
select ?name ?emissions ?year ?contactPerson ?account_holder ?country where {
    ?installation euets:napInfo ?napInfo .
    ?installation rdfs:label ?name . 
    ?installation euets:contactPerson ?personInfo . 
    ?personInfo rdfs:label ?contactPerson . 
    ?napInfo euets:periodYear ?year . 
    ?napInfo euets:calculatedEmissions ?emissions .
    ?installation euets:account ?account . 
    ?installation euets:countryCode ?country . 
    ?account euets:AccountHolder ?account_holder .
} order by ?installation ?year limit 100

[edit] Linking from EU-ETS to the Ownership Links and Enhanced EUTL Dataset

[edit] Find all facilities in the EU-ETS owned by "ELECTRICITE DE FRANCE"

This is especially useful for countries (France, Ireland, Italy, Romania) where many if not all account holders are persons and not companies, or even for Greece where acounts do not use latin alphabet. This should make easier and faster the work done on Enipedia to identify those accounts.

PREFIX eutl: <http://enipedia.tudelft.nl/data/Ownership_Links_and_Enhanced_EUTL_Dataset/>
PREFIX eutlprop: <http://enipedia.tudelft.nl/data/Ownership_Links_and_Enhanced_EUTL_Dataset/property/>
PREFIX euets: <http://enipedia.tudelft.nl/data/EU-ETS/> 
select * where {
   GRAPH <http://enipedia.tudelft.nl/data/Ownership_Links_and_Enhanced_EUTL_Dataset> {
      ?eutlInstallation rdf:type eutl:Installation . 
      ?eutlInstallation eutlprop:name_OHA_PHA "ELECTRICITE DE FRANCE" . 
      ?eutlInstallation eutlprop:country_code_OHA_PHA ?countryCode . 
      ?eutlInstallation eutlprop:installation_ID ?installationID . 
   }
   GRAPH <http://enipedia.tudelft.nl/data/EU-ETS> {
      ?euetsInstallation rdf:type euets:Installation . 
      ?euetsInstallation euets:countryCode ?countryCode . 
      ?euetsInstallation euets:installationIdentifier ?installationID . 
   }
} 

[edit] Emissions for facilities owned by a specific company

PREFIX eutl: <http://enipedia.tudelft.nl/data/Ownership_Links_and_Enhanced_EUTL_Dataset/>
PREFIX eutlprop: <http://enipedia.tudelft.nl/data/Ownership_Links_and_Enhanced_EUTL_Dataset/property/>
PREFIX euets: <http://enipedia.tudelft.nl/data/EU-ETS/> 
select ?name ?emissions ?year where {
   GRAPH <http://enipedia.tudelft.nl/data/Ownership_Links_and_Enhanced_EUTL_Dataset> {
      ?eutlInstallation rdf:type eutl:Installation . 
      ?eutlInstallation eutlprop:name_OHA_PHA "ELECTRICITE DE FRANCE" . 
      ?eutlInstallation eutlprop:country_code_OHA_PHA ?countryCode . 
      ?eutlInstallation eutlprop:installation_ID ?installationID . 
   }
   GRAPH <http://enipedia.tudelft.nl/data/EU-ETS> {
      ?euetsInstallation rdf:type euets:Installation . 
      ?euetsInstallation euets:installationName ?name . 
      ?euetsInstallation euets:countryCode ?countryCode . 
      ?euetsInstallation euets:installationIdentifier ?installationID . 
      ?euetsInstallation euets:napInfo ?napInfo . 
      ?napInfo euets:periodYear ?year . 
      ?napInfo euets:calculatedEmissions ?emissions . 
   }
} 
name trend 2005 2006 2007 2008 2009 2010
EDF - Vazzio 455263,425009,522347,412259,418554,392861 455,263 425,009 522,347 412,259 418,554 392,861
EDF Centrale d'Ambès 0,0,0,null,null,null - - - - - -
EDF TAC DE BRENNILIS 13369,27704,23260,24438,47674,62328 13,369 27,704 23,260 24,438 47,674 62,328
EDF TAC DE DIRINON 24592,39698,14736,15762,19272,33235 24,592 39,698 14,736 15,762 19,272 33,235
EDF Centrale Le Havre 3152570,2326240,4046980,4045060,3187300,3368480 3,152,570 2,326,240 4,046,980 4,045,060 3,187,300 3,368,480
EDF Centrale de Blénod 2772370,2428500,2506350,1906300,2331820,2611800 2,772,370 2,428,500 2,506,350 1,906,300 2,331,820 2,611,800
EDF Centrale de Richemont 381,351,306,359,238,0 381 351 306 359 238 -
EDF Centrale de La Maxe 1989140,1826280,1702500,1195930,1103800,1393600 1,989,140 1,826,280 1,702,500 1,195,930 1,103,800 1,393,600
EDF Centrale de Cordemais 5270080,4637480,4788450,3811130,5053850,4971680 5,270,080 4,637,480 4,788,450 3,811,130 5,053,850 4,971,680
EDF TAC de Vitry-Arrighi 8192,7547,26197,13980,20887,23563 8,192 7,547 26,197 13,980 20,887 23,563
EDF Centrale de Martigues 974496,657724,407216,594863,357101,243681 974,496 657,724 407,216 594,863 357,101 243,681
EDF Centrale de Vitry 1451610,839127,1351040,954756,1328530,1104180 1,451,610 839,127 1,351,040 954,756 1,328,530 1,104,180
EDF Centrale de Vaires 194911,0,0,null,null,null 194,911 - - - - -
EDF Centrale de Porcheville 998709,949855,675929,752250,556302,662880 998,709 949,855 675,929 752,250 556,302 662,880
EDF TAC de Gennevilliers 14688,18059,32315,12430,7567,0 14,688 18,059 32,315 12,430 7,567 -
EDF Centrale d'Aramon 546670,333534,349374,337547,191443,272283 546,670 333,534 349,374 337,547 191,443 272,283
EDF Centrale d'Albi 446453,214538,0,null,null,null 446,453 214,538 - - - -
EDF DDC 204594,134600,172645,185932,312473,300753 204,594 134,600 172,645 185,932 312,473 300,753
EDF KOUROU 6621,1789,4923,10704,79855,7996 6,621 1,789 4,923 10,704 79,855 7,996
EDF Centrale de Bouchain 1077620,840652,530725,739956,643130,671951 1,077,620 840,652 530,725 739,956 643,130 671,951
EDF Centrale de Dunkerque 23897,0,0,null,null,null 23,897 - - - - -
EDF LE PORT CENTRALE 411311,343832,203898,231561,322705,286228 411,311 343,832 203,898 231,561 322,705 286,228
EDF TAC Baie 51991,69252,24661,25807,85068,76730 51,991 69,252 24,661 25,807 85,068 76,730
EDF - Lucciana 450124,302859,204921,245008,281964,282468 450,124 302,859 204,921 245,008 281,964 282,468
EDF - Jarry Sud 101741,57933,144614,169616,314532,304648 101,741 57,933 144,614 169,616 314,532 304,648
EDF - St barthélémy 63631,68473,69765,66922,67648,70534 63,631 68,473 69,765 66,922 67,648 70,534
EDF - St Martin 55686,57130,62668,57588,66035,78605 55,686 57,130 62,668 57,588 66,035 78,605
EDF - Jarry Nord 574577,584986,543002,550951,585819,646329 574,577 584,986 543,002 550,951 585,819 646,329
EDF - St Martin 2 62323,63765,64417,67838,66430,64022 62,323 63,765 64,417 67,838 66,430 64,022
EDF Pointe des Carrières 297509,382658,332023,360292,393371,368200 297,509 382,658 332,023 360,292 393,371 368,200
EDF BELLEFONTAINE 675115,619326,650492,672797,626568,668457 675,115 619,326 650,492 672,797 626,568 668,457
EDF - TAC DE VAIRES null,null,null,null,null,0 - - - - - -


[edit] Find Global Ultimate Owners (GUO) for installations and their account holders

This allows for linking facilities defined in the EU-ETS to the top owner in the corporate hierarchy. The work on Property:Subsidiary would be a good point of comparison. There are still caveats since GUO have been computed for the 2005-2007 period and are not necessary current. In the following example, Transalpina di Energia was a top owner since EDF only owned 50% by that time, and Edison also divested its Serene assets to BG since then.

[edit] Find companies with facilities in multiple countries, and count the facilities in each

PREFIX eutl: <http://enipedia.tudelft.nl/data/Ownership_Links_and_Enhanced_EUTL_Dataset/>
PREFIX eutlprop: <http://enipedia.tudelft.nl/data/Ownership_Links_and_Enhanced_EUTL_Dataset/property/>
PREFIX euets: <http://enipedia.tudelft.nl/data/EU-ETS/>
select ?company ?country ?installations ?subsidiaries
from <http://enipedia.tudelft.nl/data/Ownership_Links_and_Enhanced_EUTL_Dataset> {
     { select ?company ?country count(?eutlInstallation) as ?installations
              sql:group_concat_distinct(?subsidiary,", ") as ?subsidiaries  where {
         ?eutlInstallation eutlprop:GUO_past_name ?company .
         ?eutlInstallation eutlprop:name_OHA_PHA ?subsidiary .
         ?eutlInstallation eutlprop:country_code_OHA_PHA ?country .
     } group by ?company ?country }
     { select ?company count(distinct ?country) as ?nb_countries where {
         ?eutlInstallation eutlprop:GUO_past_name ?company .
         ?eutlInstallation eutlprop:country_code_OHA_PHA ?country .
     } group by ?company }
     FILTER (?nb_countries > 1) .
} order by ?company desc(?installations)

See results

Personal tools
Namespaces

Variants
Actions
Navigation
Portals
Advanced
Toolbox