This page describes the different software that we've developed or use for Enipedia. These are in various forms of production, and we have made the source code available in the hope that people find it useful in helping them to build their own applications.
Category:Documentation contains a collection of the pages involved in the various development efforts of Enipedia.
This is an extension to Semantic MediaWiki that allows you to directly embed SPARQL queries on wiki pages and get the results returned in a variety of formats. This is currently being upgraded to work with SMW 1.6. This software is one of the key enablers of our work on Enipedia.
- Documentation: http://www.mediawiki.org/wiki/Extension:SparqlExtension
- Source code: https://svn.eeni.tbm.tudelft.nl/SparqlExtension/
 Elasticsearch on Enipedia
We use Elasticsearch to allow for different types of queries to be conducted on multiple datasets.
- Online version: http://enipedia.tudelft.nl/Elasticsearch.html
- Documentation: Elasticsearch on Enipedia
There are a few interesting applications of this:
- Geographic queries - search across multiple data sets for all facilities within a geographic bounding box
- Fuzzy queries - find matches when alternative spellings may be used
- Automated suggestions of matches between datasets - the queries can be automated to find the best matches for one facility within another data set.
 Filtering OpenStreetMap for Industry Data
We maintain an up-to-date copy of the OpenStreetMap database, and create daily extracts that relate to several industrial sectors.
- Daily data extracts:
- Extracting Power Data from OpenStreetMap - includes source code for filtering
- Talk:Extracting Power Data from OpenStreetMap - details for extracts related to industrial sectors, includes source code for filtering
- Working with Electricity Grid Data from OpenStreetMap
- Similar work:
This is an interactive map that allows for exploring Enipedia power plant data via a Google Maps interface. This also includes layers from ITO Map showing electricity grid data from OpenStreetMap, along with the image of the earth at night from NASA. Developed with help from Nono and Martin.
- Online version: http://enipedia.tudelft.nl/maps
- Documentation: Enipedia Maps
- Source code: https://github.com/cbdavis/enipedia-maps
- We also have a sandbox on ScraperWiki, which allows people to experiment with the code and try out new ideas without having to worry about breaking the version of the map that we use on the site. New functionality developed in the sandbox usually gets migrated to the Github code and the version shown on Enipedia Maps.
 Wikipedia Power Plants
The english language version of wikipedia contains at least 4000 articles in power plants around the world, in addition to numerous pages containing lists of power plants in tabular form. We use this to download the latest data on a daily basis and then load into into the search interface at http://enipedia.tudelft.nl/Elasticsearch.html
- Source code: https://github.com/cbdavis/wikipedia-power-plants
 Visualizing Growth of German PV
- Source code: https://github.com/cbdavis/Visualizing-Growth-of-German-PV
- The code was used to create the video below. It also downloads the source data (multiple Excel spreadsheets) from a German government website and merges them all together into a single table, which can be more convenient to work with than multiple files, each containing sheets for each month.
 Visualizing Shipments from Coal Mines to US Power Plants
- KML File: https://github.com/cbdavis/Visualizing-Shipments-from-Coal-Mines-to-US-Power-Plants/raw/master/US-Coal-Mines-and-Powerplants.kmz
- Source code: https://github.com/cbdavis/Visualizing-Shipments-from-Coal-Mines-to-US-Power-Plants
 Wind deployment in Denmark (1978 - 2012)
This is an interactive animated map created by Alfredas that shows the growth of installed with capacity in Denmark by district.
- Online version: http://enipedia.tudelft.nl/wind/
 Map of CO2 emissions from European Industrial Facilities
- Online version: http://enipedia.tudelft.nl/EPRTR/CO2_source_visualization.html
This is a MediaWiki bot written in R, designed to work with semantic templates. An interesting feature of this bot framework is that it can read in data from a CSV file and then write the data in each row to templates spread across many wiki pages.
- Source code + Documentation: https://github.com/cbdavis/RSemanticMediaWikiBot
 Google Refine Reconciliation API
Matching entries in external datasets to their corresponding entries on Enipedia. This matching code is currently in the process of being reborn due to some of the limitations of the Google Refine Reconciliation process, which only handles a single data field. In practice, we've found that performing matching over multiple data fields is a much more robust solution than just trying to match on the names of things.
- Enipedia Power Plant Dataset Reconciliation API
- Github: https://github.com/cbdavis/cbdavis_code/tree/master/instanceMatching/reconciliationAPI
Scripts used to help with cleanup and maintenance of Enipedia data. These are run every day or two under the Chrisbot account.
- Github: https://github.com/cbdavis/EnipediaDataQualityBot
This reads the eGRID dataset from several Excel spreadsheets published by the US EPA and converts it to RDF.
Matlab code to run a query against a SPARQL endpoint, retrieve the data as TSV, and import it into a struct. Developed for the work on Portal:OpenGridData which does load flow calculations over power grid data, stored as wiki pages.
 TODO - further code to publish
- Code for working with Elasticsearch - there's some for loading the data, and also for trying to match all entities contained in two different databases.
- IAEA, EU-ETS to RDF
- Extraction of OpenStreetMap data
- Work on filtering power data from a daily updated planet.osm, correlation with Wikipedia data, checks across different language versions of Wikipedia.