Industrial Symbiosis Data Sources

From Enipedia
Jump to: navigation, search

This is a page to document data sources about Industrial Symbiosis.


[edit] Aim and Objectives of the Primary Database Project

[edit] Aim of the Primary Database Project

Collect and structure relevant data sets in order to generate a central repository for open industrial symbiosis data.

[edit] Objectives of the primary database project

  • Allow for the addition and integration of external data sets that may use proprietary naming conventions.
  • Allow for the easy use and integration of database contents in various applications.
  • Create tools that can easily be updated, verified, and utilized.
  • Adapt to the needs of the research and industrial communities so as to generate critical mass in input and application users
  • Reduce the risk of creating a bad (nondesired) connection (ie toxic properties, etc) / To highlight which connections to never make.
  • Make explicit more tacit information in the field.

[edit] Notes on the overall objectives

The objectives of this database are primarily descriptive as opposed to normative. The idea is that many potential normative applications (open or proprietary) could then be build upon such a primary database. Applications such as web based tools to highlight potential to direct users, analysis tools for consultants and facilitators, comparative tools for academics, or other tools looking to learn from and build upon the smart system solutions being implemented around the world today.

[edit] Initiatives

In order to collaborate on the database a few first initiatives have been suggested. These are 1) Creating a common namespace, 2) Incorporating existing case studies 3) identifying key attributes (properties/characteristics) of material and energy and beginning the work to fill in these in relation to items populated in the namespace.

  • Namespace - Finding names for the same things or relationships between them (i.e. subset of, related to).
  • Case studies - incorporate from, NISP file attached below.
  • Material properties - chemical level, etc.


This initiative looks to create a namespace for the symbiosis database based on sources such as E-PRTREWCLCI – and other data sets.

[edit] Objectives of the namespace task

  • Be able to link existing data sources or use them rewritten in our own format
  • Be able to have a working nomenclature for adding existing IS situations
  • Allow for (if desired) the comparison or sharing of data between applications and datasets
  • Allow for / enable the use of this database in parallel with private data sources (ie so it is Not necessary to contribute links from private data sources)
  • In general, lay the ground – foundation for a multitude of applications that can be built upon

[edit] State of the task

Initial activities for Screen Scraping of names from wikis and data sets is scheduled for May/June 2013.

[edit] Tools and Files within the task


[edit] Objectives of the case studies task

  • input data from historic symbiosis cases (including material information) into the database

[edit] Sources


[edit] Objectives of the material properties task

  • Be able to cross apply technologies that use one material to other similar (property) materials
  • Use new (or more materials) in a current technology that hold similar key attributes as the materials currently used in such processes
  • Be able to substitute feed in materials with materials of similar key properties (or a combination thereof)
  • Better enable for the combining feedstocks for a solution within a technology

[edit] State of the task

Suggestions from George L. on structure (Can upload file here if he approves)

Suggestion from Chris D. on Chembox

[edit] Tools and Files within the task

[edit] Shared tools and files for the initiatives

[edit] ScreenScraping

[edit] Data Cleanup/Linking

  • - search interface for finding NACE and EWC codes
    • Currently works well for small pieces of text, suggestions tend to get worse for full paragraphs. At the moment doesn't consider synonyms or related terms.
  • OpenRefine - Great tool for cleaning up messy data. Also can use cluster and edit to find synonyms. This can be used to create lookup tables.
    • See OpenRefine Tutorial for a demo of the different types of functionality.
    • OpenRefine is also quite useful for retrieving coordinates for place names. With this you can very quickly display objects on a map.
  • Generic code for linking data sets is being developed here. Given two different data sets, it tries to match the entities based on the number of words they have in common along with a weighting factor based on how often those words occur in the entire data set. This helps to reduce the importance of common words, while highlighting matches on words that occur infrequently (by calculating the self information of a partitioning). The code is currently a bit rough, but a working demo can be without much effort.
  • The Next Big Thing You Missed: Software That Helps Businesses Rid Their Supply Chains of Slave Labor - mentions interesting work on "an algorithm that could predict how likely it was that a given product, sourced from a given place, was made using slave labor." This is to some extent addressing the same type problem that we have with integration of data sets. These datasets describe multiple levels of a system and it's possible to use various clues to narrow down the probability of what's actually happening.

[edit] Visualization

Visualization & searchable interface for European Waste Classification codes

[edit] Linking Terms, Finding Synonyms

  • LoopLocal has done a bit of work on manual tagging of existing databases
  • GEMET - GEneral Multilingual Environmental Thesaurus. (Example entry for Benzene).
  • DBpedia Spotlight has an API that allows you to send text and get a list of Wikipedia articles that are mentioned in the text. This can also be extended to work with other data sources besides Wikipedia. The power of this is that it can be used to expand the amount of text available for machine learning algorithms. Several of the classification codes have only short text descriptions. Running these descriptions through DBpedia Spotlight could help us to get additional data (via the RDF & plain text representations) that could be used to describe the underlying concepts.
  • Chembox info box on wikipedia, see for some of the properties that can be extracted via DBpedia. There are 5000+ pages that contain this infobox
    • Wikipedia data is interesting as it allows us to (often) get the different language names of chemicals. Furthermore, the redirects for each of the articles allows you to see alternative names for the same substance. The Wikipedia API can also be used to retrieve these (Benzene redirects.
Wikipedia Infobox for Benzene
Redirects for Benzene
Categories for Benzene

[edit] Other files/websites of interest

[edit] OpenSYNERGY M.Sc. project

From September 2013, till February 2014, a team of 5 Industrial Ecology students joined forces in an Interdisciplinary Project Group. More information on the Industrial Ecology programme can be found here: The project was intended to develop a dynamic and applicable Industrial Symbiosis facilitating web platform, which with the help of Industrial Symbiosis indicators and ontological concepts identifies, extracts and combines relevant information from the major European databases in such a way that it could be used to form Industrial Symbiosis synergies. Data related to substance flows, energy supplies, added values by cost analysis, environmental impact assessment by calculating CO2 emissions, problems related to logistics and storage were used in the formation of such a tool - an algorithm to a web platform.

In this video a proposed user interface is presented, called ‘openSynergy’. What is shown in the video is a sketch for a dynamic web platform that would extract and combine information from open source databases (related to industrial processes) along with user-input information with the aim of facilitating Industrial Symbiosis. This would explore the possibilities and provide recommendations for potential synergies of by-products, waste or/and utilities. The purpose of the ISData project mentioned on this page is to figure out how to create the technical infrastructure (in terms of data and software) that would allow for the realization of a website such as that proposed by this M.Sc. project.

[edit] Advanced Ideas

This section concerns ideas that would require a bit of work to implement, but could build upon the initiatives mentioned above.

In general, developing these would involve some experience with Artificial Intelligence, Expert Systems, etc.

  • Automatic matching of possible flows is possible, but would need a concerted effort to deal with false positives.
  • has a Tagging API that allows you to upload documents and retrieve suggested keywords for the document. A similar set up could be created to suggest industrial classification codes, chemical identifiers, etc.

[edit] eSymbiosis conference

"Information synergy of industrial symbiosis", presentation of ISdata can be found at File:Presentation Ben for eSymbiosis.pdf

Personal tools