Industrial Symbiosis Data Sources
This is a page to document data sources about Industrial Symbiosis.
 Aim and Objectives of the Primary Database Project
 Aim of the Primary Database Project
Collect and structure relevant data sets in order to generate a central repository for open industrial symbiosis data.
 Objectives of the primary database project
- Allow for the addition and integration of external data sets that may use proprietary naming conventions.
- Allow for the easy use and integration of database contents in various applications.
- Create tools that can easily be updated, verified, and utilized.
- Adapt to the needs of the research and industrial communities so as to generate critical mass in input and application users
- Reduce the risk of creating a bad (nondesired) connection (ie toxic properties, etc) / To highlight which connections to never make.
- Make explicit more tacit information in the field.
 Notes on the overall objectives
The objectives of this database are primarily descriptive as opposed to normative. The idea is that many potential normative applications (open or proprietary) could then be build upon such a primary database. Applications such as web based tools to highlight potential to direct users, analysis tools for consultants and facilitators, comparative tools for academics, or other tools looking to learn from and build upon the smart system solutions being implemented around the world today.
In order to collaborate on the database a few first initiatives have been suggested. These are 1) Creating a common namespace, 2) Incorporating existing case studies 3) identifying key attributes (properties/characteristics) of material and energy and beginning the work to fill in these in relation to items populated in the namespace.
- Namespace - Finding names for the same things or relationships between them (i.e. subset of, related to).
- Case studies - incorporate from http://ie.tudelft.nl, NISP file attached below.
- Material properties - chemical level, etc.
 Objectives of the namespace task
- Be able to link existing data sources or use them rewritten in our own format
- Be able to have a working nomenclature for adding existing IS situations
- Allow for (if desired) the comparison or sharing of data between applications and datasets
- Allow for / enable the use of this database in parallel with private data sources (ie so it is Not necessary to contribute links from private data sources)
- In general, lay the ground – foundation for a multitude of applications that can be built upon
 State of the task
Initial activities for Screen Scraping of names from wikis and data sets is scheduled for May/June 2013.
 Tools and Files within the task
- Could use google refine to create a lookup table – with common words (Chris D. has worked a bit with matching software)
- Simple Knowledge Organization System - This is a vocabulary for creating thesauri. In particular for this task, it is useful as it specifies terms to deal with linkages between concepts - "broader", "narrower", "related". Not everything will be a one-to-one match, and it may not even be clear if two things are an exact match or not. We need to have a way to capture the certainty or the ambiguity of this.
- CPC to ISIC correspondence table
- http://unstats.un.org/unsd/cr/registry/regot.asp?Lg=1 - multiple correspondence tables
- http://wits.worldbank.org/product_concordance.html - multiple correspondence tables
- http://ec.europa.eu/eurostat/ramon/relations/index.cfm?TargetUrl=LST_REL&StrLanguageCode=EN&IntCurrentPage=7 - multiple correspondence tables
- Classification Systems
- LOW - List of Wastes Regulations - https://www.hesa.ac.uk/dox/datacoll/c09042/LOW_Guide_v1_2_%28sustainability%29.pdf
- EWC - European Waste Catalogue
- CPA - Statistical classification of products by activity - http://ec.europa.eu/eurostat/statistics-explained/index.php/Glossary:Statistical_classification_of_products_by_activity_%28CPA%29
- United Nations Standard Products and Services Code
 CASE STUDIES
 Objectives of the case studies task
- input data from historic symbiosis cases (including material information) into the database
- Incorporate from http://ie.tudelft.nl
- Look into Global Synergy Database and if reuse is possible.
- Hebei By-Product Synergy Project
- By-Product Synergy Hub - Case Studies
- Resource Optimization Initiative - Case Studies
- File:NISP Case Studies.rar
- Prosum is setting up an information platform on "prospecting critical raw materials from e-waste". There might be overlap with some of the other work here.
- "ProSUM is a EU co-founded project funded which will deliver the First Urban Mine Knowledge Data Platform (EU-UMKDP), a centralised database of all available data and information on arisings, stocks, flows and treatment of waste electrical and electronic equipment (WEEE or e-waste), end-of-life vehicles (ELVs), batteries and mining wastes from extraction to end of life products with the ability to reference all spatial and non-spatial data."
- Best Available Technologies
 MATERIAL PROPERTIES
 Objectives of the material properties task
- Be able to cross apply technologies that use one material to other similar (property) materials
- Use new (or more materials) in a current technology that hold similar key attributes as the materials currently used in such processes
- Be able to substitute feed in materials with materials of similar key properties (or a combination thereof)
- Better enable for the combining feedstocks for a solution within a technology
 State of the task
Suggestions from George L. on structure (Can upload file here if he approves)
Suggestion from Chris D. on Chembox
 Tools and Files within the task
- Examples of screenscraping wikipedia (and dbpedia) data https://scraperwiki.com/profiles/cbdavis/
 Data Cleanup/Linking
- http://enipedia.tudelft.nl/ISDATAsearch.html - search interface for finding NACE and EWC codes
- Currently works well for small pieces of text, suggestions tend to get worse for full paragraphs. At the moment doesn't consider synonyms or related terms.
- OpenRefine - Great tool for cleaning up messy data. Also can use cluster and edit to find synonyms. This can be used to create lookup tables.
- See OpenRefine Tutorial for a demo of the different types of functionality.
- OpenRefine is also quite useful for retrieving coordinates for place names. With this you can very quickly display objects on a map.
- Generic code for linking data sets is being developed here. Given two different data sets, it tries to match the entities based on the number of words they have in common along with a weighting factor based on how often those words occur in the entire data set. This helps to reduce the importance of common words, while highlighting matches on words that occur infrequently (by calculating the self information of a partitioning). The code is currently a bit rough, but a working demo can be without much effort.
- The Next Big Thing You Missed: Software That Helps Businesses Rid Their Supply Chains of Slave Labor - mentions interesting work on "an algorithm that could predict how likely it was that a given product, sourced from a given place, was made using slave labor." This is to some extent addressing the same type problem that we have with integration of data sets. These datasets describe multiple levels of a system and it's possible to use various clues to narrow down the probability of what's actually happening.
- Visual explorer for NAICS codes (requires browser supporting HTML5) - this allows users to search and find codes. The source code is on github here. This can be easily modified for the other classification codes as well.
- Visual explorer for European Waste Classification (EWC) codes.
- Sankey Diagram in D3 with modifications allowing for loops in flows
 Linking Terms, Finding Synonyms
- LoopLocal has done a bit of work on manual tagging of existing databases
- GEMET - GEneral Multilingual Environmental Thesaurus. (Example entry for Benzene).
- DBpedia Spotlight has an API that allows you to send text and get a list of Wikipedia articles that are mentioned in the text. This can also be extended to work with other data sources besides Wikipedia. The power of this is that it can be used to expand the amount of text available for machine learning algorithms. Several of the classification codes have only short text descriptions. Running these descriptions through DBpedia Spotlight could help us to get additional data (via the RDF & plain text representations) that could be used to describe the underlying concepts.
- Chembox info box on wikipedia, see http://dbpedia.org/page/Benzene for some of the properties that can be extracted via DBpedia. There are 5000+ pages that contain this infobox
- Wikipedia data is interesting as it allows us to (often) get the different language names of chemicals. Furthermore, the redirects for each of the articles allows you to see alternative names for the same substance. The Wikipedia API can also be used to retrieve these (Benzene redirects.
 Other files/websites of interest
- Industrial Symbiosis
- LCA Digital Commons - Documentation on over > 13000 processes
- http://eplca.jrc.ec.europa.eu/ResourceDirectory/databaseList.vm - list of LCA databases
- https://bonsai.uno/ - Big Open Network for Sustainability Assessment Information
- https://nexus.openlca.org - Searchable repository of LCA data sets
- European Commission - Environmental Data Centre on Waste
- http://iere.org/research/lci-repository/ - IERE LCI Repository
- http://vocamp.org/wiki/GeoVoCampSB2015 - Work on creating an ontology for LCA.
- http://publictest.calrecycle.ca.gov/lcatoolfrontend - interactive website for used oil LCAs.
- An example of how NACE/SNI Industry codes can be heuristically linked to LCI processes: File:SNI to ELCD and CPD-SPINE processes.xlsx
- File:EWC in CSV.txt
- 10th Annual Industrial Symbiosis Research Symposium (ISRS) files:
- http://biobasedeconomy.nl - allows for exploring different pathways for processing biomass
- CarbonDB Open data life cycle assessment for energy and carbon
 OpenSYNERGY M.Sc. project
From September 2013, till February 2014, a team of 5 Industrial Ecology students joined forces in an Interdisciplinary Project Group. More information on the Industrial Ecology programme can be found here: http://www.industrialecology.nl. The project was intended to develop a dynamic and applicable Industrial Symbiosis facilitating web platform, which with the help of Industrial Symbiosis indicators and ontological concepts identifies, extracts and combines relevant information from the major European databases in such a way that it could be used to form Industrial Symbiosis synergies. Data related to substance flows, energy supplies, added values by cost analysis, environmental impact assessment by calculating CO2 emissions, problems related to logistics and storage were used in the formation of such a tool - an algorithm to a web platform.
In this video a proposed user interface is presented, called ‘openSynergy’. What is shown in the video is a sketch for a dynamic web platform that would extract and combine information from open source databases (related to industrial processes) along with user-input information with the aim of facilitating Industrial Symbiosis. This would explore the possibilities and provide recommendations for potential synergies of by-products, waste or/and utilities. The purpose of the ISData project mentioned on this page is to figure out how to create the technical infrastructure (in terms of data and software) that would allow for the realization of a website such as that proposed by this M.Sc. project.
 Advanced Ideas
This section concerns ideas that would require a bit of work to implement, but could build upon the initiatives mentioned above.
In general, developing these would involve some experience with Artificial Intelligence, Expert Systems, etc.
- Automatic matching of possible flows is possible, but would need a concerted effort to deal with false positives.
- http://www.reegle.info has a Tagging API that allows you to upload documents and retrieve suggested keywords for the document. A similar set up could be created to suggest industrial classification codes, chemical identifiers, etc.
 eSymbiosis conference
"Information synergy of industrial symbiosis", presentation of ISdata can be found at File:Presentation Ben for eSymbiosis.pdf