Checking for duplicates in the generic process network
 Possible Matches
 Duplicates per Wikipedia Page
Goods/Commoditeis have Wikipedia pages specified for them, which acts as a type of unique identifier. Using this, we can check for duplicate pages that point to the same page. Another technique used is that we use "other names" properties specified on Wikipedia/DBpedia to check for duplicates as well.
- Do the pages point to the same Wikipedia page?
- Is one page not a redirect of the other page?
- use query to check that ?good1 owl:sameAs ?good2 is not set
 Check for pages that are not owl:sameAs of each other but point to the same Wikipedia page
This is likely the best approximation of if duplicate pages exist. The queries further down this page give more information for what's going on behind the scenes with regard to page redirects, etc.
 Check for pages that point to the same Wikipedia page
These are distinct GoodNames that reference the same Wikipedia page, and may have to be merged. This does not check for redirects.
 Count the number of pages with the same Wikipedia Page
The query below gives an indication of the number of duplicates that exist. These are GoodNames that need to be merged. The number wpPageCount should be divided by two since the Sparql query used looks at all combinations of duplicates (i.e. a=b is counted along with b=a).
 Find two GoodNames which may be duplicates based on "other names" defined on Wikipedia
 List all products used, and the Wikipedia page they correspond to
This is a sanity check showing which products have Wikipedia pages specified for them. Idealy there should be a Wikipedia page for every GoodName in order to help with duplicate detection
 Dealing with redirects (owl:sameAs)
There are some issues with redirects, especially if other pages point to those redirect pages. The list below shows pages with links that should be readjusted. This is an issue with sparql not supporting the owl:sameAs link. Hopefully we can find better ways around this than having to refactor page properties.