With a few people from Logilab we went to the 2nd International Workshop on Open Data (WOD), on the 3rd of june.
Although the main focus was an academic take on OpenData, a lot of talks were related to the Semantic Web technologies and especially LinkedData.
The full program (and papers) is on the following website. Here is a quick review of the things we though worth sharing.
- privacy oriented ontologies : http://l2tap.org/
- interesting automations done to suggest alignments when initial data is uploaded to an opendata website
- some opendata platforms have built-in APIs to get files, one example is Socrata : http://dev.socrata.com/
- some work is being done to scale processing of linked data in the cloud (did you know you could access ready available datasets in the Amazon cloud ? DBPedia for example )
- the data stored in wikipedia can be a good source of vocabulary on certain machine learning tasks (and in the future, wikidata project)
- there is an RDF extension to Google Refine (or OpenRefine), but we haven't managed to get it working out of the box,
- WebSmatch uses morphological operators (erosion / dilation) to identify grids and zones in Excel Spreadsheets and then aligns column data on known reference values (e.g. country lists).
We naturally enjoyed the presentation made by Romain Wenz about http://data.bnf.fr with the unavoidable mention of Victor Hugo (and CubicWeb).
Thanks to the organizers of the conference and to the National French Library for hosting the event.