iPaw Conference in Chicago
by Jen Golbeck
This week I was lucky enough to attend the International Provenance and Annotation Workshop in Chicago. On one hand, there was a very small contingent of Semantic Web researchers here. While some people knew of RDF, the focus was much different than I was used to or expected.
While our community (or at least, me as a representative of it) has considered “provenance” of our metadata to be the person who created it (and perhaps the date), provenance in this context was much broader. The talks at iPaw treated provenance as all of the history and information behind scientific data. Ideally, provenance would contain everything necessary to reproduce a datum. As such, many talks focused on workflows and connecting workflows to data.
I presented two papers. The first was Chris Halaschek’s work on PhotoStuff. This was primarily an annotation paper, with a little support for provenance (the who-said-it kind) on the website end. There were some questions about why the whole workflow (i.e. process of annotating a picture) wasn’t saved. In reality, though, an RDF file contains all of that information. It lets the reader know which ontologies were used, what terms were chosen, which regions of the picture were annotated with which terms, if instances previously defined were used, etc. While we present it as just a metadata file describing a photo, in reality it contains all of the provenance information. Anyone reading an RDF file would know exactly how to reproduce the annotations of that picture.
The second talk was on my own work with FilmTrust. I intended to present it as an instance of using provenance (who said things) with the annotations (trust, movie ratings, etc) to do social computations and interface design. However, the interest in my work was more in how this social information could be connected to other information in the background and workflows to enhance it. Margo Seltzer of Harvard raised some interesting questions about security of provenance that have had me doing a lot of thinking about new applications of the trust work. For example, if someone has the provenance for public data, but the provenance is protected (think “secret sources”), trust is one way that access to the provenance can be determined. As usual, this is not a “secure” solution, but I think that use of trust is a very interesting one. I will be following up on that point by attending the PASS Workshop at Harvard later this month.
Overall, iPaw was a very useful workshop to attend. I think the space of provenance is an interesting place to look at applying Semantic Web technologies. They seem to be a perfect fit for what scientists are looking achieve.
