Mindswap Convert To RDF Tool


Author: Michael Grove
For: MINDSWAP

Introduction

Writing RDF by hand is difficult and often the data that needs to be converted to an RDF format is huge; entirely too much to do by hand, or too complex to be worth doing. Convert To RDF is a tool for automatically converting delimited text data into RDF via a simple mapping mechanism. The original ConvertToRDF tool worked from the commandline taking in a map file which defined how to perform the conversion. Writing map files by hand was sometimes a complicated task so a GUI version of the program has been designed.

The new Convert to RDF provides the user with a table layout. When a delimited text file, or an Excel file, is opened in Convert to RDF, the data is shown in the main table of the program. From this point, creating a mapping is just a matter of a few clicks and drags.

Walkthrough

There are two ways to start Convert document. You can create a new convert document from scratch by selecting "File->New...". A simple dialog will popup allowing the user to select the input document and the delimiters to use when parsing that document into its constituent data. Convert to RDF has some basic support for reading in Excel files, and in the case that the file is Excel format, the delimiters will be ignored and the default Excel parsing will be used.

If you have a map file saved from a previous session with Convert to RDF, you can re-open that map file and pick up where you left off. To open from a map file, select "File->Open Map File..." and select your map file from the local file system. All your previous work that was saved in the map file, including all imports and mapping statements will be re-created.

New Convert Document Dialog
Once the file has been successfully parsed, it will be shown in the main data table. From here you can create the mappings using the simple point and click interface provided by the UI. First, you should import any ontologies that you will use in your mappings. To add an import, right click in the imports window and select "Add Import" from the popup menu. You can also add an import by right clicking anywhere in the table and choosing the "Add Import" option.

The concepts in an imported document are used when creating the dialogs for specifying the mappings. You can create mappings only using resources that have been imported into the program.

Adding an import
Once you have imported any documents you plan on using during your mappings, you can start the task of specifying the mappings. A Data Map is a section of the original data file that represents a chunk of data with a specific type. Each Data Map is associated with a specific class, and each row in the Data Map corresponds to an instance of that class. The columns of the map are the property values of the instance which each column mapping to a specific property.

To create a Data Map, select the rows in the table that you would like to be in the Map, and right click in the selection. You will get a popup menu with a set of operations you can perform to that selection. One of the options will be "Add New Data Map...", so click on that item to specify the selected rows are a Data Map.

The new Data Map will be created and you will see it listed in the "Data Maps" pane to the side of the main table. The selected columns that are a part of the Data Map will change to Yellow. Any table cell whose background is Yellow belongs to a Data Map and is not currently part of a mapping and is not a part of a header row. Colorizing the table cells according to their status hopefully will make it easy for the user to determine what has been mapped, what has not, and how the file is mapped by only a quick glance.

A new Data Map
Now that you have created a Data Map, you need to specify which row is the header row for the data. The header row is generally a row with string values identifying the type of data that is in each of the columns. In this example, the header row is the first row, which has values like "Name", "Height", and "Weight." You MUST have a header row to be able to create mappings between the columns and RDF concepts. Future versions will rely on the column index, or allow the user to create a new header row, but for now, you must have a header row. To specify which row is the header, select the row which is to be the header row and right click in that row. The popup menu will have an option "Set as Header row." The header row will then turn Purple so it is clear which row is the header row upon a quick glance.

Set Data Map Use Type
Next, you must specify which the type for the Data Map. The type for the Data Map specifies which RDF concept each row in the Data Map is an instance of. You must specify a type for each Data Map to be able to generate RDF; otherwise the converter would not know what instances to create when it does the conversion. To specify the type for a Data Map, right click anywhere in the Data Map and select the "Set Map Use Type..." option. A dialog will popup with a list of all the classes that are in the documents that you have imported into the tool. Select a class from the picklist and hit OK to set the type.

Lastly, you must specify mappings for every column in the Data Map. These mappings tell which Property the column maps to. So when the conversion happens, each row in the Data Map will be specified as an instance of the Data Map Use type. Each column in the current row represents a property value and these values will be associated with the new instance via the specified mappings.

Add a new Column Mapping
To create a new mapping, select a column in the Data Map and right click. Select the "Add Column mapping" option from the menu. You will be presented with a simple dialog very similar to the dialog for setting the Use Type for the Data Map. There will be a picklist of properties from the currently loaded ontologies. Select a property from this list to create the mapping. Columns in Data Maps that have a mapping will have a light blue background so it is clear to the user which columns have been mapped, and which have not. When you select a Data Map in the Data Map window to the left of the table, all current mappings will be shown in the Mappings window. This is an easy way to inspect all the mappings for a particular Data Map.

The special mapping is for rdf:ID. This specifies that the value of that column is to be used as the rdf:ID of the resultant instance. You do not need to set the rdf:ID for the instances, you can generate bnodes, but generally it is a good practice.

Convert Table with header row and mappings
You can delete data maps, remove header rows and mappings, as well as some other operations. The right click menu in the table is context sensitive. It will always show you which operations you have available to you based on your current selection in the table.

You can view the RDF generated from the current conversion by selecting "Tools->View RDF...". This will display the RDF using the mappings you have specified in the tool. You can check the RDF periodically to make sure you are creating the correct mappings and are generating the RDF you expect. To save this RDF to a file, select "File->Save RDF..." and select the file to save the RDF to. Also, you can save your current map file to disk so you are able to pick up your work later, or pass on a map file to a friend or coworker. To save your work as a map file, select "File->Save Map File..." and select a location to save.

Notes

The map file format has a few features that are not yet exposed through the UI. You can map multiple columns to a single property, or a single column to multiple properties by using built in functions such as concat and tokenize. Also, you can reference instances in other Data Maps using a link variable. These features are new to this version of Convert to RDF, but provide a lot of flexibility and power to the tool. Newer versions will expose this functionality through the UI, but currently, you must add these types of mappings by hand.

Another goal of the future work for this project is to convert some of the API over to extend the Jena framework similar to how the OWL Class Loader or the Multimedia API.


Links and Resources

Previous Convert To RDF commandline Version
Jena
Mindswap Convert To RDF Javadocs
Download the source.
Download the utilities jar needed to compile Convert To RDF.
The source is also available from SVN here.


MINDSWAP is a W3C member