[an error occurred while processing this directive]
[an error occurred while processing this directive]
Mindswap Convert To RDF Tool
Author: Michael Grove
For: MINDSWAP
Introduction
Writing RDF by hand is difficult and often the data that needs to be converted to an RDF format is huge;
entirely too much to do by hand, or too complex to be worth doing. Convert To RDF is a tool for automatically
converting delimited text data into RDF via a simple mapping mechanism. The original ConvertToRDF tool
worked from the commandline taking in a map file which defined how to perform the conversion. Writing
map files by hand was sometimes a complicated task so a GUI version of the program has been designed.
The new Convert to RDF provides the user with a table layout. When a delimited text file, or an Excel file,
is opened in Convert to RDF, the data is shown in the main table of the program. From this point, creating
a mapping is just a matter of a few clicks and drags.
Walkthrough
There are two ways to start Convert document. You can create a new convert document from scratch by
selecting "File->New...". A simple dialog will popup allowing the user to select the input document
and the delimiters to use when parsing that document into its constituent data. Convert to RDF has some
basic support for reading in Excel files, and in the case that the file is Excel format, the delimiters
will be ignored and the default Excel parsing will be used.
If you have a map file saved from a previous session with Convert to RDF, you can re-open that map file and pick
up where you left off. To open from a map file, select "File->Open Map File..." and select your map file
from the local file system. All your previous work that was saved in the map file, including all imports
and mapping statements will be re-created.
|
New Convert Document Dialog
Once the file has been successfully parsed, it will be shown in the main data table. From here you
can create the mappings using the simple point and click interface provided by the UI. First, you should
import any ontologies that you will use in your mappings. To add an import, right click in the imports
window and select "Add Import" from the popup menu. You can also add an import by right clicking anywhere
in the table and choosing the "Add Import" option.
The concepts in an imported document are used
when creating the dialogs for specifying the mappings. You can create mappings only using resources
that have been imported into the program.
|
Adding an import
Once you have imported any documents you plan on using during your mappings, you can start the task of
specifying the mappings. A Data Map is a section of the original data file that represents a chunk of
data with a specific type. Each Data Map is associated with a specific class, and each row in the Data Map
corresponds to an instance of that class. The columns of the map are the property values of the instance which
each column mapping to a specific property.
To create a Data Map, select the rows in the table that you would like to be in the Map, and right click
in the selection. You will get a popup menu with a set of operations you can perform to that selection.
One of the options will be "Add New Data Map...", so click on that item to specify the selected rows are
a Data Map.
The new Data Map will be created and you will see it listed in the "Data Maps" pane to the side of the main table.
The selected columns that are a part of the Data Map will change to Yellow.
Any table cell whose background is Yellow belongs to a Data Map and is not currently part of a mapping and
is not a part of a header row. Colorizing the table cells according to their status hopefully will make
it easy for the user to determine what has been mapped, what has not, and how the file is mapped by only
a quick glance.
|
A new Data Map
Now that you have created a Data Map, you need to specify which row is the header row for the data. The
header row is generally a row with string values identifying the type of data that is in each of the columns.
In this example, the header row is the first row, which has values like "Name", "Height", and "Weight."
You MUST have a header row to be able to create mappings between the columns and RDF concepts. Future versions
will rely on the column index, or allow the user to create a new header row, but for now, you must have a header row.
To specify which row is the header, select the row which is to be the header row and right click in that row.
The popup menu will have an option "Set as Header row." The header row will then turn Purple so it is
clear which row is the header row upon a quick glance.
|
Set Data Map Use Type
Next, you must specify which the type for the Data Map. The type for the Data Map specifies which RDF concept
each row in the Data Map is an instance of. You must specify a type for each Data Map to be able to generate
RDF; otherwise the converter would not know what instances to create when it does the conversion. To specify
the type for a Data Map, right click anywhere in the Data Map and select the "Set Map Use Type..." option. A
dialog will popup with a list of all the classes that are in the documents that you have imported into the tool.
Select a class from the picklist and hit OK to set the type.
Lastly, you must specify mappings for every column in the Data Map. These mappings tell which Property
the column maps to. So when the conversion happens, each row in the Data Map will be specified as an instance
of the Data Map Use type. Each column in the current row represents a property value and these values
will be associated with the new instance via the specified mappings.
|
Add a new Column Mapping
To create a new mapping, select a column in the Data Map and right click. Select the "Add Column mapping" option
from the menu. You will be presented with a simple dialog very similar to the dialog for setting the Use Type for
the Data Map. There will be a picklist of properties from the currently loaded ontologies. Select a
property from this list to create the mapping. Columns in Data Maps that have a mapping will have a light
blue background so it is clear to the user which columns have been mapped, and which have not. When
you select a Data Map in the Data Map window to the left of the table, all current mappings will be shown
in the Mappings window. This is an easy way to inspect all the mappings for a particular Data Map.
The special mapping is for rdf:ID. This specifies that the value of that column is to be used as the rdf:ID
of the resultant instance. You do not need to set the rdf:ID for the instances, you can generate bnodes, but
generally it is a good practice.
|
Convert Table with header row and mappings
You can delete data maps, remove header rows and mappings, as well as some other operations. The right click
menu in the table is context sensitive. It will always show you which operations you have available to you
based on your current selection in the table.
You can view the RDF generated from the current conversion by selecting "Tools->View RDF...". This will
display the RDF using the mappings you have specified in the tool. You can check the RDF periodically
to make sure you are creating the correct mappings and are generating the RDF you expect. To save this RDF
to a file, select "File->Save RDF..." and select the file to save the RDF to. Also, you can save your
current map file to disk so you are able to pick up your work later, or pass on a map file to a friend or coworker.
To save your work as a map file, select "File->Save Map File..." and select a location to save.
Notes
The map file format has a few features that are not yet exposed through the UI. You can map multiple
columns to a single property, or a single column to multiple properties by using built in functions
such as concat and tokenize. Also, you can reference instances in other Data Maps using a link variable.
These features are new to this version of Convert to RDF, but provide a lot of flexibility and power to
the tool. Newer versions will expose this functionality through the UI, but currently, you must add these
types of mappings by hand.
Another goal of the future work for this project is to convert some of the API over to extend the Jena framework
similar to how the OWL Class Loader or the Multimedia API.
Links and Resources
Previous Convert To RDF commandline Version
Jena
Mindswap Convert To RDF Javadocs
Download the source.
Download the utilities jar needed to compile Convert To RDF.
The source is also available from SVN here.
[an error occurred while processing this directive]