Agents and The Semantic Web Portal

Group Members

Project Description

The Semantic Web is defined as "an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation" This definition brings the idea of software agents that can work autonomously and do the tasks in behalf of people. It is quite reasonable to say that the real power of semantic web will be revealed when the agents are capable of accomplishing tasks without the guidance of a human being.

The nature dataset is particularly interesting from the semantic web perspective because fo the range of topics it covers. With a good coverage of the range of academic literature, these articles provide us with an oportunity to study the overlaps and interrelationships between fields that may not have been previously available.

Implemented, this idea of providing links based on common semantic content is the Semantic Web Portal. For such an implementation, several different types of semantic web agents will be necessary

Existing Systems and Tools

These are descriptions of semantic web agents currently available on the web. Their applicability to the semantic web protal and nature dataset will be discussed later in this document.

Semantic Web Portal

A Semantic Web portal is several steps beyond today's search engine. Instead of requiring users to enter a series of keywords, the SWPortal can generate results for a term or collection of terms specified in an ontology. That step alone removes much of ambiguity from search. This can be extended further by incorporating it into editors like SMORE. As users edit their pages, a panel dedicated to the Portal can be returning pages with similar markup, related images and data, or references to other material.

Here is a fictitious screenshot of the Semantic Search Portal as it would be implemented in SMORE:

It represents the basic idea and is by no means complete. There are two main stages involved. In the first stage, the user provides associations between keywords in the search string and actual ontological references (classes/properties/instances or in the worst case the user leaves it blank providing incomplete information). Secondly, the software performs a combined search using multiple agents (bots) associated with different ontologies that parse the rdf data based on the query, exchange information with one another and use inference rules to further extend the search domain.

As can be seen in the example, when the user searches for 'Java Programmer in College Park' the result of the search returns a 'Professor (of Programming Languages) at the University of Maryland' (note: no keyword match at all) simply because the user provides associations such as 'programmer is a class...', 'college park' is an instance of 'city' etc..and leaves the rest to the search agents that filter data and make inferences based on the rdf query.

Agents play a role in several stages of this portal. First, it is safe to assume that users will have their own, specific ontologies and that articles will not all be marked up with the same ontology. Thus, to make the first judgement about whether two concepts are related, there needs to be some merging or reconciliation between the ontologies. Doing this manually would be far too difficult and tedious, so there needs to me an agent that can do this resolution. *OntoMerge* agent takes steps toward achieving this, and the *Semantic Similarity Agent* also begins to address some of these issues.

Secondly, we would like a method for giving credibility to sources. Users should have an ability to rank how much credit (or discred) specific colleagues receive. Using algorithms in the same ilk as those described in *TrustBot*, results of a search can be ranked according to how much a user should trust the results. A similar result could be achieved by an agent such as *WebMate* which stores and builds a notion of user preferences based on their previous web experience

To build this knowledge of articles and their semantic contents, a crawler, such as the *DAML Crawler* or *OCRA* are needed to spider files and build a repository of knowledge about pages and their corresponding RDF.

On the Nature Dataset

As someone is marking up an article (in a portal enabled editor), the user could find related articles in potentially different research areas (assuming there is access to all of the nature articles which are properly marked up). This could provide additional references, images, or just a general link to research in related disciplines that may be of use.

Links

Existing Systems and Tools

Tools for Agent Development

Semantic Search Tools

Related Papers


[FrontPage] [TitleIndex] [WordIndex]