Mindswap Weblog

Google and Owl - huh (aka how many ontologies??)

by James Hendler

I blogged a while ago that a good way to find ontologies on google was to use “filetype:owl” (i.e. “person filetype:owl” will find ontologies with the term person in them) and then that if you used the “-” trick in Google to use a word that wasn’t in the ontology, and chose something not in many ontologies (say “-xyzzy filetype:google”) it would be a good approximation of the number of files Google found with “.owl” extensions. Last year, I did this around October, and the number was 6500, which seemed to make some sense.

But here’s something interesting - today I ran the Google query “-xyzzy filetype:owl” and I got the answer 107,000!! While I’d love this to be a meaningful number, I’m suspicious. I’ve tried probing around - a few of these seem to be people using “.owl” to fool you into coming to their pages (have we hit the big time??) - but that doesn’t seem to be the answer. Some of this seems to be large numbers of files where people are either streaming things into OWL files (way cool) or creating OWL files automatically - but it still doesn’t quite seem to explain it. I’d welcome any theories people have about what is happening…

till then, of course, I’ll be happy to claim 100,000 ontologies (and my slides from 2003 refer to 257 - so we’ve come a long way!) :-)

cheers and happy (secular, Gregorian) New Year

Jim H.

[note to folks from outside US:  I’ve already gotten a couple of responses that this doesn’t work — it appears Google US has not yet propagated something to at least Germany- if you still get a number <10,000 I assume it is because of something about how Google propagates these things.]

5 Responses to “Google and Owl - huh (aka how many ontologies??)”

  1. Richard Cyganiak Says:

    Well today I get 5700 results for the same query.

    I can produce more meaningful numbers by analyzing coffee grounds.

  2. Gunnar Grimnes Says:

    I get:

    “-xyzzy filetype:owl” - 5,070
    “-gunnar filetype:owl” - 4,870

    Wow! Several hundred ontologies mention my name!

    BUT

    “gunnar filetype:owl” - a single hit :)

    (Try also -word that rhymes with duck to see how rude ontology engineers are!)

  3. James Hendler Says:

    I updates the original post to make it clear that for some reason this seems to be different in the US and at least Germany.

    the point Gunnar makes, that the numbers don’t seem to add up is exactly right — that’s the point of my blog - that Google searches in this space don’t make much sense at all.

    in my google (US) -gunnar also finds about 107k, gunnar = 1 - so at least it seems more consistent :-)

  4. Raza Kashif Says:

    well i did so in a meta engine that returned me some different results from yahoo,msn,google,…. and so forth about 10 engines.

    I explored all i could and found out that most of it was useless data for a person finding an ontology.

    and yes google need to narrow there general search to accuracy! what think you?

    is there any was we could do a search like ” *.owl ” or ” *.* .owl ” in any search engine or google. what suggest you?

    Happy new year all !

    regards rk

  5. Vladislav Chernyshov Says:

    I did this request “-xyzzy filetype:owl” from Russia (I live here) and I got 11 500 results. huh!)

    I think we should stop to pray on Google in this question. IMHO, in this case quality (of the ontologies and their interconnectivity) does matter, not quantity.

    I’ve been styding semantic web for almost half a year, but….I’m still confused on how to make on all this something REALLY useful….And I’m very confused comparing growth rates of a semantic web and WWW itself: I’ve read historical chronicles and figured out that WWW was born in 1989 and 9 years later, in 1998 it was MUCH MORE bigger and USEFUL than semantic web now, in 2007 - 9 years later from it’s born date in 1998….

    I’m REALLY want to make something useful. I want to make an intelligent web agent, but as I said I’m still confused with what can I do…

    It would be great to have some discussion on this.

Leave a Reply

MINDSWAP is a W3C member