Mindswap Weblog

Ontology for the Intelligence Community: Part I

by Aaron Mannes

The perfect is the enemy of the good – a theme echoing through this post.  On November 30 and December 1 I attended a conference entitled Ontology for the Intelligence Community: Towards Effective Exploitation and Integration of Intelligence Resources.  I took extensive notes, had many ideas, and was going to write a veritable Ontologiad about it.  Now, a month later, I’d better at least get some of the highlights down.

The main organizer was Prof. Barry Smith of the National Center for Ontological Research (NCOR) at the University of Buffalo.  Unsurprisingly, a fair amount of the conference went over my head – so following is a summary of what caught my attention.  The slides to most of the presentations are available online.  I will share my own comments in Italics.  In that vein, I was pleasantly surprised to meet several people who had actually read papers I had written.

The conference led off with a presentation (before rushing to download – it is 103 slides long) by Prof. Smith, Ontology: A Guide for the Intelligence Analyst.  The main points were an overview of the concept of ontology, what they can do for the intelligence community, and the NCOR’s role in making this happen.  Since, according to Smith’s discussion, the primary purpose of ontologies is to facilitate fusing information from different sources together Smith discussed the importance of developing thoroughly tested, intelligible, and useful ontologies that can be efficiently glued together.  He compared this to the importance of standardizing railway gauges to promote trans-continental travel.  Because of OWL there are a lot of “light” ontologies out there that are not well designed.  This proliferation of bad ontologies will, according to Smith, lead to ghettos.  The NCOR exists to establish best practices for ontology development and implement rigorous testing according to the scientific method.  One area where NCOR has had success has been in collaboration with The National Center for Biomedical Ontology.

Smith discussed the importance of ontologies focusing on types, not instances, and not being based on what is in people’s heads.  There is no red state and blue state chemistry – only chemistry: and ontologies need to represent reality and there is one reality.

An interesting question was that in the intelligence world, relationships are not always clear and an ontology might consequently be counterproductive.  Smith responded that getting the basics right – naming and organizing things properly would be a valuable improvement in and of itself.

Smith’s presentation was interesting and informative.  I did a lengthy summary because it set the tone for much of what followed.  However, as an ontology builder I questioned his focus on ontologies depicting single representation of reality.  There may not be red and blue state chemistry – but there are different modeling approaches to information.  I regularly make judgments about shaping the PiT ontology that reflect my needs.  There are sound reasons why another ontology writer would make different choices.  On the other hand – it is crucial that these different ontologies talk to each other so that establishing best practices would certainly be useful.  It would also be helpful to have a toolbox of approaches for ontology construction.  (I had to commit the cardinal sin of making everything up out of my head.)   Unfortunately, I did not have a chance to ask a question about this issue.

Maureen Baginski, a long-time analyst with the NSA who helped set up the FBI Intelligence center, is the intel analyst equivalent of a rock star and she gave a dynamic presentation (no slides however).  She talked about her early days in the business as a “Soviet power analyst.”  In the late 1970s and early 1980s data on the Soviet energy infrastructure was extremely hard to come by, so while most of the work was intellectual, the process was driven by the data.  Now the paradigm has shifted.  Enormous resources have been invested in data collection and there is too much so that the analyst is the victim of the data rather than the master.

She told several stories about the challenges of intelligence sharing, particularly between the cleared and uncleared world.  This is particularly problematic because now decision-makers might be police officers who would not have a security clearance, but need the information.  She mentioned an unnamed police chief who was excited to get a security clearance, but when he finally got access found there was too much information and very little of it was useful.

Broadly speaking, she said that the intelligence process is still driven by what information can be obtained.  But it needs to be driven by the analysts and what they need to know.

One particular comment struck me.  She noted that bandwidth was expanding in every realm, except one - inside the skulls of human beings.  To that end she noted she was not tremendously impressed with a lot of the software designed to aid analysts, noting that they were more good presentation tools.  However, in the question and answer period, when people asked her what kinds of tools she would like, her responses were not particularly illuminating.

Baginski’s comments did help me make some connections with my own work.  PiT is in fact an attempt to build a tool that will help the analyst absorb and work with the information and re-focusing on that point is always helpful.  Talking to other people at the conference, many talked about how they could – using ontologies – develop systems that would provide me (the analyst) exactly what I needed.  I hope that is true.  Apparently, because of legacy systems an analyst may need to dive into dozens of different databases to get what they need.  But, my take on the process is that even if – rather than the equivalent of scanning fifty books for the info I need – I now have that compressed into three books of probably relevant info, I still have a tough task ahead.  I still have to take those three books of data, detect patterns, and shape it into a narrative.  Some helpful functions may be automated, but it is still a substantial task.

With the above in mind, at the final Q&A session in which all of the day’s speakers participated I had a question.  Many of the other attendees discussed the difficulties of getting their agencies to buy into this “Ontology” stuff.  So, after introducing myself as working at MINDSWAP and describing PiT as an effort to build a tool by & for analysts I posited:

The key constituency are the analysts.  If you want in-depth ontologies you will need the analysts to work on them.  But analysts do not get promoted for ontology writing.  Why not try to work with analysts in building small scale tools that help them on a practical level and then when they see what the technology can do for them they will start buying in.

Prof. Smith replied (based on my notes) that the Semantic Web was a problem because it allowed people to make ontologies too easily and too many of these ontologies were bad.  (In another context, it was noted that NCOR’s successes with biomedical data did not necessarily translate to intel data which is a lot more complex and tougher to categorize by type.)  There was then some discussion about the Semantic Web vs. Ontologies that, I didn’t quite follow.

I didn’t know about this internecine conflict – nor, as an analyst – do I care about it.  I need tools that help me do my job.  I am also wide open to the suggestions about how to build ontologies.  I was heartened that, when chatting with other attendees, many told me that my approach had some merit.

I’ll get up briefs of the other presentations.  But the above were my main takeaways.

2 Responses to “Ontology for the Intelligence Community: Part I”

  1. Bob DuCharme Says:

    “Before rushing to download…” I was looking forward to checking out the 103 slides, but that link and most of the others here seem to start with an “a” start tag that has no attributes, so they don’t work as links. Is it possible to download those slides? Thanks…

  2. schematique.org » Blog Archive » Getting the measure of things Says:

    […] The problem of ontology alignment presents an interesting diversion from this sort of dialectic. Ontology alignment, as a discipline, focuses on algorithms for alignment. There is in fact a competition, held each year, to find more effective algorithms, in terms of precision and recall. Precision, in information retrieval terms, is the extent of correct matches the algorithm reaches against some pre-established result; recall is the extent of correct matches the algorithm reaches as a percentage of the total matches it finds. Irrespective of how the algorithm operates - and there are a number of techniques it can exploit - it has no knowledge of conceptual schemes. It merely operates on the symbols available to it. The question then becomes - how is the pre-established result in fact established? Who guarantees that the result is itself sensitive to the possibility of incommensurability? There remains the very likely possibility that precision and recall in fact represent varying degrees of false positives - automated matches which correlate to some predefined match, which for some other interpreter, is no match at all. In this sense a matching algorithm can only be measured against a human interpretation; it is therefore a result which can be totally inverted when faced, at the extreme, with a diametrically opposing interpretation. The incommensurabilist will uphold this likelihood; the commensurabilist will see it as a deviate case, to be consciously avoided by diligent professionals. (Something like these attitudes can be found in an interesting description of ontologies in the intelligence community). […]

Leave a Reply

MINDSWAP is a W3C member