Pychinko: Rete-based CWM Clone
(Yet another forward chainer)
Bijan
Parsia, Yarden Katz, Kendall Clark
MINDSWAP
University of
Maryland, College Park
FOAF Camp,
Enschede, Holland, Aug 19 - Aug 20, 2004
Motivation
- CWM is slow
- CWM is very
slow
- CWM is uberslow!
- Introducing a clean, easy to understand implementation
of a classic AI algorithm (Rete)
- Understanding CWM and providing an
alternative engine
- The point: to be faster and cleaner than
CWM, not to do anything fancy
Our solution
- Read the computer science literature from over 20 years
ago (Forgy, 1982) and implement the classic, well-studied algorithm
- CWM uses topological sort
How the Rete works
- Relies on two properties of rule-systems: a) most rules
share components structurally, and b) new additions to the KB tend to
change the network in one local area - i.e., changes don't affect most
of the network
- Builds a network of rules at startup: a new fact is
filtered through the rule network, doing the least amount of redundant
checks
- Rete rocks the house when you have: lots of rules, lots
of recursive application, and a persistent Rete
Example Rete network:
Benchmarks
(generated this
morning!)
Data Set 1
1-pattern rules (matching all triples (?s ?p ?o)), initial fact
base: 700 facts, 700 inferences
Pychinko time: 6.2
sec (take that, CLIPS!)
CWM time: 1 min, 54 sec
For bigger set (2000 facts)...
Pychinko time: 21.35 sec
CWM time: 11
min, 34.4 sec
Data
Set 2
Rules that reuse the same variables on left sides, initial fact
base: 2000
639 inferred facts
Pychinko time: 18.9 sec
CWM time: 1 min, 24.2 sec
Data Set 3
Enzymes data (8k) file, RDFS inference rules
Pychinko time: 172 sec
CWM time: I don't know, it's
still running!
Other data sets (bio-related) have shown results of 16 sec versus ~8
hrs..
This is our
worst-case (very little recursive applications, only a handful of
rules) but CWM's average-best case.
Problems
- Why bother? Just use CLIPS!
- ...But SemWeb community likes Python
- Portability (Pychinko on the Nokia phones = cool
applications)
What do we have?
- Rule processing
- Unoptimized query (integrated to RDFlib & sbp's
Afon parser)
- An implementation of the math/string CWM builtins (we
subclass CWM code)
- Currently in the process of tighter integration with CWM
For the Brave FOAFers..
Future work
- Still a few algorithmic optimizations to be made
- Explore Gator networks, other recent Rete work
- Adding native Python syntax for the rule language, like
Drools for Java (currently there's no integrated, native rule engine
available for Python)
Further reading
Forgy, C.L.: Rete: A Fast Algorithm for the Many Pattern/Many Object
Pattern Match Problem
Artificial Intelligence, 19(1982) 17-37