Drools

Drools Documentation

Mark Proctor

Michael Neale

Michael Frandsen

Sam Griffith Jr.

Edson Tirelli

Fernando Meyer

Kris Verlaenen

4.0.4


Preface
I. Reference Manual
1. Drools 4.0 Release Notes
1.1. What is new in Drools 4.0
1.1.1. Language Expressiveness Enhancements
1.1.2. Core Engine Enhancements
1.1.3. IDE Enhancements
1.1.4. Business Rules Management System - BRMS
1.1.5. Miscellaneous Enhancements
1.2. Upgrade tips from Drools 3.0.x to Drools 4.0.x
1.2.1. API changes
1.2.1.1. Working Memory creation
1.2.1.2. Working Memory Actions
1.2.2. Rule Language Changes
1.2.2.1. Working Memory Actions
1.2.2.2. Primitive support and unboxing
1.2.3. Drools Update Tool
1.2.4. DSL Grammars in Drools 4.0
1.2.5. Rule flow Update for 4.0.2
2. The Rule Engine
2.1. What is a Rule Engine?
2.1.1. Introduction and Background
2.2. Why use a Rule Engine?
2.2.1. Advantages of a Rule Engine
2.2.2. When should you use a Rule Engine?
2.2.3. When not to use a Rule Engine
2.2.4. Scripting or Process Engines
2.2.5. Strong and Loose Coupling
2.3. Knowledge Representation
2.3.1. First Order Logic
2.4. Rete Algorithm
2.5. The Drools Rule Engine
2.5.1. Overview
2.5.2. Authoring
2.5.3. RuleBase
2.5.4. WorkingMemory and Stateful/Stateless Sessions
2.5.4.1. Facts
2.5.4.2. Insertion
2.5.4.3. Retraction
2.5.4.4. Update
2.5.4.5. Globals
2.5.4.6. Shadow Facts
2.5.4.7. Property Change Listener
2.5.4.8. Initial Fact
2.5.5. StatefulSession
2.5.6. Stateless Session
2.5.7. Agenda
2.5.7.1. Conflict Resolution
2.5.7.2. Agenda Groups
2.5.7.3. Agenda Filters
2.5.8. Truth Maintenance with Logical Objects
2.5.8.1. Example Scenario
2.5.8.2. Important note: Equality for Java objects
2.5.9. Event Model
2.5.10. Sequential Mode
3. Installation and Setup (Core and IDE)
3.1. Installing and using
3.1.1. Dependencies and jars
3.1.2. Runtime
3.1.3. Installing IDE (Rule Workbench)
3.1.3.1. Installing GEF (a required dependency)
3.1.3.2. Installing from zip file
3.1.3.3. Installing from the update site
3.2. Setup from source
3.3. Source Checkout
3.4. Build
3.4.1. Building the Source
3.4.2. Building the Manual
3.5. Eclipse
3.5.1. Generating Eclipse Projects
3.5.2. Importing Eclipse Projects
3.5.3. Exporting the IDE Plugin
3.5.4. Building the update site
4. Decision Tables
4.1. Decision tables in spreadsheets
4.1.1. When to use Decision tables
4.1.2. Overview
4.1.3. How decision tables work
4.1.4. Keywords and syntax
4.1.4.1. Syntax of templates
4.1.4.2. Keywords
4.1.5. Creating and integrating Spreadsheet based Decision Tables
4.1.6. Managing business rules in decision tables.
4.1.6.1. Workflow and collaboration.
4.1.6.2. Using spreadsheet features
5. The (Eclipse based) Rule IDE
5.1. Introduction
5.1.1. Features outline
5.1.2. Creating a Rule project
5.1.3. Creating a new rule and wizards
5.1.4. Textual rule editor
5.1.5. Guided editor (rule GUI)
5.1.6. Views
5.1.6.1. The Working Memory View
5.1.6.2. The Agenda View
5.1.6.3. The Global Data View
5.1.6.4. The Audit View
5.1.7. Domain Specific Languages
5.1.7.1. Editing languages
5.1.8. The Rete View
5.1.9. Large drl files
5.1.10. Debugging rules
5.1.10.1. Creating breakpoints
5.1.10.2. Debugging rules
6. The Rule Language
6.1. Overview
6.1.1. A rule file
6.1.2. What makes a rule
6.1.3. Reserved words
6.2. Comments
6.2.1. Single line comment
6.2.2. Multi line comment
6.3. Package
6.3.1. import
6.3.2. expander
6.3.3. global
6.4. Function
6.5. Rule
6.5.1. Rule Attributes
6.5.1.1. no-loop
6.5.1.2. lock-on-active
6.5.1.3. salience
6.5.1.4. agenda-group
6.5.1.5. auto-focus
6.5.1.6. activation-group
6.5.1.7. dialect
6.5.1.8. date-effective
6.5.1.9. date-exptires
6.5.1.10. duration
6.5.2. Left Hand Side (when) Conditional Elements
6.5.2.1. Pattern
6.5.2.2. 'and'
6.5.2.3. 'or'
6.5.2.4. 'eval'
6.5.2.5. 'not'
6.5.2.6. 'exists'
6.5.2.7. 'forall'
6.5.2.8. From
6.5.2.9. 'collect'
6.5.2.10. 'accumulate'
6.5.3. The Right Hand Side (then)
6.5.4. A note on auto boxing/unboxing and primitive types
6.6. Query
6.7. Domain Specific Languages
6.7.1. When to use a DSL
6.7.2. Editing and managing a DSL
6.7.3. Using a DSL in your rules
6.7.4. Adding constraints to facts
6.7.5. How it works
6.7.6. Creating a DSL from scratch
6.7.7. Scope and keywords
6.7.8. DSLs in the BRMS and IDE
6.8. Rule Flow
6.8.1. Assigning rules to a ruleflow group
6.8.2. A simple ruleflow
6.8.3. How to build a rule flow
6.8.4. Using a rule flow in your application
6.8.5. Different types of nodes in a ruleflow
6.9. XML Rule Language
6.9.1. When to use XML
6.9.2. The XML format
6.9.3. Legacy Drools 2.x XML rule format
6.9.4. Automatic transforming between formats (XML and DRL)
7. Deployment and Testing
7.1. Deployment options
7.1.1. Deployment using the RuleAgent
7.1.2. Deployment using drl source
7.1.3. Deploying rules in your classpath
7.1.4. Deployable objects, RuleBase, Package etc.
7.1.4.1. DRL and PackageDescr
7.1.4.2. Package
7.1.4.3. RuleBase
7.1.4.4. Serializing
7.1.5. Deployment patterns
7.1.5.1. In process rule building
7.1.5.2. Out of process rule building
7.1.5.3. Some deployment scenarios
7.1.6. Web Services
7.1.7. Future considerations
7.2. Testing
7.2.1. Testing frameworks
7.2.2. FIT for Rules - a rule testing framework
8. The Java Rule Engine API
8.1. Introduction
8.2. How To Use
8.2.1. Building and Registering RuleExecutionSets
8.2.2. Using Stateful and Stateless RuleSessions
8.2.2.1. Globals
8.3. References
9. The BRMS (Business Rule Management System)
9.1. Introduction
9.1.1. What is a BRMS?
9.1.1.1. When to use a BRMS
9.1.1.2. Who uses a BRMS
9.1.2. Features outline
9.2. Administration guide
9.2.1. Installation
9.2.1.1. Supported and recommended platforms
9.2.2. Database configuration
9.2.2.1. Changing the location of the data store
9.2.2.2. Configuring the BRMS to use an external RDBMS
9.2.2.3. Searching and indexing, Version storage
9.2.3. Security
9.2.3.1. Using your containers security and LDAP
9.2.4. Data management
9.2.4.1. Backups
9.2.4.2. Asset list customization
9.2.4.3. Customised selectors for package building
9.2.4.4. Adding your own logos or styles to the BRMS web GUI
9.2.4.5. Import and Export
9.3. Architecture
9.3.1. Building from source
9.3.1.1. Modules
9.3.1.2. Working with Maven 2
9.3.1.3. Working with GWT
9.3.1.4. Debugging, Editing and running with Eclipse
9.3.2. Re-usable components
9.3.3. Versioning and Storage
9.3.4. Contributing
9.4. Quick start guide
9.4.1. Quick start guide
9.4.1.1. Supported browser platforms
9.4.1.2. Initial configuration
9.4.1.3. Writing some rules
9.4.1.4. Finding stuff
9.4.1.5. Deployment
9.4.2. BRMS concepts
9.4.2.1. Rules are assets
9.4.2.2. Categorisation
9.4.2.3. The asset editor
9.4.2.4. Rule authoring
9.4.2.5. Templates of assets/rules
9.4.2.6. Status management
9.4.2.7. Package management
9.4.2.8. Version management
9.4.2.9. Deployment management
9.4.2.10. Navigating and finding rules
9.4.3. The business user perspective
9.4.4. Deployment: Integrating rules with your applications
9.4.4.1. The Rule Agent
9.4.4.2. Manual deployment
10. Examples
10.1. Getting the examples
10.1.1. Hello World
10.1.2. State Example
10.1.2.1. Understanding the State Example
10.1.3. Banking Tutorial
10.1.4. Fibonacci Example
10.1.5. Golfing Example
10.1.5.1. The riddle
10.1.5.2. Launching the example
10.1.5.3. The matching rule
10.1.5.4. Conclustion
10.1.6. Trouble Ticket
10.1.6.1. Executing the Example
10.1.6.2. Platinum gets the best service
10.1.6.3. Silver and Gold
10.1.6.4. Escalating
10.1.6.5. Running it
10.1.7. Pricing Rule Decision Table Example
10.1.7.1. Executing the example
10.1.7.2. The decision table
10.1.8. Shopping Example
10.1.8.1. Running the example
10.1.8.2. Discounts and purchases
10.1.8.3. Calculating the discount
10.1.9.
10.1.9.1. Pet Store Example
10.1.10. Honest Politician Example
10.1.11. Sudoku Example
10.1.11.1. Sudoku Overview
10.1.11.2. Running the Example
10.1.11.3. Java Source and Rules Overview
10.1.11.4. Sudoku Validator Rules (validatorSudoku.drl)
10.1.11.5. Sudoku Solving Rules (solverSudoku.drl)
10.1.11.6. Suggestions for Future Developments
10.1.12.
10.1.12.1. Number Guess
10.1.13. Miss Manners and Benchmarking
10.1.13.1. Introduction
10.1.13.2. Indepth look
10.1.13.3. Output Summary
10.1.14. Conways Game Of Life Example
10.1.15. Insurance Company Risk Factor and Policy price (using BRMS)
10.1.15.1. BRMS editors
10.1.15.2. Introduction
10.1.15.3. The insurance logic
10.1.15.4. Downloading and installing the BRMS
10.1.15.5. Deploying the insurance example in your application server
10.1.15.6. Running the example from the web page
Index

Preface

Part I. Reference Manual

Chapter 1. Drools 4.0 Release Notes

1.1. What is new in Drools 4.0

Drools 4.0 is a major update over the previous Drools 3.0.x series. A whole new set of features were developed which special focus on language expressiveness, engine performance and tools availability. The following is a list of the most interesting changes.

1.1.1. Language Expressiveness Enhancements

  • New Conditional Elements: from, collect, accumulate and forall

  • New Field Constraint operators: not matches, not contains, in, not in, memberOf, not memberOf

  • New Implicit Self Reference field: this

  • Full support for Conditional Elements nesting, for First Order Logic completeness.

  • Support for multi-restrictions and constraint connectives && and ||

  • Parser improvements to remove previous language limitations, like character escaping and keyword conflicts

  • Support for pluggable dialects and full support for MVEL scripting language

  • Complete rewrite of DSL engine, allowing for full l10n

  • Fact attributes auto-vivification for return value restrictions and inline-eval constraints

  • Support for nested accessors, property navigation and simplified collection, arrays and maps syntax

  • Improved support for XML rules

1.1.2. Core Engine Enhancements

  • Native support for primitive types, avoiding constant autoboxing

  • Support for transparent optional Shadow Facts

  • Rete Network performance improvements for complex rules

  • Support for Rule-Flows

  • Support for Stateful and Stateless working memories (rule engine sessions)

  • Support for Asynchronous Working Memory actions

  • Rules Engine Agent for hot deployment and BRMS integration

  • Dynamic salience for rules conflict resolution

  • Support for Parameterized Queries

  • Support for halt command

  • Support for sequential execution mode

  • Support for pluggable global variable resolver

1.1.3. IDE Enhancements

  • Support for rule break-points on debugging

  • WYSIWYG support for rule-flows

  • New guided editor for rules authoring

  • Upgrade to support all new engine features

1.1.4. Business Rules Management System - BRMS

  • New BRMS tool

  • User friendly web interface with nice WEB 2.0 ajax features

  • Package configuration

  • Rule Authoring easy to edit rules both with guided editor ( drop-down menus ) and text editor

  • Package compilation and deployment

  • Easy deployment with Rule Agent

  • Easy to organize with categories and search assets

  • Versioning enabled, you can easily replace yours assets with previously saved

  • JCR compliant rule assets repository

1.1.5. Miscellaneous Enhancements

  • Slimmed down dependencies and smaller memory footprint

1.2. Upgrade tips from Drools 3.0.x to Drools 4.0.x

As mentioned before Drools 4.0 is a major update over the previous Drools 3.0.x series. Unfortunately, in order to achieve the goals set for this release, some backward compatibility issues were introduced, as discussed in the mail list and blogs.

This section of the manual is a work in progress and will document a simple how-to on upgrading from Drools 3.0.x to Drools 4.0.x.

1.2.1. API changes

There are a few API changes that are visible to regular users and need to be fixed.

1.2.1.1. Working Memory creation

Drools 3.0.x had only one working memory type that worked like a stateful working memory. Drools 4.0.x introduces separate APIs for Stateful and Stateless working memories that are called now Rule Sessions. In Drools 3.0.x, the code to create a working memory was:

Example 1.1. Drools 3.0.x: Working Memory Creation

WorkingMemory wm = rulebase.newWorkingMemory();


In Drools 4.0.x it must be changed to:

Example 1.2. Drools 4.0.x: Stateful Rule Session Creation

StatefulSession wm = rulebase.newStatefulSession();


The StatefulSession object has the same behavior as the Drools 3.0.x WorkingMemory (it even extends the WorkingMemory interface), so there should be no other problems with this fix.

1.2.1.2. Working Memory Actions

Drools 4.0.x now supports pluggable dialects and has built-in support for Java and MVEL scripting language. In order to avoid keyword conflicts, the working memory actions were renamed as showed bellow:

Table 1.1. Working Memory Actions equivalent API methods

Drools 3.0.xDrools 4.0.x
WorkingMemory.assertObject()WorkingMemory.insert()
WorkingMemory.assertLogicalObject()WorkingMemory.insertLogical()
WorkingMemory.modifyObject()WorkingMemory.update()

1.2.2. Rule Language Changes

The DRL Rule Language also has some backward incompatible changes as detailed bellow.

1.2.2.1. Working Memory Actions

The Working Memory actions in rule consequences were also changed in a similar way to the change made in the API. The following table summarizes the change:

Table 1.2. Working Memory Actions equivalent DRL commands

Drools 3.0.xDrools 4.0.x
assert()insert()
assertLogical()insertLogical()
modify()update()

1.2.2.2. Primitive support and unboxing

Drools 3.0.x did not had native support for primitive types and consequently, it auto-boxed all primitives in it's respective wrapper classes. That way, any use of a boxed variable binding required a manual unbox.

Drools 4.0.x has full support for primitive types and does not wrap values anymore. So, all previous unwrap method calls must be removed from the DRL.

Example 1.3. Drools 3.0.x manual unwrap

rule "Primitive int manual unbox"
when
    $c : Cheese( $price : price )
then
    $c.setPrice( $price.intValue() * 2 )
end

The above rule in 4.0.x would be:

Example 1.4. Drools 4.0.x primitive support

rule "Primitive support"
when
    $c : Cheese( $price : price )
then
    $c.setPrice( $price * 2 )
end

1.2.3. Drools Update Tool

The Drools Update tools is a simple program to help with the upgrade of DRL files from Drools 3.0.x to Drools 4.0.x.

At this point, its main objective is to upgrade the memory action calls from 3.0.x to 4.0.x, but expect it to grow over the next few weeks covering additional scenarios. It is important to note that it does not make a dumb text search and replace in rules file, but it actually parses the rules file and try to make sure it is not doing anything unexpected, and as so, it is a safe tool to use for upgrade large sets of rule files.

The drools update tool can be found as a maven project in the following source repository http://anonsvn.labs.jboss.com/labs/jbossrules/trunk/experimental/drools-update/ you just need to check it out, and execute the maven clean install action with the project's pom.xml file. After resolve all the class path dependencies you are able to run the toll with the following command:

java -cp $CLASSPATH org.drools.tools.update.UpdateTool -f <filemask> [-d <basedir>] [-s <sufix>]

The program parameters are very easy to understand as following.

  • -h,--help, Shows a very simple list the usage help

  • -d your source base directory

  • -f pattern for the files to be updated. The format is the same as used by ANT: * = single file, directory ** = any level of subdirectories EXAMPLE: src/main/resources/**/*.drl = matches all DRL files inside any subdirectory of /src/main/resources

  • -s,--sufix the sufix to be added to all updated files

1.2.4. DSL Grammars in Drools 4.0

It is important to note that the DSL template engine was rewritten from scratch to improve flexibility. One of the new features of DSL grammars is the support to Regular Expressions. This way, now you can write your mappings using regexp to have additional flexibility, as explained in the DSL chapter. Although, now you have to escape characters with regexp meaning. Example: if previously you had a matching like:

Example 1.5. Drools 3.0.x mapping

[when][]- the {attr} is in [ {values} ]={attr} in ( {values} )

Now, you need to escape '[' and ']' characters, as they have special meaning in regexps. So, the same mapping in Drools 4.0 would be:

Example 1.6. Drools 4.0.x mapping with escaped characters

[when][]- the {attr} is in \[ {values} \]={attr} in ( {values} )

1.2.5. Rule flow Update for 4.0.2

The Rule flow feature was updated for 4.0.2, and now all your ruleflows must decalre a package name.

Rule Flow properties

Figure 1.1. Rule Flow properties


Chapter 2. The Rule Engine

2.1. What is a Rule Engine?

2.1.1. Introduction and Background

Artificial Intelligence (A.I.) is a very broad research area that focuses on "Making computers think like people" and includes disciplines such as Neural Networks, Genetic Algorithms, Decision Trees, Frame Systems and Expert Systems. Knowledge representation is the area of A.I. concerned with how knowledge is represented and manipulated. Expert Systems use Knowledge representation to facilitate the codification of knowledge into a knowledge base which can be used for reasoning - i.e. we can process data with this knowledge base to infer conclusions. Expert Systems are also known as Knowledge-based Systems and Knowledge-based Expert Systems and are considered 'applied artificial intelligence'. The process of developing with an Expert System is Knowledge Engineering. EMYCIN was one of the first "shells" for an Expert System, which was created from the MYCIN medical diagnosis Expert System. Where-as early Expert Systems had their logic hard coded, "shells" separated the logic from the system, providing an easy to use environment for user input. Drools is a Rule Engine that uses the Rule Based approached to implement an Expert System and is more correctly classified as a Production Rule System.

The term "Production Rule" originates from formal grammar - where it is described as "an abstract structure that describes a formal language precisely, i.e., a set of rules that mathematically delineates a (usually infinite) set of finite-length strings over a (usually finite) alphabet" (wikipedia).

Business Rule Management Systems build additional value on top of a general purpose Rule Engines by providing, business user focused, systems for rule creation, management, deployment, collaboration, analysis and end user tools. Further adding to this value is the fast evolving and popular methodology "Business Rules Approach", which is a helping to formalize the role of Rule Engines in the enterprise.

The term Rule Engine is quite ambiguous in that it can be any system that uses rules, in any form, that can be applied to data to produce outcomes; which includes simple systems like form validation and dynamic expression engines. The book "How to Build a Business Rules Engine (2004)" by Malcolm Chisholm exemplifies this ambiguity. The book is actually about how to build and alter a database schema to hold validation rules. The book then shows how to generate VB code from those validation rules to validate data entry - while a very valid and useful topic for some, it caused quite a surprise to this author, unaware at the time in the subtleties of Rules Engines differences, who was hoping to find some hidden secrets to help improve the Drools engine. JBoss jBPM uses expressions and delegates in its Decision nodes; which control the transitions in a Workflow. At each node it evaluates has a rule set that dictates the transition to undertake - this is also a Rule Engine. While a Production Rule System is a kind of Rule Engine and also an Expert System, the validation and expression evaluation Rule Engines mention previously are not Expert Systems.

A Production Rule System is turing complete with a focus on knowledge representation to express propositional and first order logic in a concise, non ambiguous and declarative manner. The brain of a Production Rules System is an Inference Engine that is able to scale to a large number of rules and facts. The Inference Engine matches facts and data, against Production Rules, also called Productions or just Rules, to infer conclusions which result in actions. A Production Rule is a two-part structure using First Order Logic for knowledge representation.

when
    <conditions>
then
    <actions>

The process of matching the new or existing facts against Production Rules is called Pattern Matching, which is performed by the Inference Engine. There are a number of algorithms used for Pattern Matching by Inference Engines including:

  • Linear

  • Rete

  • Treat

  • Leaps

Drools implements and extends the Rete algorith, Leaps use to be supported but was removed due to poor maintenance. The Drools Rete implementation is called ReteOO, signifying that Drools has an enhanced and optimized implementation of the Rete algorithm for Object Oriented systems. Other Rete based engines also have marketing terms for their proprietary enhancements to Rete, like RetePlus and Rete III. It is important to understand that names like Rete III are purely marketing where, unlike the original published Rete Algorithm, no details of the implementation are published. This makes questions such as "Does Drools implement Rete III?" nonsensical. The most common enhancements are covered in "Production Matching for Large Learning Systems (Rete/UL)" (1995) by Robert B. Doorenbos.

The Rules are stored in the Production Memory and the facts that the Inference Engine matches against the Working Memory. Facts are asserted into the Working Memory where they may then be modified or retracted. A system with a large number of rules and facts may result in many rules being true for the same fact assertion, these rules are said to be in conflict. The Agenda manages the execution order of these conflicuting rules using a Conflict Resolution strategy.

A Basic Rete network

Figure 2.1. A Basic Rete network


A Production Rule System's Inference Engine is stateful and able to enforce truthfulness - called Truth Maintence. A logical relationship can be declared by actions which means the action's state depends on the inference remaining true; when it is no longer true the logical dependent action is undone. The "Honest Politician" is an example of Truth Maintenance, which always ensures that hope can only exist for a democracy while we have honest politicians.

when
    an honest Politician exists
then
    logically assert Hope

when
   Hope exists
then
   print "Hurrah!!! Democracy Lives" 

when
   Hope does not exist
then
   print "Democracy is Doomed" 

There are two methods of execution for a Production Rule Systems - Forward Chaining and Backward Chaining; systems that implement both are called Hybrid Production Rule Systems. Understanding these two modes of operation are key to understanding why a Production Rule System is different and how to get the best from them. Forward chaing is 'data-driven' and thus reactionary - facts are asserted into the working memory which results in one or more rules being concurrently true and scheduled for execution by the Agenda - we start with a fact, it propagates and we end in a conclusion. Drools is a forward chaining engine.

Forward Chaining

Figure 2.2. Forward Chaining


Backward chaining is 'goal-driven', meaning that we start with a conclusion which the engine tries to satisfy. If it can't it then searches for conclusions that it can, known as 'sub goals', that will help satisfy some unknown part of the current goal - it continues this process until either the initial conclusion is proven or there are no more sub goals. Prolog is an example of a Backward Chaining engine; Drools will be adding support for Backward Chaining in its next major release.

Backward Chaining

Figure 2.3. Backward Chaining


2.2. Why use a Rule Engine?

Some frequently asked questions:

  1. When should you use a rule engine?

  2. What advantage does a rules engine have over hand coded "if...then" approaches?

  3. Why should you use a rule engine instead of a scripting framework, like BeanShell?

We will attempt to address these questions below.

2.2.1. Advantages of a Rule Engine

  • Declarative Programming

    Rule engines allow you to say "What to do" not "How to do it".

    The key advantage of this point is that using rules can make it easy to express solutions to difficult problems and consequently have those solutions verified (rules are much easier to read then code).

    Rule systems are capable of solving very, very hard problems, providing an explanation of how the solution was arrived at and why each "decision" along the way was made (not so easy with other of AI systems like neural networks or the human brain - I have no idea why I scratched the side of the car).

  • Logic and Data Separation

    Your data is in your domain objects, the logic is in the rules. This is fundamentally breaking the OO coupling of data and logic, which can be an advantage or a disadvantage depending on your point of view. The upshot is that the logic can be much easier to maintain as there are changes in the future, as the logic is all laid out in rules. This can be especially true if the logic is cross-domain or multi-domain logic. Instead of the logic being spread across many domain objects or controllers, it can all be organized in one or more very distinct rules files.

  • Speed and Scalability

    The Rete algorithm, Leaps algorithm, and its descendants such as Drools' Reteoo (and Leaps), provide very efficient ways of matching rule patterns to your domain object data. These are especially efficient when you have datasets that do not change entirely (as the rule engine can remember past matches). These algorithms are battle proven.

  • Centralization of Knowledge

    By using rules, you create a repository of knowledge (a knowledgebase) which is executable. This means it's a single point of truth, for business policy (for instance) - ideally rules are so readable that they can also serve as documentation.

  • Tool Integration

    Tools such as Eclipse (and in future, Web based UIs) provide ways to edit and manage rules and get immediate feedback, validation and content assistance. Auditing and debugging tools are also available.

  • Explanation Facility

    Rule systems effectively provide an "explanation facility" by being able to log the decisions made by the rule engine along with why the decisions were made.

  • Understandable Rules

    By creating object models and optionally Domain Specific Languages that model your problem domain you can set yourself up to write rules that look very close to natural language. They lend themselves to logic that is understandable to, possibly nontechnical, domain experts as they are expressed in their language. (as all the program plumbing, the "How", is in the usual code, hidden away).

2.2.2. When should you use a Rule Engine?

The shortest answer to this is "when there is no satisfactory traditional programming approach to solve the problem.". Given that short answer, some more explanation is required. The reason why there is no "traditional" approach is possibly one of the following:

  • The problem is just too fiddle for traditional code.

    The problem may not be complex, but you can't see a non-fragile way of building it.

  • The problem is beyond any obvious algorithm based solution.

    It is a complex problem to solve, there are no obvious traditional solutions or basically the problem isn't fully understood.

  • The logic changes often

    The logic itself may be simple (but doesn't have to be) but the rules change quite often. In many organizations software releases are few and far between and rules can help provide the "agility" that is needed and expected in a reasonably safe way.

  • Domain experts (or business analysts) are readily available, but are nontechnical.

    Domain experts are often a wealth of knowledge about business rules and processes. They typically are nontechnical, but can be very logical. Rules can allow them to express the logic in their own terms. Of course, they still have to think critically and be capable of logical thinking (many people in "soft" nontechnical positions do not have training in formal logic, so be careful and work with them, as by codifying business knowledge in rules, you will often expose holes in the way the business rules and processes are currently understood).

If rules are a new technology for your project teams, the overhead in getting going must be factored in. Its not a trivial technology, but this document tries to make it easier to understand.

Typically in a modern OO application you would use a rule engine to contain key parts of your business logic (what that means of course depends on the application) - ESPECIALLY the REALLY MESSY parts!. This is an inversion of the OO concept of encapsulating all the logic inside your objects. This is not to say that you throw out OO practices, on the contrary in any real world application, business logic is just one part of the application. If you ever notice lots of "if", "else", "switch", an over abundance of strategy patterns and/or other messy logic in your code that just doesn't feel right (and you keep coming back to fix it - either because you got it wrong, or the logic/your understanding changes) - think about using rules. If you are faced with tough problems of which there are no algorithms or patterns for, consider using rules.

Rules could be used embedded in your application or perhaps as a service. Often rules work best as "stateful" component - hence they are often an integral part of an application. However, there have been successful cases of creating reusable rule services which are stateless.

In your organization it is important to think about the process you will use for updating rules in systems that are in production (the options are many, but different organizations have different requirements - often they are out of the control of the application vendors/project teams).

2.2.3. When not to use a Rule Engine

To quote a Drools mailing list regular (Dave Hamu): "It seems to me that in the excitement of working with rules engines, that people forget that a rules engine is only one piece of a complex application or solution. Rules engines are not really intended to handle workflow or process executions nor are workflow engines or process management tools designed to do rules. Use the right tool for the job. Sure, a pair of pliers can be used as a hammering tool in a pinch, but that's not what it's designed for."

As rule engines are dynamic (dynamic in the sense that the rules can be stored and managed and updated as data), they are often looked at as a solution to the problem of deploying software (most IT departments seem to exist for the purpose of preventing software being rolled out). If this is the reason you wish to use a rule engine, be aware that rule engines work best when you are able to write declarative rules. As an alternative, you can consider data-driven designs (lookup tables), or script/process engines where the scripts are managed in a database and are able to be updated on the fly.

2.2.4. Scripting or Process Engines

Hopefully the preceding sections have explained when you may want to use a rule engine.

Alternatives are script-based engines that provide the dynamicness for "changes on the fly" (there are many solutions here).

Alternatively Process Engines (also capable of workflow) such as jBPM allow you to graphically (or programmatically) describe steps in a process - those steps can also involve decision point which are in themselves a simple rule. Process engines and rules often can work nicely together, so it is not an either-or proposition.

One key point to note with rule engines, is that some rule-engines are really scripting engines. The downside of scripting engines is that you are tightly coupling your application to the scripts (if they are rules, you are effectively calling rules directly) and this may cause more difficulty in future maintenance, as they tend to grow in complexity over time. The upside of scripting engines is they can be easier to implement at first, and you can get quick results (and conceptually simpler for imperative programmers!).

Many people have also implemented data-driven systems successfully in the past (where there are control tables that store meta-data that changes your applications behavior) - these can work well when the control can remain very limited. However, they can quickly grow out of control if extended to much (such that only the original creators can change the applications behavior) or they cause the application to stagnate as they are too inflexible.

2.2.5. Strong and Loose Coupling

No doubt you have heard terms like "tight coupling" and "loose coupling" in systems design. Generally people assert that "loose" or "weak" coupling is preferable in design terms, due to the added flexibility it affords. Similarly, you can have "strongly coupled" and "weakly coupled" rules. Strongly coupled in this sense means that one rule "firing" will clearly result in another rule firing etc.; in other words there is a clear (probably obvious) chain of logic. If your rules are all strongly coupled, the chances are that the rules will have future inflexibility, and more significantly, that perhaps a rule engine is overkill (as the logic is a clear chain of rules - and can be hard coded. [A Decision Tree may be in order]). This is not to say that strong or weak coupling is inherently bad, but it is a point to keep in mind when considering a rule engine and in how you capture the rules. "Loosely" coupled rules should result in a system that allows rules to be changed, removed and added without requiring changes to other rules that are unrelated.

2.3. Knowledge Representation

2.3.1. First Order Logic

Rules are written using First Order Logic, or predicate logic, which extends Propositional Logic. Emil Leon Post was the first to develop an inference based system using symbols to express logic - as a consequence of this he was able to prove that any logical system (including mathematics) could be expressed with such a system.

A proposition is a statement that can be classified as true or false. If the truth can be determined from statement alone it is said to be a "closed statement". In programming terms this is an expression that does not reference any variables:

10 == 2 * 5

Expressions that evaluate against one or more variables, the facts, are "open statements", in that we cannot determine whether the statement is true until we have a variable instance to evaluate against:

Person.sex == "male"

With SQL if we look at the conclusion's action as simply returning the matched fact to the user:

select * from People where People.sex == "male"

For any rows, which represent our facts, that are returned we have inferred that those facts are male people. The following diagram shows how the above SQL statement and People table can be represented in terms of an Inference Engine.

SQL as a simplistic Inference Engine

Figure 2.4. SQL as a simplistic Inference Engine


So in Java we can say that a simple proposition is of the form 'variable' 'operator' 'value' - where we often refer to 'value' as being a literal value - a proposition can be thought as a field constraint. Further to this propositions can be combined with conjunctive and disjunctive connectives, which is the logic theorists way of saying '&&' and '||'. The following shows two open propositional statements connected together with a single disjunctive connective.

      
      person.getEyeColor().equals("blue") || person.getEyeColor().equals("green") 
      
    

This can be expressed using a disjunctive Conditional Element connective - which actually results in the generation of two rules, to represent the two possible logic outcomes.

      
      Person( eyeColour == "blue" ) || Person( eyeColor == "green" )
      
    

Disjunctive field constraints connectives could also be used and would not result in multiple rule generation.

      
      Person( eyeColour == "blue"|| == "green" )
      
    

Propositional Logic is not Turing complete, limiting the problems you can define, because it cannot express criteria for the structure of data. First Order Logic (FOL), or Predicate Logic, extends Propositional Logic with two new quantifier concepts to allow expressions defining structure - specifically universal and existential quantifiers. Universal quantifiers allow you to check that something is true for everything; normally supported by the 'forall' conditional element. Existential quantifiers check for the existence of something, in that it occurs at least once - this is supported with 'not' and 'exists' conditional elements.

Imagine we have two classes - Student and Module. Module represents each of the courses the Student attended for that semester, referenced by the List collection. At the end of the semester each Module has a score. If the Student has a Module score below 40 then they will fail that semester - the existential quantifier can be used used with the "less than 40" open proposition to check for the existence of a Module that is true for the specified criteria.

    
    public class Student {
    private String name;
    private List modules;

    ...
    }
       
    
    
    public class Module {
    private String name;
    private String studentName;
    private int score;
    
    

Java is Turing complete in that you can write code, among other things, to iterate data structures to check for existence. The following should return a List of students who have failed the semester.

    
    List failedStudents = new ArrayList();
    
    for ( Iterator studentIter = students.iterator(); studentIter.hasNext() {
        Student student = ( Student ) studentIter.next();
        for ( Iterator it = student.getModules.iterator(); it.hasNext(); ) {
            Module module = ( Module ) it.next();
            if ( module.getScore() < 40  ) {
                failedStudents.add( student ) ;
                break;
            }
        }
    }
    
    

Early SQL implementations were not Turing complete as they did not provide quantifiers to access the structure of data. Modern SQL engines do allow nesting of SQL, which can be combined with keywords like 'exists' and 'in'. The following show SQL and a Rule to return a set of Students who have failed the semester.


      select 
    * 
from 
    Students s 
where exists (  
    select 
        * 
    from 
        Modules m 
    where 
        m.student_name = s.name and 
        m.score < 40 
)

    


    rule "Failed_Students"
    when
        exists( $student : Student() && Module( student == $student, score < 40 ) )
    
    

2.4. Rete Algorithm

The RETE algorithm was invented by Dr. Charles Forgy and documented in his PhD thesis in 1978-79. A simplified version of the paper was published in 1982 (http://citeseer.ist.psu.edu/context/505087/0). The word RETE is latin for "net" meaning network. The RETE algorithm can be broken into 2 parts: rule compilation and runtime execution.

The compilation algorithm describes how the Rules in the Production Memory to generate an efficient discrimination network. In non-technical terms, a discrimination network is used to filter data. The idea is to filter data as it propagates through the network. At the top of the network the nodes would have many matches and as we go down the network, there would be fewer matches. At the very bottom of the network are the terminal nodes. In Dr. Forgy's 1982 paper, he described 4 basic nodes: root, 1-input, 2-input and terminal.

Rete Nodes

Figure 2.5. Rete Nodes


The root node is where all objects enter the network. From there, it immediately goes to the ObjectTypeNode. The purpose of the ObjectTypeNode is to make sure the engine doesn't do more work than it needs to. For example, say we have 2 objects: Account and Order. If the rule engine tried to evaluate every single node against every object, it would waste a lot of cycles. To make things efficient, the engine should only pass the object to the nodes that match the object type. The easiest way to do this is to create an ObjectTypeNode and have all 1-input and 2-input nodes descend from it. This way, if an application asserts a new account, it won't propagate to the nodes for the Order object. In Drools when an object is asserted it retrieves a list of valid ObjectTypesNodes via a lookup in a HashMap from the object's Class; if this list doesn't exist it scans all the ObjectTypde nodes finding valid matches which it caches in the list. This enables Drools to match against any Class type that matches with an instanceof check.

ObjectTypeNodes

Figure 2.6. ObjectTypeNodes


ObjectTypdeNodes can propagate to AlphaNodes, LeftInputAdapterNodes and BetaNodes. AlphaNodes are used to evaluate literal conditions. Although the 1982 paper only covers equality conditions, many RETE implementations support other operations. For example, Account.name == "Mr Trout" is a literal condition. When a rule has multiple literal conditions for a single object type, they are linked together. This means that if an application asserts an account object, it must first satisfy the first literal condition before it can proceed to the next AlphaNode. In Dr. Forgy's paper, he refers to these as IntraElement conditions. The following shows the AlphaNode combinations for Cheese( name == "cheddar, strength == "strong" ):

AlphaNodes

Figure 2.7. AlphaNodes


Drools extends Rete by optimizing the propagation from ObjectTypdeNode to AlphaNode using hashing. Each time an AlphaNode is added to an ObjectTypdeNode it adds the literal value as a key to the HashMap with the AlphaNode as the value. When a new instance enters the ObjectTypde node, rather than propagating to each AlphaNode, it can instead retrieve the correct AlphaNode from the HashMap - avoiding unnecessary literal checks.

There are two two-input nodes; JoinNode and NotNode - both are types of BetaNodes. BetaNodes are use to compare 2 objects, and their fields, to each other. The objects may be the same or different types. By convention we refer to the two inputs as left and right. The left input for a BetaNode is generally a list of objects; in Drools this is a Tuple. The right input is a single object. Two Nots can be used to implement 'exists' checks. BetaNodes also have memory. The left input is called the Beta Memory and remembers all incoming tuples. The right input is called the Alpha Memory and remembers all incoming objects. Drools extends Rete by performing indexing on the BetaNodes. For instance, if we know that a BetaNode is performing a check on a String field, as each object enters we can do a hash lookup on that String value. This means when facts enter from the opposite side, instead of iterating over all the facts to find valid joins, we do a lookup returning potentially valid candidates. At any point a valid join is found the Tuple is joined with the Object; which is referred to as a partial match; and then propagated to the next node.

JoinNode

Figure 2.8. JoinNode


To enable the first Object, in the above case Cheese, to enter the network we use a LeftInputNodeAdapter - this takes an Object as an input and propagates a single Object Tuple.

Terminal nodes are used to indicate a single rule has matched all its conditions - at this point we say the rule has a full match. A rule with an 'or' conditional disjunctive connective results in subrule generation for each possible logically branch; thus one rule can have multiple terminal nodes.

Drools also performs node sharing. Many rules repeat the same patterns, node sharing allows us to collapse those patterns so that they don't have to be re-evaluated for every single instance. The following two rules share the first same pattern, but not the last:

    
    rule
    when
        Cheese( $chedddar : name == "cheddar" )
        $person : Person( favouriteCheese == $cheddar )
    then
        System.out.println( $person.getName() + " likes cheddar" );
    end
    
   
    
    rule
    when
        Cheese( $chedddar : name == "cheddar" )
        $person : Person( favouriteCheese != $cheddar )
    then
        System.out.println( $person.getName() + " does not like cheddar" );
    end
    
  

As you can see below, the compiled Rete network shows the alpha node is shared, but the beta nodes are not. Each beta node has its own TerminalNode. Had the second pattern been the same it would have also been shared.

Node Sharing

Figure 2.9. Node Sharing


2.5. The Drools Rule Engine

2.5.1. Overview

Drools is split into two main parts: Authoring and Runtime.

The authoring process involves the creation of DRL or XML files for rules which are fed into a parser - defined by an Antlr 3 grammar. The parser checks for correctly formed grammar and produces an intermediate structure for the "descr"; where the "descr" indicates the AST that "describes" the rules. The AST is then passed to the Package Builder which produces Packages. Package Builder also undertakes any code generation and compilation that is necessary for the creation of the Package. A Package object is self contained and deployable, in that it's a serialized object consisting of one or more rules.

Authoring Components

Figure 2.10. Authoring Components


A RuleBase is a runtime component which consists of one or more Packages. Packages can be added and removed from the RuleBase at any time. A RuleBase can instantiate one or more WorkingMemories at any time; a weak reference is maintained, unless configured otherwise. The Working Memory consists of a number of sub components, including Working Memory Event Support, Truth Maintenance System, Agenda and Agenda Event Support. Object insertion may result in the creation of one or more Activations. The Agenda is responsible for scheduling the execution of these Activations.

Runtime Components

Figure 2.11. Runtime Components


2.5.2. Authoring

PackageBuilder

Figure 2.12. PackageBuilder


Four classes are used for authoring: DrlParser, XmlParser, ProcessBuilder and PackageBuilder. The two parser classes produce "descr" (description) AST models from a provided Reader instance. ProcessBuilder reads in an xstream serialisation representation of the Rule Flow. PackageBuilder provides convienience APIs so that you can mostly forget about those classes. The three convenience methods are "addPackageFromDrl", "addPackageFromXml" and addRuleFlow - all take an instance of Reader as an argument. The example below shows how to build a package that includes both XML, DRL and rule files and a ruleflow file, which are in the classpath. Note that all added package sources must be of the same package namespace for the current PackageBuilder instance!

Example 2.1. Building a Package from Multiple Sources

PackageBuilder builder = new PackageBuilder();
builder.addPackageFromDrl( new InputStreamReader( getClass().getResourceAsStream( "package1.drl" ) ) );
builder.addPackageFromXml( new InputStreamReader( getClass().getResourceAsStream( "package2.xml" ) ) );
builder.addRuleFlow( new InputStreamReader( getClass().getResourceAsStream( "ruleflow.rfm" ) ) );
Package pkg = builder.getPackage();      

It is essential that you always check your PackageBuilder for errors before attempting to use it. While the ruleBase does throw an InvalidRulePackage when a broken Package is added, the detailed error information is stripped and only a toString() equivalent is available. If you interrogate the PackageBuilder itself much more information is available.

Example 2.2. Checking the PackageBuilder for errors

PackageBuilder builder = new PackageBuilder();
builder.addPackageFromDrl( new InputStreamReader( getClass().getResourceAsStream( "package1.drl" ) ) );
PackageBuilderErrors errors = builder.getErrors();

PackageBuilder is configurable using PackageBuilderConfiguration class.

PackageBuilderConfiguration

Figure 2.13. PackageBuilderConfiguration


It has default values that can be overridden programmatically via setters or on first use via property settings. At the heart of the settings is the ChainedProperties class which searches a number of locations looking for drools.packagebuilder.conf files; as it finds them it adds the properties to the master propperties list; this provides a level precedence. In order of precedence those locations are: System Properties, user defined file in System Properties, user home directory, working directory, various META-INF locations. Further to this the droosl-compiler jar has the default settings in its META-INF directory.

Currently the PackageBulderConfiguration handles the registry of Accumulate functions, registry of Dialects and the main ClassLoader.

Drools has a pluggeable Dialect system, which allows other languages to compile and execution expressions and blocks, the two currently supported dialects are Java and MVEL. Each has its own DialectConfiguration Implementation; the javadocs provide details for each setter/getter and the property names used to configure them.

JavaDialectConfiguration

Figure 2.14. JavaDialectConfiguration


The JavaDialectConfiguration allows the compiler and language levels to be supported. You can override by setting the "drools.dialect.java.compiler" property in a packagebuilder.conf file that the ChainedProperties instance will find, or you can do it at runtime as shown below.

Example 2.3. Configuring the JavaDialectConfiguration to use JANINO via a setter

PackageBuilderConfiguration cfg = new PackageBuilderConfiguration( );
JavaDialectConfiguration javaConf = (JavaDialectConfiguration) cfg.getDialectConfiguration( "java" );
javaConf.setCompiler( JavaDialectConfiguration.JANINO );            

if you do not have Eclipse JDT Core in your classpath you must override the compiler setting before you instantiate this PackageBuilder, you can either do that with a packagebuilder properties file the ChainedProperties class will find, or you can do it programmatically as shown below; note this time I use properties to inject the value for startup.

Example 2.4. Configuring the JavaDialectConfiguration to use JANINO

Properties properties = new Properties();
properties.setProperty( "drools.dialect.java.compiler",
                        "JANINO" );
PackageBuilderConfiguration cfg = new PackageBuilderConfiguration( properties );
JavaDialectConfiguration javaConf = (JavaDialectConfiguration) cfg.getDialectConfiguration( "java" );
assertEquals( JavaDialectConfiguration.JANINO,
              javaConf.getCompiler() ); // demonstrate that the compiler is correctly configured            

Currently it allows alternative compilers (Janino, Eclipse JDT) to be specified, different JDK source levels ("1.4" and "1.5") and a parent class loader. The default compiler is Eclipse JDT Core at source level "1.4" with the parent class loader set to "Thread.currentThread().getContextClassLoader()".

The following show how to specify the JANINO compiler programmatically:

Example 2.5. Configuring the PackageBuilder to use JANINO via a property

PackageBuilderConfiguration conf = new PackageBuilderConfiguration();
conf.setCompiler( PackageBuilderConfiguration.JANINO );
PackageBuilder builder = new PackageBuilder( conf );

The MVELDialectConfiguration is much simpler and only allows strict mode to be turned on and off, by default strict is true; this means all method calls must be type safe either by inference or by explicit typing.

MvelDialectConfiguration

Figure 2.15. MvelDialectConfiguration


2.5.3. RuleBase

RuleBaseFactory

Figure 2.16. RuleBaseFactory


A RuleBase is instantiated using the RuleBaseFactory, by default this returns a ReteOO RuleBase. Packages are added, in turn, using the addPackage method. You may specify packages of any namespace and multiple packages of the same namespace may be added.

Example 2.6. Adding a Package to a new RuleBase

RuleBase ruleBase  = RuleBaseFactory.newRuleBase();
ruleBase.addPackage( pkg  );        

RuleBase

Figure 2.17. RuleBase


A RuleBase contains one or more more packages of rules, ready to be used, i.e., they have been validated/compiled etc. A Rule Base is serializable so it can be deployed to JNDI or other such services. Typically, a rulebase would be generated and cached on first use; to save on the continually re-generation of the Rule Base; which is expensive.

A Rule Base instance is thread safe, in the sense that you can have the one instance shared across threads in your application, which may be a web application, for instance. The most common operation on a rulebase is to create a new rule session; either stateful or stateless.

The Rule Base also holds references to any stateful session that it has spawned, so that if rules are changing (or being added/removed etc. for long running sessions), they can be updated with the latest rules (without necessarily having to restart the session). You can specify not to maintain a reference, but only do so if you know the Rule Base will not be updated. References are not stored for stateless sessions.

ruleBase.newStatefulSession();  // maintains a reference.
ruleBase.newStatefulSession( false ); // do not maintain a reference    

Packages can be added and removed at any time - all changes will be propagated to the existing stateful sessions; don't forget to call fireAllRules() for resulting Activations to fire.

ruleBase.addPackage( pkg );  // Add a package instance
ruleBase.removePackage( "org.com.sample" );  // remove a package, and all its parts, by it's namespace
ruleBase.removeRule( "org.com.sample", "my rule" ); // remove a specific rule from a namespace         

While there is a method to remove an indivual rule, there is no method to add an individual rule - to achieve this just add a new package with a single rule in it.

RuleBaseConfigurator can be used to specify additional behavior of the RuleBase. RuleBaseConfiguration is set to immutable after it has been added to a Rule Base. Nearly all the engine optimizations can be turned on and off from here, and also the execution behavior can be set. Users will generally be concerned with insertion behavior (identity or equality) and cross product behavior(remove or keep identity equals cross products).

RuleBaseConfiguration conf = new RuleBaseConfiguration();
conf.setAssertBehaviour( AssertBehaviour.IDENTITY );
conf.setRemoveIdentities( true );
RuleBase ruleBase = RuleBaseFactory.newRuleBase( conf );
RuleBaseConfiguration

Figure 2.18. RuleBaseConfiguration


2.5.4. WorkingMemory and Stateful/Stateless Sessions

WorkingMemory

Figure 2.19. WorkingMemory


It holds references to all data that has been "inserted" into it (until retracted) and it is the place where the interaction with your application occurs. Working memories are stateful objects. They may be shortlived or longlived.

2.5.4.1. Facts

Facts are objects (beans) from your application that you insert into the working memory. Facts are any Java objects which the rules can access. The rule engine does not "clone" facts at all, it is all references/pointers at the end of the day. Facts are your applications data. Strings and other classes without getters and setters are not valid Facts and can't be used with Field Constraints which rely on the JavaBean standard of getters and setters to interact with the object.

2.5.4.2. Insertion

"Insert" is the act of telling the WorkingMemory about the facts. WorkingMemory.insert(yourObject) for example. When you insert a fact, it is examined for matches against the rules etc. This means ALL of the work is done during insertion; however, no rules are executed until you call "fireAllRules()". You don't call "fireAllRules()" until after you have finished inserting your facts. This is a common misunderstanding by people who think the work happens when you call "fireAllRules()". Expert systems typically use the term "assert" or "assertion" to refer to facts made available to the system, however due to the assert become a keyword in most languages we have moved to use the "Insert" keyword; so expect to hear the two used interchangeably.

When an Object is insert it returns a FactHandle. This FactHandle is the token used to represent your insert Object inside the WorkingMemory, it is also how you will interact with the Working Memory when you wish to retract or modify an object.

Cheese stilton = new Cheese("stilton");
FactHandle stiltonHandle = session.insert( stilton );      

As mentioned in the Rule Base section a Working Memory may operate in two assertions modes equality and identity - identity is default.

Identity means the Working Memory uses an IdentityHashMap to store all asserted Objects. New instance assertions always result in the return of a new FactHandle, if an instance is asserted twice then it returns the previous fact handle – i.e. it ignores the second insertion for the same fact.

Equality means the Working Memory uses a HashMap to store all asserted Objects. New instance assertions will only return a new FactHandle if a no equal classes have been asserted.

2.5.4.3. Retraction

"Retraction" is when you retract a fact from the Working Memory, which means it will no longer track and match that fact, and any rules that are activated and dependent on that fact will be cancelled. Note that it is possible to have rules that depend on the "non existence" of a fact, in which case retracting a fact may cause a rule to activate (see the 'not' and 'exist' keywords). Retraction is done using the FactHandle that was returned during the assert.

Cheese stilton = new Cheese("stilton");
FactHandle stiltonHandle = session.insert( stilton );
....
session.retract( stiltonHandle );            

2.5.4.4. Update

The Rule Engine must be notified of modified Facts, so that it can be re-process. Modification internally is actually a retract and then an insert; so it clears the WorkingMemory and then starts again. Use the modifyObject method to notify the Working Memory of changed objects, for objects that are not able to notify the Working Memory themselves. Notice modifyObject always takes the modified object as a second parameter - this allows you to specify new instances for immutable objects. The update() method can only be used with objects that have shadow proxies turned on. If you do not use shadow proxies then you must call session.modifyRestract() before making your changes and session.modifyInsert() after the changes.

Cheese stilton = new Cheese("stilton");
FactHandle stiltonHandle = workingMemory.insert( stilton );
....
stilton.setPrice( 100 );
workingMemory.update( stiltonHandle, stilton );              

2.5.4.5. Globals

Globals are named objects that can be passed in to the rule engine; without needing to insert them. Most often these are used for static information, or services that are used in the RHS of a rule, or perhaps a means to return objects from the rule engine. If you use a global on the LHS of a rule, make sure it is immutable. A global must first be declared in the drl before it can be set on the session.

global java.util.List list        

With the Rule Base now aware of the global identifier and its type any sessions are now able to call session.setGlobal; failure to declare the global type and identifier first will result in an exception being thrown. to set the global on the session use session.setGlobal(identifier, value);

List list = new ArrayList();
session.setGlobal("list", list);           

If a rule evaluates on a global before you set it you will get a NullPointerException.

2.5.4.6. Shadow Facts

A shadow fact is a shallow copy of an asserted object. Shadow facts are cached copies of object asserted to the working memory. The term shadow facts is commonly known as a feature of JESS (Java Expert System Shell).

The origins of shadow facts traces back to the concept of truth maintenance. The basic idea is that an expert system should guarantee the derived conclusions are accurate. A running system may alter a fact during evaluation. When this occurs, the rule engine must know a modification occurred and handle the change appropriately. There's generally two ways to guarantee truthfulness. The first is to lock all the facts during the inference process. The second is to make a cache copy of an object and force all modifications to go through the rule engine. This way, the changes are processed in an orderly fashion. Shadow facts are particularly important in multi-threaded environments, where an engine is shared by multiple sessions. Without truth maintenance, a system has a difficult time proving the results are accurate. The primary benefit of shadow facts is it makes development easier. When developers are forced to keep track of fact modifications, it can lead to errors, which are difficult to debug. Building a moderately complex system using a rule engine is hard enough without adding the burden of tracking changes to facts and when they should notify the rule engine.

Drools 4.0 has full support for Shadow Facts implemented as transparent lazy proxies. Shadow facts are enable by default and are not visible from external code, not even inside code blocks on rules.

Although shadow facts are a great way of ensuring the engine integrity, they add some overhead to the the reasoning process. As so, Drools 4.0 supports fine grained control over them with the ability to enable/disable them for each individual class. To disable shadow fact for all classes set the following property in a configuration file of system property:

drools.shadowProxy = false

Alternatively, it is possible to disable through an API call:

RuleBaseConfiguration conf = new RuleBaseConfiguration();
conf.setShadowProxy( false );
...
RuleBase ruleBase = RuleBaseFactory.newRuleBase( conf );

To disable the shadow proxy for a list of classes only, use the following property instead:

drools.shadowproxy.exclude = org.domainy.* org.domainx.ClassZ

As shown above, a space separated list is used to specify more than one class, and '*' is used as a wild card.

IMPORTANT: disabling shadow facts for a class inhibits the ability of the engine keep track of changes to that class attributes. It means, once asserted, a fact of that class MUST NOT change any of its attributes or the engine may start to present unpredictable behavior. It does not help to use update(). The only way to safely change an attribute of a fact whose shadow fact is disabled is to call modifyRetract() before changing the attribute, change the attribute and call modifyAssert().

2.5.4.7. Property Change Listener

If your fact objects are Java Beans, you can implement a property change listener for them, and then tell the rule engine about it. This means that the engine will automatically know when a fact has changed, and behave accordingly (you don't need to tell it that it is modified). There are proxy libraries that can help automate this (a future version of drools will bundle some to make it easier). To use the Object in dynamic mode specify true for the second assertObject parameter.

Cheese stilton = new Cheese("stilton");
FactHandle stiltonHandle = workingMemory.insert( stilton, true );  //specifies that this is a dynamic fact            

To make a JavaBean dynamic add a PropertyChangeSupport field memory along with two add/remove mothods and make sure that each setter notifies the PropertyChangeSupport instance of the change.

private final PropertyChangeSupport changes = new PropertyChangeSupport( this );
...
public void addPropertyChangeListener(final PropertyChangeListener l) {
    this.changes.addPropertyChangeListener( l );
}

public void removePropertyChangeListener(final PropertyChangeListener l) {
    this.changes.removePropertyChangeListener( l );
}
...

public void setState(final String newState) {
    String oldState = this.state;
    this.state = newState;
    this.changes.firePropertyChange( "state",
                                      oldState,
                                      newState );
}              

2.5.4.8. Initial Fact

To support conditional elements like "not" (which will be covered later on), there is a need to "seed" the engine with something known as the "Initial Fact". This fact is a special fact that is not intended to be seen by the user.

On the first working memory action (assert, fireAllRules) on a fresh working memory, the Initial Fact will be propagated through the RETE network. This allows rules that have no LHS, or perhaps do not use normal facts (such as rules that use "from" to pull data from an external source). For instance, if a new working memory is created, and no facts are asserted, calling the fireAllRules will cause the Initial Fact to propagate, possibly activating rules (otherwise, nothing would happen as there area no other facts to start with).

2.5.5. StatefulSession

StatefulSession

Figure 2.20. StatefulSession


The StatefulSession extends the WorkingMemory class. It simply adds async methods and a dispose() method. The ruleBase retains a reference to each StatefulSession is creates, so that it can update them when new rules are added, dispose() is needed to release the StatefulSession reference from the RuleBase, without it you can get memory leaks.

Example 2.7. Createing a StatefulSession

StatefulSession session = ruleBase.newStatefulSession();

2.5.6. Stateless Session

StatelessSession

Figure 2.21. StatelessSession


The StatelessSession wraps the WorkingMemory, instead of extending it, its main focus is on decision service type scenarios.

Example 2.8. Createing a StatelessSession

StatelessSession session = ruleBase.newStatelessSession();
session.execute( new Cheese( "cheddar" ) );

The api is reduced for the problem domain and is thus much simpler; which in turn can make maintenance of those services easier. The RuleBase never retains a reference to the StatelessSession, thus dispose() is not needed, and they only have an execute() method that takes an object, an array of objects or a collection of objects - there is no insert or fireAllRules. The execute method iterates the objects inserting each and calling fireAllRules() at the end; session finished. Should the session need access to any results information they can use the executeWithResults method, which returns a StatelessSessionResult. The reason for this is in remoting situations you do not always want the return payload, so this way its optional.

setAgendaFilter, setGlobal and setGlobalResolver share their state across sessions; so each call to execute() will use the set AgendaFilter, or see any previous set globals etc.

StatelessSessions do not currently support propertyChangeLissteners.

Async versions of the Execute method are supported, remember to override the ExecutorService implementation when in special managed thread environments such as JEE.

StatelessSessions also support sequential mode, which is a special optimised mode that uses less memory and executes faster; please see the Sequential section for more details.

StatelessSessionResult

Figure 2.22. StatelessSessionResult


StatelessSession.executeWithResults(....) returns a minimal api to examine the sessions data. The inserted Objects can be iterated over, querries can be executed and globals retrieved. Once the StatelessSessionResult is serialised it loses the reference to the underlying WorkingMemory and RuleBase, so querries can no longer be executed, however globals can still be retrieved and objects iterated. To retrieve globals they must be exported from the StatelessSession; the GlobalExporter strategy is set with StatelessSession.setGlobalExporter( GlobalExporter globalExporter ). Two implementations of GlobalExporter are available and users may implement their own strategies. CopyIdentifiersGlobalExporter copies named identifiers into a new GlobalResovler that is passed to the StatelessSessionResult; the constructor takes a String[] array of identifiers, if no identifiers are specified it copies all identifiers declaredin the RuleBase. ReferenceOriginalGlobalExporter just passes a reference to the original Global Resolver; the later should be used with care as identifier instances can be changed at any time by the StatelessSession and the GlobalResolver may not be serialisable freindly.

Example 2.9. GlobalExporter with StatelessSessions

StatelessSession session = ruleBase.newStatelessSession();
session.setGlobalExporter( new CopyIdentifiersGlobalExporter( new String[]{"list"} ) );
StatelessSessionResult result = session.executeWithResults( new Cheese( "stilton" ) );
List list = ( List ) result.getGlobal( "list" );

2.5.7. Agenda

Two Phase Execution

Figure 2.23. Two Phase Execution


The Agenda is a RETE feature. During a Working Memory Action rules may become fully matched and eligible for execution; a single Working Memory Action can result in multiple eligible rules. When a rule is fully matched an Activation is created, referencing the Rule and the matched facts, and placed onto the Agenda. The Agenda controls the execution order of these Activations using a Conflict Resolution strategy.

The engine operates in a "2 phase" mode which is recursive:

  1. Working Memory Actions - this is where most of the work takes place - in either the Consequence or the main java application process. Once the Consequence has finished or the main Java application process calls fireAllRules() the engine switches to the Agenda Evaluation phase.

  2. Agenda Evaluation - attempts to select a rule to fire, if a rule is not found it exits, otherwise it attempts to fire the found rule, switching the phase back to Working Memory Actions and the process repeats again until the Agenda is empty.

Two Phase Execution

Figure 2.24. Two Phase Execution


The process recurses until the agenda is clear, in which case control returns to the calling application. When Working Memory Actions are taking place, no rules are being fired.

2.5.7.1. Conflict Resolution

Conflict resolution is required when there are multiple rules on the agenda. As firing a rule may have side effects on working memory, the rule engine needs to know in what order the rules should fire (for instance, firing ruleA may cause ruleB to be removed from the agenda).

The default conflict resolution strategies emplyed by Drools are: Salience and LIFO (last in, first out).

The most visible one is "salience" or priority, in which case a user can specify that a certain rule has a higher priority (by giving it a higher number) then other rules. In that case, the higher salience rule will always be preferred. LIFO priorities based on the assigned Working Memory Action counter value, multiple rules created from the same action have the same value - execution of these are considered arbitrary.

As a general rule, it is a good idea not to count on the rules firing in any particular order, and try and author the rules without worrying about a "flow".

Custom conflict resolution strategies can be specified by setting the Class in the RuleBaseConfiguration method setConflictResolver, or using the property "drools.conflictResolver".

2.5.7.2. Agenda Groups

Agenda groups are a way to partition rules (activations, actually) on the agenda. At any one time, only one group has "focus" which means that the activations for rules in that group will only take effect - you can also have rules "auto focus" which means the focus for its agenda group is taken when that rules conditions are true.

They are sometimes known as "modules" in CLIPS terminology. Agenda groups are a handy way to create a "flow" between grouped rules. You can switch the group which has focus either from within the rule engine, or from the API. If you rules have a clear need for multiple "phases" or "sequences" of processing, consider using agenda-groups for this purpose.

Each time setFocus(...) is called it pushes that Agenda Group onto a stack, when the focus group is empty it is popped off and the next one of the stack evaluates. An Agenda Group can appear in multiple locations on the stack. The default Agenda Group is "MAIN", all rules which do not specify an Agenda Group are placed there, it is also always the first group on the Stack and given focus as default.

2.5.7.3. Agenda Filters

AgendaFilters

Figure 2.25. AgendaFilters


Filters are optional implementations of a the filter interface which are used to allow/or deny an activation from firing (what you filter on, is entirely up to the implementation). Drools provides the following convenience default implementations

  • RuleNameEndWithAgendaFilter

  • RuleNameEqualsAgendaFilter

  • RuleNameStartsWithAgendaFilter

  • RuleNameMatchesAgendaFilter

To use a filter specify it while calling FireAllRules. The following example will filter out all rules ending with the text "Test":

workingMemory.fireAllRules( new RuleNameEndsWithAgendaFilter( "Test" ) );      

2.5.8. Truth Maintenance with Logical Objects

In a regular insert, you need to explicitly retract a fact. With logical assertions, the fact that was asserted will be automatically retracted when the conditions that asserted it in the first place are no longer true (it's actually more clever then that! If there are no possible conditions that could support the logical assertion, only then will it be retracted).

Normal insertions are said to be “STATED” (ie The Fact has been stated - just like the intuitive concept). Using a HashMap and a counter we track how many times a particular equality is STATED; this means we count how many different instances are equal. When we logically insert an object we are said to justify it and it is justified by the firing rule. For each logical insertion there can only be one equal object, each subsequent equal logical insertion increases the justification counter for this logical assertion. As each justification is removed when we have no more justifications the logical object is automatically retracted.

If we logically insert an object when there is an equal STATED object it will fail and return null. If we STATE an object that has an exist equal object that is JUSTIFIED we override the Fact - how this override works depends on the configuration setting "WM_BEHAVIOR_PRESERVE". When the property is set to discard we use the existing handle and replace the existing instance with the new Object - this is the default behavior - otherwise we override it to STATED but we create an new FactHandle.

This can be confusing on a first read, so hopefully the flow charts below help. When it says that it returns a new FactHandle, this also indicates the Object was propagated through the network.

Stated Insertion

Figure 2.26. Stated Insertion


Logical Insertion

Figure 2.27. Logical Insertion


2.5.8.1. Example Scenario

An example may make things clearer. Imagine a credit card processing application, processing transactions for a given account (and we have a working memory accumulating knowledge about a single accounts transaction). The rule engine is doing its best to decide if transactions are possibly fraudulent or not. Imagine this rule base basically has rules that kick in when there is "reason to be suspicious" and when "everything is normal".

Of course there are many rules that operate no matter what (performing standard calculations, etc.). Now there are possibly many reasons as to what could trigger a "reason to be suspicious": someone notifying the bank, a sequence of large transactions, transactions for geographically disparate locations or even reports of credit card theft. Rather then smattering all the little conditions in lots of rules, imagine there is a fact class called "SuspiciousAccount".

Then there can be a series of rules whose job is to look for things that may raise suspicion, and if they fire, they simply insert a new SuspiciousAccount() instance. All the other rules just have conditions like "not SuspiciousAccount()" or "SuspiciousAccount()" depending on their needs. Note that this has the advantage of allowing there to be many rules around raising suspicion, without touching the other rules. When the facts causing the SuspiciousAccount() insertion are removed, the rule engine reverts back to the normal "mode" of operation (and for instance, a rule with "not SuspiciousAccount()" may kick in which flushes through any interrupted transactions).

If you have followed this far, you will note that truth maintenance, like logical assertions, allows rules to behave a little like a human would, and can certainly make the rules more manageable.

2.5.8.2. Important note: Equality for Java objects

It is important to note that for Truth Maintenance (and logical assertions) to work at all, your Fact objects (which may be Javabeans) override equals and hashCode methods (from java.lang.Object) correctly. As the truth maintenance system needs to know when 2 different physical objects are equal in value, BOTH equals and hashCode must be overridden correctly, as per the Java standard.

Two objects are equal if and only if their equals methods return true for each other and if their hashCode methods return the same values. See the Java API for more details (but do keep in mind you MUST override both equals and hashCode).

2.5.9. Event Model

The event package provides means to be notified of rule engine events, including rules firing, objects being asserted, etc. This allows you to separate out logging/auditing activities from the main part of your application (and the rules) - as events are a cross cutting concern.

There are three types of event listeners - WorkingMemoryEventListener, AgendaEventListener RuleFlowEventListener.

WorkingMemoryEventListener

Figure 2.28. WorkingMemoryEventListener


AgendaEventListener

Figure 2.29. AgendaEventListener


RuEventListener

Figure 2.30. RuEventListener


Both stateful and statless sessions implement the EventManager interface, which allows event listeners to be added to the session.

EventManager

Figure 2.31. EventManager


All EventListeners have default implementations that implement each method, but do nothing, these are convienience classes that you can inherit from to save having to implement each method - DefaultAgendaEventListener, DefaultWorkingMemoryEventListener, DefaultRuleFlowEventListener. The following shows how to extend DefaultAgendaEventListener and add it to the session - the example prints statements for only when rules are fired:

session.addEventListener( new DefaultAgendaEventListener() {                            
   public void afterActivationFired(AfterActivationFiredEvent event) {
       super.afterActivationFired( event );
       System.out.println( event );
   }
});       

Drools also provides DebugWorkingMemoryEventListener, DebugAgendaEventListener and DebugRuleFlowEventListener that implements each method with a debug print statement:

session.addEventListener( new DebugWorkingMemoryEventListener() );        

The Eclipse based Rule IDE also provides an audit logger and graphical viewer, so that the rule engine can log events for later viewing, and auditing.

2.5.10. Sequential Mode

With Rete you have a stateful session where objects can be asserted and modified over time, rules can also be added and removed. Now what happens if we assume a stateless session, where after the initial data set no more data can be asserted or modified (no rule re-evaluations) and rules cannot be added or removed? This means we can start to make assumptions to minimize what work the engine has to do.

  1. Order the Rules by salience and position in the ruleset (just sets a sequence attribute on the rule terminal node). 4

  2. Create an array, one element for each possible rule activation; element position indicates firing order.

  3. Turn off all node memories, except the right-input Object memory.

  4. Disconnect the LeftInputAdapterNode propagation, and have the Object plus the Node referenced in a Command object, which is added to a list on the WorkingMemory for later execution.

  5. Assert all objects, when all assertions are finished and thus right-input node memories are populated check the Command list and execute each in turn.

  6. All resulting Activations should be placed in the array, based upon the determined sequence number of the Rule. Record the first and last populated elements, to reduce the iteration range.

  7. Iterate the array of Activations, executing populated element in turn.

  8. If we have a maximum number of allowed rule executions, we can exit our network evaluations early to fire all the rules in the array.

The LeftInputAdapterNode no longer creates a Tuple, adding the Object, and then propagate the Tuple – instead a Command Object is created and added to a list in the Working Memory. This Command Object holds a reference to the LeftInputAdapterNode and the propagated Object. This stops any left-input propagations at insertion time, so that we know that a right-input propagation will never need to attempt a join with the left-inputs (removing the need for left-input memory). All nodes have their memory turned off, including the left-input Tuple memory but excluding the right-input Object memory – i.e. The only node that remembers an insertion propagation is the right-input Object memory. Once all the assertions are finished, and all right-input memories populated, we can then iterate the list of LeftInputAdatperNode Command objects calling each in turn; they will propagate down the network attempting to join with the right-input objects; not being remembered in the left input, as we know there will be no further object assertions and thus propagations into the right-input memory.

There is no longer an Agenda, with a priority queue to schedule the Tuples, instead there is simply an array for the number of rules. The sequence number of the RuleTerminalNode indicates the element with the array to place the Activation. Once all Command Objects have finished we can iterate our array checking each element in turn and firing the Activations if they exist. To improve performance in the array we remember record the first and last populated cells. The network is constructed where each RuleTerminalNode is given a sequence number, based on a salience number and its order of being added to the network.

Ty