|
|
Home | Loader API | Interfaces | File List | Index | |
This component is a forward chaining, rule engine. The rule engine is programmed via a specific XML dialect described in this document. The rule XML is interpreted by this component. P6R's rule enigne uses XPath 2.0 as its expression language and this also allows an application to extend the rule engine's behvaior via XPath interface P6R::p6IXpathVariables (which allows an application to define their own functions and variables that are recognized in an XPath expression). XPath 2.0 also contains powerful functionality such as regular expressions and string functions. The P6R's rule engine is unique in that is supports both XML and JSON encoded data for its stated fact tree (see below).
An XPath Enabled Rule Engine - https://www.p6r.com/articles/2008/05/22/an-xpath-enabled-rule-engine/
Compiled Rule Sets - https://www.p6r.com/articles/2008/09/17/compiled-rule-sets/
Rule Throttling - https://www.p6r.com/articles/2008/12/17/rule-throttling/
The basic building block is the rule and it has the following form:
<rule name='abc' setname='rover' priority='5'> <if test='XPath expression' /> <then> One or more actions </then> </rule>
Every rule must have an 'if' and 'then' element. The 'if' element defines the conditions when the rule is true and thus can be executed. The 'then' element defines one or more actions which are evaluated when a rule is executed. (Note, that the 'if' element is an empty element which must appear before the 'then' element.)
Rules can be grouped in to rule sets where each set is defined by a unique name. A rule defines which set it belongs do by the use of the optional attribute 'setname'. If no 'setname' attribute is defined then a rule is placed in the default rule set which is named by '#default'. (Note, that the rule set name '#default' is reserved.)
Rules are grouped under a parent element of '<rulesets> ... </rulesets>', allowing multiple rules and rule sets to be defined by a single XML document.
Rules can also use the optional attribute of 'priority' which defines the relative importance of a rule with all the other rules in the same rule set. If no 'priority' attribute is used then a priority of zero (the lowset level) is the default.
In the 'if' element value of the test attribute can be any valid XPath 2.0 expression. These expressions are evaluated against the XML (or JSON) 'fact' tree created by a call to the startFacts() API call. This method allows the application to define 'stated' facts to be used in the evaluation of rules.
When a rules 'if' element evaluates to true and it is selected for execution each action defined in its 'then' element is executed in order of appearance. There are several possible actions, each is defined in the next couple of sections.
Note, that because we use XML as our base language several rule engine features require the use of XML namespaces. These are described in detail below.
There are two types of global variables. The first kind are defined by the 'variable' element, for example: <variable name='$g1' setname='example1' select='XPath expression' /> (where 'select' and 'setname' are optional). Variable declarations can occur anywhere after initial 'rulesets' element, but must not be inside a rule definition.
A global variable is defined to be in a rule set, which is defined by the 'setname' attribute. If no setname is defined then the '#default' set is used. A global variable can have its value initialized by any XPath valid value via the use of the 'select' attribute. A global variable can have its value changed at anytime by the use of the 'set-variable' action in a 'then' element (see below, "The Rule 'Then' Element" for details). A global variable is only visible by rules inside its rule set, thus multiple rule sets can have the same variable defined.
Variable name's must start with a dollar sign ('$'), for example '$g1', this is an XPath requirement and allows the XPath parser to recognize variables. Thus any global variable can be passed into any function inside an XPath expression.
The second kind of global variable are those defined by the application using the rule engine. These variables also start with a dollar sign but are NOT defined by any 'variable' element. To define their value the calling application must create an implementation of the P6R::p6IXpathVariables interface and pass that object into the rule engine API function setExternalFunctions( p6IXpathVariables *pConnector ); The p6IXpathVariables lets the calling application to extend the rule engine functionality by defining any number of global variables and any number of functions that can be called inside an XPath expression. A global variable defined in this manner can have any valid XPath type and can have a different value each time it is accessed. In addition, the calling application can call the setExternalFunctions() function mutliple times between calls to evaluate() if desired.
1) <call func='XPath Expression' /> The 'func' parameter is required. The func expression should contain a call to a function that is either a built in XPath 2.0 function, or one defined by the calling application via the use of the P6R::p6IXpathVariables interface and the rule engine setExternalFunctions() method. For example, the XPath expression could be something like "setclick( 'abc', $g1 )", which would be a function defined by the application with two parameters being passed to it. Note, that XPath 2.0 allows any number of parameters to be passed to an application defined function, the parameters to that function must be valid XPath types (e.g., strings, integers, XML node sequences). This action is a good way to call out to application defined functions to execute an external action when a specific condition is satisified (i.e., when the 'if' element condition is true).
2) <set-variable name='$g1' select='XPath expression' /> The 'name' and 'select' parameters are required. This action will first evaluate the XPath expression defined in the select parameter. The result of that expression will be assigned to the global variable designated in the 'name' parameter. If that global parameter does not already exist (i.e., been defined by a 'variable' element (see Section above on Global Variables), a fatal error will occur and the evaluate() API call will fail and rule execution will stop immediately. The rule set that this action is defined in is the rule set used to lookup the global variable (e.g., "rule setname="example1" ...", then the 'example1' rule set would be used for this action).
3) <set-focus setname='over' /> The 'setname' parameter is required. This action changes the rule set that is currently being evaluated. The change will occur after the current rule is executed. Thus any actions that follow the set-focus action in the same rule will not be affected. The the rule set identified by the setname parameter does not exist then a fatal error will occur and the evaluate() API call will fail and rule execution will stop immediately.
4) <set-fact name='rover' separator='-+-' select='XPath expression' location='/P6R;infer ...' /> The 'name' parameter is required and 'select' and 'location' are optional. This action adds one or more inferred facts onto the inferred fact XML tree under "/P6R:infer". The value of the 'name' parameter defines the XML node to be added so if "name='rover'" the path to the new inferred fact will be "/P6R:infer".
However, if the 'location' parameter is also defined, for example "location='/P6R:infer/part1'", then the new inferred fact will be add at the path "/P6R:infer/part1/rover", assuming that "part1" is already in the tree. The 'select' parameter will evaluate to a valid XPath type value and that will get added to the location in the inferred fact tree. Thus its the value of the select parameter that determines the value of the inferred fact.
The action <set-fact name='section1' /> will simply add a new XML node at the path "/P6R:infer/section1". This is useful for different rule sets to create separate subtrees in the inferred fact tree that is just for that rule set to use. Also the action "<set-fact name='paragraph44' location='/P6R:infer/section1' />", which would create the following nodes in the inferred fact tree "/P6R:infer/section1/paragraph44". Once this path is created, any number of facts can be "hung off of" any part of that tree (i.e., at the 'section1' branch or at the 'paragraph44' branch).
The "separator" attribute is optional and is a string value. It can be used to separate the elements of an XPath node set which may end up as a result of the "select" attribute. If a node set is not the result of the select attribute then the separator is ignored. If no separator is specified then a single space is used as the default separator.
5) <clear-facts /> This action results in the entire inferred fact tree (i.e., "/P6R:infer") to be deleted. After this action new inferred facts can be added to a new inferred fact tree. The result of this action is identical to the result of the API call P6R::p6IRuleEngine::reset( P6RULE_CLEAR_FACTS ).
6) <delete-fact location='/P6R:infer ...' /> This action removes all the inferred facts defined in the inferred fact tree located at the path defined by the 'location' parameter. After this action, new inferred facts can always be added back onto the tree at the same location that was deleted.
7) <halt /> There are no parameters to this action. This action stops rule engine evaluation immedately. If there are any additional actions that are defined after the 'halt' they are ignored. This action results in a successful return of the rule engine's evaluate() API call (i.e., it does not represent an error condition).
The "Agenda" is a standard rule engine concept. There is one agenda for each rule set and basically it is just a list of rules whose 'if' element has evaluated to true. None of the rules on the agenda have been executed but they are on the list because they could be executed.
The rule engine works in multiple passes. Once a rule set is selected all the rules in that rule set have their 'if' elements evaluated. All rules that have their 'if' elements evaluate to true are added to the agenda. The rules in the agenda are organized by rule priority. And so the rule that actually gets to run is the first rule with the highest priority. This selection of the rule to run, out of several possible rules, is referred to as conflict resolution.
The rule that is select to be run is removed from the agenda but all the others remain. Now the rule engine API allows the application to clear agenda in several ways if desired. First the function P6R::p6IRuleEngine::reset() with the P6RULERESETS flag of 'P6RULE_CLEAR_AGENDAS' can be called between calls to the evaluate() method or during the initialize() method the flag contains the 'P6RULE_CLEARAGENDAS' flag value. When using the initialize() method agendas will be cleared each and every time after a rule has been selected for execution.
The agenda is a excellent optimization since it reduces the number of times that 'if' elements have to be evaluated which can involved evaluating an XPath expression against the fact tree.
P6R's rule engine has two types of facts that can be accessed in any XPath expression. The first type of fact is referred to a "stated" fact and is declared to the system by the API functions P6R::p6IRuleEngine::startFacts() and P6R::p6IRuleEngine::continueFacts() which both take buffers of XML (or JSON with the P6RULE_USEJSON flag). Any XML (or JSON) that the application wishes to use can be streamed into these functions. The result is a standard DOM tree created by using the P6R::p6IDOMXML component. In addtion, this tree is then automatically referencable via the XPath 2.0 expression language embedded into our rule engine language.
If JSON is used as the data format for stated facts, then the XPath step paths need to follow the rules defined in the XPath component. Basically, a path to a fact "/menu/lunch/drinks" in the DOM tree becomes "/JSON-document/menu/lunch/drinks", see XPath 2.0 and DOM Tree Implementation.
The second type of fact is an "inferred" fact and is one that is created by a 'set-fact' element which can appear inside any 'then' element (see section "The Rule 'Then' Element" above). The inferred fact that is created by this element is added onto the same DOM XML tree that stated facts are added to. Thus both stated and inferred facts share one DOM XML tree. This tree is global to the rule engine so that any rule and any XPath expression can access it.
All inferred facts are placed by default under the DOM XML tree's leaf at the path "/P6R:infer", so for example if a stated fact had the name "rover" it would appear in the XPath path as "/P6R:infer/rover". In addition to this default behavoir there is an optional attribute to the 'set-fact' element called 'location'. The 'location' attribute specifies where in the DOM XML tree under '/P6R:infer' to place the new inferred fact. So for example, if we had <set-fact name='rover' location='/P6R:infer/betamax/tape' />, then the new fact would be placed in the DOM XML tree at "/P6R:infer/betamax/tape/rover". This works just as long as the branches 'betamax' and 'tape' already exist.
The following namespace is required to access inferred facts via XPath expressions:
<rulesets version='1.0' xmlns:P6R='http://www.p6r.com/ruleengine'> Thus it is interesting to note that all facts in the P6R's rule engine are organized and can be created in a tree format, and accessed by the powerful XPath 2.0 language. In XML an item does not have to have a unique name. So for example the following XML is totally valid: <book> <chapter> ... </chapter> <chapter> ... </chapter> <chapter> ... </chapter> </book>
In this example, there are 3 chapters in the book. The standard XPath expression "/book/chapter" refers to all three chapter elements and constructs a set of all three. Thus it is possible to create inferred facts with the same name, that is path in the XML tree. In addition, using this tree structure also allows easy segregation of facts. That is each rule set can create its own branch under "/P6R;infer" in order to keep its data separate from other rules sets. This control is totally in the hands of the rule writter to use.
Passing the 'P6RULE_KEEPSTATS' flag into the P6R::p6IRuleEngine::initialize() method enables a powerful run-time feature. Automatically a subtree '/P6R:stats' is created (similar to the subtree '/P6R:infer' defined above) off of the fact tree. In this tree the following counts are generated: (1) how many times a each rule is executed, and (2) the total number of rules that evaluated to true and those to false. Each count is done in relation to a rule set. After an evaluation this information can be output in a similar way that the inferred rules are output (i.e., P6R::p6IRuleEngine::outputStatistics()).
The following namespace is required to access execution statistics via XPath expressions:
<rulesets version='1.0' xmlns:P6R2='http://www.p6r.com/ruleengine'> Below is an example taken from one of our unit tests: <?xml version="1.0" encoding="utf-8"?> <P6R:stats xmlns:P6R="http://www.p6r.com/ruleengine"> <redrover> <P6R:eval-true>2</P6R:eval-true> <P6R:eval-false>2</P6R:eval-false> <newset>2</newset> </redrover> <P6R:defaultset> <P6R:eval-true>4</P6R:eval-true> <Irvin>2</Irvin> <stoprules>2</stoprules> </P6R:defaultset> </P6R:stats>
In the example above, 'redrover' is the name of a defined rule set. 'P6R:defaultset' is the XML friendly form used for the '#default' rule set (see above). 'P6R:eval-true' represents the count of the number of rules that evaluated to true, while 'P6R:eval-false' represents the opposite. The 'newset' element is a rule in the redrover rules set that ran twice. So from these counts we can see what gets executed and how often.
These counts can be accumlated over one or many P6R::p6IRuleEngine::evalute() calls or can be reset at any time by calling P6R::p6IRuleEngine::reset() with P6RULE_CLEAR_STATS set. Note that these counts are totally separate from the inferred facts that are generated.
Now even more interesting is that during rule execution these values are available for access just like all inferred facts (as described above). Since '/P6R:stats' (just like '/P6R:infer') is in the DOM tree all its elements are accessable via XPath. So the value '2' of newset can be referenced in a test condition, select attribute, XPath function call etc. This feature allows for all kinds of functionality including "rule throttling".
Rules and rule sets can be deleted and modified by the use of the following additional XML elements:
1) <delete-rule name='redcar' setname='rover' /> This command will delete the rule with the unique name set:rover, rule:redcar, for example.
2) <delete-ruleset setname='rover' /> This command will delete the rule set with the unique name defined in the setname parameter. All the rules contained in the selected rule set will also be deleted.
3) <delete-variable name='$g1' setname='rover' /$gt; This command will delete the global variable with the unique name set:rover, name:$g1, for example.
4) <modify-rule ... This element has the exact same syntax as the basic <rule> element. However, if finds an existing rule and over writes it with the values it defines. A 'modify-rule' must completely define the rule even if only a small part of the old rule is being changed. So for example, if only the priority attribute is being changed the entire contents of the rest of the rule definition must be re-declared since this operation is an over write. If the rule does not exist to be modified a fatal error will occur.
It is important to understand that the rule engine will continually execute if there is a single rule that can be executed. So it is possible for the same rule to be run over and over again if its 'if' element continually evaluates to true. So when writting rules it is important to consider how to stop the evaluation.
There are two brute force ways to stop the rule engine evaluate() function. One way is to use the '<halt/>' action. Another way is to limit the number of rules that can be exeucted by each evaluate() call by calling the rule engine API method setEvalLimits( number ), where 'number' indicates the maximum number of rules to run. So a call to setEvalLimits(1) will ensure that only one rule is executed per call to evualuate, while setEvalLimits(0) means no limit and is the default setting.
Other ways to ensure that the same rule does not run over and over again are to (a) in a rule's set of actions make sure one of the actions invalidates the 'if' element's expression, and (b) set the 'set-focus' action to change to another rule set where there are no rules or rules that are less likely to have their 'if' elements evaluate to true.
A detailed trace of rule compilation and exeuction can be obtained by calling the rule engine's API initialize() with the P6RULE_TRACEBASIC flag. This will show the rules on the agendas and which rules rules and even the actions that execute for each rule. The P6RULE_TRACEVERBOSE flag will generate a very large trace since it turns on the logging for all P6R components that comprise the rule engine (e.g., the XPath logging will also be turned on).
P6R's rule engine supports forward chaining by the use of the 'set-fact' action which places inferred facts into a subtree of the "/P6R:infer" branch of a DOM XML tree. This inferred fact tree (i.e., "working memory" in rule engine jargon). can contain its values as long as the p6IRuleEngine instance exists. That is the facts in the tree can live across multiple calls to the evaluate() API function. This behavior is referred to as statefull sessions.
In addition to statefull session, P6R's rule engine supports stateless sessions by either the use of the 'clear-facts' action during rule engine execution or via the API function reset( P6RULE_CLEAR_INFER | P6RULE_CLEAR_AGENDAS ) before the next call to the API function evaluate. The P6R::p6IRuleEngine::reset() function is a cheap operation. The result of clearing the generated inferred facts is that each new run of the rule engine runs with no previous state. Creation of a new inferred fact tree
Truth maintenance is concerned with keeping Agenda filled with only valid rules (i.e., whose 'if' element is true. Remember that the Agenda is a type of optimization mechanism. All rules that can be executed are placed on an Agenda. However, what happens when the "fact" (either stated or inferred) is no longer true (i.e., not in the DOM XML tree)? All rules in the Agenda that are dependent on a fact to be true need to be removed from all Agendas. To to this one of two methods can be used: (1) call initialize( P6RULE_RESETAGENDAS ) when the rule engine is created, or (2) call reset( P6RULE_CLEAR_AGENDAS ) before each call to the evaluate() function. Both of these methods clear out all rules on the agendas thus forcing all 'if' elements to be re-evaluated.
In addition, it is possible to have negative rules (i.e., a rule will be on an agenda when an XPath not() or exists() condition fails. In this case, the rule can be run on a negative condition. So if a fact is removed from the DOM XML tree then a rules should be placed on the Agenda. This case takes care of itself next time the evaluate() function is called.
Often there is a need for a rule engine to perform a task quickly. Even in these cases complex regular expressions can be required. The P6R rule engine incorporates P6R's P6R::p6IRegex regex engine. p6IRegex provides the means to limit execution with the API method: P6R::p6IRegex::setBackTrackLimits( P6UINT32 maxBackStack, P6UINT32 maxBackTracks ). This method is exposed by the rule engine's API method: p6R::p6IRuleEngine::setRegexLimits( P6UINT32 maxBackStack, P6UINT32 maxBackTracks ). This API call allows the application to prevent run away regular expressions and thus limit evaluation time for performance restricted applications.
These added functions require the use of the P6R namespace: <rulesets version='1.0' xmlns:P6R='http://www.p6r.com/ruleengine/extensions'>
1) P6R:system-property
This function is a rule engine defined extension to the XPath language. It provides the caller with product information about the current version of the product (e.g., version number of the release).
Argument Data Type Meaning input string One of the following are allowed: { version, vendor, vendor-url, product-version, product } result string For example, system-property('vendor') results in "Project6Research".
The rule engine (P6R::p6IRuleEngine) can be used in a simple mode of execution where a set of rules are first compiled (via the modifyRules() method) and then evaluated to generate inferred facts (via the evaluate() method). For some applications this mode of execution will be acceptable.
However, for high performance applications the same set of rules will need to be executed concurrently by several threads. Two additional interfaces support this complex mode of execution. First, the p6IRuleSets interface allows the caller to extract out a compiled rules. Second, the compiled rules are represented by the P6R::p6IRuleCompiled interface. The compiled set of rules does not contain any execution state and thus can be shared with multiple p6IRuleEngine components at the same time. (p6IRuleSets is obtained by calling p6IRuleEngine::queryInterface() method.)
In the complex mode of execution, one or more "compiled" sets of rules are generated by performing the following calling sequence:
p6IRuleEngine::queryInterface( p6IRuleSets, &pRules ) p6IRuleEngine::modifyRules(...) pRules->getRules( &pCompiled_1 ) . . . p6IRuleEngine::modifyRules(...) pTemplate->getRules( &pCompiled_N )
The P6R::p6IRuleEngine::modifyRules() can be called on any "loaded" set of compiled rules, thus allowing a set of compiled rules to be modified. Note, that such modification to a rule set will occur globally across every p6IRuleEngine component sharing the compiled rules. Proper behavior is ensured by P6R reader / writer locks.
Next in the complex mode of execution, the compiled set of rules can be run concurrently across several p6IRuleEngine components. The P6R::p6IRuleSets interface also allows a compiled set of rules to be "loaded" into a p6IRuleEngine for execution. In fact, the same compiled set of rules could be loaded to any number of p6IRuleEngine components at the same time.
When a compiled set of rules is no longer needed the caller invokes the standard "release()" method on the component to free it.
In addition to sharing a compiled set of rules, the stated facts, which are parsed via the P6R::p6IRuleEngine::startFacts() and P6R::p6IRuleEngine::continueFacts() methods, can instead be "loaded" by a call to the P6R::p6IRuleEngine::startFactsWithDOM() method. This requires that the caller parse the stated facts into a DOM tree, but then that DOM tree can be shared across multiple instances of the P6R::p6IRuleEngine component. This would improve performance in avoiding redundant parsing of state fact documents. Note, that generated inferred facts occur local to each separate p6IRuleEngine component as local, run-time state.
As an example of what can be done with P6R's rule engine it is possible to build a feedback loop between the calling application and the rules loaded into the rule engine. To do this, first the application loads a set of initial rules into the rule engine and calls evaluate(). These initial rules use "call func=''" actions to return the state of the rule engine evaluation and even inferred facts via the use of global variables back to the calling application.
Next when the rule engine's evaluate() API method returns the calling application uses the passed back data and uses it to add and/or delete and/or modify the rules loaded in the rule engine. Then once again the calling application can call the evaluate() API method and restart where it left off. The calling application can even use the reset( flags ) or call startFacts() API calls to change the environment that the rules will run in.
The following group of rules can be loaded into the rule engine with a single call to modifyRules() API function.
<?xml version='1.0' encoding='ISO-8859-1'?> <rulesets version='1.0' xmlns:P6R='http://www.p6r.com/ruleengine/extensions' xmlns:P6R2='http://www.p6r.com/ruleengine'> <variable name='$g1'/> <variable name='$g5'/> <rule name='stoprules'> <if test='true()'/> <then>" <set-variable name='$g1' select='/P6R2:infer/mealitem'/> <set-variable name='$g5' select='/P6R2:infer/mealitem/12345'/> <delete-fact location='/P6R2:infer/mealitem/sunshine' /> <halt/> </then> </rule> <variable name='$g1' setname='over1'/> <variable name='$g2' setname='over1' select='67' /> <variable name='$g3' setname='over1' /> <delete-ruleset setname='freeme' /> <rule name='greencar'> <if test='menu/maincourse'/> <then> <set-fact name='sum' select='55 * 3' /> </then> </rule> <rule name='yellocar' setname='over1' priority='8'> <if test='$externalVar1'/> <then> <set-variable name='$g3' select='/menu/dessert/item'/> <call func=\"setclickurl( 'www.amazon.com', $g3 )\" /> <set-focus setname='#default' /> <set-fact name='mealitem' select='/menu/maincourse/soup' /> <set-fact name='12345' select="'Hi There Henry'" location='/P6R2:infer/mealitem' />" <set-fact name='6789A' location='/P6R2:infer/mealitem' /> <set-fact name='nextbranch' /> <set-fact name='12345' select="'Hi There Irvin'" location='/P6R2:infer/nextbranch' /> <set-fact name='sunshine' select='15*4' location='/P6R2:infer/mealitem' /> </then> </rule> <rule name='hotcar' setname='over1'> <if test='$g2'/> <then> <call func=\"setclickurl( 'www.cnn.com' )\" /> </then> </rule> </rulesets>