NAME

kasykconstraint - description of constraints in Kasyk query XML (<constraint>)


DESCRIPTION

The <constraint> container in both Kasyk query XML and Kasyk caching query server configuration XML allow specification of expressions limiting the availability of documents to the initial result set.


<constraint>...</constraint>

A <constraint> expression describes various properties and the values they must have for a document to be considered for the initial result set of a Kasyk hitlist XML. It does not operate on the text of a document.

A property of a document in a Kasyk index is a named quantity having a flag (boolean), numeric or string typed value. The set of allowable properties in any particular Kasyk index is specified by the <property> containers in the Kasyk configuration XML of that Kasyk index.

The valid operators in a constraint expression consist of:

"&"

Logical "and" operator. True if both left hand side as well as right hand side of the expression is true.

Please note that due to the fact that the & character by itself is not legal XML, the "and" operator must always be specified as "&amp;".

"|"
Logical "or" operator. True if either left hand side or right hand side of the expression is true.
"!"
Logical "not" operation. True if right hand side of expression is not true.
"="

Logical "equal" operation. True if left hand side of expression is equal to the right hand side of expression.

Please note that unlike many programming languages, a single "=" is used to denote equality comparison, rather than a double equal sign "==".

"!="
Logical "not equal" operation. True if left hand side of expression is not equal to the right hand side of expression.
"<"

Logical "less than" operation. True if left hand side of expression is less than the right hand side of expression.

Please note that due to the fact that the < character by itself is not legal XML, the "less than" operator must always be specified as "&lt;".

"<="

Logical "less than or equal" operation. True if left hand side of expression is less than or equal to the right hand side of expression.

Please note that due to the fact that the < character by itself is not legal XML, the "less than or equal" operator must always be specified as "&lt;=".

">"

Logical "greater than" operation. True if left hand side of expression is greater than the right hand side of expression.

Please note that due to the fact that the > character by itself is not legal XML, the "greater than" operator must always be specified as "&gt;".

">="

Logical "greater than or equal" operation. True if left hand side of expression is greater than or equal to the right hand side of expression.

Please note that due to the fact that the > character by itself is not legal XML, the "greater than or equal" operator must always be specified as "&gt;=".

"( )"
Parentheses for changing precedence.
"like"

The "like" operator allows the comparison of a string valued property against a simple regular expression. The simple regular expression can contain '*' (match any sequence of characters) and '?' (match exactly a single character). For instance:

 filename like ".log"

will allow all documents having a <filename> property containing the text "log" anywhere, eg, "/tmp/log", "/tmp/log/jack".

 filename like "log*d"

will match "logd", "log/d" and "logged"

 filename like "log?d"

will match "log/d" and "logdd" but not "logd". Currently the "like" operator does its string matching operation in a case-insensitive manner.

Please note that the "like" operator can be used to simulate multi-valued properties that need to be able to have constraints applied to them. For instance, a "categories" property in a document that has the conceptual categories "foo" and "baz":

 <document>
  <properties>
   <categories>:foo:baz:</categories>
  </properties>
  <text><!-- text of the document here --></text>
 </document>

In this case, the constraint:

 categories like ":baz:"

would select all documents that have the conceptual category "baz" set.

"in ()"

The "in" operator provides a shortcut to doing many comparisons. The "in" operator expects on the right side a list of values to compare against the property on the left side. For example:

 id in (1,2,3,5,7,11,13,17,19)

is equivalent to:

 id=1 | id=2 | id=3 | id=5 | id=7 | id=11 | id=13 | id=17 | id=19

Apart from being a shortcut, it also provides internal optimizations to allow a faster determination of whether a document is selected or not.

It can be used to provide "ad-hoc" category like constraints on an index, for instance if the constraint is determined by an external entity, such as a query in a database (e.g. all the records of a specific client).

The current version of Kasyk does not allow multi-valued properties (properties with the multiplicity attribute set to "*" in the Kasyk configuration XML) to be used as part of a constraint expression. It is the intention to change this in the future. In many cases this deficiency can be counteracted by using (many) properties of type "flag" instead or by using a delimited string in an unkeyed string property (see above example with the "like" operator).

Properties are referenced by name. A property of type "flag" cannot be compared to anything else; it provides a boolean (true / false) value by itself. Other property types must be compared against a value to return a boolean value.


EXAMPLES

For instance, if the following properties are defined in the Kasyk configuration XML:

 <property name="pressrelease" type="flag"/>
 <property name="date"         type="number"/>

then you can specify these example constraints:

selecting documents of a specific type
 <constraint> pressrelease </constraint>

Only include a document in the initial result set if the <pressrelease> property of that document is set. This would usually correspond to all press releases in a database.

selecting documents in a specific time frame
 <constraint> pressrelease &amp; date &gt;= 20030101 </constraint>

Only include a document in the initial result set if the <pressrelease> property of that document is set and the <date> property of the document has a value at least equal to "20030101". This would usually correspond to press releases of the year 2003 and later in a database.

selecting documents that are not of a specific type
 <constraint> !pressrelease </constraint>

Only include a document in the initial result set if the <pressrelease> property of that document is not set. This would usually correspond to everything but press releases in a database.

It should be noted that string based comparisons are performed in a case-sensitive fashion, while the "like" operator runs case-insensitively. Also, string values must be specified with surrounding single or double quotes.


SEE ALSO

Kasyk home, Kasyk query XML, Kasyk caching query server configuration XML, Kasyk hitlist XML, Kasyk configuration XML, Kasyk document sequence XML, Kasyk searcher (kasyk), Kasyk server (kasykd), Kasyk caching query server (kasykcqd).

See http://www.kasyk.nl/xml/kasykconstraint.html for the most up-to-date version of this information.


COPYRIGHT

Copyright © 2003 Dijkmat BV

This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Kasyk XML Information: Kasyk version 1.0.0, XML version http://www.kasyk.org/1.0, generated on Tue Nov 25 12:09:47 2003.