Tuesday, April 1, 2008

TQL data model (1/3): simple kernel

We start a little tutorial on TQL data model (and its use) with this post, the kernel data model we present here  is not the complete data model supported by the current TQL implementation but it is the subset presented in the main TQL articles.
We will complete the data model, step by step, in following posts.

The original data model of TQL is inspired by ambients and is defined wrt a set of labels L.
It defines unordered unranked labelled (over L) trees and is generated by the following grammar:

I ::= 0 a single node
         a[I] a single edge labelled a (element of L) leading to a subtree I
        I | I' justaposition of two trees (merging roots and forgetting order)
 
It is essentially a "nested" commutative monoid 
freely generated from an alphabet of labels, i.e. the operator _ | _ 
is associative and commutative and has 0 as neutral element.

XML can be easily represented as information trees if we forget about order, we use some convention for representing attributes (e.g. to use a sorting discipline) and we assume PCDATA and all other XML basic types can be encoded into our labels.
On the other hand, information trees can be serialized into XML, but doing so we are non deterministically deciding an order among children of the same node.  
Information trees may also be produced by other means (e.g. connections over JDBC databases),  since TQL also implements and supports extensible functions that can be coded directly in Java and produce information trees, in the current release of TQL a demo of a JDBC connection is included with the function named scan.

The TQL system (since version 2.0) can load data as XML files from the file system or the web, and results of a query can be serialized into XML files and stored on the file system.

No comments: