Types for XML with Application to Xcerpt

Detta är en avhandling från Institutionen för datavetenskap

Sammanfattning: XML data is often accompanied by type information, usually expressed by some schema language. Sometimes XML data can be related to ontologies defining classes of objects, such classes can also be interpreted as types. Type systems proved to be extremely useful in programming languages, for instance to automatically discover certain kinds of errors. This thesis deals with an XML query language Xcerpt, which originally has no underlying type system nor any provision for taking advantage of existing type information. We provide a type system for Xcerpt; it makes possible type inference and checking type correctness.The system is descriptive: the types associated with Xcerpt constructs are sets of data terms and approximate the semantics of the constructs. A formalism of Type Definitions is adapted to specify such sets. The formalism may be seen as a simplification and abstraction of XML schema languages. The type inference method, which is the core of this work, may be seen as abstract interpretation. A non standard way of assuring termination of fixed point computations is proposed, as standard approaches are too inefficient. The method is proved correct wrt. the formal semantics of Xcerpt.We also present a method for type checking of programs. A success of type checking implies that the program is correct wrt. its type specification. This means that the program produces results of the specified type whenever it is applied to data of the given type. On the other hand, a failure of type checking suggests that the program may be incorrect. Under certain conditions (on the program and on the type specification), the program is actually incorrect whenever the proof attempt fails.A prototype implementation of the type system has been developed and usefulness of the approach is illustrated on example programs.In addition, the thesis outlines possibility of employing semantic types (ontologies) in Xcerpt. Introducing ontology classes into Type Definitions makes possible discovering some errors related to the semantics of data queried by Xcerpt. We also extend Xcerpt with a mechanism of combining XML queries with ontology queries. The approach employs an existing Xcerpt engine and an ontology reasoner; no modifications are required.