Скачать презентацию XML Tools Leonidas Fegaras CSE 6331 Leonidas Скачать презентацию XML Tools Leonidas Fegaras CSE 6331 Leonidas

59d40f44938fcc2670af5b48e1c4a355.ppt

  • Количество слайдов: 25

XML Tools Leonidas Fegaras CSE 6331 © Leonidas Fegaras XML Tools 1 XML Tools Leonidas Fegaras CSE 6331 © Leonidas Fegaras XML Tools 1

XML Processing Well-formedness checks & reference expansion XML document parser XML infoset document validator XML Processing Well-formedness checks & reference expansion XML document parser XML infoset document validator DTD or XML schema CSE 6331 © Leonidas Fegaras XML Tools XML infoset (annotated) application storage system 2

Tools for XML Processing • DOM: a language-neutral interface for manipulating XML data – Tools for XML Processing • DOM: a language-neutral interface for manipulating XML data – requires that the entire document be in memory • SAX: push-based stream processing – hard to write non-trivial applications • XPath: a declarative tree-navigation language – beautiful and easy to use – is part of many other languages • XSLT: a language for transforming XML based on templates – very ugly! • XQuery: full-fledged query language – influenced by OQL • Xml. Pull: pull-based stream processing – far better than SAX, but not a standard yet CSE 6331 © Leonidas Fegaras XML Tools 3

DOM The Document Object Model (DOM) is a platform- and language-neutral interface that allows DOM The Document Object Model (DOM) is a platform- and language-neutral interface that allows programs and scripts to dynamically access and update the content and structure of XML documents. The following is part of the DOM interface: public interface Node { public String get. Node. Name (); public String get. Node. Value (); public Node. List get. Child. Nodes (); public Named. Node. Map get. Attributes (); } public interface Element extends Node { public Node get. Elements. By. Tag. Name ( String name ); } public interface Document extends Node { public Element get. Document. Element (); } public interface Node. List { public int get. Length (); public Node item ( int index ); } CSE 6331 © Leonidas Fegaras XML Tools 4

DOM Example import java. io. File; import javax. xml. parsers. *; import org. w DOM Example import java. io. File; import javax. xml. parsers. *; import org. w 3 c. dom. *; /*[dept/text()=“cse”]/tel/text() class Test { public static void main ( String args[] ) throws Exception { Document. Builder. Factory dbf = Document. Builder. Factory. new. Instance(); Document. Builder db = dbf. new. Document. Builder(); Document doc = db. parse(new File("depts. xml")); Node. List nodes = doc. get. Document. Element(). get. Child. Nodes(); for (int i=0; i

Better Programming import java. io. File; import javax. xml. parsers. *; import org. w Better Programming import java. io. File; import javax. xml. parsers. *; import org. w 3 c. dom. *; import java. util. Vector; class Sequence extends Vector { Sequence () { super(); } Sequence ( String filename ) throws Exception { super(); Document. Builder. Factory dbf = Document. Builder. Factory. new. Instance(); Document. Builder db = dbf. new. Document. Builder(); Document doc = db. parse(new File(filename)); add((Object) doc. get. Document. Element()); } Sequence child ( String tagname ) { Sequence result = new Sequence(); for (int i = 0; i

SAX • SAX is a Simple API for XML that allows you to process SAX • SAX is a Simple API for XML that allows you to process a document as it's being read – in contrast to DOM, which requires the entire document to be read before it takes any action) • The SAX API is event based – The XML parser sends events, such as the start or the end of an element, to an event handler, which processes the information CSE 6331 © Leonidas Fegaras XML Tools 7

Parser Events • Receive notification of the beginning of a document void start. Document Parser Events • Receive notification of the beginning of a document void start. Document () • Receive notification of the end of a document void end. Document () • Receive notification of the beginning of an element void start. Element ( String namespace, String local. Name, String q. Name, Attributes atts ) • Receive notification of the end of an element void end. Element ( String namespace, String local. Name, String q. Name ) • Receive notification of character data void characters ( char[] ch, int start, int length ) CSE 6331 © Leonidas Fegaras XML Tools 8

SAX Example: a Printer import java. io. File. Reader; javax. xml. parsers. *; org. SAX Example: a Printer import java. io. File. Reader; javax. xml. parsers. *; org. xml. sax. helpers. *; class Printer extends Default. Handler { public Printer () { super(); } public void start. Document () {} public void end. Document () { System. out. println(); } public void start. Element ( String uri, String name, String tag, Attributes atts ) { System. out. print(“<” + tag + “>”); } public void end. Element ( String uri, String name, String tag ) { System. out. print(“”); } public void characters ( char text[], int start, int length ) { System. out. print(new String(text, start, length)); } } CSE 6331 © Leonidas Fegaras XML Tools 9

The Child Handler class Child extends Default. Handler { Default. Handler next; // the The Child Handler class Child extends Default. Handler { Default. Handler next; // the next handler in the pipeline String ptag; // the tagname of the child boolean keep; short level; // are we keeping or skipping events? // the depth level of the current element public Child ( String s, Default. Handler n ) { super(); next = n; ptag = s; keep = false; level = 0; } public void start. Document () throws SAXException { next. start. Document(); } public void end. Document () throws SAXException { next. end. Document(); } CSE 6331 © Leonidas Fegaras XML Tools 10

The Child Handler (cont. ) public void start. Element ( String nm, String ln, The Child Handler (cont. ) public void start. Element ( String nm, String ln, String qn, Attributes a ) throws SAXException { if (level++ == 1) keep = ptag. equals(qn); if (keep) next. start. Element(nm, ln, qn, a); } public void end. Element ( String nm, String ln, String qn ) throws SAXException { if (keep) next. end. Element(nm, ln, qn); if (--level == 1) keep = false; } public void characters ( char[] text, int start, int length ) throws SAXException { if (keep) next. characters(text, start, length); } } CSE 6331 © Leonidas Fegaras XML Tools 11

Forming the Pipeline class SAX { public static void main ( String args[] ) Forming the Pipeline class SAX { public static void main ( String args[] ) throws Exception { SAXParser. Factory pf = SAXParser. Factory. new. Instance(); SAXParser parser = pf. new. SAXParser(); Default. Handler handler = new Child("gradstudent", new Child("name", new Printer())); parser. parse(new Input. Source(new File. Reader("cs. xml")), handler); } } SAX parser CSE 6331 © Leonidas Fegaras Child: gradstudent XML Tools Child: name Printer 12

Example Input Stream SAX Events Child: gradstudent Child: name Printer SD: <department> SE: department Example Input Stream SAX Events Child: gradstudent Child: name Printer SD: SE: department SE: deptname Computer Science C: Computer Science EE: deptname SE: gradstudent SE: name SE: lastname Smith C: Smith EE: lastname SE: firstname John C: John EE: firstname EE: name EE: gradstudent . . . EE: department CSE 6331 ED: © Leonidas Fegaras XML Tools 13

Xml. Pull Unlike SAX, you pull events from document • Create a pull parser: Xml. Pull Unlike SAX, you pull events from document • Create a pull parser: Xml. Pull. Parser xpp; xpp = factory. new. Pull. Parser(); • Pull the next event: xpp. get. Event. Type() • Type of events: – – – START_TAG END_TAG TEXT START_DOCUMENT END_DOCUMENT • More information at: http: //www. xmlpull. org/ CSE 6331 © Leonidas Fegaras XML Tools 14

Better Xml. Pull Events class Attributes { public String[] names; public String[] values; } Better Xml. Pull Events class Attributes { public String[] names; public String[] values; } abstract class Event { } class Start. Tag extends Event { public String tag; public Attributes attributes; } class End. Tag extends Event { public String tag; } class CData extends Event { public String text; } class EOS extends Event {} CSE 6331 © Leonidas Fegaras XML Tools 15

Iterators import org. xmlpull. v 1. Xml. Pull. Parser; import org. xmlpull. v 1. Iterators import org. xmlpull. v 1. Xml. Pull. Parser; import org. xmlpull. v 1. Xml. Pull. Parser. Factory; abstract class Iterator { abstract public void open (); // open the stream iterator abstract public void close (); // close the stream iterator abstract public Event next (); // get the next tuple from stream } abstract class Filter extends Iterator { Iterator input; } CSE 6331 © Leonidas Fegaras XML Tools 16

Document Reader class Document extends Iterator { String path; int state; File. Reader reader; Document Reader class Document extends Iterator { String path; int state; File. Reader reader; Xml. Pull. Parser xpp; static Xml. Pull. Parser. Factory factory; Event get. Event () { int event. Type = xpp. get. Event. Type(); if (event. Type == Xml. Pull. Parser. START_TAG) { int len = xpp. get. Attribute. Count(); String[] names = new String[len]; String[] values = new String[len]; for (int i = 0; i

Document Reader (cont. ) public void open () { reader = new File. Reader(path); Document Reader (cont. ) public void open () { reader = new File. Reader(path); xpp = factory. new. Pull. Parser(); xpp. set. Input(reader); state = 0; } public void close () { reader. close(); } public Event next () { if (state > 0) { state++; if (state == 2) return new EOS(); }; Event e = get. Event(); if (xpp. get. Event. Type() != Xml. Pull. Parser. END_DOCUMENT) xpp. next(); return e; } CSE 6331 © Leonidas Fegaras XML Tools 18

The Child Iterator class Child extends Filter { String tag; short nest; // the The Child Iterator class Child extends Filter { String tag; short nest; // the nesting level of the event boolean keep; // are we in keeping mode? public void open () { keep = false; nest = 0; input. open(); } public Event next () { while (true) { Event t = input. next(); if (t instanceof EOS) return t; else if (t instanceof Start. Tag) { if (nest++ == 1) { keep = tag. equals(((Start. Tag) t). tag); if (!keep) continue; } } else if (t instanceof End. Tag) if (--nest == 1 && keep) { keep = false; return t; }; if (keep) return t; } } } CSE 6331 © Leonidas Fegaras XML Tools 19

XSL Transformation A stylesheet specification language for converting XML documents into various forms (XML, XSL Transformation A stylesheet specification language for converting XML documents into various forms (XML, HTML, plain text, etc). • Can transform each XML element into another element, add new elements into the output file, or remove elements. • Can rearrange and sort elements, test and make decisions about which elements to display, and much more. • Based on XPath: CSE 6331 © Leonidas Fegaras XML Tools 20

XSLT Templates • XSL uses XPath to define parts of the source document that XSLT Templates • XSL uses XPath to define parts of the source document that match one or more predefined templates. • When a match is found, XSLT will transform the matching part of the source document into the result document. • The parts of the source document that do not match a template will end up unmodified in the result document (they will use the default templates). Form: The default (implicit) templates visit all nodes and strip out all tags: CSE 6331 © Leonidas Fegaras XML Tools 21

Other XSLT Elements <xsl: value-of select=“XPath expression“/> select the value of an XML element Other XSLT Elements select the value of an XML element and add it to the output stream of the transformation, e. g. . copy the entire XML element to the output stream of the transformation. apply the template rules to the elements that match the XPath expression. add an element to the output with a tag-name derived from the XPath. Example: CSE 6331 © Leonidas Fegaras XML Tools 22

Copy the Entire Document <xsl: stylesheet version = ’ 1. 0’ xmlns: xsl=’http: //www. Copy the Entire Document CSE 6331 © Leonidas Fegaras XML Tools 23

More on XSLT • Conflict resolution: more specific templates overwrite more general templates. Templates More on XSLT • Conflict resolution: more specific templates overwrite more general templates. Templates are assigned default priorities, but they can be overwritten using priority=“n” in a template. • Modes can be used to group together templates. No mode is an empty mode. • Conditional and loop statements: body body • Variables can be used to name data: value Variables are used as CSE 6331 © Leonidas Fegaras XML Tools {$x} in XPaths. 24

Using XSLT import import javax. xml. parsers. *; org. xml. sax. *; org. w Using XSLT import import javax. xml. parsers. *; org. xml. sax. *; org. w 3 c. dom. *; javax. xml. transform. *; javax. xml. . transform. dom. *; javax. xml. transformstream. *; java. io. *; class XSLT { public static void main ( String argv[] ) throws Exception { File stylesheet = new File("x. xsl"); File xmlfile = new File("a. xml"); Document. Builder. Factory dbf = Document. Builder. Factory. new. Instance(); Document. Builder db = dbf. new. Document. Builder(); Document document = db. parse(xmlfile); Stream. Source stylesource = new Stream. Source(stylesheet); Transformer. Factory tf = Transformer. Factory. new. Instance(); Transformer transformer = tf. new. Transformer(stylesource); DOMSource source = new DOMSource(document); Stream. Result result = new Stream. Result(System. out); transformer. transform(source, result); } } CSE 6331 © Leonidas Fegaras XML Tools 25