5/23/10'1'XML'CISC437/637,'Lecture'#24'Ben'Cartere;e'1'Copyright'©'Ben'Cartere;e'Semistructured'Data'Model'• XML'(EXtensible'Markup'Language)'is'an'alternaMve'to'the'relaMonal'model'– Rather'than'tuples'of'a;ributes,'data'is'represented'using'nested'tags'with'content'• Markup:'– Based'on'tags,'e.g.'<Mtle>,'<author>,'<isbn>'• Extensible:'– Users'specify'tags'and'tag'semanMcs;'no'fixed'catalog'• Language:'– Consists'of'a'set'of'symbols,'a'syntax,'and'a'semanMcs'2'Copyright'©'Ben'Cartere;e'5/23/10'2'Advantages'of'XML'• Semistructured'data'model'– RelaMonal'data'from'heterogeneous'sources'– Textual'data'with'tags'and'links'• Extensible'– And'“self]documenMng”'• Flexible'exchange'format'– Highly'useful'for'exchanging'data'between'databases/organizaMons'Copyright'©'Ben'Cartere;e' 3'XML'Syntax'• Tags:''e.g.'book,'Mtle,'author,'…'– Start'tag:''<book>'– End'tag:''</book>'• Elements:''e.g.'<book>…</book>'– Elements'may'be'nested'– Empty'elements'may'be'present,'e.g.'<book/>'• A/ributes:''e.g.'price'– <book'price=“55”>…</book>'• Oids'and'references:''e.g.'id=“o555”;'idref=“o555”'– Defines'keys'and'foreign'keys'• An'XML'document'has'a'single'root'element'– It'is'well9formed'if'all 'tags'have'matching,'properly'nested'end'tags'Copyright'©'Ben'Cartere;e' 4'5/23/10'3'XML'SemanMcs'• An'XML'document'is'a'tree'– Elements'are'nodes'– Leaf'nodes'are'content'• XML'allows'trees'to'be'ragged'– RelaMonal'data'viewed'as'a'balanced'tree'• XML'subtrees'need'not'have'idenMcal'elements'– RelaMonal'data'viewed'as'isomorphic'subtrees'connected'by'a'root'node'Copyright'©'Ben'Cartere;e' 5'XML'Data'Typing'• The'relaMonal'model'uses'the'schema'for'data'typing'• XML'uses'the'Document<Type<Defini@on'(DTD)'– The'DTD'describes'the'valid'trees'that'can'occur'– An'XML'document'that'does'not'conform'to'the'DTD'is'invalid'• (Invalid'XML'can'sMll'be'well]formed)'Copyright'©'Ben'Cartere;e' 6'5/23/10'4'DTD'Syntax'• <!DOCTYPE'root'['– Defines'root'element'in'XML'document'• <!ELEMENT'tag'(CONTENT)>'– Content'can'be:'• A'regular'expression'formed'from'other'elements'• Text]only,'CONTENT'='#PCDATA'• Empty,'CONTENT'='EMPTY'• Any,'CONTENT'='ANY'• Mixed,'CONTENT'='(#PCDATA'|'tagA'|'tagB'|'…)*'• <!ATTLIST'tag'a;r1'TYPE'DEFAULT'a;r2'TYPE'DEFAULT'…'>'– TYPE'can'be'CDATA'(string),'ID'(key),'IDREF'(foreign'key),'or'an'enumeraMon'(A'|'B'|'C'|'…)'– DEFAULT'can'be'#REQUIRED,'#IMPLIED,'“value”,'“value”'#FIXED'• ]>'– End'of'DTD'Copyright'©'Ben'Cartere;e' 7'Querying'XML'Data'• How'do'we'query'semi]structured'data?'• Take'advantage'of'the'tree'semanMcs:'– Define'a'template'describing'traversals'from'the'root'• XPath'is'the'basis'for'the'template'– Used'for'selecMng'data'in'the'tree'• XQuery'is'the'complete'query'language'– Select'data'and'construct'output'Copyright'©'Ben'Cartere;e' 8'5/23/10'5'XPath'Summary'/bib'Matches'a'bibliography'element'at'the'root'/bib/paper'Matches'a'paper'in'the'bibliography'/bib/*'Matches'every'child'of'bibliography'/bib/book//lastname'Matches'lastnames'at'any'depth'under'books'//lastname'Matches'lastnames'at'any'depth'//(book|'paper)'Matches'a'book'or'a'paper'/bib/book/@price'Matches'a'“price”'a;ribute'of'a'book'/bib/book[@price<60]/author'Matches'the'authors'of'books'that'cost'less'than'$60'/bib/book/author[1]/lastname'Matches'the'lastnames'of'the'first'author'of'each'book'/bib/book/author[lastname]'Matches'books'with'a'given'author'lastname'Copyright'©'Ben'Cartere;e' 9'XQuery'Summary'• XQuery'based'on'keywords'in'FLWOR'expressions'– for'(SQL'FROM)'– let'defines'temporary'variables'– where<(SQL'WHERE)'– order<by'(SQL'ORDER'BY)'– result<(SQL'SELECT)'• XQuery'supports'joins,'nested'queries,'grouping,'aggregaMon,'funcMons,'if/else'clauses,'…'Copyright'©'Ben'Cartere;e'
View Full Document