Tuesday, June 01, 2010

Making XML Schema less of a pain by parsing text with XSLT

Allow me to get to the point immediately. XML Schema can be a royal pain.

Don't get me wrong; I'm glad it exists. It's powerful, serves a clear purpose, is well-supported, yadda, yadda, yadda. Unfortunately, it's also quite complex, has a lot of pitfalls (elementFormDefault!), and is terribly verbose.

For instance, would you rather have this:


http://blog.jwbroek.com/nifty-namespace
thingamabob        ; This is a comment.
  foo xsd:string
  bar xsd:boolean  ; Set to true to enable bar.
  baz
    alice
      count xsd:integer?  ; Count is optional.
      description  ; Type defaults to string.
    bobs           ; List of 0 or more bobs.
      bob xsd:boolean*
    charles +      ; At least one charles.


Or this:


<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema xmlns:tns="http://blog.jwbroek.com/nifty-namespace"
            xmlns:xsd="http://www.w3.org/2001/XMLSchema"
            elementFormDefault="qualified"
            attributeFormDefault="unqualified"
            targetNamespace="http://blog.jwbroek.com/nifty-namespace">
   <xsd:element name="thingamabob">
      <xsd:annotation>
         <xsd:documentation>This is a comment.</xsd:documentation>
      </xsd:annotation>
      <xsd:complexType>
         <xsd:sequence>
            <xsd:element name="foo" type="xsd:string"/>
            <xsd:element name="bar" type="xsd:boolean">
               <xsd:annotation>
                  <xsd:documentation>Set to true to enable bar.</xsd:documentation>
               </xsd:annotation>
            </xsd:element>
            <xsd:element name="baz">
               <xsd:complexType>
                  <xsd:sequence>
                     <xsd:element name="alice">
                        <xsd:complexType>
                           <xsd:sequence>
                              <xsd:element name="count" type="xsd:integer" minOccurs="0">
                                 <xsd:annotation>
                                    <xsd:documentation>Count is optional.</xsd:documentation>
                                 </xsd:annotation>
                              </xsd:element>
                              <xsd:element name="description" type="xsd:string">
                                 <xsd:annotation>
                                    <xsd:documentation>Type defaults to string.</xsd:documentation>
                                 </xsd:annotation>
                              </xsd:element>
                           </xsd:sequence>
                        </xsd:complexType>
                     </xsd:element>
                     <xsd:element name="bobs">
                        <xsd:annotation>
                           <xsd:documentation>List of 0 or more bobs.</xsd:documentation>
                        </xsd:annotation>
                        <xsd:complexType>
                           <xsd:sequence>
                              <xsd:element name="bob" type="xsd:boolean" minOccurs="0" maxOccurs="unbounded"/>
                           </xsd:sequence>
                        </xsd:complexType>
                     </xsd:element>
                     <xsd:element name="charles" type="xsd:string" maxOccurs="unbounded">
                        <xsd:annotation>
                           <xsd:documentation>At least one charles.</xsd:documentation>
                        </xsd:annotation>
                     </xsd:element>
                  </xsd:sequence>
               </xsd:complexType>
            </xsd:element>
         </xsd:sequence>
      </xsd:complexType>
   </xsd:element>
</xsd:schema>


Both describe the same XML structure, but if you ask me, the first one is much clearer, and much quicker to write as well.

Granted, we're not using any of the fancy bells and whistles of XML Schema here. However, this would be quite sufficient for most of the things I see Schema being used for.

Wouldn't it be nice if you could actually write your Schema's using the first syntax?

Well, you're in luck: you can! The Schema above was entirely generated by applying the XSLT below to the simple syntax at the top. Hope you'll enjoy it as much as I do. :-)

(Tip: use Kernow to execute the XSLT. Put your input in C:\dev\projects\schemagen\test\input.txt, or override the parameter to use a file of your choice.)


<!--
Copyright 2010 J.W. van den Broek

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
   xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:jws="http://blog.jwbroek.com/xslt/xsd/functions"
   exclude-result-prefixes="#all">
  
   <xsl:output indent="yes"/>
  
   <!-- Override this to read your file. -->
   <xsl:param name="input-file" select="'file:///C:/dev/projects/schemagen/test/input.txt'"/>
  
   <xsl:template match="/">
      <!-- Sequence of all non-empty lines in the input. -->
      <xsl:variable name="lines" select="tokenize(unparsed-text($input-file),'&#x0D;')[not(matches(.,'^\s*$'))]"/>
     
      <!-- Create the schema. Make the root schema element here, taking the target namespace from the first line of input. -->
      <xsd:schema elementFormDefault="qualified" attributeFormDefault="unqualified" targetNamespace="{$lines[1]}">
         <xsl:namespace name="tns" select="$lines[1]"/>
         <!-- Pass all other lines on the the element-declarations function, which will create the element declarations. -->
         <xsl:sequence select="jws:element-declarations(subsequence($lines,2,count($lines)-1))"/>
      </xsd:schema>
   </xsl:template>
  
   <!-- Create element declarations based on lines of input. -->
   <xsl:function name="jws:element-declarations" as="element()*">
      <xsl:param name="rawLines" as="xsd:string*"/>
     
      <!-- Only continue if we have lines of input remaining. -->
      <xsl:if test="exists($rawLines)">
         <!-- Take the indentation from the first line. We'll create declarations for all elements with this level of indentation. -->
         <!-- We'll recursively create declarations for elements at higher indentation. -->
         <xsl:variable name="curIndent" select="replace($rawLines[1],'^(\s*).+$','$1')"/>
         <!-- Remove the base indentation from all lines. The elements we're going to make declarations for now have no indentation. -->
         <xsl:variable name="lines" select="for $l in $rawLines return substring-after($l, $curIndent)"/>
         <!-- Determine indices for all elements without indentation. We'll use this info to efficiently access the right lines of input. -->
         <xsl:variable name="indicesAtRoot" select="index-of((for $l in $lines return matches($l, '^\i+.*')), true())"/>
         <!-- Contains the root indices, but also the end of input. We'll use this to create subsequences for our recursive calls. -->
         <xsl:variable name="indicesAndBound" select="$indicesAtRoot, count($lines)+1"/>
        
         <!-- Create declarations for all root elements. (And recursively all child elements as well.) -->
         <xsl:for-each select="$indicesAtRoot">
            <!-- Current line of input. -->
            <xsl:variable name="curLine" select="$lines[current()]"/>
            <!-- Name of current element. -->
            <xsl:variable name="name" select="replace($curLine,'^(\i\c*).*$','$1')"/>
            <!-- Type of current element. May be empty, in which case we'll use xsd:string as default later on. -->
            <xsl:variable name="type" select="replace($curLine,'^\i\c*\s*([^?*+;\s]*)?.*$','$1')"/>
            <!-- Occurrence of current element. ?: optional, *: 0 or more, +: 1 or more. Empty is XSD default (1). -->
            <xsl:variable name="occurrence" select="replace($curLine,'^[^?*+;]*(\?|\*|\+).*$','$1')"/>
            <!-- Documentation. Will go into a documentation annotation. -->
            <xsl:variable name="doc" select="replace($curLine,'^[^;]+(;\s*(.*))?$','$2')"/>
            <!-- Current position in the $indicesAtRoot sequence. -->
            <xsl:variable name="pos" select="position()"/>
            <!-- Select the subsequence of all lines that contain children of the current element. -->
            <xsl:variable name="children" select="subsequence($lines, $indicesAndBound[$pos]+1, $indicesAndBound[$pos+1] - $indicesAndBound[$pos] - 1)"/>
           
            <!-- Create the element declaration. -->
            <xsd:element name="{$name}">
               <!-- No type declaration if there are children. Is an inline complex type declaration. -->
               <xsl:if test="empty($children)">
                  <xsl:choose>
                     <!-- On empty type, we default to string. -->
                     <xsl:when test="$type = ''">
                        <xsl:attribute name="type" select="'xsd:string'"/>
                     </xsl:when>
                     <xsl:otherwise>
                        <xsl:attribute name="type" select="$type"/>
                     </xsl:otherwise>
                  </xsl:choose>
               </xsl:if>
              
               <!-- Set minOccurs and maxOccurs. -->
               <xsl:choose>
                  <xsl:when test="$occurrence='?'">
                     <xsl:attribute name="minOccurs" select="'0'"/>
                  </xsl:when>
                  <xsl:when test="$occurrence='*'">
                     <xsl:attribute name="minOccurs" select="'0'"/>
                     <xsl:attribute name="maxOccurs" select="'unbounded'"/>
                  </xsl:when>
                  <xsl:when test="$occurrence='+'">
                     <xsl:attribute name="maxOccurs" select="'unbounded'"/>
                  </xsl:when>
               </xsl:choose>
              
               <!-- Set documentation annotation. -->
               <xsl:if test="$doc != ''">
                  <xsd:annotation>
                     <xsd:documentation>
                        <xsl:sequence select="$doc"/>
                     </xsd:documentation>
                  </xsd:annotation>
               </xsl:if>
              
               <!-- Recursively do child declarations. -->
               <xsl:if test="exists($children)">
                  <xsd:complexType>
                     <xsd:sequence>
                        <xsl:sequence select="jws:element-declarations($children)"/>
                     </xsd:sequence>
                  </xsd:complexType>
               </xsl:if>
            </xsd:element>
         </xsl:for-each>
      </xsl:if>
   </xsl:function>
  
</xsl:stylesheet>

20 comments:

Anonymous said...

I've already bookmark this article and will definitely refer this article to all my close friends and colleagues. Thanks for posting!

Anonymous said...

kiwbtuk [url=http://www.seebychloeja.com/]シーバイクロエ バッグ[/url] hicjkqa kvqxpdo http://www.colorfulchloeja.com/ rucmlld fvoiztw [url=http://www.chloe2013ss.com/]クロエ 財布[/url] jdxemor msntpfp [url=http://www.chloe2013ss.com/]クロエ バッグ[/url] yxyzdnc gvwqgcb [url=http://www.chloe2013ss.com/]クロエ アウトレット[/url] uglglpm ieurdjs http://www.chloe2013ss.com/ pftqfum tggrxxz [url=http://www.bestjpgucci.com/]グッチ アウトレット[/url] vhfvdrv cmperli [url=http://www.bestjpgucci.com/]グッチ 財布[/url] qmgvmpw yeusgpa [url=http://www.bestjpgucci.com/]グッチ バッグ[/url] mhmpzpw rnligef http://www.bestjpgucci.com/ gomtdvs tkykldl [url=http://www.guccistationsjp.com/]グッチ アウトレット[/url] udthnpa tofohyv [url=http://www.guccistationsjp.com/]グッチ 財布[/url] uwhklke ykclyjg [url=http://www.guccistationsjp.com/]gucci 財布[/url] bmxkkxa yybjmpi [url=http://www.guccisprings.com/]グッチ アウトレット[/url] swsfrgi ypqhtky http://www.guccisprings.com/ uvjfsun wxcghgy [url=http://www.guccisprings.com/]グッチ 財布[/url] pjyjhou tytarmr [url=http://www.guccisprings.com/]グッチ バッグ[/url] hcesife igyatqn [url=http://www.chloefind.com/]クロエ 財布[/url] ghjfujs dbqmnec [url=http://www.chloefind.com/]クロエ バッグ[/url] tnvalmw hypeawf [url=http://www.chloefind.com/]クロエ アウトレット[/url] jdhcyrq vqpszbg http://www.guccistationsjp.com/ jdgmgbj mqrpiif http://www.chloefind.com/ mgkuuch wsehnep [url=http://www.seebychloeja.com/]シーバイクロエ 財布[/url] quhzfeo yxxfvty [url=http://www.seebychloeja.com/]シーバイクロエ バッグ 新作[/url] luancbb ytehpmc http://www.seebychloeja.com/ zmjmigk weamvjg [url=http://www.colorfulchloeja.com/]クロエ 財布[/url] utbnblf pxqimfx [url=http://www.colorfulchloeja.com/]クロエ アウトレット[/url] avtkucw qnctqnu [url=http://www.colorfulchloeja.com/]シーバイクロエ バッグ[/url] nbqmdut

dong dong said...

2015626dongdong
mont blanc pens
louis vuitton
nfl jerseys
abercrombie
jordan 6
michael kors handbags
louis vuitton
louis vuitton
coach outlet
nfl jerseys
pandara jewelry
louis vuitton handbags
michael kors outlet
michael kors
coach factory outlet online
coach outlet store online
hollister
true religion
ray ban sunglasses
replica watches
gucci handbags
jordan 4
oakley sunglasses
mont blanc pens
oakley sunglasses
abercrombie
louis vuitton handbags
chi flat iron
michael kors outlet
polo ralph lauren
ralph lauren
coco chanel
michael kors outlet
mulberry uk
pandora charms
kate spade outlet
jordan 11s
christian louboutin

Qing Cai said...

abercrombie and fitch, http://www.abercrombie-fitch.us.com/
oakley sunglasses, http://www.oakleysunglassesdiscount.us.com/
rolex watches uk, http://www.rolexwatches-uk.co.uk/
louis vuitton handbags, http://www.louisvuittonhandbag.us/
kobe shoes, http://www.kobeshoes.us/
prada handbags, http://www.pradahandbagsoutlet.co.uk/
nike huarache, http://www.nike-airhuarache.co.uk/
ralph lauren uk, http://www.ralphlaurenoutletuk.org.uk/
kobe bryant shoes, http://www.kobebryantshoes.in.net/
michael kors outlet, http://www.michaelkorsoutletusa.net/
air max 2014, http://www.airmax2014.net/
cheap oakley sunglasses, http://www.cheapoakleysunglassess.us.com/
calvin klein underwear, http://www.calvinklein.in.net/
ugg boots, http://www.uggbootscheap.eu.com/
ray ban sunglasses, http://www.raybansunglassesonline.us.com/
nike trainers, http://www.niketrainers.me.uk/
nike free 5, http://www.nikefree5.us/
hollister, http://www.hollistercanada.com/
tiffany jewellery, http://www.tiffanyjewelleryoutlets.co.uk/
nfl jerseys wholesale, http://www.nfljerseys-wholesale.us.com/
louis vuitton handbags, http://www.louisvuittonhandbags.org.uk/
ray ban, http://www.occhiali-rayban.it/
hollister uk, http://www.hollistershirts.co.uk/
air jordan 11, http://www.airjordan11.net/
tory burch outlet, http://www.toryburch.in.net/
cai2015924

Minko Chen said...

hermes outlet
nike trainers
mlb jerseys
michael kors outlet
canada goose outelt
swarovski crystal
air jordan 4
lacoste pas cher
kobe bryants shoes
cheap football shirts
swarovski crystal
parajumpers outlet
canada goose outlet
ralph lauren outlet
air jordan 11
phone cases
michael kors canada
longchamp handbags
louis vuitton bags
1119minko

Bảo Ngân Nguyễn said...

Khi bạn cầm trong tay bản vẽ hoặc sáng chế vật dụng trang tri noi that trong nhà cho chung cư nhà chung cư của khách hàng hay qua chính mình rồi thì điều quan trọng thiet ke noi that và nhu yếu tiếp theo đó là tìm kiếm một đơn vị thi cong noi that thiết kế xây dựng đồ trang trí trong nhà uy tín và có kinh nghiệm. Sẽ thế nào trường hợp giao phó đầy đủ sáng chế noi that chung cu đẹp đẽ đó cho các người thợ không có chuyên môn, không có kỹ thuật thiet ke noi that van phong và tay nghề, ngay đến đọc bản vẽ cũng không biết. Quá nguy hiểm! đa số đa số thiết bị nha dep có thể sẽ trở thành một đống gạch vụn. lắp đặt đồ trang trí thiet ke noi that phong khach trong nhà tổng thể đòi hỏi phải có sự tích hợp dịp nhàng và khoa học giữa các đội thợ thi công sao cho công việc không bị chồng chéo, đúng tiến độ mà vẫn giữ được tính sắc sảo, đúng khoa học theo bản phân bổ đã được chủ đầu tư phê duyệt. Với kinh nghiệm bố trí lắp đặt vật dụng man cua trong nhà qua hàng trăm công trình nha dep 2016 lớn nhỏ. Với đội ngũ nhân viên năng động, tận tình, tay nghề cao,toàn bộ nhân viên oz luôn ý thức và tự hoàn thành bản thân để thích ứng với tốc độ gia tăng của khoa học và lắp đặt để mang đến cho quý người dùng những dịch vụ thiet ke noi that can ho tốt nhất, đáp ứng với sự mong mỏi và tình thương quý khách đã dành cho chúng tôi trong suôt thời gian qua.

Qing Cai said...

lacoste polo shirts
swarovski outlet
michael kors uk
kobe bryants shoes
mcm outlet
toms outlet
rolex watches
mulberry outlet
louis vuitton neverfull
michael kors canada
michael kors outlet online
longchamp handbags
nike air huarache
longchamp outlet
polo ralph lauren
snapbacks wholesale
oakley sunglasses
soccer jerseys
polo shirts
swarovski crystal
ralph lauren outlet
hermes birkin bag
true religion jeans
hermes outlet
ray ban sunglasses
herve leger dresses
juicy couture outlet
toms shoes
swarovski jewelry
lebron james shoes
cai20160422

Yuanyuan Lin said...

7.12lllllyuan"oakley sunglasses wholesale"
"louis vuitton handbags outlet"
"ray-ban sunglasses"
"michael kors wallet"
"ray ban sunglasses"
"oakley sunglasses wholesale"
"longchamp handbags"
"juicy couture tracksuit"
"prada sunglasses for women"
"swarovski outlet"
"ralph lauren polo"
"longchamp outlet online"
"links of london"
"cartier watches"
"nike roshe run"
"asics"
"ferragamo outlet"
"cheap ray ban sunglasses"
"polo ralph lauren"
"michael kors outlet"
"nike tn pas cher"
"ray ban sunglasses"
"tiffany and co"
"tory burch outlet online"
"michael kors outlet"
"mulberry outlet"
"tiffany jewellery"
"tory burch outlet"
"babyliss pro"
"police sunglasses for men"
"prada outlet online"
"soccer jerseys"
"michael kors clearance"
"mcm outlet"
"true religion jeans outlet"
7.12

John said...

kobe bryant shoes
coach factory outlet online
oakley sunglasses wholesale
michael kors outlet store
cheap ray bans
yeezy boost
louboutin shoes
cheap jerseys wholesale
ralph lauren uk
christian louboutin outlet
2016105yuanyuan

Unknown said...

seahawks jersey this
yeezy boost 350 black would
baltimore ravens jerseys website
new england patriots jerseys almost
valentino shoes at
under armour shoes back
carolina jerseys This
yeezy boost 350 have
hugo boss sale for
dolce and gabbana shoes easy

Unknown said...

michael kors handbags
coach outlet
cheap jordans
cheap michael kors handbags
hugo boss sale
giants jersey
buffalo bills jerseys
jets jersey
raiders jerseys
ecco shoes

John said...

longchamp bags
christian louboutin shoes
discount oakley sunglasses
louboutin outlet
cheap air max
ralph lauren
true religion outlet online
michael kors outlet canada
longchamp handbags
yeezy boost
20170703yuanyuan

قمة الدقة said...


شركة تسليك مجارى بالاحساء
شركة تسليك مجارى بصفوى
شركة تسليك مجارى بعنك

dong dong23 said...

uggs classic boots
nike store
ugg outlet
ugg boots
nike factory outlet
fitflops sale clearance
coach outlet
mac cosmetics
vibram fivefingers
louis vuitton
20179.25wengdongdong

林磊 said...

2017106 leilei3915

ugg boots
cheap jerseys
kate spade handbags
michael kors outlet online
michael kors handbags clearance
ralph lauren shirts
coach outlet online
prada outlet
coach outlet online
fred perry polo

Obat Mata Juling said...

his article is very helpful at all thanks

Obat Penyakit Asam Urat
Obat Kanker Rahim

haiyan wu said...

yeezy 700
michael kors handbags
true religion jeans
cheap nfl jerseys
golden goose francy
adidas yeezy
yeezy shoes
yeezy boost 350
kayno
fitflops sale clearance

love said...

zzzzz2018.7.24
jordan shoes
ralph lauren outlet
supreme shirt
ralph lauren outlet
christian louboutin outlet
pandora
jimmy choo shoes
ray ban eyeglasses
louboutin shoes
canada goose outlet

xjd7410@gmail.com said...

20180727 junda
adidas ultra boost
grizzlies jerseys
michael kors outlet
nets jerseys
michael kors outlet
air jordan 4
ralph lauren polo
ray ban outlet
coach outlet
air jordan 4

jeje said...

Par conséquent, les individus peuvent Air Jordan 1 France restaurer leur résidence avec beaucoup moins de stress en utilisant cette colle. Certains des couleurs communes incluent le noir, le bleu, le blanc et même l'orange. Vous verrez dans cette mise en place de nombreux souffrants du genre de malaise obtenir avis chaussure running new balance un grand soulagement de votre sport de la marche. "Pourquoi Carrefour n'embauche pas plus de personnel?" quelqu'un interrogé. Ils offrent toutes les mêmes fonctionnalités que les anciens Kelty FC nike air max 1 femme soldes 3.0 et 2.0, avec quelques bonus qui pourraient rendre nos aventures en plein air beaucoup plus amusantes. Parmi les types de colle accessibles, Loctite GO2 Glue a gagné en popularité dans chaque élément du monde en raison de nike air jordan femme pas cher son efficacité.