Syntax::Construct + XML::XSH2

E. Choroba

Czech Republic, Prague

I. Syntax::Construct—explicitly declare the syntactic constructs your programme uses.
II. XML::XSH2—XML Shell wrapper around XML::LibXML

PerlMonks, CPAN, YouTube, …

choroba@matfyz.cz

Part I

Syntax::Construct

Features (1)

You want to use say in your programme.

Features (2)

Non-features (1)

You want to use the defined-or operator (//).

Non-features (2)

But...

Solution?

#!/usr/bin/perl
use warnings;
use strict;

use Syntax::Construct qw(  //  ?|  );

Benefits for

Current Status

Part II

XML::XSH2

XML (1)

Extensible Markup Language

XML (2)

Example

<?xml version="1.0" encoding="utf-8"?>
<root xmlns:ns="namespace-uri">
  <ns:empty_element />
  <element attribute="value">
    Text &amp; <![CDATA[(<&>)]]>
  </element>
  <!-- Comment -->
  <?Processing-instruction data ?>
</root>

Why Another XML Tool?

How can XML be parsed?

XSH

XML Shell

XPath (1)

XPath 1.0 Expressions

ls /root/ns:child[@hidden="false"]//grand-child[sum(num) > 0]

XPath (2)

“Why do I Need Another Language?”

Expressions versus Commands (1)

Expressions versus Commands (2)

XPath Extension Functions

Navigation

Modifying the Document (1)

Modifying the Document (2)

Difference between append and into

set /scratch/a/text() 'one' ;
set /scratch/b/text() 'two' ;
insert text three append //a ;
insert text four  append //b/text() ;
insert text five  into   //a ;
insert text six   into   //b/text() ;
save :f generated/append-into.xml ;
<?xml version="1.0" encoding="utf-8"?>
<scratch>
  <a>onethreefive</a>
  <b>six</b>
</scratch>

Sorting Nodes

Interaction with Perl

Interaction with the Shell

On MSWin, use cygwin

Regular Expressions

open regex.xml ;
echo :n Before: ;
for /names/name {
    echo { "\t" } (.) ;
    if xsh:matches(., ' ') {
        my $first = xsh:match(., '^((.).*) ') ;
        set . xsh:subst(., concat('^', $first[1]), 
                           concat($first[2], '.')) ;
    }
}
echo :n After: ;
for /names/name echo { "\t" } (.) ;
Before:	 Joe Smith
	 John Doe
	 Michael Angel
	 chromatic
After:	 J. Smith
	 J. Doe
	 M. Angel
	 chromatic

Retrenching XML::LibXML Code (1)

Commenting an element out

<start>
  <a>text<b>test</b>more text<c>text</c></a>
  <d>text</d>
</start>
#!/usr/bin/perl
use warnings;
use strict;

use XML::LibXML;

my $xml       = 'XML::LibXML'->load_xml(location => 'xml-libxml-2.xml');
my ($element) = $xml->findnodes('/start/a[1]');
my $comment   = $xml->createComment($element->toString);
$element->replaceNode($comment);
print $xml->toString;

Retrenching XML::LibXML Code (2)

Commenting an element out

<start>
  <a>text<b>test</b>more text<c>text</c></a>
  <d>text</d>
</start>
open xml-libxml-2.xml ;
insert comment xsh:serialize(/start/a[1]) prepend /start ;
delete /start/a[1] ;
ls / ;

Retrenching XML::LibXML Code (3)

Changing an attribute's value

<root>
  <parameters>
    <key name="Repetitions" value="500" />
    <key name="Mode" value="COLD" />
  </parameters>
</root>
#!/usr/bin/perl
use warnings;
use strict;

use XML::LibXML;

my $file    = 'xml-libxml-1.xml';
my $xml     = 'XML::LibXML'->load_xml( location => $file );
my $root    = $xml->documentElement;
my ($value) = $root->findnodes('parameters/key[@name="Repetitions"]/@value');
$value->setValue($value->getValue + 1);
rename $file, "$file~" or die $!;
$xml->toFile($file);

Retrenching XML::LibXML Code (4)

Changing an attribute's value

<root>
  <parameters>
    <key name="Repetitions" value="500" />
    <key name="Mode" value="COLD" />
  </parameters>
</root>
open xml-libxml-1.xml ;
cd /root/parameters/key[@name="Repetitions"]/@value ;
set . .+1 ;
save :b ;

Namespaces (1)

<?xml version="1.0" encoding="utf-8"?>
<root xmlns="http://namespa.ce/1"
      xmlns:x="http://namespa.ce/2">

  <a xmlns="http://namespa.ce/2">A</a>
  <b>B</b>
  <x:c>C</x:c>
</root>

Namespaces (2)

#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };

use XML::LibXML;

my $xml = 'XML::LibXML'->load_xml(location => 'namespace.xml');
my $xpc = 'XML::LibXML::XPathContext'->new;

$xpc->registerNs('n', 'http://namespa.ce/1');
$xpc->registerNs('y', 'http://namespa.ce/2');

say $_->textContent for $xpc->findnodes('( x:a | n:b | y:c )',
                                        $xml->documentElement);

Namespaces (3)

register-namespace n http://namespa.ce/1 ;
register-namespace y http://namespa.ce/2 ;

open namespace.xml ;
cd /n:root ;
for ( x:a | n:b | y:c ) echo (.) ;
A
B
C

Handling HTML

Hashing (1)

<?xml version="1.0" encoding="utf-8"?>
<store>
  <product id="f01" price="40" count="10">bread</product>
  <product id="f02" price="18" count="2">beer</product>
  <product id="t00" price="220" count="0">pincers</product>
  <product id="c33" price="10000" count="1">laptop</product>
  <product id="c55" price="6000" count="0">smartphone</product>
</store>
<?xml version="1.0" encoding="utf-8"?>
<order>
  <product id="f01">2</product>
  <product id="f02">1</product>
  <product id="c55">1</product>
</order>

Hashing (2)

quiet ;
open products.xml ;
my $store := hash @id /store/product ;
my $sum = 0 ;
open order.xml ;
for /order/product {
    my $p = xsh:lookup('store', @id) ;
    if (. <= $p/@count) {
        $sum = $sum + $p/@price * . ;
    } else {
        echo Not enough: $p ;
    }
}
echo Total: $sum ;
Not enough: smartphone
Total: 98

Streaming

Large files that don't fit into memory

open supply.xml ;
my $supply := hash ../@id /order/product/text() ;
stream :f products.xml :F generated/stream.xml select product {
    my $plus = xsh:lookup('supply', @id) ;
    if $plus set @count (@count + $plus) ;
} 
<?xml version='1.0'?><store>
  <product count='12' price='40' id='f01'>bread</product>
  <product count='3' price='18' id='f02'>beer</product>
  <product count='0' price='220' id='t00'>pincers</product>
  <product count='1' price='10000' id='c33'>laptop</product>
  <product count='1' price='6000' id='c55'>smartphone</product>
</store>

Example (1)

xml2dot.xsh

open { $ARGV[0] } ;
my $count ;

def processNonRec $type $parent $node {
    my $name = $type ;
    perl { $name .= $count->{"*$type"}++ } ;
    echo :s '"' $parent '"' " -> " '"' $name '"' ;
    echo :s '"' $name '" [label="' $type '()"]' ;
}

def processNode $parent $node {
    my $label = name($node);
    my $name ;
    perl { $name = $label . "=" . $count->{$label}++ } ;
    echo :s '"' $parent '"' " -> " '"' $name '"' ;
    if $node/self::* {
        if (count($node/../*[name() = $label]) > 1) {
            my $num = count($node/preceding-sibling::*[name() = $label]) ;
            $label = concat($label, "[", $num + 1, "]") ;
        }
    } else {
        $label = concat("@", $label) ;
    }
    echo :s '"' $name '" [label="' $label '"]' ;
    for ($node/node() | $node/@*) {
        if self::*                         processNode $name (.) ;
        if (count(.|../@*) = count(../@*)) processNode $name (.) ;
        if self::text()                    processNonRec 'text'    $name (.) ;
        if self::comment()                 processNonRec 'comment' $name (.) ;
        if self::processing-instruction()  processNonRec 'pi'      $name (.) ;
    }
}

echo 'strict digraph' name() '{' ;
echo 'node [shape=box]' ;
processNode "document" /* ;
echo '}' ;

Example (2)

xml2dot.xsh

strict digraph  {
node [shape=box]
"document" -> "root=0"
"root=0" [label="root"]
"root=0" -> "text0"
"text0" [label="text()"]
"root=0" -> "ns:empty_element=0"
"ns:empty_element=0" [label="ns:empty_element"]
"root=0" -> "text1"
"text1" [label="text()"]
"root=0" -> "element=0"
"element=0" [label="element"]
"element=0" -> "attribute=0"
"attribute=0" [label="@attribute"]
"element=0" -> "text2"
"text2" [label="text()"]
"element=0" -> "text3"
"text3" [label="text()"]
"element=0" -> "text4"
"text4" [label="text()"]
"root=0" -> "text5"
"text5" [label="text()"]
"root=0" -> "comment0"
"comment0" [label="comment()"]
"root=0" -> "text6"
"text6" [label="text()"]
"root=0" -> "pi0"
"pi0" [label="pi()"]
"root=0" -> "text7"
"text7" [label="text()"]
}

Example (3)

To Do

Thank you.

Questions?