Table of Contents
XSH2 acts as a command interpreter. Individual commands must be separated with a semicolon. In the interactive shell, backslash may be used at the end of a line to indicate that a command continues on the next line. Output redirection can be used to pipe output of some XSH command to some external program, or to capture the output to a variable. See Redirection for more info.
XSH2 command help provides a complete reference, instantly from the command-line:
help command
gives a list of all XSH2 commands.
help type gives a
list of all argument types.
help topic followed by
documentation chapter gives more information on a given topic.
help toc displays the table of contents.
XSH2 is designed as an environment for querying and manipulating XML and HTML documents. Use open or create commands to load an XML or HTML document from a local file, external URL (such as http:// or ftp://), string or pipe. XSH2 can optionally validate the document during parse process (see validation and load-ext-dtd). Parsed documents are stored in memory as DOM trees, that can be navigated and manipulated with XSH2 commands and XPath language, whose names and syntax make working with the DOM tree a flavor of working in a UNIX filesystem.
A parsed document is usually stored in a variable.
XSH2 shares variables
with the XPath engine, so if e.g. $doc is a XSH2
variable holding a document (or, more generally any node-set), then
$doc//section/title is an
XPath expression selecting all title
subelements of all
section elements within the (sub)tree of $doc.
Although XSH2 is able to parse remote documents via
http:// or ftp://, it is only
able to save them locally. To upload a document to a remote server
(e.g. using FTP) or to store it into a database, use save command with a --pipe
parameter, in connection with an external program able to store
its standard input (XML) to the desired location.
You can also use similar
parameter with open in order to parse
documents from standard output of some external program.
Example 1. Store a XSH2 document on a remote machine using the Secure Shell
xsh> save --pipe "ssh my.remote.org 'cat > test.xml'" $doc
turn on backup file creation
use a catalog file during all parsing processes
set on/off changing current document to newly open/created files
clone a given document
close document (without saving)
make a new document from a given XML fragment
specifying documents
specifying filenames
display a list of open documents
index a static document for faster XPath lookup
turn off backup file creation
specifying names of DOM nodes
load an XML, HTML, or Docbook SGML document from a file, pipe or URI
load and insert XInclude sections
save a document as XML or HTML
change filename or URL associated with a document
process selected elements from an XML stream (EXPERIMENTAL)
name of a sub-routine
With XSH2, it is possible to browse a document tree (XML data represented as a DOM-tree) as if it was a local filesystem, except that XPath expressions are used instead of ordinary directory paths.
To mimic the filesystem navigation as closely as possible, XSH2 contains several commands named by analogy of UNIX filesystem commands, such as cd, ls and pwd.
The current position in the document tree is called the current node. Current node's XPath may be queried with pwd command. In the interactive shell, current node is also displayed in the command line prompt. (Since there may be multiple document trees open at the same time, XSH2 tries to locate a variable holding the current document and use it to fully qualify current node's XPath in the XSH2 prompt.) Remember, that beside cd command, current node (and document) is also silently changed by open command, create command and temporarily also by the node-list variant of the foreach loop without a loop variable.
XPath expressions are always evaluated in context
of the current node. Different documents can be accessed
through variables: $doc/foo[1]/bar.
Example 2. XSH2 shell
$scratch:/>$docA := open "testA.xml"$docA/>$docB := open "testB.xml"$docB/>pwd/ $docB/>cd $docA/article/chapter[title='Conclusion']$docA/article/chapter[5]>pwd/article/chapter[5] $docA/article/chapter[5]>cd previous-sibling::chapter$docA/article/chapter[4]>cd ..$docA/article>cd $docB$docB:/>ls<?xml version="1.0" encoding="utf-8"?> <article>...</article>
serialize nodes as to canonical XML
change current context node
mark elements to be folded by list command
list a given part of a document as XML
show a given node location (as a canonical XPath)
show current context node location (as a canonical XPath)
define XPath extension function (EXPERIMENTAL)
unfold elements folded with fold command
undefine extension function (EXPERIMENTAL)
XPath expression
XSH2 not only provides ways to browse and inspect the DOM tree but also many commands to modify its content by various operations, such as copying, moving, and deleting its nodes as well as creating completely new nodes or XML fragments and attaching them to it. It is quite easy to learn these commands since their names or aliases mimic their well-known filesystem analogies. On the other hand, many of these commands have two versions one of which is prefixed with a letter "x". This "x" stands for "cross", thus e.g. xcopy should be read as "cross copy". Let's explain the difference on the example of xcopy.
In a copy operation, you have to specify what nodes are to be copied and where to, in other words, you have to specify the source and the target. XSH2 is very much XPath-based so, XPath is used here to specify both of them. However, there might be more than one node that satisfies an XPath expression. So, the rule of thumb is that the "cross" variant of a command places one and every of the source nodes to the location of one and every destination node, while the plain variant works one-by-one, placing the first source node to the first destination, the second source node to the second destination, and so on (as long as there are both source nodes and destinations left).
$scratch/>$a := create "<X><A/><Y/><A/></X>";$a/>$b := create "<X><B/><C/><B/><C/><B/></X>";$b/>xcopy $a//A replace $b//B;$b/>copy $b//C before $a//A;$b/>ls $a;<?xml version="1.0" encoding="utf-8"?> <X><C/><A/><Y/><C/><A/></X> $b/>ls $b;<?xml version="1.0" encoding="utf-8"?> <X><A/><A/><C/><A/><A/><C/><A/><A/></X>
As already indicated by the example, another issue of tree modification is the way in which the destination node determines the target location. Should the source node be placed before, after, or somewhere among the children of the resulting node? Or maybe, should it replace it completely? This information has to be given in the location argument that usually precedes the destination XPath.
Now, what happens if source and destination nodes are of incompatible types? XSH2 tries to avoid this by implicitly converting between node types when necessary. For example, if a text, comment, and attribute node is copied into, before or after an attribute node, the original value of the attribute is replaced, prepended or appended respectively with the textual content of the source node. Note however, that element nodes are never converted into text, attribute or any other textual node. There are many combinations here, so try yourself and see the results.
You may even use some more sophisticated way to convert between node types, as shown in the following example, where an element is first commented out and than again uncommented. Note, that the particular approach used for resurrecting the commented XML material works only for well-balanced chunks of XML.
Example 3. Using string variables to convert between different types of nodes
$doc := create <<EOF; <?xml version='1.0'?> <book> <chapter> <title>Intro</title> </chapter> <chapter> <title>Rest</title> </chapter> </book> EOF# comment out the first chapterls //chapter[1] |> $chapter_xml;insert comment $chapter_xml replace //chapter[1];ls / 0;# OUTPUT: <?xml version="1.0"?> <book> <!-- <chapter> <title>Intro</title> </chapter> --> <chapter> <title>Rest</title> </chapter> </book> # un-comment the chapter$comment = string(//comment()[1]);insert chunk $comment replace //comment()[1];ls / 0;# OUTPUT: <?xml version="1.0"?> <book> <chapter> <title>Intro</title> </chapter> <chapter> <title>Rest</title> </chapter> </book>
change namespace prefix (EXPERIMENTAL)
change namespace URI (EXPERIMENTAL)
clone a given document
copy nodes (in the one-to-one mode)
create a special attribute declaring an XML namespace (EXPERIMENTAL)
Edit parts of a XML document in a text editor.
Edit a string or variable in a text editor.
expression argument type
index selected nodes by some key value
create a node in on a given target location
relative destination specification (such as after, before, etc.)
transform node value/data using Perl or XPath expression
move nodes (in the one-to-one mode)
node type specification (such as element, attribute, etc.)
normalizes adjacent textnodes
load and insert XInclude sections
remove given nodes
quickly rename nodes with in-line Perl code
create or modify document content (EXPERIMENTAL)
set document's DTD declaration
set document's charset (encoding)
set namespace of the current node (EXPERIMENTAL)
set document's standalone flag
sort a given node-list by given criteria
strip leading and trailing whitespace
wrap given nodes into elements
wrap spans of nodes into elements
copy nodes (in the all-to-every mode)
create nodes on all target locations
move nodes (in the all-to-every mode)
XPath expression
compile a XSLT stylesheet and/or transform a document with XSLT
apply XUpdate commands on a document
As almost every scripting language, XSH2 supports subroutines, various conditional statements, loops and even exceptions.
single XSH2 command or a block of XSH2 commands
indirect call to a user-defined routine (macro)
sub-routine declaration
execute a given block of commands
evaluate given expression as XSH commands
exit XSH2 shell
loop iterating over a node-list or a perl array
if statement
conditionally include another XSH2 source in current position
include another XSH2 source in current position
iterate a block over current subtree
immediately exit an enclosing loop
start the next iteration of an enclosing loop
restart an iteration on a previous node
restart the innermost enclosing loop block
return from a subroutine
switch into normal execution mode (quit test-mode)
process selected elements from an XML stream (EXPERIMENTAL)
do not execute any command, only check the syntax
throw an exception
try/catch statement
undefine sub-routine or variable
negated if statement
simple while loop
Beside the possibility to browse the DOM tree and list some parts of it (as described in Navigation), XSH2 provides commands to obtain other information related to open documents as well as the XSH2 interpreter itself. These commands are listed bellow.
search the documentation
serialize nodes as to canonical XML
calculate a expression and enumerate node-lists
displays various information about a document
calculate a given expression and return the result.
display a list of open documents
on-line documentation
print line-numbers corresponding to matching nodes
list a given part of a document as XML
list all user-defined subroutines
show document's DTD
show a given node location (as a canonical XPath)
List namespaces available in a context of a given nodes
list current settings using XSH2 syntax
print stuff on standard or standard error output
show document's original character encoding
show current context node location (as a canonical XPath)
validate a document against a DTD, RelaxNG, or XSD schemas
list global variables
show version information
Namespaces provide a
simple method for qualifying element and attribute names in
XML documents. Namespaces are represented by a namespace URI
but, since the URI can be very long, element and attribute
names are associated with a namespace using a namespace prefix
(see the W3C
recommendation for details). In an XML document, a
prefix can be associated with a namespace URI using a
declaration which takes form of special attribute of the form
xmlns:prefix="namespace uri" on an element.
The scope of the namespace declaration is then the subtree of the
element carrying the special xmlns:prefix attribute (and includes
attributes of the element). Moreover, a default namespace can
be declared using just xmlns="namespace
uri". In that case all unprefixed element names
in the scope of such a
declaration belong to the namespace.
An unprefixed element which is not in scope of
a default namespace declaration does not belong to any namespace.
It is recommended not to combine namespaced elements and
non-namespaced elements in a single document.
Note that regardless of default
namespace declarations, unprefixed attributes do not belong to any namespace
(because they are uniquely determined by their name
and the namespace and name of the the element which carries them).
XSH2 tries to deal namespace declarations transparently (creating them if necessary when nodes are copied between different documents or scopes of namespace declarations). Most commands which create new elements or attributes provide means to indicate a namespace. In addition, XSH2 provides commands declare-ns, set-ns, change-ns-uri, and change-ns-prefix to directly manipulate XML namespace declarations on the current node.
Since XSH2 is heavily XPath-based, it is important to
remember that XPath 1.0 maps prefixes to namespaces
independently of the declarations in the current document. The
mapping is instead provided via so called XPath context.
Namespaces can be tested in XPath either using the built-in
namespace-uri() function, but it is more
convenient to use namespace prefixes associated with namespace
URIs in the XPath context. This association is independent of
the documents to which the XPath expression is applied and can
be established using the command register-namespace. Additional, XSH2
automatically propagates the namespace association in the
scope of the current node to the XPath context, so that
per-document prefixes in the current scope can also be used.
IMPORTANT: XPath 1.0 has no concept of a default namespace. Unprefixed names in XPath only match names which have no namespace. So, if the document uses a default namespace, it is required to associate a non-empty prefix with the default namespace via register-namespace and add that prefix to names in XPath expressions intended to match nodes in the default namespace.
Example 4. Manipulating nodes in XHTML documents
open "index.xhtml"; $xhtmlns = "http://www.w3.org/1999/xhtml"; register-namespace x $xhtmlns; wrap --namespace $xhtmlns '<font color="blue">' //x:a[@href]; # or wrap '<x:font color="blue">' //x:a[@href];
In the preceding example we associate the (typically
default) namespace of XHTML documents with the prefix
x. We than use this prefix to match all
links (a elements) in the document. Note
that we do not write @x:href to match the
@href attribute because unprefixed
attributes do not belong to the default namespace. The wrap command is used to create new containing
elements for the nodes matched by the XPath expression. We may
either specify the namespace of the containing element
explicitly, using --namespace option, or
implicitly, by using a prefix associated with the namespace in
the XPath context. In the latter case, XSH2 chooses a
suitable prefix declared for the namespace in the document
scope (in this case the default, i.e. no, prefix), adding a
new namespace declaration if necessary.
change namespace prefix (EXPERIMENTAL)
change namespace URI (EXPERIMENTAL)
create a special attribute declaring an XML namespace (EXPERIMENTAL)
List namespaces available in a context of a given nodes
register namespace prefix to use XPath expressions
register a prefix for the XHTML namespace
register a prefix for the XSH2 namespace
set namespace of the current node (EXPERIMENTAL)
unregister namespace prefix
XSH2 commands accept arguments of various types, usually expressed as Perl or XPath expressions. Unlike in most languages, individual XSH2 commands may evaluate the same expression differently, usually to enforce a result of a certain type (such as a node-list, a string, a number, a filename, a node name, etc.). See expression and individual argument types for more information.
single XSH2 command or a block of XSH2 commands
List of XSH2 commands and their general syntax
specifying documents
character encoding (codepage) identifier
expression argument type
specifying filenames
relative destination specification (such as after, before, etc.)
specifying names of DOM nodes
node type specification (such as element, attribute, etc.)
in-line code in Perl programming language
name of a sub-routine
XPath expression
In XSH2, like in Perl and XPath,
variable names are are prefixed
with a dollar sign ($).
Variables can
contain arbitrary Perl Scalar (string, number, array
reference, hash reference or an object reference). XPath
objects are transparently mapped to Perl objects via
XML::LibXML objects.
Values can be assigned to variables
either by simple assignments of the form
$variable = expression,
where the right hand side is an expression, or by command
assignments of the form
$variable := command,
where the right hand side is a XSH2 command, or by
capturing the output of some command with a variable
redirection of the following form:
command |> $variable;
XSH2 expressions are evaluated either by XPath
engine or by Perl (the latter only happens if the entire
expression is enclosed with braces
{...}), and
both Perl and XPath can access all XSH2 variables
transparently (Perl expressions may even assign to them).
A simple simple expression consisting of a variable name
(e.g. $variable) is always evaluated by the
XPath engine and the result is the content of the variable as
it appears to the XPath data model. Since in XPath object
cannot be void (undefined), XPath engine complains, if the
value of the variable is undefined. On the other hand,
expressions like {$variable} are evaluated by
Perl, which results in the value of the variable as seen
by Perl.
Variables can also be used as macros for complicated XPath
expressions. Any occurrence of a substring of the form
${variable} in an XPath expression is
interpolated to the value of $variable (if
$variable contains an object rather than a
string or number, then the object is cast to string first) before the
entire expression is evaluated. So, for example, if
${variable} contains string
"chapter[title]" (without the quotes), then
the XPath expression
//sect1/${variable}/para interpolates to
//sect1/chapter[title]/para prior
to evaluation.
To display the current value of a variable, use either print or (in case of a global variables - the distinction is discussed below) the command variables:
xsh>$b="my_document";xsh>$file="${b}s.xml";xsh>$f := open $file;xsh>ls //$b[count(descendant::para)>10]xsh>print $bmy_document xsh>variables... $b='my_document'; ... $file='my_documents.xml'; ...
Variables can also serve as containers for documents and can be used to
store lists of nodes that result from evaluating an XPath
expression (a.k.a. XPath node-sets). This is especially useful
when a sequence of commands is to be performed on some fixed
set of nodes and repetitive evaluation of the same XPath
expression would be lengthy. XPath
node-sets are represented by
XML::LibXML::NodeList Perl objects (which
is simply a array reference blessed to the above class, which
provides some simple operator overloading). In XPath, by a
node-set by definition can only contain a single copy of each
node and the nodes within a node-set are processed in the same
order as they appear in the XML document. Having XPath
node-sets represented by a list gives us the advantage of having
the possibility to process the list in a different order than
the one implied by the document (which is what happens
if a variable containing a node-list is evaluated by Perl
rather than XPath), see an example below.
xsh>$creatures = //creature[@status='alive']# process creatures in the document order: xsh>foreach $creature print @name;# process creatures in the reverse document order: xsh>foreach { reverse @$creature } print @name;# append some more nodes to a node-list (using a variant of # a simple assignment) xsh>$creatures += //creature[@status='dead'];# again, we can process creatures in order implied by the document: xsh>foreach $creature print @name;# but we can also process first living and then dead creatures, # since this is how they are listed in $creature xsh>foreach {$creature} print @name;# same as the above is xsh>foreach {@$creature} print @name;
XSH2 variables are either globally or lexically scoped. Global variables need not to be declared (they can be directly assigned to), whereas lexical variables must be declared using the command my. Global variable assignment may also be made temporal for the enclosing block, using local.
$var1 = "foo"; # a global variable requires no declaration local $var1 $var2 $var3; # localizes global variables $var1 = "bar"; # assignment to a localized variable is temporary local $var4 = "foo"; # localized assignment my $var1 $var $var3; # declares lexical variables my $var1 = "foo"; # lexical variable declaration with assignment
Lexical variables are only defined in the scope of current block or subroutine. There is no way to refer to a lexical variable form outside of the block it was declared in, nor from within a nested subroutine call. Of course, lexical variables can be referred to from nested blocks or Perl expressions (where they behave just like Perl's lexical variables).
On the other hand, global or localized XSH2 variables are just
Perl Scalar variables belonging to the
XML::XSH2::Map namespace, which is also the
default namespace for any Perl code evaluated from XSH2 (so
there's no need to use this prefix explicitly in Perl expressions,
unless of course there is a lexical variable in the current
scope with the same).
Localizing a variable using the local
keyword makes all assignments to it occurring in the enclosing block
temporary. The variable itself remains global, only its
original value is restored at the end of the block that localized it.
In all above cases, it is possible to arbitrarily intermix XSH2 and Perl assignments:
xsh>ls //chapter[1]/title<title>Introduction</title> xsh>$a=string(//chapter[1]/title)xsh>perl { $b="CHAPTER 1: ".uc($a); }xsh>print $bCHAPTER 1: INTRODUCTION
Although all XSH2 variables are in fact Perl Scalars, it is still possible to store Perl Array or Hash value to a XSH2 variable via reference. The following example demonstrates using Perl Hashes to collect and print some simple racial statistics about the population of Middle-Earth:
my $races;
foreach a:/middle-earth/creature {
my $race=string(@race);
perl { $races->{$race}++ };
}
print "Middle-Earth Population (race/number of creatures)"
print { map "$_/$races->{$_}\n" keys(%$races); };
variable assignment
specifying documents
expression argument type
temporarily assign new value to a variable
name of a sub-routine
XPath expression
WARNING: XSH2 redirection syntax is not yet finished. It is currently the same as in XSH1 but this may be drastically changed in the future releases.
Output redirection can be used to pipe output of some XSH command to some external program, or to capture it to a variable. Redirection of output of more than one XSH command can be achieved using the do command.
The syntax for redirecting the output of a XSH command to
an external program, is xsh-command | shell-command
;, where xsh-command is any XSH2
command and shell-command is any command
(or code) recognized by the default shell interpreter of the
operating system (i.e. on UNIX systems by
/bin/sh or /bin/csh, on
Windows systems by cmd). The shell command
may contain further redirections (as supported by the system
shell interpreter), but should not contain semicolons, except when
the whole shell command is enclosed in brackets.
Example 5. Use well-known UNIX commands to filter XPath-based XML listing from a document and count the results
xsh> ls //something/* | grep foo | wcThe commands listed below can be used to modify the default behavior of the XML parser or XSH2 itself. Some of the commands switch between two different modes according to a given expression (which is expected to result either in zero or non-zero value). Other commands also working as a flip-flop have their own explicit counterparts (e.g. verbose and quiet or debug and nodebug). This inconsistency is due to historical reasons.
The encoding and query-encoding settings allow to specify character encodings of user's input and XSH2's own output. This is particularly useful when you work with UTF-8 encoded documents on a console which only supports 8-bit characters.
The settings command displays current settings by means of XSH2 commands. Thus it can not only be used to review current values, but also to store them for future use, e.g. in ~/.xshrc file.
xsh> settings | cat > ~/.xshrc
turn on backup file creation
set on/off changing current document to newly open/created files
turn on/off parser's ability to fill default attribute values
display many annoying debugging messages
turn on/off serialization of empty tags
character encoding (codepage) identifier
choose output charset
turn on/off pretty-printing
turn on/off ignorable whitespace preservation
turn on/off external DTD fetching
turn off backup file creation
turn off debugging messages
list current settings using XSH2 syntax
turn on/off parser's tendency to expand entities
turn on/off transparent XInclude insertion by parser
make the parser more pedantic
declare the charset of XSH2 source files and terminal input
turn off many XSH2 messages
turn on/off parser's ability to fix broken XML
define XPath extension function (EXPERIMENTAL)
register namespace prefix to use XPath expressions
register a prefix for the XHTML namespace
register a prefix for the XSH2 namespace
switch into normal execution mode (quit test-mode)
turn on/off serialization of DTD DOCTYPE declaration
do not execute any command, only check the syntax
undefine extension function (EXPERIMENTAL)
unregister namespace prefix
turn on/off validation in XML parser
make XSH2 print many messages
sets TAB completion for axes in xpath expressions in the interactive mode
turn on/off TAB completion for xpath expressions in the interactive mode
map predefined XSH2 XPath extension functions to no or other namespace
Along with XPath, Perl is one of two XSH2 expression languages, and borrows XSH2 its great expressive power. Perl is a language optimized for scanning arbitrary text files, extracting information from those text files, and printing reports based on that information. It has built-in regular expressions and powerful yet easy to learn data structures (scalars, arrays, hash tables). It's also a good language for many system management tasks. XSH2 itself is written in Perl (except for the XML engine, which uses libxml2 library written in C by Daniel Veillard).
Perl expressions or blocks of code can either be used as arguments to any XSH2 command. One of them is perl command which simply evaluates the given Perl block. Other commands, such as map, even require Perl expression argument and allow quickly change DOM node content. Perl expressions may also provide lists of strings to iterate over with a foreach loop, or serve as conditions for if, unless, and while statements.
To prevent conflict between XSH2 internals and the evaluated
Perl code, XSH2 runs such code in the context of a special
namespace XML::XSH2::Map. As described in
the section Variables, XSH2 string
variables may be accessed and possibly assigned from Perl
code in the most obvious way, since they actually
are Perl variables defined in the
XML::XSH2::Map namespace.
The interaction between XSH2 and Perl actually works the
other way round as well, so that you may call back XSH2 from the
evaluated Perl code. For this, Perl function
xsh is defined in the
XML::XSH2::Map namespace. All parameters
passed to this function are interpreted as XSH2 commands.
Moreover, the following Perl helper functions are defined:
xsh(string,....) - evaluates
given string(s) as XSH2 commands.
call(name) - call a given
XSH2 subroutine.
count(string) - evaluates
given string as an XPath expression and returns
either literal value of the result (in case of
boolean, string and float result type) or
number of nodes in a returned node-set.
literal(string|object) -
if passed a string, evaluates it as a XSH2 expression
and returns the literal value of the result;
if passed an object, returns literal value of
the object.
For example,
literal('$doc/expression') returns the same
value as count('string($doc/expression)').
serialize(string|object) -
if passed a string, it first evaluates the string
as a XSH2 expression to obtain a node-list object.
Then it serializes the object into XML.
The resulting string is equal to the output of the XSH2 command ls applied on the same expression or object
expression only without indentation and folding.
type(string|object) -
if passed a string, it first evaluates
the string as XSH2 expression to obtain a node-list object.
It returns a list of strings representing
the types of nodes in the node-list
(ordered in the canonical document order).
The returned type strings are:
element,
attribute,
text,
cdata,
pi,
entity_reference,
document,
chunk,
comment,
namespace,
unknown.
nodelist(string|object,...) -
converts its arguments to objects if necessary
and returns a node-list consisting of the objects.
xpath(string, node?) -
evaluates a given string as an XPath expression
in the context of a given node and returns
the result.
echo(string,...) - prints
given strings on XSH2 output.
Note, that in the interactive mode,
XSH2 redirects all output to a specific terminal file handle
stored in the variable $OUT.
So, if you for example mean to pipe the result
to a shell command, you should avoid using STDOUT filehandle
directly. You may either use the usual print
without a filehandle,
use the echo function,
or use $OUT as a filehandle.
In the following examples we use Perl to populate the
Middle-Earth with Hobbits whose names are read from a text
file called hobbits.txt, unless there are
some Hobbits in Middle-Earth already.
Example 7. Use Perl to read text files
unless (//creature[@race='hobbit']) {
perl {
open my $fh, "hobbits.txt" };
@hobbits=<$file>;
close $fh;
}
foreach { @hobbits } {
copy xsh:new-element("creature","name",.,"race","hobbit")
into m:/middle-earth/creatures;
}
}
Example 8. The same code as a single Perl block
perl {
unless (count(//creature[@race='hobbit'])) {
open my $file, "hobbits.txt";
foreach (<$file>) {
xsh(qq{insert element "<creature name='$_' race='hobbit'>"
into m:/middle-earth/creatures});
}
close $file;
}
};
XSH2 allows users to extend the set of XPath functions by
providing extension functions written in Perl. This can
be achieved using the register-function
command. The perl code implementing an extension function
works as a usual perl routine accepting its arguments in
@_ and returning the result. The
following conventions are used:
The arguments passed to the perl implementation by the XPath
engine are simple scalars for string, boolean and float
argument types and
XML::LibXML::NodeList objects for node-set
argument types. The implementation is
responsible for checking the argument number and types. The
implementation may use general Perl functions as well as
XML::LibXML
methods to process the arguments and return the result.
Documentation for the XML::LibXML Perl module
can be found for example at http://search.cpan.org/~pajas/XML-LibXML/.
Extension functions SHOULD NOT MODIFY the document DOM tree. Doing so could not only confuse the XPath engine but possibly even result in an critical error (such as segmentation fault). Calling XSH2 commands from extension function implementations is also dangerous and isn't generally recommended.
The extension function implementation must return
a single value, which can be of
one of the following types: simple scalar (a number or
string), XML::LibXML::Boolean object
reference (result is a boolean value),
XML::LibXML::Literal object reference
(result is a string), XML::LibXML::Number
object reference (result is a float),
XML::LibXML::Node (or derived) object
reference (result is a node-set consisting of a single node),
or XML::LibXML::NodeList (result is a
node-set). For convenience, simple (non-blessed) array
references consisting of
XML::LibXML::Node objects can also be
used for a node-set result instead of a
XML::LibXML::NodeList.
In the interactive mode, XSH2 interprets all lines starting
with the exclamation mark (!) as shell
commands and invokes the system shell to interpret the line
(this is to mimic FTP and similar command-line interpreters).
xsh> !ls -l
-rw-rw-r-- 1 pajas pajas 6355 Mar 14 17:08 Artistic
drwxrwxr-x 2 pajas users 128 Sep 1 10:09 CVS
-rw-r--r-- 1 pajas pajas 14859 Aug 26 15:19 ChangeLog
-rw-r--r-- 1 pajas pajas 2220 Mar 14 17:03 INSTALL
-rw-r--r-- 1 pajas pajas 18009 Jul 15 17:35 LICENSE
-rw-rw-r-- 1 pajas pajas 417 May 9 15:16 MANIFEST
-rw-rw-r-- 1 pajas pajas 126 May 9 15:16 MANIFEST.SKIP
-rw-r--r-- 1 pajas pajas 20424 Sep 1 11:04 Makefile
-rw-r--r-- 1 pajas pajas 914 Aug 26 14:32 Makefile.PL
-rw-r--r-- 1 pajas pajas 1910 Mar 14 17:17 README
-rw-r--r-- 1 pajas pajas 438 Aug 27 13:51 TODO
drwxrwxr-x 5 pajas users 120 Jun 15 10:35 blib
drwxrwxr-x 3 pajas users 1160 Sep 1 10:09 examples
drwxrwxr-x 4 pajas users 96 Jun 15 10:35 lib
-rw-rw-r-- 1 pajas pajas 0 Sep 1 16:23 pm_to_blib
drwxrwxr-x 4 pajas users 584 Sep 1 21:18 src
drwxrwxr-x 3 pajas users 136 Sep 1 10:09 t
-rw-rw-r-- 1 pajas pajas 50 Jun 16 00:06 test
drwxrwxr-x 3 pajas users 496 Sep 1 20:18 tools
-rwxr-xr-x 1 pajas pajas 5104 Aug 30 17:08 xshTo invoke a system shell command or program from the non-interactive mode or from a complex XSH2 construction, use the exec command.
Since UNIX shell commands are very powerful tool for
processing textual data, XSH2 supports direct redirection of
XSH2 commands output to system shell command. This is very
similarly to the redirection known from UNIX shells, except
that here, of course, the first command in the pipe-line
colone is an XSH2 command. Since semicolon (;)
is used in XSH2 to separate commands, it has to be prefixed
with a backslash if it should be used for other purposes.
Example 9. Use grep and less to display context of `funny'
xsh> ls //chapter[5]/para | grep funny | less
change system working directory
execute a shell command
expression argument type
index selected nodes by some key value
transform node value/data using Perl or XPath expression
in-line code in Perl programming language
evaluate in-line Perl code
quickly rename nodes with in-line Perl code
Like many other shells, XSH2 provides means for customizing
the format of its interactive shell prompt. The prompt is
displayed according to the content of the variable
$PROMPT on which the following
substitutions and interpolations are performed
(in this order):
1. Prompt-string replacements
%% - percent sign %p - XPath location of the current node %P - like %p but without an initial document variable %l - XPath location of the current node with ID-shortcuts %L - like %l but without an initial document variable %n - name of the current node %N - local name of the current node %c - canonical XPath name of the current node %y - type of the current node (element,attribute,...) %i - ID of the current node %d - current document variable %h - the hostname up to the first '.' %H - the hostname %s - XSH shell name (basename of $0) %t - the current time in 24-hour HH:MM:SS format %T - the current time in 12-hour HH:MM:SS format %@ - the current time in 12-hour am/pm format %A - the current time in 24-hour HH:MM format %u - the username of the current user %v - the version of XSH2 (e.g., 2.1.0) %V - the revision number of XML::XSH2::Functions (e.g. 2.40) %w - current working directory (on the local filesystem) %W - basename of %w
2. Variable, XPath and Perl interpolations
Substrings of the forms ${variable},
${{...perl...}} and
${(...xpath...)} are interpolated as in XSH2
expressions.
3. Special character substitution
\n - newline character \r - line-feed character \t - tab character \a - bell character \b - backspace character \f - form feed character \e - escape character (\033) \\ - backslash character \nnn - the character corresponding to the octal number nnn (useful for non-printable terminal control characters)
The default value of $PROMPT is "%p>".
Note that you must escape ${...}
interpolators like \${...} if you
want them to be evaluated at each prompt
rather than at the time of the assignment to $PROMPT.
For example:
Example 11. Let `uname` be computed once, `date` at every prompt
$PROMPT="[${{ chomp($u=`uname`);$u }} \${{ chomp($d=`date`);$d }}] %p>"This section briefly describes differences between XSH2 and previous XSH 1.x releases. The list should not be considered complete. Some syntax variations or amendments in the semantics of various commands may not be documented in this section, neither are various improvements in the XSH interpreter.
In XSH2, subroutines can be called without a call. They can be redefined and undefined. The command call can still be used, but it's use only makes sense in indirect calls, where subroutine's name is computed from an expression.
def foo $param1 $param2 {
# param1 and $param2 are lexical (a.k.a. my)
ls $param1;
echo $param2
}
foo (//chapter)[1] (//chapter)[1]/title
def inc $param1 { return ($param1 + 1) }
$two := inc 1;
XSH2 uses variables of the form $variable for all kinds of objects, including node-sets (which, if evaluated as Perl expressions, preserve node order). Node-list variables of XSH 1.x have been deprecated.
$var = //foo/bar; # node set
$var = "hallo world"; # string
$var = xsh:new-element("foo"); # node object
$var = { ['a','b','c'] }; # Perl array reference
$var = {{ 'a'=>'A', 'b'=>'B' }}; # Perl hash reference
XSH2 allows variables to be used in XPath just as they are used in XSLT:
$var = //foo/bar; ls //baz[ . = $var[@test=1]/any ]
Variable interpolation is still available in XSH2 via ${var}, but it's importance is diminished compared to XSH 1.x, because the XPath engine now evaluates variables directly. Interpolation can still be used for things like "XPath-macros":
$filter = "[ . = $var[@test=1]/any ]";
ls //baz${filter};
XSH2 equally supports XPath and Perl expressions (written in braces { ... }). Unfortunately, Perl expressions can't be embedded in XPath expressions, but one can still use variables as an agent:
perl { use MIME::Base64 };
my $encoded = { encode_base64('open sesame') }
ls //secret-cave[string(password) = $encoded]
We can, however, use Perl-only expressions complemented with auto-conversion to do things like:
copy { encode_base64('Pe do mellon a minno!') } replace //secret-cave/password/text();
Commands return values (see := assignment, or &{ } expressions).
$moved_paras := xmove //para replace .;
$chapter := wrap chapter $moved_paras;
ls $chapter;
# or just
ls &{ wrap chapter &{ xmove //para replace . } };
XSH2 deprecates "string" expressions of XSH 1.x. However, for convenience, some XSH2 commands interpret name-like XPath expressions on certain argument positions as strings (mostly commands that expect file-name or node-name arguments):
insert element my_document into .; insert text "foo" into my_document; $doc := open my_document; # opens file named "my_document" $doc := open "my_document"; # same $doc := open (my_document); # opens file named "foo" $doc := open string(my_document); # same
In XSH2, XML documents have no ID. They are referred to using variables (which fits in well with the unified variable concept):
$doc1 := open "foo1.xml";
$doc2 := open "foo2.xml";
ls ($doc1//para|$doc2//para);
cd $doc1;
ls id('intro'); # finds ID intro in the current document ($doc1)
ls xsh:id2($doc2, 'intro'); # finds ID intro in $doc2
XSH2 commands have options and flags instead of many optional (positional) arguments. Options/flags usually have both long forms (like --flag) and equivalent short forms (like :f) (colon is borrowed from Scheme, because dash is reserved for XPath minus).
$doc := open --format html "version1.html";
save --file "version2.xml" $doc;
ls --fold /;
ls :f /;
ls --depth 1 /;
ls :d 1 /;
# all the same:
$sorted = sort --key @name --locale --descending //user;
$sorted = sort :l:d:k@name //user;
$sorted = sort --key @name --compare { use locale; $b cmp $a } //user;
validate --relaxng --file "test.rng" $mydoc;
validate --public "-//OASIS//DTD DocBook XML V4.1.2//EN" $mydoc;
validate --yesno $mydoc;
Finally, eval is no longer
an alias for perl in XSH2,
but instead evaluates strings containing XSH2 commands
(so eval $string now practically works like old ugly
perl { xsh($string) }). See the documentation for
eval for a handy usage example
(no more PHP, XSTL and XPathScript :-)).
Example 12. Open command has changed.
XSH1: foo = file.xml; or foo = "file.xml";
XSH2:
$foo := open file.xml; # file.xml is a bareword in file-name context
or
$foo := open "file.xml"; # "file.xml" is a XPath string
or
$foo := open {"file.xml"}; # "file.xml" is a Perl string
or
$foo = xsh:open("file.xml"); # righthand side is an XPath extension function
Example 13. XSH2 commands have options
XSH1: open HTML FILE foo2 = "file.html";
XSH2: $foo2 := open --format html "file.html";
Example 14. documents
XSH1: foo = file.xml; ls foo:(//bar|//baz);
XSH2: $foo := open file.xml; ls ($foo//bar|$foo//baz);
Example 15. variable interpretation
XSH1:
$family = "Arial";
ls //font[@family="$family"]; # interpolation
or
ls //font[@family="${family}"]; # interpolation
XSH2:
$family = "Arial";
ls //font[@family=$family]; # evaluation by XPath engine
or
ls //font[@family="${family}"]; # interpolation
Example 16. adding new nodes
XSH1: insert attribute "foo=bar" into /scratch;
XSH2:
insert attribute "foo=bar" into /scratch;
or
copy xsh:new-attribute("foo","bar") into /scratch;
Example 17. foreach with perl expression
XSH1:
foreach { glob('*.xml') } {
open doc = $__;
...
}
XSH2:
foreach { glob('*.xml') } {
my $doc := open .;
...
}
Example 18. foreach (perl expression) with variable
XSH2:
foreach my $filename in { glob('*.xml') } {
my $doc := open $filename;
...
}
Example 19. sorting nodes
XSH1:
%list = //player;
sort @best_score { $a <=> $b } %list;
copy %list into .;
XSH2:
$list := sort --numeric --key @best_score //player;
copy { $list } into .;
or
copy &{ sort --numeric --key @best_score //player } into .;
or (using short options)
copy &{ sort :n :k @best_score //player } into .;
apropos [--fulltext] [--regexp] expression
Print all help topics containing given expression
in their short description. The
--fulltext flag forces
the search to be performed over the full text
of help.
--fulltext indicates,
that the given expression
is a regular expression instead of a literal string.
[assign] $variable = expression[assign] $variable := command[assign]
$variable [-= | += | *= | /= | %= | x= | .= | ||= | &&= ] expression[assign]
$variable [-:= | +:= | *:= | /:= | %:= | x:= | .:= | ||:= | &&:= ] command
Evaluate the expression (= assignment) or command (:= assignment) on the right side of the assignment and store the result in a given variable. Optionally a Perl operator (- subtraction, + addition, * multiplication, / division, % modulo, x repeat string n-times, . concatenation, || logical OR, && logical AND) can precede the assignment, in which case the variable is assigned the result of applying given operator on its previous value and the value of the right side of the assignment.
Example 20. Assign XPath (node-set, string), or Perl results
xsh>$a=chapter/title;xsh>$b="hallo world";xsh>$c={ `uname` };xsh>ls $a;
Example 21. Arithmetic expressions (XPath)
xsh>$a=5*100# assign 500 to $a xsh>$a += 20# add 20 to $a xsh>$a = (($a+5) div 10)
Example 22. Arithmetic expressions (Perl)
xsh>$a={ 5*100 }xsh>$a = { join ';', split //,"hallo" }# assigns "h;a;l;l;o" to $a
Example 23. Command result assignment
xsh>$doc := open "file.xml"# open a document xsh>$copies := xcopy //foo into //bar# copy elements and store the copies xsh>$wrappers := wrap "wrapper" $copies# wrap each node from $copies to a new element "wrapper" and store the wrapping elements
Enable creating backup files on save (default).
This command is equivalent to setting the
$BACKUPS variable to 1.
call expression [expression ...]
Call a subroutine whose name is computed by evaluating the first argument expression. All other expressions are evaluated too and the results are passed to the subroutine as arguments.
This command should only be used if the name of the subroutine isn't known at the compile time. Otherwise it is recommended to use a direct subroutine call of the form:
subroutine-name [argument1 argument2 ...]
def a $arg { echo "A says" $arg }
def b $arg { echo "B says" $arg }
a "hallo!"; # call subroutine a
b "hallo!"; # call subroutine b
call { chr(ord("a")+rand(2)) } "surprise!"; # call a or b randomly
canonical [--comments|:c] [--filter|:f xpath] [expression]
This commands prints a canonical XML representing nodes specified by its argument (or current node, if no argument is given).
--comments or
:c removes comments
from the resulting XML.
--filter or :f
can be used to filter
the resulting XML so that it only contains
nodes explicitly included in the given node-set.
For details see "Canonical XML" or "Exclusive XML Canonicalization" W3C recommendations.
catalog filename
cd [expression]
Evaluate given expression to a node-list and change current context node to the first node in it.
change-ns-prefix expression [expression]
This command takes one or two arguments. The first argument is a new prefix and the second, optional, argument is an old namespace prefix. It changes the prefix of a namespace declaration of the context node to the new value. If no old prefix is given, the change applies to a declaration on the context node whose prefix equals to the prefix of the context node, otherwise the command changes the declaration with the given old prefix.
The command throws an exception if the new prefix is already taken by some other declaration in the scope.
change-ns-uri expression [expression]
This command takes one or two arguments. The first argument is a new namespace URI and the second, optional, argument is a namespace prefix. It changes the URI value of a namespace declaration of the context node to the new value. If no prefix is given, the change applies to a declaration on the context node whose prefix equals to the prefix of the context node, otherwise the change applies to a declaration with the given prefix.
$doc := clone document
Create and return a copy of a given document. Unless switch-to-new-documents configuration flag is turned off, the root node of the new document becomes the current node.
Calling this command only makes sense if
either
switch-to-new-documents is set, or
if the result is assigned to a variable or
passed to another XSH2 command using the &{...}
syntax, since otherwise the newly
created copy of the document is automatically garbage-collected and
destroyed.
close [document]
Close a given document (or, if called with no argument, the current document) by trying to remove all references from XSH2 variables to nodes belonging to the document. If no references to the document are left, the garbage-collector destroys the DOM tree and frees the memory it occupied for later reuse (depending on architecture, this may or may not give the allocated memory back to the system).
copy [--respective|:r] expression location expression$results := copy [--respect