For Gauche 0.9.15Search (procedure/syntax/module):

Next: , Previous: , Up: Library modules - Utilities   [Contents][Index]

12.61 sxml.tools - Manipulating SXML structure

Module: sxml.tools

This module is a port of Kirill Lisofsky’s sxml-tools, a collection of convenient procedures that work on SXML structure. The current version is derived from sxml-tools CVS revision 3.13.

The manual entry is mainly derived from the comments in the original source code.


12.61.1 SXML predicates

Function: sxml:empty-element? obj

{sxml.tools} A predicate which returns #t if given element obj is empty. Empty element has no nested elements, text nodes, PIs, Comments or entities but it may contain attributes or namespace-id. It is a SXML counterpart of XML empty-element.

Function: sxml:shallow-normalized? obj

{sxml.tools} Returns #t if the given obj is shallow-normalized SXML element. The element itself has to be normalized but its nested elements are not tested.

Function: sxml:normalized? obj

{sxml.tools} Returns #t if the given obj is normalized SXML element. The element itself and all its nested elements have to be normalised.

Function: sxml:shallow-minimized? obj

{sxml.tools} Returns #t if the given obj is shallow-minimized SXML element. The element itself has to be minimised but its nested elements are not tested.

Function: sxml:minimized? obj

{sxml.tools} Returns #t if the given obj is minimized SXML element. The element itself and all its nested elements have to be minimised.


12.61.2 SXML accessors

Function: sxml:name obj

{sxml.tools} Returns a name of a given SXML node. It’s just an alias of car, but introduced for the sake of encapsulation.

Function: sxml:element-name obj

{sxml.tools} A version of sxml:name, which returns #f if the given obj is not a SXML element. Otherwise returns its name.

Function: sxml:node-name obj

{sxml.tools} Safe version of sxml:name, which returns #f if the given obj is not a SXML node. Otherwise returns its name.

Function: sxml:ncname obj

{sxml.tools} Returns Local Part of Qualified Name (Namespaces in XML production [6]) for given obj, which is ":"-separated suffix of its Qualified Name. If a name of a node given is NCName (Namespaces in XML production [4]), then it is returned as is. Please note that while SXML name is a symbol this function returns a string.

Function: sxml:name->ns-id sxml-name

{sxml.tools} Returns namespace-id part of given name, or #f if it’s LocalName

Function: sxml:content obj

{sxml.tools} Returns the content of given SXML element or nodeset (just text and element nodes) representing it as a list of strings and nested elements in document order. This list is empty if obj is empty element or empty list.

Function: sxml:content-raw obj

{sxml.tools} Returns all the content of normalized SXML element except attr-list and aux-list. Thus it includes PI, COMMENT and ENTITY nodes as well as TEXT and ELEMENT nodes returned by sxml:content. Returns a list of nodes in document order or empty list if obj is empty element or empty list. This function is faster than sxml:content.

In SXML normal form, an element is represented by a list as this:

  (name attr-list aux-list content …)

where attr-list is a list beginning with @, and aux-list is a list beginning with @@.

In the minimized form, Aux-list can be omitted when it is empty. Attr-list can be omitted when it is empty and aux-list is absent.

The following procedures extract attr-list and aux-list.

Function: sxml:attr-list-node obj

{sxml.tools} Returns attr-list for a given obj, or #f if it is absent

Function: sxml:attr-as-list obj

{sxml.tools} Returns attr-list wrapped in list, or ’((@)) if it is absent and aux-list is present, or ’() if both lists are absent.

Function: sxml:aux-list-node obj

{sxml.tools} Returns aux-list for a given obj, or #f if it is absent.

Function: sxml:aux-as-list obj

{sxml.tools} Returns aux-list wrapped in list, or ’() if it is absent.

Function: sxml:attr-list-u obj

{sxml.tools} Returns the list of attributes for given element or nodeset. Analog of ((sxpath '(@ *)) obj). Empty list is returned if there is no list of attributes.

The -u suffix indicates it can be used for non-normalized SXML node. (’u’ stands for ’universal’).

Function: sxml:aux-list obj

{sxml.tools} Returns the list of auxiliary nodes for given element or nodeset. Analog of ((sxpath '(@@ *)) obj). Empty list is returned if a list of auxiliary nodes is absent.

Function: sxml:aux-list-u obj

{sxml.tools} Returns the list of auxiliary nodes for given element or nodeset. Analog of ((sxpath '(@@ *)) obj). Empty list is returned if a list of auxiliary nodes is absent.

The -u suffix indicates it can be used for non-normalized SXML node. (’u’ stands for ’universal’).

Function: sxml:aux-node obj aux-name

{sxml.tools} Return the first aux-node with aux-name given in SXML element obj or #f is such a node is absent. Note: it returns just the first node found even if multiple nodes are present, so it’s mostly intended for nodes with unique names .

Function: sxml:aux-nodes obj aux-name

{sxml.tools} Return a list of aux-node with aux-name given in SXML element obj or ’() if such a node is absent.

Function: sxml:attr obj attr-name

{sxml.tools} Accessor for an attribute attr-name of given SXML element obj. It returns: the value of the attribute if the attribute is present, or #f if there is no such an attribute in the given element.

Function: sxml:num-attr obj attr-name

{sxml.tools} Accessor for a numerical attribute attr-name of given SXML element obj. It returns: a value of the attribute as the attribute as a number if the attribute is present and its value may be converted to number using string->number, or #f if there is no such an attribute in the given element or its value can’t be converted to a number.

Function: sxml:attr-u obj attr-name

{sxml.tools} Accessor for an attribute attr-name of given SXML element obj which may also be an attributes-list or nodeset (usually content of SXML element).

It returns: the value of the attribute if the attribute is present, or #f if there is no such an attribute in the given element.

The -u suffix indicates it can be used for non-normalized SXML node. (’u’ stands for ’universal’).

Function: sxml:ns-list obj

{sxml.tools} Returns the list of namespaces for given element. Analog of ((sxpath '(@@ *NAMESPACES* *)) obj) Empty list is returned if there is no list of namespaces.

Function: sxml:ns-id->nodes obj namespace-id

{sxml.tools} Returns the list of namespace-assoc’s for given namespace-id in SXML element obj. Analog of ((sxpath '(@@ *NAMESPACES* namespace-id)) obj). Empty list is returned if there is no namespace-assoc with namespace-id given.

Function: sxml:ns-id->uri obj namespace-id

{sxml.tools} Returns a URI for namespace-id given, or #f if there is no namespace-assoc with namespace-id given.

Function: sxml:ns-uri->id obj uri

{sxml.tools} Returns a namespace-id for namespace URI given.

Function: sxml:ns-id ns-assoc

{sxml.tools} Returns namespace-id for given namespace-assoc list.

Function: sxml:ns-uri ns-assoc

{sxml.tools} Returns URI for given namespace-assoc list.

Function: sxml:ns-prefix ns-assoc

{sxml.tools} It returns namespace prefix for given namespace-assoc list. Original (as in XML document) prefix for namespace-id given has to be strored as the third element in namespace-assoc list if it is different from namespace-id. If original prefix is omitted in namespace-assoc then namespace-id is used instead.


12.61.3 SXML modifiers

Constructors and mutators for normalized SXML data. These functions are optimized for normalized SXML data. They are not applicable to arbitrary non-normalized SXML data.

Most of the functions are provided in two variants:

  1. side-effect intended functions for linear update of given elements. Their names are ended with exclamation mark. Note that the returned value of this variant is unspecified, unless explicitly noted. An example: sxml:change-content!.
  2. pure functions without side-effects which return modified elements. An example: sxml:change-content.
Function: sxml:change-content obj new-content
Function: sxml:change-content! obj new-content

{sxml.tools} Change the content of given SXML element to new-content. If new-content is an empty list then the obj is transformed to an empty element. The resulting SXML element is normalized.

Function: sxml:change-attrlist obj new-attrlist
Function: sxml:change-attrlist! obj new-attrlist

{sxml.tools} The resulting SXML element is normalized. If new-attrlist is empty, the cadr of obj is (@).

Function: sxml:change-name obj new-name
Function: sxml:change-name! obj new-name

{sxml.tools} Change a name of SXML element destructively.

Function: sxml:add-attr obj attr

{sxml.tools} Returns SXML element obj with attribute attr added, or #f if the attribute with given name already exists. attr is (attr-name attr-value). Pure functional counterpart to sxml:add-attr!.

Function: sxml:add-attr! obj attr

{sxml.tools} Add an attribute attr for an element obj. Returns #f if the attribute with given name already exists. The resulting SXML node is normalized. Linear update counterpart to sxml:add-attr.

Function: sxml:change-attr obj attr

{sxml.tools} Returns SXML element obj with changed value of attribute attr, or #f if where is no attribute with given name. attr is (attr-name attr-value).

Function: sxml:change-attr! obj attr

{sxml.tools} Change value of the attribute for element obj. attr is (attr-name attr-value). Returns #f if where is no such attribute.

Function: sxml:set-attr obj attr
Function: sxml:set-attr! obj attr

{sxml.tools} Set attribute attr of element obj. If there is no such attribute the new one is added.

Function: sxml:add-aux obj aux-node

{sxml.tools} Returns SXML element obj with an auxiliary node aux-node added.

Function: sxml:add-aux! obj aux-node

{sxml.tools} Add an auxiliary node aux-node for an element obj.

Function: sxml:squeeze obj
Function: sxml:squeeze! obj

{sxml.tools} Eliminates empty lists of attributes and aux-lists for given SXML element obj and its descendants ("minimize" it). Returns a minimized and normalized SXML element.

Function: sxml:clean obj

{sxml.tools} Eliminates empty lists of attributes and all aux-lists for given SXML element obj and its descendants. Returns a minimized and normalized SXML element.


12.61.4 SXPath auxiliary utilities

These are convenience utilities to extend SXPath functionalities.

Function: sxml:add-parents obj . top-ptr

{sxml.tools} Returns an SXML nodeset with a ’parent pointer’ added. A parent pointer is an aux node of the form (*PARENT* thunk), where thunk returns the parent element.

Function: sxml:node-parent rootnode

{sxml.tools} Returns a fast ’node-parent’ function, i.e. a function of one argument - SXML element - which returns its parent node using *PARENT* pointer in aux-list. ’*TOP-PTR* may be used as a pointer to root node. It return an empty list when applied to root node.

Function: sxml:lookup id index

{sxml.tools} Lookup an element using its ID.


12.61.5 SXML to markup conversion

Procedures to generate XML or HTML marked up text from SXML. For more advanced conversion, see the SXML serializer (sxml.serializer - Serializing XML and HTML from SXML).

Function: sxml:clean-feed . fragments

{sxml.tools} Filter the ’fragments’. The fragments are a list of strings, characters, numbers, thunks, #f – and other fragments. The function traverses the tree depth-first, and returns a list of strings, characters and executed thunks, and ignores #f and ’().

If all the meaningful fragments are strings, then (apply string-append ... ) to a result of this function will return its string-value.

It may be considered as a variant of Oleg Kiselyov’s SRV:send-reply: While SRV:send-reply displays fragments, this function returns the list of meaningful fragments and filter out the garbage.

Function: sxml:attr->xml attr :optional namespace-prefix-assig
Function: sxml:attr->html attr

{sxml.tools} Creates the XML/HTML markup for attributes. Return a tree of strings, which can be passed to tree->string or write-tree (see text.tree - Lazy text construction).

The argument attr is a two-element list, (attr-name attr-value), where attr-name is a symbol. Attr-value can be any value, and converted to a string with x->string.

The optional namespace-prefix-assig argument is used to replace namespace prefix in the attibute name to its alias. See sxml:sxml->xml below for the details.

There’s a subtle difference between the two. If the value is an empty string, sxml:attr->html omits the attribute value, while sxml:attr->xml doesn’t.

(use text.tree)

(tree->string (sxml:attr->xml '(name foo)))
 ⇒ " name=\"foo\""

(tree->string (sxml:attr->html '(value "He said, \"Oops.\"")))
 ⇒ " value=\"He said, "Oops."\""

(tree->string (sxml:attr->xml '(download "")))
 ⇒ " download=\"\""

(tree->string (sxml:attr->html '(download "")))
 ⇒ " download"

Note: This procedure is slighly modified from the original sxml-tools to escape special characters in the attribue value, and recognize namespace prefix.

Function: sxml:string->xml string
Function: sxml:string->html string

{sxml.tools} Return a tree of strings where characters <, >, &, and " in string are replaced by character entity references. With sxml:string->xml, ' is also replaced.

The returned string tree can be passed to tree->string or write-tree (see text.tree - Lazy text construction).

See also html-escape-string in text.html-lite - Simple HTML document construction.

Function: sxml:sxml->xml sxml :optional namespace-prefix-assig
Function: sxml:sxml->html sxml

{sxml.tools} Converts SXML to XML and HTML, respectively. The return value is a tree of strings to be pased to tree->string or write-tree (see text.tree - Lazy text construction).

sxml:sxml->xml can take an optional namespace-prefix-assig, which is the form ((alias-symbol . namespace-string) …). It is the same format as the namespace-prefix-assig argument of ssax:xml->sxml (see SSAX Highest-level parsers - XML to SXML). It has two effects:

  • It adds xmlns attributes to the top node of sxml.
  • If a node name or an attribute name has a prefix string that matches namespace-string, it is replaced with alias-symbol. (If its prefix string does not match any of namespace-prefix-assig, the prefix is left as is.)

Note that sxml:sxml->html does not take namespace-prefix-assig, for HTML doesn’t have namespace prefix.

Other differences between XML and HTML version are:

  • Handling of empty-valueed attribute (see sxml:attr->xml/sxml:attr->html above.)
  • Characters replaced with character entity references (see sxml:string->xml/sxml:string->html above.)
  • Non-terminated tags. In HTML, some tags does not have end tags. In XML, elements with empty content are emitted as self-terminating tags.
(tree->string (sxml:sxml->xml '(div (@ (foo "") (bar "'")))))
  ⇒ "<div foo=\"\" bar=\"&apos;\"/>"
(tree->string (sxml:sxml->html '(div (@ (foo "") (bar "'")))))
  ⇒ "<div foo bar=\"'\"/>"

(tree->string (sxml:sxml->xml '(img (@ (src "foo.png")))))
  ⇒ "<img src=\"foo.png\"/>"
(tree->string (sxml:sxml->html '(img (@ (src "foo.png")))))
  ⇒ "<img src=\"foo.png\">"
Function: sxml:non-terminated-html-tag? tag

{sxml.tools} This predicate yields #t for "non-terminated" HTML 4.0 tags.


Next: , Previous: , Up: Library modules - Utilities   [Contents][Index]


For Gauche 0.9.15Search (procedure/syntax/module):