!DOCTYPE = html
html {
head {
meta(http-equiv = content-type
content = 'text/html; charset=UTF-8')
title = libxmq
link(type = text/css
href = resources/style.css
rel = stylesheet)
link(type = text/css
href = resources/xmq.css
rel = stylesheet)
script(type = text/javascript
src = resources/code.js) = ''
meta(name = viewport
content = width=device-width
initial-scale = 1.0)
}
body(onload = 'checkBackgroundSetting();')
{
h1 = 'XMQ - a new language for xml/html (+json)'
i = ( 'by Fredrik Öhrström (last updated '
&DATE;
') oehrstroem@gmail.com' )
br
form(method = post
enctype = multipart/form-data
action = view)
{
button(type = button
title = 'Set functional cookie background=dark'
onClick = 'goDark();') = 'Dark Mode'
' '
button(type = button
title = 'Set functional cookie background=light'
onClick = 'goLight();') = 'Light Mode'
' '
input(name = uploadFile
id = uploadFile
type = file
onchange = 'form.submit()'
hidden)
button(type = button
title = 'Max 200KB, logs cleared after 30 days.'
onclick = '''document.getElementById('uploadFile').click()''') = 'Upload XML/HTML/JSON'
}
br
p = 'Sometimes it seems like new developers are gravitating towards JSON and other
key=value based languages for data storage, leaving XML behind. This is somewhat
understandable since the visual appearance of XML (when used for data storage)
can be verbose, to say the least. The discombobulated handling of whitespace does not help either.
It would be sad to see XML fade away simply because of its visual appearance.'
p = 'But what if there is a way to render XML as a key value language and fix the whitespace
at the same time? This would be the same XML as before, just view/edited in a different way.
XMQ is such a language, it is XML just with Q instead of L.'
p = 'The XMQ format is easier for a human to read and write than XML/HTML yet it
captures exactly the XML/HTML content. It can always be safely pretty printed
without introducing significant whitespace. The file-suffix HTMQ is used
when working with HTML in the XMQ format. There is even a reasonable mapping
between JSON and XMQ.'
p = 'We can use XML for markup of texts. We can use XMQ for data storage.
We can always convert between them since they are in fact the same thing.'
div {
'What does it look like? Click on the examples: '
br
a(href = resources/index_htmq.html) = '[this page as htmq]'
' '
a(href = resources/pom_example.html) = [pom.xml]
' '
a(href = resources/rss_example.html) = [rss.xml]
' '
a(href = resources/svg_example.html) = [svg]
' '
a(href = resources/ibmpcjr_cart.html) = '[mame cart driver]'
' '
a(href = resources/android_layout_main.html) = '[Android layout]'
' '
a(href = resources/jabber.html) = [jabber]
' '
a(href = resources/docx_example.html) = [docx]
' '
a(href = resources/odt_example.html) = [odt]
' '
a(href = resources/json_example.html) = [json]
' '
a(href = resources/xsd_example.html) = '[xsd schema definition]'
' '
a(href = resources/xslt_example.html) = '[xslt transform]'
' '
a(href = resources/soap_response.html) = '[soap response]'
' '
a(href = resources/java_pojo.html) = '[java pojo jackson]'
' '
a(href = resources/saml_idp_metadata.html) = '[saml idp metadata]'
' '
a(href = resources/saml_authn_response.html) = '[saml authn_response]'
' '
a(href = resources/SEQLXML25.html) = '[WIPO ST.26_sequence listing]'
}
&SHIPORDER_XMQ;
' '
pre(class = 'xmq xmq_light w40') = ( &SHIPORDER_XML;
)
p {
'You can use the standalone '
a(href = https://github.com/libxmq/xmq/releases/latest)
{
i = xmq
}
' tool to convert between xmq/xml/htmq/html/json or include '
a(href = https://github.com/libxmq/xmq/releases/latest) = xmq.h/xmq.c
' to use xmq directly from your program or in the future link to a prebuilt libxmq.
With the xmq tool you can syntax highlight and pretty print XMQ (and thus indirectly XML and HTML).
You can convert between XMQ/HTMQ and XML/HTML/JSON, you can apply XSLT transformations, replace
entities with other XMQ/XML/JSON or quoted content, write raw text output and more...'
}
div {
a(href = https://github.com/libxmq/xmq/releases/latest) = [Download]
' '
a(href = https://github.com/libxmq/xmq) = [Github]
' '
a(href = xmq.pdf) = [Grammar]
' '
a(href = https://github.com/orgs/libxmq/discussions) = [Forum/Discussions]
br
}
pre = '''########## Pretty print as xmq using colors on terminal
xmq pom.xml
xmq data.json
xmq index.html
########## Convert to xmq and store in file
xmq pom.xml > pom.xmq
xmq data.json > data.xmq
########## Use the built in pager.
xmq pom.xml pa
cat data.json | xmq pa
xmq index.html delete //script delete //style pa
curl -s 'https://dummyjson.com/todos?limit=20' | xmq pa
########## View in your default browser.
xmq docbook.xml br
xmq data.json br
curl https://slashdot.org | xmq delete //script delete //style br
########## Convert xmq to xml,html or json
xmq pom.xmq to-xml > pom.xml
xmq data.xmq to-json > data.json
xmq index.htmq to-html > index.html'''
h2 = Background
p = 'XML can be human readable/editable if it is used for markup of longer
human language texts, ie books, articles and other documents etc. In
these cases the xml-tags represent a minor part of the whole xml-file.'
p = 'However XML is often used for data storage and
configuration files (eg pom.xml). In such files the
xml-tags represent a major part of the whole
xml-file. This makes the data storage and config files
hard to read and edit directly by hand. Today, the tags
are a major part of html files as well, which is one
reason why html files are hard to read and edit.'
p = 'The other reason why XML/HTML is hard to edit by hand is because they
have a complicated approach to dealing with significant
whitespace. The truth is that you cannot always pretty print the
XML/HTML code as you would like since it might introduce significant
white space. In fact proper pretty printing even requires a full
understanding of the DTD/XSD/CSS!'
h2 = Solution
p = 'XMQ solves the verbosity of tags by using braces to avoid
closing xml-tags and parentheses to surround the
attributes. XMQ solves the whitespace confusion by
requiring all intended whitespace to be quoted. There are a lot
of details, of course, but let us begin with an example.'
p {
'Download/build the xmq executable for your platform, then
download: '
a(href = resources/shiporder.xmq) = shiporder.xmq
a(href = resources/shiporder.xml) = shiporder.xml
' (Note that the xml is manually pretty printed.)'
}
p {
'To convert the XML to XMQ and pretty print with colors:'
br
span(class = code) = 'xmq shiporder.xml'
}
p {
'Use the page command if the source file is large:'
br
span(class = code) = 'xmq shiporder.xml page'
br
'Press q or esc to exit pager. You can shorten the command to just pa.'
}
p {
'Use the browse command to start your default browser to view the content:'
br
span(class = code) = 'xmq shiporder.xml browse'
br
'You can shorten the command to just br. Set the environement variable
XMQ_BG=dark or XMQ_BG=light to background color or XMQ_BG=mono for no color.'
}
p {
'Use shell redirection to store the XMQ output in a file:'
br
span(class = code) = 'xmq shiporder.xml to-xmq > test.xmq'
br
'(The to-xmq command is the default command and can be left out here.)'
}
p {
'To convert the XMQ to XML:'
br
span(class = code) = 'xmq shiporder.xmq to-xml > test.xml'
}
&SHIPORDER_XMQ;
' '
pre(class = 'xmq xmq_light w40') = ( &SHIPORDER_XML;
)
p = 'The hierarchical style should look familiar, but note:'
ul {
li = 'XMQ files are always UTF8 encoded.'
li = 'Safe values after = can be stored as plain text (see 889923 container), no quoting needed!'
li = '''Unsafe values (after =) with newlines, whitespace or ( ) { } ' " or leading = & // /* must be quoted.'''
li = 'Two single quotes always mean the empty string (see sailing).'
li = 'In multiline quotes, the incidental indentation is removed (see address).'
li = ''''
Quotes containing quotes are quoted using n+1 single quotes (see coord).
Note that two quotes are reserved for the empty string. You will therefore see a single
quote ' or three quotes ''' or more quotes.
''''
li = 'Single line comments use // and multi line comments use /* */.'
li = 'Comments containing comments are commented using n+1 slashes (eg ///* *///).'
}
p = 'This means that you can quote any block of text with enough single quotes and you
can comment any block of text with enough slashes.'
p = 'The incidental indentation removal and n+1 quotes ideas originated in the expert group
for JEP 378 Java Text blocks and was a collaborative effort led by Jim Laskey and Brian Goetz.
The seed to the idea to separate desired whitespace from incidental(accidental)
was planted by Kevin Bourrillion and the idea for n+1 quotes came from John Rose.'
h2 = 'XMQ as a configuration file'
p {
'XMQ permits multiple root nodes which means that if you use XMQ as your software config file format
then the first iteration of your config file can be as simple as this:'
a(href = resources/config.xmq) = config.xmq
}
&CONFIG_XMQ;
p = 'Every xmq file can be printed in compact form on a single line where whitespace between tokens is minimized.'
&CONFIG_COMPACT_XMQ;
p = 'The only permitted whitespace between tokens are space (ascii 32) and newlines (ascii 10 or 13 or 13 10).
All other whitespace (including tabs) must be quoted. Let us take a look at shiporder in compact form.
(Any linebreaks below are due to your browser.)'
&SHIPORDER_COMPACT_XMQ;
p {
'You can see character entities like '
xmqE =
' for newlines and compound values like '
xmqEK = address
'='
xmqCP = '('
xmqQ = ( '
'...'
' )
xmqE =
xmqQ = ( '
'...'
' )
xmqCP = ')'
' which normally is a multiline quote but where escaped newlines are intermingled with quotes to create the compact form.'
}
p = 'You can read the config file using a simple C api. (You can also use the full libxml2 api if you like and future
programming languages APIs will be coming.)'
pre(class = 'xmq xmq_light w40')
{
'XMQDoc *doc = '
xmqQ = xmqNewDoc
'();
ok = '
xmqQ = xmqParseFile
'(doc, "config.xmq, "myconf"); assert(ok);
server = '
xmqQ = xmqGetString
'(doc, NULL, "/myconf/server");
port = '
xmqQ = xmqGetInt
'(doc, NULL, "/myconf/port");
file = '
xmqQ = xmqGetString
'(doc, NULL, "/myconf/file");
cron = '
xmqQ = xmqGetString
'(doc, NULL, "/myconf/cron");'
}
p {
'As you can see, XMQ can be trivial, which is nice for your first config file, but when your program
grows in complexity, so can your config file. You do not have to convert your config
file to xml, but if you want to then you can supply your implicit root (in this case myconf):'
br
span(class = code) = 'xmq --root=myconf config.xmq to-xml > config.xml'
}
pre(class = 'xmq xmq_light w80') = &CONFIG_XML;
h2 = 'Web pages and whitespace'
p {
'Now let us try some htmq/html:'
a(href = resources/welcome_traveller.htmq) = welcome_traveller.htmq
a(href = resources/welcome_traveller.html) = welcome_traveller.html
' (Note that the html is manually pretty printed.)'
}
p {
span(class = code) = 'xmq welcome_traveller.htmq pager'
}
&WELCOME_TRAVELLER_HTMQ;
' '
pre(class = 'xmq xmq_light w80') = &WELCOME_TRAVELLER_HTML;
ul {
li = '''Text that does not immediately follow an equal sign = is called a standalone quote
(see 'Rest here ...' and 'p until ...') and must always be quoted.
If you do not quote them, they will be interpreted as elements (see html body a h1).'''
li = 'XMQ pretty printing is straightforward whereas the html line breaks are weird to prevent spaces inside the word sleep.'
li = 'XMQ entities like ∇ (∇) must be outside of the quotes.'
li = 'In the xmq it is obvious that there is exactly a single space between Say and the nabla.'
}
p {
'If you convert from htmq to html:'
br
span(class = code) = 'xmq welcome_traveller.htmq to-html'
br
'Then you will see that xmq does not pretty print since it wants to preserve the xmq whitespace
exactly as it was written. (Any linebreaks below are due to your browser.)
Since you know exactly what whitespace you are feeding the browser (html) and other tools (xml)
it will be easier to control their behaviour.'
}
pre(class = 'xmq xmq_light w40') = &WELCOME_TRAVELLER_NOPP_HTML;
p {
'If you convert from the original manually pretty printed html above to htmq:'
span(class = code) = 'xmq welcome_traveller.html to-htmq'
' then you will see that the tool xmq by default trims
whitespace using its own heuristic. It keeps the
original linebreaks but removes incidental indentation
and leading/ending whitespace if the leader/ender contain
newlines. This is the same rule xmq uses for trimming multiline quotes
and comments. This heuristic usually works well but might in some
situation remove significant spaces from XML sources which
were not written with this in mind.'
}
&WELCOME_TRAVELLER_BACK_HTMQ;
p {
'You can also see that the ∇ was replaced with the actual ∇. This happened because
it was character entity, which is just another kind of quote.
If you want to preserve all whitespace and restore the html entities then do:'
br
span(class = code) = 'xmq --trim=none welcome_traveller.html to-htmq --escape-non-7bit'
}
&WELCOME_TRAVELLER_BACK_NOTRIM_HTMQ;
p = 'As you can see there is quite a lot of whitespace in xml/html, which might or might not be
significant/ignorable depending on your css and other settings. If you really want this whitespace then
xmq will make it obvious.'
h2 = 'Compact XMQ with multiline comments'
p {
'The opposite of xmq pretty printing is xmq compact printing with no indentation and no newlines: '
br
span(class = code) = 'xmq welcome_traveller.htmq to-htmq --compact'
}
&WELCOME_TRAVELLER_HTMQ_COMPACT;
p {
'Even multiline comments can be printed as compact XMQ since */* means a newline.
This is not possible with XML since there is no standardized way to escape newlines inside html/xml comments.'
a(href = resources/multi.xmq) = multi.xmq
}
&MULTI_XMQ;
p {
span(class = code) = 'xmq multi.xmq to-xmq --compact'
}
&MULTI_COMPACT_XMQ;
' '
p {
'Note that, for historical reasons, xml/html does not permit two or more consecutive dashes inside a comment! This is
quite a showstopper if you just want to comment out some large part of your document. As you
can see two dashes are permitted in xmq-comments and the xmq tool works around
this problem when converting to xml/html by adding a very specific char (␐) in such a way
there are no two consecutive dashes in the xml. When loading from such xml, the char (␐) is
instead removed to restore the two dashes.'
span(class = code) = 'xmq multi.xmq to-xml'
}
pre(class = 'xmq xmq_light w80') = &MULTI_XML;
h2 = 'Quoting and entities inside attributes'
p = '''For elements, a key=value is syntactic sugar for: key{'value'}
For attributes, you can only write: key=value
In the example below all content is: 123
If the xmq tool detects that all children of an element are either text or entities,
then it will present the element as a key value pair.'''
&SYNTACTIC_SUGAR_XMQ;
p = 'XMQ is designed with the assumption that we rarely need significant leading/ending whitespace/quotes.
However sometimes you have to have that. For an element value, you can express such whitespace in
different ways:'
&CORNERS_XMQ;
p {
'You can see the magenta colored parentheses '
xmqCP = '( )'
' after the equal = sign. This is a compound value which can only consist of quotes and entities.
Compound values are mandatory for the attribute values that need multiple quotes/entities since braces
{ } cannot be used inside an attribute value.'
}
&COMPOUND_XMQ;
h2 = 'Viewing large html pages'
p = 'The xmq tool is useful to decode large html pages. Let us assume that you downloaded a large html page: index.html'
pre(class = 'xmq xmq_light w40') = 'xmq index.html delete /html/head delete //style delete //script pager'
p = 'This command will delete the head node and all style and script nodes, before using the pager to show you the htmq.
The argument to delete is an xpath expression.'
p {
'There are other commands to modify the xmq. In particular you can see how this web page is constructed by
replacing entities with text files or with xmq files. '
a(href = resources/index_htmq.html) = index.htmq
}
h2 = JSON
a(id = json)
p {
'We can use the xmq tool to convert shiporder to json:'
span(class = code) = 'xmq shiporder.xmq to-json | jq .'
' You can see that the xml element name is folded as the key "_":"shiporder" and
attributes are folded as children prefixed with underscores.'
}
pre(class = 'xmq xmq_light w40') = &SHIPORDER_JSON;
p = 'If an XMQ value is a valid JSON number, true, false or null, then it is converted to
the proper JSON value (no quotes) see "id":889923 . If you want to force a number to a JSON string,
then add the S attribute: speed(S)=123 will translate into "speed":"123" .'
p {
'We can also use the xmq tool to parse json. '
br
span(class = code) = '''curl -s 'https://dummyjson.com/todos?skip=4&limit=2' | xmq'''
' or '
br
span(class = code) = 'xmq todos.json'
a(href = resources/todos.xmq) = todos.xmq
a(href = resources/todos.json) = todos.json
}
p = 'A limitation of JSON is painfully visible as underline element names.
The Javascript objects lack types which means that the JSON objects lacks type information
and this results in the corresponding XMQ element get the anonymous name/type underscore _.
If the JSON object contains a child _ with a name, then xmq will use this as the key instead,
this enables proper back-forth conversion between XMQ and JSON.'
&TODOS_XMQ;
' '
pre(class = 'xmq xmq_light w80') = ( &TODOS_JSON;
)
p = 'XMQ can read and generate json. The workflow is intended to be: 1) convert json to xmq 2) work on the data as xmq
3) possibly write back to json. Some clean xml files (like pom.files) can be converted to clean json,
worked on in json and then converted back to xmq. But if you convert generic xml/html with standalone text nodes
and attributes to json, then the result is probably rather confusing since key values must be unique in json.'
p = 'The xmq tool solves this by detecting non-unique element names and suffixing them with [0], [1], [2] etc.'
&SIMPLE_PAGE_HTMQ;
' '
pre(class = 'xmq xmq_light w40') = ( &SIMPLE_PAGE_JSON;
)
h2 = 'XSLT transforms'
p {
'XSLT transforms can be terribly(horribly) complicated when you have to deal with whitespace.
Again the XMQ format solves this part of the XSLT complexity because the whitespace in XMQ is explicit.
Let us convert the curled JSON directly into an HTML page using xmq and an xslq transform.'
a(href = resources/todos.xslq) = todos.xslq
a(href = resources/todos.xslt) = todos.xslt
br
span(class = code) = 'xmq todos.json transform todos.xslq to-html > todos.html'
}
&TODOS_XSLQ;
p {
'And this is the generated HTMQ. '
a(href = resources/todos.html) = todos.html
}
&TODOS_HTMQ;
p {
'We can generate whitespace exact text output and still have
a pretty printed and readable xslq-transform. The to-text commad
will output only text nodes and substituted entities.'
br
span(class = code) = 'xmq todos.json transform todosframed.xslq to-text'
br
a(href = resources/todos.xslq) = todosframed.xslq
}
&TODOSFRAMED_XSLQ;
p {
'And this is the generated text file. '
a(href = resources/todos.text) = todos.text
}
pre(class = 'xmq xmq_light w80 text') = &TODOS_TEXT;
h2 = 'DTD:s and XSD:s'
p = 'The inline DTD is the value assigned to the !DOCTYPE.
There are no changes the the DTD language.'
&DTD_EXAMPLE_XMQ;
pre(class = 'xmq xmq_light w40') = &DTD_EXAMPLE_XML;
p = 'Since XSD:s are normal xml they are rendered as XMQ in the same way as other XML.'
&XSD_EXAMPLE_XSQ;
h2 = Conclusions
p = 'With XMQ the assumed dichotomy between mark-up languages (like XML) and key-value store languages (like JSON)
has been (perhaps surprisingly) removed. We can now use XML for mark-up situations and XMQ for key-value
store situations. They are interchangeable and all the years of effort going into XSLT/XSD and other
tools can still be used with XMQ.'
p = 'There are of course still bugs to fix in xmq and improvements to how the specification works.
Please let me know if you find bugs or other improvements.'
}
}