XMIR, a Quick Tour
Last updated at: 17.04.2025
XMIR is a dialect of XML,
which we use to represent a parsed
EO object. It is a pretty simple format,
which has a few
important tricks that I share below in this blog post. You may
also want to check our schema:
XMIR.xsd
(it is also rendered in HTML,
which may be more readable for some of you).
Consider this simple EO object that prints "Hello, world!"
:
# App.
[] > app
[x] > foo
QQ.io.stdout > @
QQ.txt.sprintf *1
"Hello, %s\n"
x
foo > @
"world!"
If we parse it using the EoSyntax
class from eo-parser,
we will get this XMIR (or very similar):
<object
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
dob="2024-12-27T11:00:08"
ms="98"
revision="27abe8b"
time="2025-04-17T09:32:04.455112Z"
version="0.56.0"
xsi:noNamespaceSchemaLocation="https://www.eolang.org/xsd/XMIR-0.56.0.xsd">
<listing># App.
[] > app
[x] > foo
QQ.io.stdout > @
QQ.txt.sprintf *1
"Hello, %s\n"
x
foo > @
"world!"
</listing>
<o line="2" name="app" pos="0">
<o line="3" name="foo" pos="2">
<o base="∅" line="3" name="x" pos="3"/>
<o base=".stdout" line="4" name="@" pos="9">
<o base=".io" line="4" pos="6">
<o base="QQ" line="4" pos="4"/>
</o>
<o base=".sprintf" line="5" pos="12">
<o base=".txt" line="5" pos="8">
<o base="QQ" line="5" pos="6"/>
</o>
<o base="string" line="6" pos="8">48-65-6C-6C-6F-2C-20-25-73-0A</o>
<o base="tuple" line="7" pos="8">
<o base=".empty">
<o base="tuple"/>
</o>
<o base="x" line="7" pos="10"/>
</o>
</o>
</o>
</o>
<o base="foo" line="8" name="@" pos="2">
<o base="string" line="9" pos="4">77-6F-72-6C-64-21</o>
</o>
</o>
</object>
The <object>
is the root element, it will always be there, with
a few mandatory attributes:
ms
is how much time in milliseconds it took to parse the object and generate this XMIR file,time
is the time in ISO 8601 format when the file was generated,version
is the version of the parser.
The <listing>
element contains the source code of the EO object,
which was parsed, without any modifications, “as is.”
Errors and Warnings
The <errors>
element may have a list of problems discovered by the
parser or any other optimizers, as <error>
elements. If there are no
errors, the <errors>
element should not exist in <object>
.
For example, it may look like this:
<object>
[...]
<errors>
<error severity="warning" line="3">There is an extra bracket</error>
<error severity="error" line="12">The object 'x' is not found</error>
[...]
</errors>
</object>
The errors with the warning
severity may more or less safely be ignored. The
errors with the error
severity will lead to failures in further compilation
and processing. There could also be elements with the critical
severity,
which must stop the processing of the document immediately.
Sheets
The <sheets>
element contains a list of all
post-processors that were applied to the document after is parsing.
We process our XMIR documents using dozens of XSL stylesheets. That’s why
the name of the XML element. You may find something like this over there:
<object>
[...]
<sheets>
<sheet>move-voids-up</sheet>
<sheet>const-to-dataized</sheet>
<sheet>stars-to-tuples</sheet>
<sheet>wrap-method-calls</sheet>
[...]
</sheets>
</object>
The names you see in the <sheet>
elements are the names of the files.
For example, wrap-method-calls
represents the
[wrap-method-calls.xsl
] file
in the objectionary/eo GitHub repository.
If no XSL stylesheets are applied to XMIR, the <sheets>
element should not exist
in <object>
.
Metas
There may be an optional element <metas>
with a list of <meta>
elements.
For example, if my source code would have this meta at the 3rd
line of the source file:
+alias foo com.example.foo
We would see the following in the XMIR:
<object>
[...]
<metas>
<meta line="3">
<head>alias</head>
<tail>foo Q.com.example.foo</tail>
<part>foo</part>
<part>Q.com.example.foo</part>
</meta>
[...]
</metas>
</object>
Each <meta>
element contains parts of the meta. The <head>
contains everything that goes after the +
until the first space.
The <tail>
contains everything after the first space. There could
be a number of <part>
elements, each of which containing the parts
of the <tail>
separated by spaces.
Objects
The <object>
element must contain only one <o/>
element which represents an
object being parsed. The <o/>
element may have a few optional attributes:
line
andpos
are the number of the line where the object was found by the parser and the position in the line;name
is the name of the object, if the object has it;base
may refer to object formation that is being copied;as
is the name of the attribute which current object is bound to during the application
There could be no other attributes.
Special cases
- The
<o/>
elements that have nested<o>
element withname
which value isλ
are atoms. Atoms must not havebase
attribute: ```xml
2. The `<o/>` elements with `base` attribute which value is `∅` are **void** attributes.
Void attributes also must have `name` attribute:
```xml
<o name="foo">
<o name="bar" base="∅"/>
</o>
- Data literals found in the source code are presented with nested
<o/>
XML elements that contain text. Only elements withbase
attribute equal toQ.org.eolang.bytes
may contain nested<o>
element with text.
<o base="Q.org.eolang.bytes" line="6" pos="8">
<o>48-65-6C-6C-6F-2C-20-25-73-0A</o>
</o>
- The
name
attribute of<o/>
element may be auto generated by EO parser. In such case it’s look like: ```xml
Such `name` consists of several parts:
- char `a` (ascii 97) that stands for "auto-generated"
- char `🌵` that is just a pretty character prohibited by EO grammar
- number `104` which is joined line and position of the place where
the object is found.
Such names are unique through entire XMIR.
5. If object is bound to a specific attribute not by name but by position, the
`as` attribute may look like:
```xml
<o base="Q.org.eolang.number" as="α2"/>
Here the first character is α
(alpha), the number 2
is the position of the
attribute.
This description of XMIR is not complete. If you want me to explain something else in more details, please post a message below this blog post and I will add the content.