Training courses

Kernel and Embedded Linux

Bootlin training courses

Embedded Linux, kernel,
Yocto Project, Buildroot, real-time,
graphics, boot time, debugging...

Bootlin logo

Elixir Cross Referencer

.. index:: encoder

Encoders
========

This section gives an overview of encoders, details on the encoders
that ship with libxo, and documentation for developers of future
encoders.

Overview
--------

The libxo library contains software to generate four "built-in"
formats: text, XML, JSON, and HTML.  These formats are common and
useful, but there are other common and useful formats that users will
want, and including them all in the libxo software would be difficult
and cumbersome.

To allow support for additional encodings, libxo includes a
"pluggable" extension mechanism for dynamically loading new encoders.
libxo-based applications can automatically use any installed encoder.

Use the "encoder=XXX" option to access encoders.  The following
example uses the "cbor" encoder, saving the output into a file::

    df --libxo encoder=cbor > df-output.cbor

Encoders can support specific options that can be accessed by
following the encoder name with a colon (':') or a plus sign ('+') and
one of more options, separated by the same character::

    df --libxo encoder=csv+path=filesystem+leaf=name+no-header
    df --libxo encoder=csv:path=filesystem:leaf=name:no-header

These examples instructs libxo to load the "csv" encoder and pass the
following options::

   path=filesystem
   leaf=name
   no-header

Each of these option is interpreted by the encoder, and all such
options names and semantics are specific to the particular encoder.
Refer to the intended encoder for documentation on its options.

The string "@" can be used in place of the string "encoder=".

    df --libxo @csv:no-header

.. _csv_encoder:

CSV - Comma Separated Values
----------------------------

libxo ships with a custom encoder for "CSV" files, a common format for
comma separated values.  The output of the CSV encoder can be loaded
directly into spreadsheets or similar applications.

A standard for CSV files is provided in :RFC:`4180`, but since the
format predates that standard by decades, there are many minor
differences in CSV file consumers and their expectations.  The CSV
encoder has a number of options to tailor output to those
expectations.

Consider the following XML::

  % list-items --libxo xml,pretty
  <top>
    <data test="value">
      <item test2="value2">
        <sku test3="value3" key="key">GRO-000-415</sku>
        <name key="key">gum</name>
        <sold>1412</sold>
        <in-stock>54</in-stock>
        <on-order>10</on-order>
      </item>
      <item>
        <sku test3="value3" key="key">HRD-000-212</sku>
        <name key="key">rope</name>
        <sold>85</sold>
        <in-stock>4</in-stock>
        <on-order>2</on-order>
      </item>
      <item>
        <sku test3="value3" key="key">HRD-000-517</sku>
        <name key="key">ladder</name>
        <sold>0</sold>
        <in-stock>2</in-stock>
        <on-order>1</on-order>
      </item>
    </data>
  </top>

This output is a list of `instances` (named "item"), each containing a
set of `leafs` ("sku", "name", etc).

The CSV encoder will emit the leaf values in this output as `fields`
inside a CSV `record`, which is a line containing a set of
comma-separated values::

  % list-items --libxo encoder=csv
  sku,name,sold,in-stock,on-order
  GRO-000-415,gum,1412,54,10
  HRD-000-212,rope,85,4,2
  HRD-000-517,ladder,0,2,1

Be aware that since the CSV encoder looks for data instances, when
used with :ref:`xo`, the `--instance` option will be needed::

  % xo --libxo encoder=csv --instance foo 'The {:product} is {:status}\n' stereo "in route"
  product,status
  stereo,in route

.. _csv_path:

The `path` Option
~~~~~~~~~~~~~~~~~

By default, the CSV encoder will attempt to emit any list instance
generated by the application.  In some cases, this may be
unacceptable, and a specific list may be desired.

Use the "path" option to limit the processing of output to a specific
hierarchy.  The path should be one or more names of containers or
lists.

For example, if the "list-items" application generates other lists,
the user can give "path=top/data/item" as a path::

  % list-items --libxo encoder=csv:path=top/data/item
  sku,name,sold,in-stock,on-order
  GRO-000-415,gum,1412,54,10
  HRD-000-212,rope,85,4,2
  HRD-000-517,ladder,0,2,1

Paths are "relative", meaning they need not be a complete set
of names to the list.  This means that "path=item" may be sufficient
for the above example.

.. _csv_leafs:

The `leafs` Option
~~~~~~~~~~~~~~~~~~

The CSV encoding requires that all lines of output have the same
number of fields with the same order.  In contrast, XML and JSON allow
any order (though libxo forces key leafs to appear before other
leafs).

To maintain a consistent set of fields inside the CSV file, the same
set of leafs must be selected from each list item.  By default, the
CSV encoder records the set of leafs that appear in the first list
instance it processes, and extract only those leafs from future
instances.  If the first instance is missing a leaf that is desired by
the consumer, the "leaf" option can be used to ensure that an empty
value is recorded for instances that lack a particular leaf.

The "leafs" option can also be used to exclude leafs, limiting the
output to only those leafs provided.

In addition, the order of the output fields follows the order in which
the leafs are listed.  "leafs=one.two" and "leafs=two.one" give
distinct output.

So the "leafs" option can be used to expand, limit, and order the set
of leafs.

The value of the leafs option should be one or more leaf names,
separated by a period (".")::

  % list-items --libxo encoder=csv:leafs=sku.on-order
  sku,on-order
  GRO-000-415,10
  HRD-000-212,2
  HRD-000-517,1
  % list-items -libxo encoder=csv:leafs=on-order.sku
  on-order,sku
  10,GRO-000-415
  2,HRD-000-212
  1,HRD-000-517

Note that since libxo uses terminology from YANG (:RFC:`7950`), the
data modeling language for NETCONF (:RFC:`6241`), which uses "leafs"
as the plural form of "leaf".  libxo follows that convention.

.. _csv_no_header:

The `no-header` Option
~~~~~~~~~~~~~~~~~~~~~~

CSV files typical begin with a line that defines the fields included
in that file, in an attempt to make the contents self-defining::

    sku,name,sold,in-stock,on-order
    GRO-000-415,gum,1412,54,10
    HRD-000-212,rope,85,4,2
    HRD-000-517,ladder,0,2,1

There is no reliable mechanism for determining whether this header
line is included, so the consumer must make an assumption.

The csv encoder defaults to producing the header line, but the
"no-header" option can be included to avoid the header line.

.. _csv_no_quotes:

The `no-quotes` Option
~~~~~~~~~~~~~~~~~~~~~~

:RFC:`4180` specifies that fields containing spaces should be quoted, but
many CSV consumers do not handle quotes.  The "no-quotes" option
instruct the CSV encoder to avoid the use of quotes.

.. _csv_dos:

The `dos` Option
~~~~~~~~~~~~~~~~

:RFC:`4180` defines the end-of-line marker as a carriage return
followed by a newline.  This `CRLF` convention dates from the distant
past, but its use was anchored in the 1980s by the `DOS` operating
system.

The CSV encoder defaults to using the standard Unix end-of-line
marker, a simple newline.  Use the "dos" option to use the `CRLF`
convention.

The Encoder API
---------------

The encoder API consists of three distinct phases:

- loading the encoder
- initializing the encoder
- feeding operations to the encoder

To load the encoder, libxo will open a shared library named:

   ${prefix}/lib/libxo/encoder/${name}.enc

This file is typically a symbolic link to a dynamic library, suitable
for `dlopen`().  libxo looks for a symbol called
`xo_encoder_library_init` inside that library and calls it with the
arguments defined in the header file "xo_encoder.h".  This function
should look as follows::

  int
  xo_encoder_library_init (XO_ENCODER_INIT_ARGS)
  {
      arg->xei_version = XO_ENCODER_VERSION;
      arg->xei_handler = test_handler;
  
      return 0;
  }

Several features here allow for future compatibility: the macro
XO_ENCODER_INIT_ARGS allows the arguments to this function change over
time, and the XO_ENCODER_VERSION allows the library to tell libxo
which version of the API it was compiled with.

The function places in xei_handler should be have the signature::

  static int
  test_handler (XO_ENCODER_HANDLER_ARGS)
  {
       ...

This function will be called with the "op" codes defined in
"xo_encoder.h".  Each op code represents a distinct event in the libxo
processing model.  For example OP_OPEN_CONTAINER tells the encoder
that a new container has been opened, and the encoder can behave in an
appropriate manner.