Training courses

Kernel and Embedded Linux

Bootlin training courses

Embedded Linux, kernel,
Yocto Project, Buildroot, real-time,
graphics, boot time, debugging...

Bootlin logo

Elixir Cross Referencer

.. Copyright (C) 2015-2020 Free Software Foundation, Inc.
   Originally contributed by David Malcolm <dmalcolm@redhat.com>

   This is free software: you can redistribute it and/or modify it
   under the terms of the GNU General Public License as published by
   the Free Software Foundation, either version 3 of the License, or
   (at your option) any later version.

   This program is distributed in the hope that it will be useful, but
   WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
   General Public License for more details.

   You should have received a copy of the GNU General Public License
   along with this program.  If not, see
   <http://www.gnu.org/licenses/>.

Tutorial part 5: Implementing an Ahead-of-Time compiler
-------------------------------------------------------

If you have a pre-existing language frontend that's compatible with
libgccjit's license, it's possible to hook it up to libgccjit as a
backend.  In the previous example we showed
how to do that for in-memory JIT-compilation, but libgccjit can also
compile code directly to a file, allowing you to implement a more
traditional ahead-of-time compiler ("JIT" is something of a misnomer
for this use-case).

The essential difference is to compile the context using
:c:func:`gcc_jit_context_compile_to_file` rather than
:c:func:`gcc_jit_context_compile`.

The "brainf" language
*********************

In this example we use libgccjit to construct an ahead-of-time compiler
for an esoteric programming language that we shall refer to as "brainf".

brainf scripts operate on an array of bytes, with a notional data pointer
within the array.

brainf is hard for humans to read, but it's trivial to write a parser for
it, as there is no lexing; just a stream of bytes.  The operations are:

====================== =============================
Character              Meaning
====================== =============================
``>``                  ``idx += 1``
``<``                  ``idx -= 1``
``+``                  ``data[idx] += 1``
``-``                  ``data[idx] -= 1``
``.``                  ``output (data[idx])``
``,``                  ``data[idx] = input ()``
``[``                  loop until ``data[idx] == 0``
``]``                  end of loop
Anything else          ignored
====================== =============================

Unlike the previous example, we'll implement an ahead-of-time compiler,
which reads ``.bf`` scripts and outputs executables (though it would
be trivial to have it run them JIT-compiled in-process).

Here's what a simple ``.bf`` script looks like:

   .. literalinclude:: ../examples/emit-alphabet.bf
    :lines: 1-

.. note::

   This example makes use of whitespace and comments for legibility, but
   could have been written as::

     ++++++++++++++++++++++++++
     >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++<
     [>.+<-]

   It's not a particularly useful language, except for providing
   compiler-writers with a test case that's easy to parse.  The point
   is that you can use :c:func:`gcc_jit_context_compile_to_file`
   to use libgccjit as a backend for a pre-existing language frontend
   (provided that the pre-existing frontend is compatible with libgccjit's
   license).

Converting a brainf script to libgccjit IR
******************************************

As before we write simple code to populate a :c:type:`gcc_jit_context *`.

   .. literalinclude:: ../examples/tut05-bf.c
    :start-after: #define MAX_OPEN_PARENS 16
    :end-before: /* Entrypoint to the compiler.  */
    :language: c

Compiling a context to a file
*****************************

Unlike the previous tutorial, this time we'll compile the context
directly to an executable, using :c:func:`gcc_jit_context_compile_to_file`:

.. code-block:: c

    gcc_jit_context_compile_to_file (ctxt,
                                     GCC_JIT_OUTPUT_KIND_EXECUTABLE,
                                     output_file);

Here's the top-level of the compiler, which is what actually calls into
:c:func:`gcc_jit_context_compile_to_file`:

 .. literalinclude:: ../examples/tut05-bf.c
    :start-after: /* Entrypoint to the compiler.  */
    :end-before: /* Use the built compiler to compile the example to an executable:
    :language: c

Note how once the context is populated you could trivially instead compile
it to memory using :c:func:`gcc_jit_context_compile` and run it in-process
as in the previous tutorial.

To create an executable, we need to export a ``main`` function.  Here's
how to create one from the JIT API:

 .. literalinclude:: ../examples/tut05-bf.c
    :start-after: #include "libgccjit.h"
    :end-before: #define MAX_OPEN_PARENS 16
    :language: c

.. note::

   The above implementation ignores ``argc`` and ``argv``, but you could
   make use of them by exposing ``param_argc`` and ``param_argv`` to the
   caller.

Upon compiling this C code, we obtain a bf-to-machine-code compiler;
let's call it ``bfc``:

.. code-block:: console

  $ gcc \
      tut05-bf.c \
      -o bfc \
      -lgccjit

We can now use ``bfc`` to compile .bf files into machine code executables:

.. code-block:: console

  $ ./bfc \
       emit-alphabet.bf \
       a.out

which we can run directly:

.. code-block:: console

  $ ./a.out
  ABCDEFGHIJKLMNOPQRSTUVWXYZ

Success!

We can also inspect the generated executable using standard tools:

.. code-block:: console

  $ objdump -d a.out |less

which shows that libgccjit has managed to optimize the function
somewhat (for example, the runs of 26 and 65 increment operations
have become integer constants 0x1a and 0x41):

.. code-block:: console

  0000000000400620 <main>:
    400620:     80 3d 39 0a 20 00 00    cmpb   $0x0,0x200a39(%rip)        # 601060 <data
    400627:     74 07                   je     400630 <main
    400629:     eb fe                   jmp    400629 <main+0x9>
    40062b:     0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
    400630:     48 83 ec 08             sub    $0x8,%rsp
    400634:     0f b6 05 26 0a 20 00    movzbl 0x200a26(%rip),%eax        # 601061 <data_cells+0x1>
    40063b:     c6 05 1e 0a 20 00 1a    movb   $0x1a,0x200a1e(%rip)       # 601060 <data_cells>
    400642:     8d 78 41                lea    0x41(%rax),%edi
    400645:     40 88 3d 15 0a 20 00    mov    %dil,0x200a15(%rip)        # 601061 <data_cells+0x1>
    40064c:     0f 1f 40 00             nopl   0x0(%rax)
    400650:     40 0f b6 ff             movzbl %dil,%edi
    400654:     e8 87 fe ff ff          callq  4004e0 <putchar@plt>
    400659:     0f b6 05 01 0a 20 00    movzbl 0x200a01(%rip),%eax        # 601061 <data_cells+0x1>
    400660:     80 2d f9 09 20 00 01    subb   $0x1,0x2009f9(%rip)        # 601060 <data_cells>
    400667:     8d 78 01                lea    0x1(%rax),%edi
    40066a:     40 88 3d f0 09 20 00    mov    %dil,0x2009f0(%rip)        # 601061 <data_cells+0x1>
    400671:     75 dd                   jne    400650 <main+0x30>
    400673:     31 c0                   xor    %eax,%eax
    400675:     48 83 c4 08             add    $0x8,%rsp
    400679:     c3                      retq
    40067a:     66 0f 1f 44 00 00       nopw   0x0(%rax,%rax,1)

We also set up debugging information (via
:c:func:`gcc_jit_context_new_location` and
:c:macro:`GCC_JIT_BOOL_OPTION_DEBUGINFO`), so it's possible to use ``gdb``
to singlestep through the generated binary and inspect the internal
state ``idx`` and ``data_cells``:

.. code-block:: console

  (gdb) break main
  Breakpoint 1 at 0x400790
  (gdb) run
  Starting program: a.out

  Breakpoint 1, 0x0000000000400790 in main (argc=1, argv=0x7fffffffe448)
  (gdb) stepi
  0x0000000000400797 in main (argc=1, argv=0x7fffffffe448)
  (gdb) stepi
  0x00000000004007a0 in main (argc=1, argv=0x7fffffffe448)
  (gdb) stepi
  9     >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++<
  (gdb) list
  4
  5     cell 0 = 26
  6     ++++++++++++++++++++++++++
  7
  8     cell 1 = 65
  9     >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++<
  10
  11    while cell#0 != 0
  12    [
  13     >
  (gdb) n
  6     ++++++++++++++++++++++++++
  (gdb) n
  9     >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++<
  (gdb) p idx
  $1 = 1
  (gdb) p data_cells
  $2 = "\032", '\000' <repeats 29998 times>
  (gdb) p data_cells[0]
  $3 = 26 '\032'
  (gdb) p data_cells[1]
  $4 = 0 '\000'
  (gdb) list
  4
  5     cell 0 = 26
  6     ++++++++++++++++++++++++++
  7
  8     cell 1 = 65
  9     >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++<
  10
  11    while cell#0 != 0
  12    [
  13     >


Other forms of ahead-of-time-compilation
****************************************

The above demonstrates compiling a :c:type:`gcc_jit_context *` directly
to an executable.  It's also possible to compile it to an object file,
and to a dynamic library.  See the documentation of
:c:func:`gcc_jit_context_compile_to_file` for more information.