Part 1: Kea Migration Assistant support
=======================================
Files:
------
- data.h (tailq list and element type declarations)
- data.c (element type code)
- keama.h (DHCP declarations)
- keama.c (main() code)
- json.c (JSON parser)
- option.c (option tables and code)
- keama.8 (man page)
The code heavily uses tailq lists, i.e. doubled linked lists with
a pointer to the last (tail) element.
The element structure mimics the Kea Element class with a few differences:
- no smart pointers
- extra fields to handle declaration kind, skip and comments
- maps are implemented as lists with an extra key field so the order
of insertion is kept and duplicates are possible
- strings are length + content (vs C strings)
There is no attempt to avoid memory leaks.
The skip flag is printed as '//' at the beginning of lines. It is set
when something cannot be converted and the issue counter (returned
by the keama command) incremented.
Part 2: ISC DHCP lexer organization
===================================
Files:
-----
- dhctoken.h (from includes, enum dhcp_token definition)
- conflex.c (from common, lexical analyzer code)
Tokens (dhcp_token enum): characters are set to their ASCII value,
others are >= 256 without real organization (e.g. END_OF_FILE is 607).
The state is in a parse structure named "cfile". There is one per file
and a few routine save it in order to do a backtrack on a larger
set than the usual lookahead.
The largest function is intern() which recognizes keywords with
a switch on the first character and a tree of if strcasecmp's.
Standard routines:
-----------------
enum dhcp_token
next_token(const char **rval, unsigned *rlen, struct parse *cfile);
and
enum dhcp_token
peek_token(const char **rval, unsigned *rlen, struct parse *cfile);
rval: if not null the content of the token is put in it
rlen: if not null the length of the token is put in it
cfile: lexer context
return: the integer value of the token
Changes:
-------
Added LBRACKET '[' and RBRACKET ']' tokens for JSON parser
(switch on dhcp_token type).
Added comments to collect ISC DHCP # comments, element stack to follow
declaration hierarchy, and issue counter to struct parse.
Moved the parse_warn (renamed into parse_error and made fatal) routine
from conflex.c to keama.c
Part 3: ISC DHCP parser organization
====================================
Files:
-----
- confparse.c (from server)
for the server in parse_statement())
- parse.c (from common)
4 classes: parameters, declarations, executable statements and expressions.
the original code parses config and lease files, I kept only the first
at the exception of parse_binding_value().
entry point
|
V
conf_file_parse
|
V
conf_file_subparse <- read_conf_file (for include)
until END_OF_FILE call
|
V
parse_statement
parse parameters and declarations
switch on token and call parse_xxx_declaration routines
on default or DHCPv6 token in DHCPv4 mode call parse_executable_statement
and put the result under the "statement" key
|
V
parse_executable_statement
According to comments the grammar is:
conf-file :== parameters declarations END_OF_FILE
parameters :== <nil> | parameter | parameters parameter
declarations :== <nil> | declaration | declarations declaration
statement :== parameter | declaration
parameter :== DEFAULT_LEASE_TIME lease_time
| MAX_LEASE_TIME lease_time
| DYNAMIC_BOOTP_LEASE_CUTOFF date
| DYNAMIC_BOOTP_LEASE_LENGTH lease_time
| BOOT_UNKNOWN_CLIENTS boolean
| ONE_LEASE_PER_CLIENT boolean
| GET_LEASE_HOSTNAMES boolean
| USE_HOST_DECL_NAME boolean
| NEXT_SERVER ip-addr-or-hostname SEMI
| option_parameter
| SERVER-IDENTIFIER ip-addr-or-hostname SEMI
| FILENAME string-parameter
| SERVER_NAME string-parameter
| hardware-parameter
| fixed-address-parameter
| ALLOW allow-deny-keyword
| DENY allow-deny-keyword
| USE_LEASE_ADDR_FOR_DEFAULT_ROUTE boolean
| AUTHORITATIVE
| NOT AUTHORITATIVE
declaration :== host-declaration
| group-declaration
| shared-network-declaration
| subnet-declaration
| VENDOR_CLASS class-declaration
| USER_CLASS class-declaration
| RANGE address-range-declaration
Typically declarations use { } and are associated with a group
(changed to a type) in ROOT_GROUP (global), HOST_DECL, SHARED_NET_DECL,
SUBNET_DECL, CLASS_DECL, GROUP_DECL and POOL_DECL.
ROOT: parent = TOPLEVEL, children = everythig but not POOL
HOST: parent = ROOT, GROUP, warn on SHARED or SUBNET, children = none
SHARED_NET: parent = ROOT, GROUP, children = HOST (warn), SUBNET, POOL4
SUBNET: parent = ROOT, GROUP, SHARED, children = HOST (warn), POOL
CLASS: parent = ROOT, GROUP, children = none
GROUP: parent = ROOT, SHARED, children = anything but not POOL
POOL: parent = SHARED4, SUBNET, warn on others, children = none
isc_boolean_t
parse_statement(struct parse *cfile, int type, isc_boolean_t declaration);
cfile: parser context
type: declaration type
declaration and return: declaration or parameter
On the common side:
executable-statements :== executable-statement executable-statements |
executable-statement
executable-statement :==
IF if-statement |
ADD class-name SEMI |
BREAK SEMI |
OPTION option-parameter SEMI |
SUPERSEDE option-parameter SEMI |
PREPEND option-parameter SEMI |
APPEND option-parameter SEMI
isc_boolean_t
parse_executable_statement(struct element *result,
struct parse *cfile, isc_boolean_t *lose,
enum expression_context case_context,
isc_boolean_t direct);
result: map element where to put the statement
cfile: parser context
lose: set to ISC_TRUE on failure
case_context: expression context
direct: called directly by parse_statement so can execute config statements
return: success
parse_executable_statement
switch on keywords (far more than in the comments)
on default with an identifier try a config option, on number or name
call parse_expression for a function call
|
V
parse_expression
expressions are divided into boolean, data (string) and numeric expressions
boolean_expression :== CHECK STRING |
NOT boolean-expression |
data-expression EQUAL data-expression |
data-expression BANG EQUAL data-expression |
data-expression REGEX_MATCH data-expression |
boolean-expression AND boolean-expression |
boolean-expression OR boolean-expression
EXISTS OPTION-NAME
data_expression :== SUBSTRING LPAREN data-expression COMMA
numeric-expression COMMA
numeric-expression RPAREN |
CONCAT LPAREN data-expression COMMA
data-expression RPAREN
SUFFIX LPAREN data_expression COMMA
numeric-expression RPAREN |
LCASE LPAREN data_expression RPAREN |
UCASE LPAREN data_expression RPAREN |
OPTION option_name |
HARDWARE |
PACKET LPAREN numeric-expression COMMA
numeric-expression RPAREN |
V6RELAY LPAREN numeric-expression COMMA
data-expression RPAREN |
STRING |
colon_separated_hex_list
numeric-expression :== EXTRACT_INT LPAREN data-expression
COMMA number RPAREN |
NUMBER
parse_boolean_expression, parse_data_expression and parse_numeric_expression
calls parse_expression and check its result
parse_expression itself is divided into parse_non_binary and internal
handling of binary operators
isc_boolean_t
parse_non_binary(struct element *expr, struct parse *cfile,
isc_boolean_t *lose, enum expression_context context)
isc_boolean_t
parse_expression(struct element *expr, struct parse *cfile,
isc_boolean_t *lose, enum expression_context context,
struct element *lhs, enum expr_op binop)
expr: map element where to put the result
cfile: parser context
lose: set to ISC_TRUE on failure
context: expression context
lhs: NULL or left hand side
binop: expr_none or binary operation
return: success
parse_non_binary
switch on unary and nullary operator keywords
on default try a variable reference or a function call
parse_expression
call parse_non_binary to get the right hand side
switch on binary operator keywords to get the next operation
with one side if expr_none return else get the second hand
handle operator precedence, can call itself
return a map entry with the operator name as the key, and
left and right expression branches
Part 4: Expression processing
=============================
Files:
------
- print.c (new)
- eval.c (new)
- reduce.c (new)
Print:
------
const char *
print_expression(struct element *expr, isc_boolean_t *lose);
const char *
print_boolean_expression(struct element *expr, isc_boolean_t *lose);
const char *
print_data_expression(struct element *expr, isc_boolean_t *lose);
const char *
print_numeric_expression(struct element *expr, isc_boolean_t *lose);
expr: expression to print
lose: failure (??? in output) flag
return: the text representing the expression
Eval:
-----
struct element *
eval_expression(struct element *expr, isc_boolean_t *modifiedp);
struct element *
eval_boolean_expression(struct element *expr, isc_boolean_t *modifiedp);
struct element *
eval_data_expression(struct element *expr, isc_boolean_t *modifiedp);
struct element *
eval_numeric_expression(struct element *expr, isc_boolean_t *modifiedp);
expr: expression to evaluate
modifiedp: a different element was returned (still false for updates
inside a map)
return: the evaluated element (can have been updated for a map or a list,
or can be a fully different element)
Evaluation is at parsing time so it is mainly a constant propagation.
(no beta reduction for instance)
Reduce:
-------
struct element *
reduce_boolean_expression(struct element *expr);
struct element *
reduce_data_expression(struct element *expr);
struct element *
reduce_numeric_expression(struct element *expr);
expr: expression to reduce
return: NULL or the reduced expression as a Kea eval string
reducing works for a limited (but interesting) set of expressions which
can be converted to kea evaluatebool and for literals.
Part 5: Specific issues
=======================
Reservations:
-------------
ISC DHCP host declarations are global, Kea reservations were per subnet
only until 1.5.
It is possible to use the fixed address but:
- it is possible to finish with orphan reservations, i.e.
reservations with an address which match no subnets
- a reservation can have no fixed address. In this case the MA puts
the reservation in the last declared subnet.
- a reservation can have more than one fixed address and these
addresses can belong to different subnets. Current code pushes
IPv4 extra addresses in a commented extra-ip-addresses but
it is legal feature for IPv6.
- it is not easy to use prefix6
The use of groups in host declarations is unclear.
ISC DHCP UID is mapped to client-id, host-identifier to flex-id
Host reservation identifiers are generated on first use.
Groups:
-------
TODO: search missing parameters from the Kea syntax.
(will be done in the third pass)
Shared-Networks:
----------------
Waiting for the feature to be supported by Kea.
Currently at the end of a shared network declaration:
- if there is no subnets it is a fatal error
- if there is one subnet the shared-network is squeezed
- if there are more than one subnet the shared-network is commented
TODO (useful only with Kea support for shared networks): combine permit /
deny classes (e.g. create negation) and pop filters to subnets when
there is one pool.
Vendor-Classes and User-Classes:
--------------------------------
ISC DHCP code is inconsistent: in particular before setting the
super-class "tname" to "implicit-vendor-class" / "implicit-user-class"
it allocates a buffer for data but does not copy the lexical value
"val" into it... So I removed support.
Classes:
--------
Only pure client-classes are supported by kea.
Dynamic/deleted stuff is not supported but does it make sense?
To spawn classes is not supported.
Match class selector is converted to Kea eval test when the corresponding
expression can be reduced. Fortunately it seems to be the common case!
Lease limit is not supported.
Subclasses:
-----------
Understood how it works:
- (super) class defined with a MATCH <data-expression> (vs.
MATCH IF <boolean-expression>)
- subclasses defined by <superclass-name> <data-literal> which
are equivalent to
MATCH IF <superclass-data-expression> EQUAL <data-literal>
So subclasses are convertible when the data expression can be reduced.
Cf https://kb.isc.org/article/AA-01092/202/OMAPI-support-for-classes-and-subclasses.html
which BTW suggests the management API could manage classes...
Hardware Addresses:
-------------------
Kea supports only Ethernet.
Pools:
------
All permissions are not supported by Kea at the exception of class members
but in a very different way so not convertible.
Mixed DHCPv6 address and prefix pools are not supported, perhaps in this
case the pool should be duplicated into pool and pd-pool instances?
The bootp stuff was ifdef's as bootp is obsolete.
Temporary (aka IA_TA) is commented ny the MA.
ISC DHCP supports interval ranges for prefix6. Kea has a different
and IMHO more powerful model.
Pool6 permissions are not supported.
Failover:
---------
Display a warning on the first use.
Interfaces:
-----------
Referenced interface names are pushed to an interfaces-config but it is
very (too!) easy to finish with a Kea config without any interface.
Hostnames:
----------
ISC DHCP does dynamic resolution in parse_ip_addr_or_hostname.
Static (at conversion time) resolution to one address is done by
the MA for fixed-address. Resolution is considered as painful
there are better (and safer) ways to do this. The -r (resolve)
command line parameter controls the at-conversion-time resolution.
Note only the first address is returned.
TODO: check the multiple address comment is correctly taken
(need a known host resolving in a stable set of addresses)
Options:
--------
Some options are known only in ISC DHCP (almost fixed), a few only by Kea.
Formats are supposed to be the same, the only known exception
(DHCPv4 domain-search) was fixed by #5087.
For option spaces DHCPv4 vendor-encapsulated-options (code 43, in general
associated to vendor-class-identifier code 60) uses a dedicated feature
which had no equivalent in Kea (fixed).
Option definitions are convertible with a few exception:
- no support in Kea for an array of records (mainly by the lack
of a corresponding syntax). BTW there is no known use too.
- no support in Kea for an array at the end of a record (fixed)
All unsupported option declarations are set to full binary (X).
- X format means ASCII or hexa:
* standard options are in general mapped to binary
* new options are mapped to string with format x (vs x)
* when a string got hexadecimal data a warning in added in comments
suggesting to switch to plain binary.
- ISC DHCP use quotes for a domain-list but not for a domain-name,
this is no very coherent and makes domain-list different than
domain-name array.
Each time an option data has a format which is not convertible than
a CSV false binary data is produced.
We have no example in ISC DHCP, Kea or standard but it is possible
than an option defined as a fixed sized record followed by
(encapsulated) suboptions bugs (it already bugs toElement).
For operations on options ISC DHCP has supersede, send, append,
prepend, default (set if not yet present), Kea puts them in code order
with a few built-in exceptions.
To finish there is the way to enforce Kea to add an option in a response
is pretty different and can't be automatically translated (cf Kea #250).
Duplicates:
-----------
Many things in ISC DHCP can be duplicated:
- options can be redefined
- same host identifier used twice
- same fixed address used in tow different hosts
etc.
Kea is far more strict and IMHO it is a good thing. Now the MA does
no particular check and multiple definitions work only for classes
(because it is the way the ISC DHCP parse works).
If we have Docsis space options, they are standard in Kea so they
will conflict.
Dynamic DNS:
------------
Details are very different so the MA maps only basic parameters
at the global scope.
Expressions:
------------
ISC DHCP expressions are typed: boolean, numeric, and data aka string.
The default for a literal is to be a string so literal numbers are
interpreted in hexadecimal (for a strange consequence look at
https://kb.isc.org/article/AA-00334/56/Do-the-list-of-parameters-in-the-dhcp-parameter-request-list-need-to-be-in-hex.html ).
String literals are converted to string elements, hexadecimal literals
are converted to const-data maps.
TODO reduce more hexa aka const-data
As booleans are not data there is no way to fix this:
/tmp/bool line 9: Expecting a data expression.
option ip-forwarding = foo = foo;
^
Cf Kea #247
The tautology 'foo = foo' is not a data expression so is rejected by
both the MA and dhcpd (BTW the role of the MA is not to fix ISC DHCP
shortcomings so it does what it is expected to do here).
Note this does not work too:
option ip-forwarding = true;
because "true" is not a keyword and it is converted into a variable
reference... And I expect ISC DHCP makes this true a false at runtime
because the variable "true" is not defined by default.
Reduced expressions are pretty printed to allow an extra check.
Hardware for DHCPv4 is expansed into a concatenation of hw-type and
hw-address, this allows to simplify expression where only one is used.
Variables:
----------
ISC DHCP has a notion of variables in a scope where the scope can be
a lexical scope in the config or a scope in a function body
(ISC DHCP has even an unused "let" statement).
There is a variant of bindings for lease files using types and able
to recognize booleans and numbers. Unfortunately this is very specific...
TODO:
- global host reservations
- class like if statement
- add more tests for classes in pools and class generation