/* $NetBSD: TODO.modules,v 1.24 2021/08/09 20:49:08 andvar Exp $ */
Some notes on the limitations of our current (as of 7.99.35) module
subsystem. This list was triggered by an Email exchange between
christos and pgoyette.
1. Builtin drivers can't depend on modularized drivers (the modularized
drivers are attempted to load as builtins).
The assumption is that dependencies are loaded before those
modules which depend on them. At load time, a module's
undefined global symbols are resolved; if any symbols can't
be resolved, the load fails. Similarly, if a module is
included in (built-into) the kernel, all of its symbols must
be resolvable by the linker, otherwise the link fails.
There are ways around this (such as, having the parent
module's initialization command recursively call the module
load code), but they're often gross hacks.
Another alternative (which is used by ppp) is to provide a
"registration" mechanism for the "child" modules, and then when
the need for a specific child module is encountered, use
module_autoload() to load the child module. Of course, this
requires that the parent module know about all potentially
loadable children.
2. Currently, config(1) has no way to "no define" drivers
XXX: I don't think this is true anymore. I think we can
undefine drivers now, see MODULAR in amd64, which does
no ath* and no select sppp*
3. It is not always obvious by their names which drivers/options
correspond to which modules.
4. Right now critical drivers that would need to be pre-loaded (ffs,
exec_elf64) are still built-in so that we don't need to alter the boot
blocks to boot.
This was a conscious decision by core@ some years ago. It is
not a requirement that ffs or exec_* be built-in. The only
requirement is that the root file-system's module must be
available when the module subsystem is initialized, in order
to load other modules. This can be accomplished by having the
boot loader "push" the module at boot time. (It used to do
this in all cases; currently the "push" only occurs if the
booted filesystem is not ffs.)
5. Not all parent bus drivers are capable of rescan, so some drivers
just have to be built-in.
6. Many (most?) drivers are not yet modularized
7. There's currently no provisions for autoconfig to figure out which
modules are needed, and thus to load the required modules.
In the "normal" built-in world, autoconfigure can only ask
existing drivers if they're willing to manage (ie, attach) a
device. Removing the built-in drivers tends to limit the
availability of possible managers. There's currently no
mechanism for identifying and loading drivers based on what
devices might be found.
8. Even for existing modules, there are "surprise" dependencies with
code that has not yet been modularized.
For example, even though the bpf code has been modularized,
there is some shared code in bpf_filter.c which is needed by
both ipfilter and ppp. ipf is already modularized, but ppp
is not. Thus, even though bpf_filter is modular, it MUST be
included as a built-in module if you also have ppp in your
configuration.
Another example is sysmon_taskq module. It is required by
other parts of the sysmon subsystem, including the
"sysmon_power" module. Unfortunately, even though the
sysmon_power code is modularized, it is referenced by the
acpi code which has not been modularized. Therefore, if your
configuration has acpi, then you must include the "sysmon_power"
module built-in the kernel. And therefore you also need to
have "sysmon_taskq" and "sysmon" built-in since "sysmon_power"
rerefences them.
9. As a corollary to #8 above, having dependencies on modules from code
which has not been modularized makes it extremely difficult to test
the module code adequately. Testing of module code should include
both testing-as-a-built-in module and testing-as-a-loaded-module, and
all dependencies need to be identified.
10. The current /stand/$ARCH/$VERSION/modules/ hierarchy won't scale as
we get more and more modules. There are hundreds of potential device
driver modules.
11. There currently isn't any good way to handle attachment-specific
modules. The build infrastructure (ie, sys/modules/Makefile) doesn't
readily lend itself to bus-specific modules irrespective of $ARCH,
and maintaining distrib/sets/lists/modules/* is awkward at best.
Furthermore, devices such as ld(4), which can attach to a large set
of parent devices, need to be modified. The parent devices need to
provide a common attribute (for example, ld_bus), and the ld driver
should attach to that attribute rather than to each parent. But
currently, config(1) doesn't handle this - it doesn't allow an
attribute to be used as the device tree's pseudo-root. The current
directory structure where driver foo is split between ic/foo.c
and bus1/foo_bus1.c ... busn/foo_busn.c is annoying. It would be
better to switch to the FreeBSD model which puts all the driver
files in one directory.
12. Item #11 gets even murkier when a particular parent can provide more
than one attribute.
13. It seems that we might want some additional sets-lists "attributes"
to control contents of distributions. As an example, many of our
architectures have PCI bus capabilities, but not all. It is rather
painful to need to maintain individual architectures' modules/md_*
sets lists, especially when we already have to conditionalize the
build of the modules based on architecture. If we had a single
"attribute" for PCI-bus-capable, the same attribute could be used to
select which modules to build and which modules from modules/mi to
include in the release. (This is not limited to PCI; recently we
encounter similar issues with spkr aka spkr_synth module.)
14. As has been pointed out more than once, the current method of storing
modules in a version-specific subdirectory of /stand is sub-optimal
and leads to much difficulty and/or confusion. A better mechanism of
associating a kernel and its modules needs to be developed. Some
have suggested having a top-level directory (say, /netbsd) with a
kernel and its modules at /netbsd/kernel and /netbsd/modules/...
Whatever new mechanism we arrive at will probably require changes to
installation procedures and bootstrap code, and will need to handle
both the new and old mechanisms for compatibility.
One additional option mentioned is to be able to specify, at boot
loader time, an alternate value for the os-release portion of the
default module path, i.e. /stand/$MACHINE/$ALT-RELEASE/modules/
The following statement regarding this issue was previously issued
by the "core" group:
Date: Fri, 27 Jul 2012 08:02:56 +0200
From: <redacted>
To: <redacted>
Subject: Core statement on directory naming for kernel modules
The core group would also like to see the following changes in
the near future:
Implementation of the scheme described by Luke Mewburn in
<http://mail-index.NetBSD.org/current-users/2009/05/10/msg009372.html>
to allow a kernel and its modules to be kept together.
Changes to config(1) to extend the existing notion of whether or not
an option is built-in to the kernel, to three states: built-in, not
built-in but loadable as a module, entirely excluded and not even
loadable as a module.
15. The existing config(5) framework provides an excellent mechanism
for managing the content of kernels. Unfortunately, this mechanism
does not apply for modules, and instead we need to manually manage
a list of files to include in the module, the set of compiler
definitions with which to build those files, and also the set of
other modules on which a module depends. We really need a common
mechanism to define and build modules, whether they are included as
"built-in" modules or as separately-loadable modules.
(From John Nemeth) Some sort of mechanism for a (driver) module
to declare the list of vendor/product/other tuples that it can
handle would be nice. Perhaps this would go in the module's .plist
file? (See #17 below.) Then drivers that scan for children might
be able to search the modules directory for an "appropriate" module
for each child, and auto-load.
16. PR kern/52821 exposes another limitation of config(1) WRT modules.
Here, an explicit device attachment is required, because we cannot
rely on all kernel configs to contain the attribute at which the
modular driver wants to attach. Unfortunately, the explicit
attachment causes conflicts with built-in drivers. (See the PR for
more details.)
17. (From John Nemeth) It would be potentially useful if a "push" from
the bootloader could also load-and-push a module's .plist (if it
exists.
18. (From John Nemeth) Some sort of schema for a module to declare the
options (or other things?) that the module understands. This could
result in a module-options editor to manipulate the .plist
19. (From John Nemeth) Currently, the order of module initialization is
based on module classes and declared dependencies. It might be
useful to have additional classes (or sub-classes) with additional
invocations of module_class_init(), and it might be useful to have a
non-dependency mechanism to provide "IF module-A and module-B are
BOTH present, module-A needs to be initialized before module-B".
20. (Long-ago memory rises to the surface) Note that currently there is
nothing that requires a module's name to correspond in any way with
the name of file from which the module is loaded. Thus, it is
possible to attempt to access device /dev/x, discover that there is
no such device so we autoload /stand/.../x/x.kmod and initialize
the module loaded, even if the loaded module is for some other
device entirely!
21. We currently do not support "weak" symbols in the in-kernel linker.
It would take some serious thought to get such support right. For
example, consider module A with a weak reference to symbol S which
is defined in module B. If module B is loaded first, and then
module A, the symbol gets resolved. But if module A is loaded first,
the symbol won't be resolved. If we subsequently load module B, we
would have to "go back" and re-run the linker for module A.
Additional difficulties arise when the module which defines the
weak symbol gets unloaded. Then, you would need to re-run the
linker and _unresolve_ the weak symbol which is no longer defined.
22. A fairly large number of modules still require a maximum warning
level of WARNS=3 due to signed-vs-unsigned integer comparisons. We
really ought to clean these up. (I haven't looked at them in any
detail, but I have to wonder how code that compiles cleanly in a
normal kernel has these issues when compiled in a module, when both
are done with WARNS=5).
23. The current process of "load all the emulation/exec modules in case
one of them might handle the image currently being exec'd" isn't
really cool. (See sys/kern/kern_exec.c?) It ends up auto-loading
a whole bunch of modules, involving file-system access, just to have
most of the modules getting unloaded a few seconds later. We don't
have any way to identify which module is needed for which image (ie,
we can't determine that an image needs compat_linux vs some other
module).
24. Details are no longer remembered, but there are some issues with
building xen-variant modules (on amd4, and likely i386). In some
cases, wrong headers are included (because a XEN-related #define
is missing), but even if you add the definition some headers get
included in the wrong order. One particular fallout from this is
the inability to have a compat version of x86_64 cpu-microcode
module. PR port-xen/53130
This is likely to be fixed by Chuck Silvers on 2020-07-04 which
removed the differences between the xen and non-xen module ABIs.
As of 2021-05-28 the cpu-microcode functionality has once again
been enabled for i386 and amd64 compat_60 modules.