.\" $NetBSD: KERNEL_LOCK.9,v 1.2 2022/02/15 22:58:25 wiz Exp $
.\"
.\" Copyright (c) 2022 The NetBSD Foundation, Inc.
.\" All rights reserved.
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions
.\" are met:
.\" 1. Redistributions of source code must retain the above copyright
.\" notice, this list of conditions and the following disclaimer.
.\" 2. Redistributions in binary form must reproduce the above copyright
.\" notice, this list of conditions and the following disclaimer in the
.\" documentation and/or other materials provided with the distribution.
.\"
.\" THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS
.\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
.\" TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
.\" PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS
.\" BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
.\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
.\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
.\" INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
.\" CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
.\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
.\" POSSIBILITY OF SUCH DAMAGE.
.\"
.Dd February 13, 2022
.Dt KERNEL_LOCK 9
.Os
.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.Sh NAME
.Nm KERNEL_LOCK
.Nd compatibility with legacy uniprocessor code
.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.Sh SYNOPSIS
.In sys/systm.h
.\"
.Ft void
.Fn KERNEL_LOCK "int nlocks" "struct lwp *l"
.Ft void
.Fn KERNEL_UNLOCK_ONE "struct lwp *l"
.Ft void
.Fn KERNEL_UNLOCK_ALL "struct lwp *l" "int *nlocksp"
.Ft void
.Fn KERNEL_UNLOCK_LAST "struct lwp *l"
.Ft bool
.Fn KERNEL_LOCKED_P
.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.Sh DESCRIPTION
The
.Nm
facility serves to gradually transition software from the kernel's
legacy uniprocessor execution model, where the kernel runs on only a
single CPU and never in parallel on multiple CPUs, to a multiprocessor
system.
.Pp
.Sy New code should not use Nm .
.Nm
is meant only for gradual transition of
.Nx
to natively MP-safe code, which uses
.Xr mutex 9
or other
.Xr locking 9
facilities to synchronize between threads and interrupt handlers.
Use of
.Nm
hurts system performance and responsiveness.
This man page exists only to document the legacy API in order to make
it easier to transition away from.
.Pp
The kernel lock, sometimes also known as
.Sq giant lock
or
.Sq big lock ,
is a recursive exclusive spin-lock that can be held by a CPU at any
interrupt priority level and is dropped while sleeping.
This means:
.Bl -tag -width "held by a CPU"
.It recursive
If a CPU already holds the kernel lock, it can be acquired again and
again, as long as it is released an equal number of times.
.It exclusive
Only one CPU at a time can hold the kernel lock.
.It spin-lock
When one CPU holds the kernel lock and another CPU wants to hold it,
the second CPU
.Sq spins ,
i.e., repeatedly executes instructions to see if the kernel lock is
available yet, until the first CPU releases it.
During this time, no other threads can run on the spinning CPU.
.Pp
This means holding the kernel lock for long periods of time, such as
nontrivial computation, must be avoided.
Under
.Dv LOCKDEBUG
kernels, holding the kernel lock for too long can lead to
.Sq spinout
crashes.
.It held by a CPU
The kernel lock is held by a CPU, not by a process, kthread, LWP, or
interrupt handler.
It may be shared by a kthread LWP and several softint LWPs at the same
time, for example, if the softints interrupted the thread on a CPU.
.It any interrupt priority level
The kernel lock
.Em does not
block interrupts; subsystems running with the kernel lock use
.Xr spl 9
to synchronize with interrupt handlers.
.Pp
Interrupt handlers that are not marked MP-safe are always run with the
kernel lock.
If the interrupt arrives on a CPU where the kernel lock is already
held, it is simply taken again recursively on interrupt entry and
released to its original recursion depth on interrupt exit.
.It dropped while sleeping
Any time the kernel sleeps to let other threads run, for any reason
including
.Xr tsleep 9
or
.Xr condvar 9
or even adaptive
.Xr mutex 9
locks, it releases the kernel lock before going to sleep and then
reacquires it afterward.
.Pp
This means, for instance, that although data structures accessed only
under the kernel lock won't be changed before the sleep, they may be
changed by another thread during the sleep.
For example, the following program may crash on an assertion failure
because the sleep in
.Xr mutex_enter 9
can allow another CPU to run and change the global variable
.Dv x :
.Bd -literal
KERNEL_LOCK(1, NULL);
x = 42;
mutex_enter(...);
...
mutex_exit(...);
KASSERT(x == 42);
KERNEL_UNLOCK_ONE(NULL);
.Ed
.Pp
This means simply introducing calls to
.Xr mutex_enter 9
and
.Xr mutex_exit 9
can break kernel-locked assumptions.
Subsystems need to be consistently converted from
.Nm
and
.Xr spl 9
to
.Xr mutex 9 ,
.Xr condvar 9 ,
etc.; mixing
.Xr mutex 9
and
.Nm
usually doesn't work.
.El
.Pp
Holding the kernel lock
.Em does not
prevent other code from running on other CPUs at the same time.
It only prevents other
.Em kernel-locked
code from running on other CPUs at the same time.
.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.Sh FUNCTIONS
.Bl -tag -width abcd
.It Fn KERNEL_LOCK nlocks l
Acquire
.Fa nlocks
recursive levels of kernel lock.
.Pp
If the kernel lock is already held by another CPU, spins until it can
be acquired by this one.
If the kernel lock is already held by this CPU, records the kernel
lock recursion depth and returns immediately.
.Pp
Most of the time
.Fa nlocks
is 1, but code that deliberately releases all of the kernel locks held
by the current CPU in order to sleep and later reacquire the same
number of kernel locks will pass a value of
.Fa nlocks
obtained from
.Fn KERNEL_UNLOCK_ALL .
.It Fn KERNEL_UNLOCK_ONE l
Release one level of the kernel lock.
Equivalent to
.Fo KERNEL_UNLOCK
.Li 1 ,
.Fa l ,
.Dv NULL
.Fc .
.It Fn KERNEL_UNLOCK_ALL l nlocksp
Store the kernel lock recursion depth at
.Fa nlocksp
and release all recursive levels of the kernel lock.
.Pp
This is often used inside logic implementing sleep, around a call to
.Xr mi_switch 9 ,
so that the same number of recursive kernel locks can be reacquired
afterward once the thread is reawoken:
.Bd -literal
int nlocks;
KERNEL_UNLOCK_ALL(l, &nlocks);
... mi_switch(l) ...
KERNEL_LOCK(nlocks, l);
.Ed
.It Fn KERNEL_UNLOCK_LAST l
Release the kernel lock, which must be held at exactly one level.
.Pp
This is normally used at the end of a non-MP-safe thread, which was
known to have started with exactly one level of the kernel lock, and is
now about to exit.
.It Fn KERNEL_LOCKED_P
True if the kernel lock is held.
.Pp
To be used only in diagnostic assertions with
.Xr KASSERT 9 .
.El
.Pp
The legacy argument
.Fa l
must be
.Dv NULL
or
.Dv curlwp ,
which mean the same thing.
.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.Sh NOTES
Some
.Nx
kernel abstractions execute caller-specified functions with the kernel
lock held by default, for compatibility with legacy code, but can be
explicitly instructed
.Em not
to hold the kernel lock by passing an MP-safe flag:
.Bl -bullet
.It
.Xr callout 9 ,
.Dv CALLOUT_MPSAFE
.It
.Xr kfilter_register 9
and
.Xr knote 9 ,
.Dv FILTEROPS_MPSAFE
.It
.Xr kthread 9 ,
.Dv KTHREAD_MPSAFE
.It
.Xr pci_intr 9 ,
.Dv PCI_INTR_MPSAFE
.It
.Xr scsipi 9 ,
.Dv SCSIPI_ADAPT_MPSAFE
.It
.Xr softint 9 ,
.Dv SOFTINT_MPSAFE
.It
.Xr usbdi 9
pipes,
.Dv USBD_MPSAFE
.It
.Xr usbdi 9
tasks,
.Dv USB_TASKQ_MPSAFE
.It
.Xr vnode 9 ,
.Dv VV_MPSAFE
.It
.Xr workqueue 9 ,
.Dv WQ_MPSAFE
.El
.Pp
The following
.Nx
subsystems are still kernel-locked and need re-engineering to take
advantage of parallelism on multiprocessor systems:
.Bl -bullet
.It
.Xr ata 4 ,
.Xr atapi 4 ,
.Xr wd 4
.It
.Xr video 4
.It
.Xr autoconf 9
.It
most of the network stack by default, unless the option
.Dv NET_MPSAFE
is enabled
.It
.No ...
.El
.Pp
All interrupt handlers at
.Dv IPL_VM ,
or lower
.Pq Xr spl 9
run with the kernel lock on most ports.
.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.Sh SEE ALSO
.Xr locking 9 ,
.Xr mutex 9 ,
.Xr spl 9