.\" $NetBSD: membar_ops.3,v 1.10 2022/04/09 23:32:52 riastradh Exp $
.\"
.\" Copyright (c) 2007, 2008 The NetBSD Foundation, Inc.
.\" All rights reserved.
.\"
.\" This code is derived from software contributed to The NetBSD Foundation
.\" by Jason R. Thorpe.
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions
.\" are met:
.\" 1. Redistributions of source code must retain the above copyright
.\" notice, this list of conditions and the following disclaimer.
.\" 2. Redistributions in binary form must reproduce the above copyright
.\" notice, this list of conditions and the following disclaimer in the
.\" documentation and/or other materials provided with the distribution.
.\"
.\" THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS
.\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
.\" TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
.\" PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS
.\" BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
.\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
.\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
.\" INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
.\" CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
.\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
.\" POSSIBILITY OF SUCH DAMAGE.
.\"
.Dd March 30, 2022
.Dt MEMBAR_OPS 3
.Os
.Sh NAME
.Nm membar_ops ,
.Nm membar_acquire ,
.Nm membar_release ,
.Nm membar_producer ,
.Nm membar_consumer ,
.Nm membar_datadep_consumer ,
.Nm membar_sync
.Nd memory ordering barriers
.\" .Sh LIBRARY
.\" .Lb libc
.Sh SYNOPSIS
.In sys/atomic.h
.\"
.Ft void
.Fn membar_acquire "void"
.Ft void
.Fn membar_release "void"
.Ft void
.Fn membar_producer "void"
.Ft void
.Fn membar_consumer "void"
.Ft void
.Fn membar_datadep_consumer "void"
.Ft void
.Fn membar_sync "void"
.Sh DESCRIPTION
The
.Nm
family of functions prevent reordering of memory operations, as needed
for synchronization in multiprocessor execution environments that have
relaxed load and store order.
.Pp
In general, memory barriers must come in pairs \(em a barrier on one
CPU, such as
.Fn membar_release ,
must pair with a barrier on another CPU, such as
.Fn membar_acquire ,
in order to synchronize anything between the two CPUs.
Code using
.Nm
should generally be annotated with comments identifying how they are
paired.
.Pp
.Nm
affect only operations on regular memory, not on device
memory; see
.Xr bus_space 9
and
.Xr bus_dma 9
for machine-independent interfaces to handling device memory and DMA
operations for device drivers.
.Pp
Unlike C11,
.Em all
memory operations \(em that is, all loads and stores on regular
memory \(em are affected by
.Nm ,
not just C11 atomic operations on
.Vt _Atomic\^ Ns -qualified
objects.
.Bl -tag -width abcd
.It Fn membar_acquire
Any load preceding
.Fn membar_acquire
will happen before all memory operations following it.
.Pp
A load followed by a
.Fn membar_acquire
implies a
.Em load-acquire
operation in the language of C11.
.Fn membar_acquire
should only be used after atomic read/modify/write, such as
.Xr atomic_cas_uint 3 .
For regular loads, instead of
.Li "x = *p; membar_acquire()" ,
you should use
.Li "x = atomic_load_acquire(p)" .
.Pp
.Fn membar_acquire
is typically used in code that implements locking primitives to ensure
that a lock protects its data, and is typically paired with
.Fn membar_release ;
see below for an example.
.It Fn membar_release
All memory operations preceding
.Fn membar_release
will happen before any store that follows it.
.Pp
A
.Fn membar_release
followed by a store implies a
.Em store-release
operation in the language of C11.
.Fn membar_release
should only be used before atomic read/modify/write, such as
.Xr atomic_inc_uint 3 .
For regular stores, instead of
.Li "membar_release(); *p = x" ,
you should use
.Li "atomic_store_release(p, x)" .
.Pp
.Fn membar_release
is typically paired with
.Fn membar_acquire ,
and is typically used in code that implements locking or reference
counting primitives.
Releasing a lock or reference count should use
.Fn membar_release ,
and acquiring a lock or handling an object after draining references
should use
.Fn membar_acquire ,
so that whatever happened before releasing will also have happened
before acquiring.
For example:
.Bd -literal -offset abcdefgh
/* thread A -- release a reference */
obj->state.mumblefrotz = 42;
KASSERT(valid(&obj->state));
membar_release();
atomic_dec_uint(&obj->refcnt);
/*
* thread B -- busy-wait until last reference is released,
* then lock it by setting refcnt to UINT_MAX
*/
while (atomic_cas_uint(&obj->refcnt, 0, -1) != 0)
continue;
membar_acquire();
KASSERT(valid(&obj->state));
obj->state.mumblefrotz--;
.Ed
.Pp
In this example,
.Em if
the load in
.Fn atomic_cas_uint
in thread B witnesses the store in
.Fn atomic_dec_uint
in thread A setting the reference count to zero,
.Em then
everything in thread A before the
.Fn membar_release
is guaranteed to happen before everything in thread B after the
.Fn membar_acquire ,
as if the machine had sequentially executed:
.Bd -literal -offset abcdefgh
obj->state.mumblefrotz = 42; /* from thread A */
KASSERT(valid(&obj->state));
\&...
KASSERT(valid(&obj->state)); /* from thread B */
obj->state.mumblefrotz--;
.Ed
.Pp
.Fn membar_release
followed by a store, serving as a
.Em store-release
operation, may also be paired with a subsequent load followed by
.Fn membar_acquire ,
serving as the corresponding
.Em load-acquire
operation.
However, you should use
.Xr atomic_store_release 9
and
.Xr atomic_load_acquire 9
instead in that situation, unless the store is an atomic
read/modify/write which requires a separate
.Fn membar_release .
.It Fn membar_producer
All stores preceding
.Fn membar_producer
will happen before any stores following it.
.Pp
.Fn membar_producer
has no analogue in C11.
.Pp
.Fn membar_producer
is typically used in code that produces data for read-only consumers
which use
.Fn membar_consumer ,
such as
.Sq seqlocked
snapshots of statistics; see below for an example.
.It Fn membar_consumer
All loads preceding
.Fn membar_consumer
will complete before any loads after it.
.Pp
.Fn membar_consumer
has no analogue in C11.
.Pp
.Fn membar_consumer
is typically used in code that reads data from producers which use
.Fn membar_producer ,
such as
.Sq seqlocked
snapshots of statistics.
For example:
.Bd -literal
struct {
/* version number and in-progress bit */
unsigned seq;
/* read-only statistics, too large for atomic load */
unsigned foo;
int bar;
uint64_t baz;
} stats;
/* producer (must be serialized, e.g. with mutex(9)) */
stats->seq |= 1; /* mark update in progress */
membar_producer();
stats->foo = count_foo();
stats->bar = measure_bar();
stats->baz = enumerate_baz();
membar_producer();
stats->seq++; /* bump version number */
/* consumer (in parallel w/ producer, other consumers) */
restart:
while ((seq = stats->seq) & 1) /* wait for update */
SPINLOCK_BACKOFF_HOOK;
membar_consumer();
foo = stats->foo; /* read out a candidate snapshot */
bar = stats->bar;
baz = stats->baz;
membar_consumer();
if (seq != stats->seq) /* try again if version changed */
goto restart;
.Ed
.It Fn membar_datadep_consumer
Same as
.Fn membar_consumer ,
but limited to loads of addresses dependent on prior loads, or
.Sq data-dependent
loads:
.Bd -literal -offset indent
int **pp, *p, v;
p = *pp;
membar_datadep_consumer();
v = *p;
consume(v);
.Ed
.Pp
.Fn membar_datadep_consumer
is typically paired with
.Fn membar_release
by code that initializes an object before publishing it.
However, you should use
.Xr atomic_store_release 9
and
.Xr atomic_load_consume 9
instead, to avoid obscure edge cases in case the consumer is not
read-only.
.Pp
.Fn membar_datadep_consumer
does not guarantee ordering of loads in branches, or
.Sq control-dependent
loads \(em you must use
.Fn membar_consumer
instead:
.Bd -literal -offset indent
int *ok, *p, v;
if (*ok) {
membar_consumer();
v = *p;
consume(v);
}
.Ed
.Pp
Most CPUs do not reorder data-dependent loads (i.e., most CPUs
guarantee that cached values are not stale in that case), so
.Fn membar_datadep_consumer
is a no-op on those CPUs.
.It Fn membar_sync
All memory operations preceding
.Fn membar_sync
will happen before any memory operations following it.
.Pp
.Fn membar_sync
is a sequential consistency acquire/release barrier, analogous to
.Li "atomic_thread_fence(memory_order_seq_cst)"
in C11.
.Pp
.Fn membar_sync
is typically paired with
.Fn membar_sync .
.Pp
.Fn membar_sync
is typically not needed except in exotic synchronization schemes like
Dekker's algorithm that require store-before-load ordering.
If you are tempted to reach for it, see if there is another way to do
what you're trying to do first.
.El
.Sh DEPRECATED MEMORY BARRIERS
The following memory barriers are deprecated.
They were imported from Solaris, which describes them as providing
ordering relative to
.Sq lock acquisition ,
but the documentation in
.Nx
disagreed with the implementation and use on the semantics.
.Bl -tag -width abcd
.It Fn membar_enter
Originally documented as store-before-load/store, this was instead
implemented as load-before-load/store on some platforms, which is what
essentially all uses relied on.
Now this is implemented as an alias for
.Fn membar_sync
everywhere, meaning a full load/store-before-load/store sequential
consistency barrier, in order to guarantee what the documentation
claimed
.Em and
what the implementation actually did.
.Pp
New code should use
.Fn membar_acquire
for load-before-load/store ordering, which is what most uses need, or
.Fn membar_sync
for store-before-load/store ordering, which typically only appears in
exotic synchronization schemes like Dekker's algorithm.
.It Fn membar_exit
Alias for
.Fn membar_release .
This was originally meant to be paired with
.Fn membar_enter .
.Pp
New code should use
.Fn membar_release
instead.
.El
.Sh SEE ALSO
.Xr atomic_ops 3 ,
.Xr atomic_loadstore 9 ,
.Xr bus_dma 9 ,
.Xr bus_space 9
.Sh HISTORY
The
.Nm membar_ops
functions first appeared in
.Nx 5.0 .
.Pp
The data-dependent load barrier,
.Fn membar_datadep_consumer ,
first appeared in
.Nx 7.0 .
.Pp
The
.Fn membar_acquire
and
.Fn membar_release
functions first appeared, and the
.Fn membar_enter
and
.Fn membar_exit
functions were deprecated, in
.Nx 10.0 .