.\" $NetBSD: pktqueue.9,v 1.3 2022/09/05 16:42:59 wiz Exp $
.\"
.\" Copyright (c) 2022 The NetBSD Foundation, Inc.
.\" All rights reserved.
.\"
.\" This code is derived from software contributed to The NetBSD Foundation
.\" by Jason R. Thorpe.
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions
.\" are met:
.\" 1. Redistributions of source code must retain the above copyright
.\" notice, this list of conditions and the following disclaimer.
.\" 2. Redistributions in binary form must reproduce the above copyright
.\" notice, this list of conditions and the following disclaimer in the
.\" documentation and/or other materials provided with the distribution.
.\"
.\" THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS
.\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
.\" TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
.\" PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS
.\" BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
.\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
.\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
.\" INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
.\" CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
.\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
.\" POSSIBILITY OF SUCH DAMAGE.
.\"
.Dd September 3, 2022
.Dt PKTQUEUE 9
.Os
.Sh NAME
.Nm pktqueue ,
.Nm pktq_create ,
.Nm pktq_destroy ,
.Nm pktq_enqueue ,
.Nm pktq_dequeue ,
.Nm pktq_barrier ,
.Nm pktq_ifdetach ,
.Nm pktq_flush ,
.Nm pktq_set_maxlen ,
.Nm pktq_rps_hash ,
.Nm pktq_sysctl_setup ,
.Nm sysctl_pktq_rps_hash_handler
.Nd Lockless network protocol input queues with integrated ISR scheduling
.Sh SYNOPSIS
.In net/pktqueue.h
.Ft pktqueue_t *
.Fn pktq_create "size_t maxlen" "void (*intrh)(void *)" "void *arg"
.Ft void
.Fn pktq_destroy "pktqueue_t *pq"
.Ft bool
.Fn pktq_enqueue "pktqueue_t *pq" "struct mbuf *m" "u_int hash"
.Ft struct mbuf *
.Fn pktq_dequeue "pktqueue_t *pq"
.Ft void
.Fn pktq_barrier "pktqueue_t *pq"
.Ft void
.Fn pktq_ifdetach "void"
.Ft void
.Fn pktq_flush "pktqueue_t *pq"
.Ft int
.Fn pktq_set_maxlen "pktqueue_t *pq" "size_t maxlen"
.Ft uint32_t
.Fn pktq_rps_hash "const pktq_rps_hash_func_t *funcp" "const struct mbuf *m"
.Ft int
.Fn pktq_sysctl_setup "pktqueue_t *pq" "struct sysctllog **clog" \
"const struct sysctlnode *parent_node" "int qid"
.Ft int
.Fn sysctl_pktq_rps_hash_handler "SYSCTLFN_ARGS"
.Sh DESCRIPTION
The
.Nm
functions provide a lockless network protocol input queue interface
with integrated software interrupt scheduling and support for
receiver-side packet steering
.Pq RPS .
The implementation is based around per-CPU producer-consumer queues;
multiple CPUs may enqueue packets into a CPU's queue, but only the
owning CPU will dequeue packets from any given queue.
.Sh FUNCTIONS
.Bl -tag -width compact
.It Fn pktq_create "maxlen" "intrh" "arg"
Create a packet queue that can store at most
.Fa maxlen
packets at one time.
.Fa maxlen
must not exceed
.Dv PCQ_MAXLEN .
The software interrupt handler
.Fa intrh
with argument
.Fa arg
will be established at
.Dv SOFTINT_NET .
.It Fn pktq_destroy "pq"
Destroy the specified packet queue.
The caller is responsible for ensuring that no packets remain in the queue
prior to calling this function.
.It Fn pktq_enqueue "pq" "m" "hash"
Enqueue the packet
.Fa m
in the specified packet queue.
The modulus of
.Fa hash
and the number of CPUs will be computed and used to select the per-CPU
queue where the packet will be stored, and thus upon which CPU the packet
will be processed.
Once the CPU selection has been made and the packet placed in the queue,
the software interrupt associated with packet queue will be scheduled.
.It Fn pktq_dequeue "pq"
Dequeue a packet from the current CPU's queue.
If no packets remain,
.Dv NULL
is returned.
This function must be called with kernel preemption disabled, which is always
the case when running in a software interrupt handler.
.It Fn pktq_barrier "pq"
Wait for a grace period when all packets enqueued at the moment of
calling this function will have been dequeued.
.It Fn pktq_ifdetach
This function is called when a network interface is detached from the
system.
It performs a
.Fn pktq_barrier
operation on all packet queues.
.It Fn pktq_flush "pq"
This function removes and frees all packets in the specified queue.
The caller is responsible for ensuring that no calls to
.Fn pktq_enqueue
or
.Fn pktq_dequeue
run concurrently with
.Fn pktq_flush .
.It Fn pktq_set_maxlen "pq" "maxlen"
Sets the maximum queue length to the value
.Fa maxlen .
If the new value of
.Fa maxlen
is smaller than the previous value, then this routine may block until
all packets that were previously in the packet queue can be re-enqueued.
.It Fn pktq_rps_hash "funcp" "m"
Calculates the RPS hash for the packet
.Fa m
using the hash function referenced by
.Fa funcp .
The available hash functions are
.Dq zero
.Pq always returns 0 ,
.Dq curcpu
.Pq returns the index of the current CPU ,
.Dq toeplitz
.Pq Toeplitz hash of an IPv4 or IPv6 packet ,
and
.Dq toeplitz-othercpus
.Po
same as
.Dq toeplitz
but always hashes to a CPU other than the current CPU
.Pc .
A default hash routine is provided by the global variable
.Dv pktq_rps_hash_default .
The default routine is guaranteed to be safe to use for any network protocol.
The behavior of
.Dq toeplitz
and
.Dq toeplitz-othercpus
is undefined if used with protocols other than IPv4 or IPv6.
.It Fn pktq_sysctl_setup "pq" "clog" "parent_node" "qid"
This function registers standard sysctl handlers for
.Fa pq
at the parent sysctl node
.Fa parent_node .
.Fa qid
allows the caller to specify the node ID at which to attach to
.Fa parent_node ;
use
.Dv CTL_CREATE
to dynamically assign a node ID.
The
.Fa clog
argument is passed directly to
.Fn sysctl_createv .
.It Fn sysctl_pktq_rps_hash_handler
This function provides a way for the user to select the preferred
RPS hash function to be used by a caller of
.Fn pktq_rps_hash
via
.Xr sysctl 8 .
When calling
.Fn sysctl_createv
to create the sysctl node, pass
.Fn sysctl_pktq_rps_hash_handler
as the
.Fa func
argument and the pointer to the RPS hash function reference variable
as the
.Fa newp
argument
.Po
cast to
.Sq void *
.Pc .
.El
.Sh CODE REFERENCES
The
.Nm
interface is implemented within the file
.Pa net/pktqueue.c .
.Pp
An example of how to use
.Fn pktq_rps_hash
can be found in the
.Fn ether_input
function.
An example of how to use
.Fn sysctl_pktq_rps_hash_handler
can be found in the
.Fn ether_sysctl_setup
function.
Both reside within the file
.Pa net/if_ethersubr.c .
.Pp
An example of how to use
.Fn pktq_sysctl_setup
can be found in the
.Fn sysctl_net_inet_ip_setup
function within the file
.Pa netinet/ip_input.c .
.Sh NOTES
The
.Fa maxlen
argument provided to
.Fn pktq_create
specifies the maximum number of packets that can be stored in each
per-CPU queue.
This means that, in theory, the maximum number of packets that can be
enqueued is
.Sq maxlen * ncpu .
However, as a practical matter, the number will be smaller because the
distribution of packets across the per-CPU queues is not perfectly uniform.
.Pp
Calls to
.Fn pktq_set_maxlen
may result in re-ordering the delivery of packets currently in
the queue.
.\" .Sh EXAMPLES
.Sh SEE ALSO
.Xr sysctl 8 ,
.Xr kpreempt 9 ,
.Xr pcq 9 ,
.Xr softintr 9 ,
.Xr sysctl 9
.Sh HISTORY
The
.Nm
interface first appeared in
.Nx 7.0 .