.\" $NetBSD: tprof.8,v 1.30 2023/04/18 00:21:23 gutteridge Exp $
.\"
.\" Copyright (c)2011 YAMAMOTO Takashi,
.\" All rights reserved.
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions
.\" are met:
.\" 1. Redistributions of source code must retain the above copyright
.\" notice, this list of conditions and the following disclaimer.
.\" 2. Redistributions in binary form must reproduce the above copyright
.\" notice, this list of conditions and the following disclaimer in the
.\" documentation and/or other materials provided with the distribution.
.\"
.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
.\" SUCH DAMAGE.
.\"
.Dd April 17, 2023
.Dt TPROF 8
.Os
.Sh NAME
.Nm tprof
.Nd record tprof profiling samples
.Sh SYNOPSIS
.Nm
.Ar op
.Op Ar arguments
.Sh DESCRIPTION
The
.Nm
tool can be used to monitor hardware events
.Tn ( PMC Ns s )
during the execution of
certain commands.
.Pp
The
.Nm
utility makes the kernel driver start profiling,
executes the specified command,
keeps recording samples from the kernel driver until the command finishes,
and reports statistics to the standard error.
.Pp
The
.Xr tprof 4
pseudo driver and a suitable backend should be loaded beforehand.
.Pp
The
.Nm
utility accepts the following options.
The first argument,
.Ar op ,
specifies the action to take.
Valid actions are:
.Bl -tag -width Cm
.
.It Cm list
.
Display the following information:
.Bl -bullet -compact
.It
a list of performance counter events available on the system
.It
the maximum number of counters that can be used simultaneously
.It
the default counter for
.Cm monitor
and
.Cm top
commands
.El
.
.It Cm monitor Xo
.Op Fl e Ar name\| Ns Oo Cm \&: Ns Ar option\^ Oc Ns Oo Cm \&, Ns Ar scale\^ Oc
.Op Fl e Ar ...
.Op Fl o Ar outfile
.Ar command
.Xc
.
Monitor the execution of
.Ar command .
The
.Ar name
specifies the event to count; it must be taken from the list of
available events.
.Ar option
specifies the source of the event; it must be a combination of
.Cm u
(userland) and
.Cm k
(kernel).
If omitted, it is assumed that both are specified.
Multiple
.Fl e
arguments can be specified.
If none of the
.Fl e
arguments are specified, the CPU's default counter is used.
.Pp
.Ar scale
specifies the ratio of the speed to the cycle counter, or the counter until
overflow.
The counter reset value on overflow used for profiling is calculated from the
speed of the cycle counter by default, but for some events this value may be
too large (counter increasing too slowly) to be sufficient for profiling.
For example, to specify an event that increases about 1000 times slower than
the cycle counter, specify
.Ql -e event,1000 .
Also, if
.Ql -e event,=200
is specified, profiling is performed every time the counter is increased by 200.
.Pp
The collected samples are written into the file
.Ar outfile
if specified.
The default is
.Pa tprof.out .
.
.It Cm count Xo
.Fl e Ar name\| Ns Op Cm \&: Ns Ar option
.Op Fl e Ar ...
.Op Fl i Ar interval
.Ar command
.Xc
.
Same as
.Cm monitor ,
but does not do any profiling,
only outputs counters every
.Ar interval
second.
.
.It Cm analyze Xo
.Op Fl CkLPs
.Op Fl p Ar pid
.Ar file
.Xc
.
Analyze the samples produced by a previous run of
.Nm ,
stored in
.Ar file ,
and generate a plain text representation of them.
.Bl -tag -width Fl
.It Fl C
Don't distinguish CPUs.
All samples are treated as its CPU number is 0.
.It Fl k
Kernel only.
Ignore samples for userland code.
.It Fl L
Don't distinguish LWPs.
All samples are treated as its LWP ID is 0.
.It Fl P
Don't distinguish processes.
All samples are treated as its PID is 0.
.It Fl p Ar pid
Process only samples for the process with PID
.Ar pid
and ignore the rest.
.It Fl s
Per symbol.
.El
.
.It Cm top Xo
.Op Fl acu
.Op Fl e Ar name\| Ns Oo Cm \&, Ns Ar scale\^ Oc
.Op Fl e Ar ...
.Op Fl i Ar interval
.Xc
.
Displays profiling results in real-time.
.Ar name
specifies the name of the event to count.
.Bl -tag -width Fl
.It Fl a
Starts in accumulation mode.
The display is updated every
.Ar interval
second, but the values are accumulative.
.It Fl c
Show the delta of the event counters.
.It Fl i Ar interval
Set the update interval in seconds.
The default value is 1.
.It Fl u
Userland processes are also included in the profiling.
.El
.Pp
While
.Nm
.Ar top
is running, it accepts commands from the terminal.
These commands are currently recognized:
.Bl -tag -width Ic
.It Aq Ic a
toggle accumurative mode.
.It Aq Ic c
shows/hides the event counters.
.It Aq Ic q
quit
.Nm .
.It Aq Ic z
clear accumulated data.
.El
.El
.Sh EXAMPLES
The following command profiles the system during 20 seconds and writes the
samples into the file
.Pa myfile.out .
.Pp
.Dl # tprof monitor -e llc-misses:k -o myfile.out sleep 20
.Pp
The following command displays the results of the sampling.
.Pp
.Dl # tprof analyze myfile.out
.Sh SUPPORT
The following CPU models are supported:
.Bl -hyphen -compact -offset indent
.It
ARMv7
.It
ARMv8
.It
x86 AMD Family 10h
.It
x86 AMD Family 15h
.It
x86 AMD Family 17h
.It
x86 AMD Family 19h
.It
x86 Intel Generic (all Intel CPUs)
.It
x86 Intel Skylake, Kabylake and Cometlake
.It
x86 Intel Silvermont/Airmont
.It
x86 Intel Goldmont
.It
x86 Intel Goldmont Plus
.El
.Sh DIAGNOSTICS
The
.Nm
utility reports the following statistics about the activities of the
.Xr tprof 4
pseudo driver.
.Bl -tag -width dropbuf_samples
.It sample
The number of samples collected and prepared for userland consumption.
.It overflow
The number of samples dropped because the per-CPU buffer was full.
.It buf
The number of buffers successfully prepared for userland consumption.
.It emptybuf
The number of buffers which have been dropped because they were empty.
.It dropbuf
The number of buffers dropped because the number of buffers kept in the kernel
exceeds the limit.
.It dropbuf_samples
The number of samples dropped because the buffers containing the samples
were dropped.
.El
.Sh SEE ALSO
.Xr tprof 4
.Sh AUTHORS
.An -nosplit
The
.Nm
utility was written by
.An YAMAMOTO Takashi .
It was revamped by
.An Maxime Villard
in 2018, and by
.An Ryo Shimizu
in 2022.
.Sh CAVEATS
The contents and representation of recorded samples are undocumented and
will likely be changed for future releases of
.Nx
in an incompatible way.