trace: Update documentation

Revamp the documentation for the new features, including a description of the new features and documentation for the trace command. Signed-off-by: Simon Glass <sjg@chromium.org>
2023-01-15 14:16:01 -07:00 · 2023-01-15 14:16:01 -07:00 · e46b72436c
commit e46b72436c
parent 9cea4797ae
9 changed files with 382 additions and 64 deletions
--- a/.gitattributes
+++ b/.gitattributes
@ -4,3 +4,4 @@
 *.bmp binary
 *.ttf binary
 *.gz binary
+*.png binary
--- a/doc/develop/pics/flamegraph.png
+++ b/doc/develop/pics/flamegraph.png
--- a/doc/develop/pics/flamegraph_timing.png
+++ b/doc/develop/pics/flamegraph_timing.png
--- a/doc/develop/pics/flamegraph_zoom.png
+++ b/doc/develop/pics/flamegraph_zoom.png
--- a/doc/develop/pics/kernelshark.png
+++ b/doc/develop/pics/kernelshark.png
--- a/doc/develop/pics/kernelshark_fg.png
+++ b/doc/develop/pics/kernelshark_fg.png
--- a/doc/develop/trace.rst
+++ b/doc/develop/trace.rst
@ -15,12 +15,16 @@ Overview
 The trace feature uses GCC's instrument-functions feature to trace all
 function entry/exit points. These are then recorded in a memory buffer.
 The memory buffer can be saved to the host over a network link using
-tftpput or by writing to an attached memory device such as MMC.
+tftpput or by writing to an attached storage device such as MMC.

 On the host, the file is first converted with a tool called 'proftool',
 which extracts useful information from it. The resulting trace output
 resembles that emitted by Linux's ftrace feature, so can be visually
-displayed by pytimechart.
+displayed by kernelshark (see kernelshark_) and used with
+'trace-cmd report' (see trace_cmd_).
+
+It is also possible to produce a flame graph for use with flamegraph.pl
+(see flamegraph_pl_).


 Quick-start using Sandbox
@ -30,7 +34,7 @@ Sandbox is a build of U-Boot that can run under Linux so it is a convenient
 way of trying out tracing before you use it on your actual board. To do
 this, follow these steps:

-Add the following to config/sandbox_defconfig
+Add the following to `config/sandbox_defconfig`:

 .. code-block:: c

@ -75,7 +79,7 @@ a trace:
             16 maximum observed call depth
             15 call depth limit
          1,275,767 calls not traced due to depth
-    =>trace calls 0 e00000
+    =>trace calls 1000000 e00000
    Call list dumped to 00000000, size 0xae0a40
    =>print
    baudrate=115200
@ -87,7 +91,7 @@ a trace:
    stdout=serial

    Environment size: 117/8188 bytes
-    =>host save host 0 trace 0 ${profoffset}
+    =>host save hostfs - 1000000 trace ${profoffset}
    11405888 bytes written in 10 ms (1.1 GiB/s)
    =>reset

@ -96,18 +100,107 @@ Then run proftool to convert the trace information to ftrace format

 .. code-block:: console

-    $ ./sandbox/tools/proftool -m sandbox/System.map -p trace dump-ftrace >trace.txt
+    $ ./sandbox/tools/proftool -m sandbox/System.map -t trace dump-ftrace >trace.dat

-Finally run pytimechart to display it
+Finally run kernelshark to display it (note it only works with `.dat` files!):

 .. code-block:: console

-    $ pytimechart trace.txt
+    $ kernelshark trace.dat

-Using this tool you can zoom and pan across the trace, with the function
-calls on the left and little marks representing the start and end of each
+Using this tool you can view the trace records and see the timestamp for each
 function.

+.. image:: pics/kernelshark.png
+  :width: 800
+  :alt: Kernelshark showing function-trace records
+
+
+To see the records on the console, use trace-cmd:
+
+.. code-block:: console
+
+    $ trace-cmd report trace.dat | less
+    cpus=1
+          u-boot-1     [000]     3.116364: function:             initf_malloc
+          u-boot-1     [000]     3.116375: function:             initf_malloc
+          u-boot-1     [000]     3.116386: function:             initf_bootstage
+          u-boot-1     [000]     3.116396: function:                bootstage_init
+          u-boot-1     [000]     3.116408: function:                   malloc
+          u-boot-1     [000]     3.116418: function:                      malloc_simple
+          u-boot-1     [000]     3.116429: function:                         alloc_simple
+          u-boot-1     [000]     3.116441: function:                         alloc_simple
+          u-boot-1     [000]     3.116449: function:                      malloc_simple
+          u-boot-1     [000]     3.116457: function:                   malloc
+
+Note that `pytimechart` is obsolete so cannot be used anymore.
+
+There is a -f option available to select a function graph:
+
+.. code-block:: console
+
+    $ ./sandbox/tools/proftool -m sandbox/System.map -t trace -f funcgraph dump-ftrace >trace.dat
+
+Again, you can use kernelshark or trace-cmd to look at the output. In this case
+you will see the time taken by each function shown against its exit record.
+
+.. image:: pics/kernelshark_fg.png
+  :width: 800
+  :alt: Kernelshark showing function-graph records
+
+.. code-block:: console
+
+    $ trace-cmd report trace.dat | less
+    cpus=1
+              u-boot-1     [000]     3.116364: funcgraph_entry:        0.011 us   |    initf_malloc();
+              u-boot-1     [000]     3.116386: funcgraph_entry:                   |    initf_bootstage() {
+              u-boot-1     [000]     3.116396: funcgraph_entry:                   |      bootstage_init() {
+              u-boot-1     [000]     3.116408: funcgraph_entry:                   |        malloc() {
+              u-boot-1     [000]     3.116418: funcgraph_entry:                   |          malloc_simple() {
+              u-boot-1     [000]     3.116429: funcgraph_entry:        0.012 us   |            alloc_simple();
+              u-boot-1     [000]     3.116449: funcgraph_exit:         0.031 us   |            }
+              u-boot-1     [000]     3.116457: funcgraph_exit:         0.049 us   |          }
+              u-boot-1     [000]     3.116466: funcgraph_entry:        0.063 us   |        memset();
+              u-boot-1     [000]     3.116539: funcgraph_exit:         0.143 us   |        }
+
+Flame graph
+-----------
+
+Some simple flame graph options are available as well, using the dump-flamegraph
+command:
+
+.. code-block:: console
+
+    $ ./sandbox/tools/proftool -m sandbox/System.map -t trace dump-flamegraph >trace.fg
+    $ flamegraph.pl trace.fg >trace.svg
+
+You can load the .svg file into a viewer. If you use Chrome (and some other
+programs) you can click around and zoom in and out.
+
+.. image:: pics/flamegraph.png
+  :width: 800
+  :alt: Chrome showing the flamegraph.pl output
+
+.. image:: pics/flamegraph_zoom.png
+  :width: 800
+  :alt: Chrome showing zooming into the flamegraph.pl output
+
+
+A timing variant is also available, which gives an idea of how much time is
+spend in each call stack:
+
+.. code-block:: console
+
+    $ ./sandbox/tools/proftool -m sandbox/System.map -t trace dump-flamegraph -f timing >trace.fg
+    $ flamegraph.pl trace.fg >trace.svg
+
+Note that trace collection does slow down execution so the timings will be
+inflated. They should be used to guide optimisation. For accurate boot timings,
+use bootstage.
+
+.. image:: pics/flamegraph_timing.png
+  :width: 800
+  :alt: Chrome showing flamegraph.pl output with timing

 CONFIG Options
 --------------
@ -138,6 +231,11 @@ CONFIG_TRACE_EARLY_SIZE
 CONFIG_TRACE_EARLY_ADDR
    Address of early trace buffer

+CONFIG_TRACE_CALL_DEPTH_LIMIT
+    Sets the limit on trace call-depth. For a broad view, 10 is typically
+    sufficient. Setting this too large creates enormous traces and distorts
+    the overall timing considerable.
+

 Building U-Boot with Tracing Enabled
 ------------------------------------
@ -148,6 +246,26 @@ instrumenting from the command line instead of having to change board
 config files.


+Board requirements
+------------------
+
+Trace data collection relies on a microsecond timer, accessed through
+`timer_get_us()`. So the first thing you should do is make sure that
+this produces sensible results for your board. Suitable sources for
+this timer include high resolution timers, PWMs or profile timers if
+available. Most modern SOCs have a suitable timer for this.
+
+See `add_ftrace()` for where `timer_get_us()` is called. The `notrace`
+attribute must be used on each function called by `timer_get_us()` since
+recursive calls to `add_ftrace()` will cause a fault::
+
+   trace: recursion detected, disabling
+
+You cannot use driver model to obtain the microsecond timer, since tracing
+may be enabled before driver model is set up. Instead, provide a low-level
+function which accesses the timer, setting it up if needed.
+
+
 Collecting Trace Data
 ---------------------

@ -155,21 +273,22 @@ When you run U-Boot on your board it will collect trace data up to the
 limit of the trace buffer size you have specified. Once that is exhausted
 no more data will be collected.

-Collecting trace data has an affect on execution time/performance. You
+Collecting trace data affects execution time and performance. You
 will notice this particularly with trivial functions - the overhead of
 recording their execution may even exceed their normal execution time.
 In practice this doesn't matter much so long as you are aware of the
 effect. Once you have done your optimizations, turn off tracing before
-doing end-to-end timing.
+doing end-to-end timing using bootstage.

 The best time to start tracing is right at the beginning of U-Boot. The
 best time to stop tracing is right at the end. In practice it is hard
 to achieve these ideals.

-This implementation enables tracing early in board_init_f(). This means
+This implementation enables tracing early in `board_init_r()`, or
+`board_init_f()` when `TRACE_EARLY` is enabled. This means
 that it captures most of the board init process, missing only the
 early architecture-specific init. However, it also misses the entire
-SPL stage if there is one.
+SPL stage if there is one. At present tracing is not supported in SPL.

 U-Boot typically ends with a 'bootm' command which loads and runs an
 OS. There is useful trace data in the execution of that bootm
@ -179,39 +298,11 @@ the OS. In practical terms, U-Boot runs the 'fakegocmd' environment
 variable at this point. This variable should have a short script which
 collects the trace data and writes it somewhere.

-Trace data collection relies on a microsecond timer, accessed through
-timer_get_us(). So the first think you should do is make sure that
-this produces sensible results for your board. Suitable sources for
-this timer include high resolution timers, PWMs or profile timers if
-available. Most modern SOCs have a suitable timer for this. Make sure
-that you mark this timer (and anything it calls) with
-notrace so that the trace library can
-use it without causing an infinite loop.
+Controlling the trace
+---------------------

-
-Commands
--------
-
-The trace command has variable sub-commands:
-
-stats
-    Display tracing statistics
-
-pause
-    Pause tracing
-
-resume
-    Resume tracing
-
-funclist [<addr> <size>]
-    Dump a list of functions into the buffer
-
-calls  [<addr> <size>]
-    Dump function call trace into buffer
-
-If the address and size are not given, these are obtained from environment
-variables (see below). In any case the environment variables are updated
-after the command runs.
+U-Boot provides a command-line interface to the trace system for controlling
+tracing and accessing the trace data. See :doc:`../usage/cmd/trace`.


 Environment Variables
@ -264,39 +355,94 @@ a trace log to address 10000000 and sends it to a host machine using
 TFTP. After this, U-Boot will boot the OS normally, albeit a little
 later.

+For a filesystem you may do something like::

-Converting Trace Output Data
----------------------------
+    trace calls 10000000 1000000;
+    save mmc 1:1 10000000 /trace ${profoffset}
+
+The trace buffer format is internal to the trace system. It consists of a
+header, a call count for each function site, followed by a list of trace
+records, once for each function call.
+
+
+Converting Trace Output Data (proftool)
+---------------------------------------

 The trace output data is kept in a binary format which is not documented
-here. To convert it into something useful, you can use proftool.
+here. See the `trace.h` header file if you are interested. To convert it into
+something useful, you can use proftool.

 This tool must be given the U-Boot map file and the trace data received
-from running that U-Boot. It produces a text output file.
+from running that U-Boot. It produces a binary output file.

-Options
+It is also possible to provide a configuration file to indicate which functions
+should be included or dropped during conversion. This file consists of lines
+like::
+
+   include-func <regex>
+   exclude-func <regex>
+
+where <regex> is a regular expression matched against function names. It
+allows some functions to be dropped from the trace when producing ftrace
+records.
+
+Options:
+
+-c <config_file>
+    Specify the optional configuration file, to control which functions are
+    included in the output.
+
+-f <format>
+    Specifies the format to use (see below)

 -m <map_file>
-    Specify U-Boot map file
+    Specify U-Boot map file (`System.map`)

-p <trace_file>
-    Specify profile/trace file
+-o <output file>
+    Specify the output filename
+
+-t <trace_file>
+    Specify trace file, the data saved from U-Boot
+
+-v <0-4>
+    Specify the verbosity, where 0 is the minimum and 4 is for debugging.

 Commands:

-dump-ftrace
-    Write a text dump of the file in Linux ftrace format to stdout
+dump-ftrace:
+    Write a binary dump of the file in Linux ftrace format. Two options are
+    available:

+    function
+        write function-call records (caller/callee)
+
+    funcgraph
+        write function entry/exit records (graph)
+
+    This format can be used with kernelshark_ and trace_cmd_.
+
+dump-flamegraph
+    Write a list of stack records useful for producing a flame graph. Two
+    options are available:
+
+    calls
+        create a flamegraph of stack frames
+
+    timing
+        create a flamegraph of microseconds for each stack frame
+
+    This format can be used with flamegraph_pl_.

 Viewing the Trace Data
 ----------------------

-You can use pytimechart for this (sudo apt-get pytimechart might work on
-your Debian-style machine, and use your favourite search engine to obtain
-documentation). It expects the file to have a .txt extension. The program
-has terse user interface but is very convenient for viewing U-Boot
-profile information.
+You can use kernelshark_ for a GUI, but note that version 2.0.x was broken. If
+you have that version you could try building it from source.

+The file must have a .dat extension or it is ignored. The program has terse
+user interface but is very convenient for viewing U-Boot profile information.
+
+Also available is trace_cmd_ which provides a command-line interface.

 Workflow Suggestions
 --------------------
@ -329,7 +475,9 @@ There are a few parameters in the code that you may want to consider.
 There is a function call depth limit (set to 15 by default). When the
 stack depth goes above this then no tracing information is recorded.
 The maximum depth reached is recorded and displayed by the 'trace stats'
-command.
+command. While it might be tempting to set the depth limit quite high, this
+can dramatically increase the size of the trace output as well as the execution
+time.


 Future Work
@ -346,5 +494,10 @@ Some other features that might be useful:
 - Compression of trace information


-Simon Glass <sjg@chromium.org>
-April 2013
+.. sectionauthor:: Simon Glass <sjg@chromium.org>
+.. April 2013
+.. Updated January 2023
+
+.. _kernelshark: https://kernelshark.org/
+.. _trace_cmd: https://www.trace-cmd.org/
+.. _flamegraph_pl: https://github.com/brendangregg/FlameGraph/blob/master/flamegraph.pl
--- a/doc/usage/cmd/trace.rst
+++ b/doc/usage/cmd/trace.rst
@ -0,0 +1,163 @@
+.. SPDX-License-Identifier: GPL-2.0+:
+
+trace command
+=============
+
+Synopis
+-------
+
+::
+
+    trace stats
+    trace pause
+    trace resume
+    trace funclist [<addr> <size>]
+    trace calls [<addr> <size>]
+
+Description
+-----------
+
+The *trace* command is used to control the U-Boot tracing system. It allows
+tracing to be paused and resumed, shows statistics for traces and provides a
+way to dump out the trace information.
+
+
+trace stats
+~~~~~~~~~~~
+
+This display tracing statistics, as follows:
+
+function sites
+    Functions are binned as a way of reducing the amount of space needed to
+    hold all the function information. This is controlled by FUNC_SITE_SIZE in
+    the trace.h header file. The usual value is 4, which provides the finest
+    granularity (assuming a minimum instruction size of 4 bytes) which means
+    that every function can be resolved individually.
+
+function calls
+    Total number of function calls, including those which were not traced due
+    to buffer space. This count does not include functions which exceeded the
+    depth limit.
+
+untracked function calls
+    Total number of function calls which did not appear in the U-Boot image.
+    This can happen if a function is called outside the normal code area.
+
+traced function calls
+    Total number of function calls which were actually traced, i.e. are included
+    in the recorded trace data.
+
+dropped due to overflow
+    If the trace buffer was exhausted then this shows the number of records that
+    were dropped. Try reducing the depth limit or expanding the buffer size.
+
+maximum observed call depth
+    Maximum observed call depth while tracing.
+
+calls not traced due to depth
+    Counts the number of function calls that were not recorded because they
+    exceeded the maximum call depth.
+
+max function calls
+    Maximum number of function calls which can be recorded in the trace buffer,
+    given its size. Once `function calls` hits this value, recording stops.
+
+trace buffer
+    Address of trace buffer
+
+call records
+    Address of first trace record. This is near the start of the trace buffer,
+    after the function-call counts.
+
+
+trace pause
+~~~~~~~~~~~
+
+Pauses tracing, so that no more data is added to the trace buffer.
+
+
+trace resume
+~~~~~~~~~~~~
+
+Resumes tracing, so that new function calls are added to the trace buffer if
+there is sufficient space.
+
+
+trace funclist [<addr> <size>]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Dumps a list of functions into the provided buffer. The file uses a format
+specific to U-Boot: a header, following by the function offset and call count.
+
+If the address and size are not given, these are obtained from
+:ref:`develop/trace:environment variables`. In any case the environment
+variables are updated after the command runs.
+
+The resulting data should be written out to the host, e.g. using Ethernet or
+a filesystem. There are no tools provided to read this sdata.
+
+
+trace calls [<addr> <size>]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Dumps a list of function calls into the provided buffer. The file uses a format
+specific to U-Boot: a header, following by the list of calls. The proftool
+tool can be used to convert this information ready for further analysis.
+
+
+Example
+-------
+
+::
+
+    => trace stats
+            269,252 function sites
+         38,025,059 function calls
+                  3 untracked function calls
+          7,382,690 traced function calls
+                 17 maximum observed call depth
+                 15 call depth limit
+         68,667,432 calls not traced due to depth
+         22,190,112 max function calls
+
+    trace buffer 6c000000 call records 6c20de78
+    => trace resume
+    => trace pause
+
+This shows that resuming the trace causes the buffer to overflow::
+
+    => trace stats
+            269,252 function sites
+         49,573,694 function calls
+                  3 untracked function calls
+         22,190,112 traced function calls (8289848 dropped due to overflow)
+                 17 maximum observed call depth
+                 15 call depth limit
+         68,667,432 calls not traced due to depth
+         22,190,112 max function calls
+
+    trace buffer 6c000000 call records 6c20de78
+    => trace funcs 30000000 0x100000
+    Function trace dumped to 30000000, size 0x1e70
+
+This shows collecting and writing out the result trace data:
+
+::
+    => trace calls 20000000 0x10000000
+    Call list dumped to 20000000, size 0xfdf21a0
+    => save mmc 1:1 20000000 /trace ${profoffset}
+    File System is consistent
+    file found, deleting
+    update journal finished
+    File System is consistent
+    update journal finished
+    266281376 bytes written in 18584 ms (13.7 MiB/s)
+
+From here you can use proftool to convert it:
+
+.. code-block:: bash
+
+    tools/proftool -m System.map -t trace -o asc.fg dump-ftrace
+
+
+.. _`ACPI specification`: https://uefi.org/sites/default/files/resources/ACPI_6_3_final_Jan30.pdf
--- a/doc/usage/index.rst
+++ b/doc/usage/index.rst
@ -84,6 +84,7 @@ Shell commands
   cmd/sm
   cmd/temperature
   cmd/tftpput
+   cmd/trace
   cmd/true
   cmd/ums
   cmd/ut