mirror of
				https://github.com/smaeul/u-boot.git
				synced 2025-10-26 09:38:14 +00:00 
			
		
		
		
	At present there are Kconfig options for tracing, but sandbox uses plain #defines to set them. Correct this and make the tracing command default to enabled so that this is not needed. Signed-off-by: Simon Glass <sjg@chromium.org>
		
			
				
	
	
		
			351 lines
		
	
	
		
			10 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
			
		
		
	
	
			351 lines
		
	
	
		
			10 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
| .. SPDX-License-Identifier: GPL-2.0+
 | |
| .. Copyright (c) 2013 The Chromium OS Authors.
 | |
| 
 | |
| Tracing in U-Boot
 | |
| =================
 | |
| 
 | |
| U-Boot supports a simple tracing feature which allows a record of execution
 | |
| to be collected and sent to a host machine for analysis. At present the
 | |
| main use for this is to profile boot time.
 | |
| 
 | |
| 
 | |
| Overview
 | |
| --------
 | |
| 
 | |
| The trace feature uses GCC's instrument-functions feature to trace all
 | |
| function entry/exit points. These are then recorded in a memory buffer.
 | |
| The memory buffer can be saved to the host over a network link using
 | |
| tftpput or by writing to an attached memory device such as MMC.
 | |
| 
 | |
| On the host, the file is first converted with a tool called 'proftool',
 | |
| which extracts useful information from it. The resulting trace output
 | |
| resembles that emitted by Linux's ftrace feature, so can be visually
 | |
| displayed by pytimechart.
 | |
| 
 | |
| 
 | |
| Quick-start using Sandbox
 | |
| -------------------------
 | |
| 
 | |
| Sandbox is a build of U-Boot that can run under Linux so it is a convenient
 | |
| way of trying out tracing before you use it on your actual board. To do
 | |
| this, follow these steps:
 | |
| 
 | |
| Add the following to config/sandbox_defconfig
 | |
| 
 | |
| .. code-block:: c
 | |
| 
 | |
|     CONFIG_TRACE=y
 | |
| 
 | |
| Build sandbox U-Boot with tracing enabled:
 | |
| 
 | |
| .. code-block:: console
 | |
| 
 | |
|     $ make FTRACE=1 O=sandbox sandbox_config
 | |
|     $ make FTRACE=1 O=sandbox
 | |
| 
 | |
| Run sandbox, wait for a bit of trace information to appear, and then capture
 | |
| a trace:
 | |
| 
 | |
| .. code-block:: console
 | |
| 
 | |
|     $ ./sandbox/u-boot
 | |
| 
 | |
|     U-Boot 2013.04-rc2-00100-ga72fcef (Apr 17 2013 - 19:25:24)
 | |
| 
 | |
|     DRAM:  128 MiB
 | |
|     trace: enabled
 | |
|     Using default environment
 | |
| 
 | |
|     In:    serial
 | |
|     Out:   serial
 | |
|     Err:   serial
 | |
|     =>trace stats
 | |
|         671,406 function sites
 | |
|          69,712 function calls
 | |
|               0 untracked function calls
 | |
|          73,373 traced function calls
 | |
|              16 maximum observed call depth
 | |
|              15 call depth limit
 | |
|          66,491 calls not traced due to depth
 | |
|     =>trace stats
 | |
|         671,406 function sites
 | |
|       1,279,450 function calls
 | |
|               0 untracked function calls
 | |
|         950,490 traced function calls (333217 dropped due to overflow)
 | |
|              16 maximum observed call depth
 | |
|              15 call depth limit
 | |
|           1,275,767 calls not traced due to depth
 | |
|     =>trace calls 0 e00000
 | |
|     Call list dumped to 00000000, size 0xae0a40
 | |
|     =>print
 | |
|     baudrate=115200
 | |
|     profbase=0
 | |
|     profoffset=ae0a40
 | |
|     profsize=e00000
 | |
|     stderr=serial
 | |
|     stdin=serial
 | |
|     stdout=serial
 | |
| 
 | |
|     Environment size: 117/8188 bytes
 | |
|     =>host save host 0 trace 0 ${profoffset}
 | |
|     11405888 bytes written in 10 ms (1.1 GiB/s)
 | |
|     =>reset
 | |
| 
 | |
| 
 | |
| Then run proftool to convert the trace information to ftrace format
 | |
| 
 | |
| .. code-block:: console
 | |
| 
 | |
|     $ ./sandbox/tools/proftool -m sandbox/System.map -p trace dump-ftrace >trace.txt
 | |
| 
 | |
| Finally run pytimechart to display it
 | |
| 
 | |
| .. code-block:: console
 | |
| 
 | |
|     $ pytimechart trace.txt
 | |
| 
 | |
| Using this tool you can zoom and pan across the trace, with the function
 | |
| calls on the left and little marks representing the start and end of each
 | |
| function.
 | |
| 
 | |
| 
 | |
| CONFIG Options
 | |
| --------------
 | |
| 
 | |
| CONFIG_TRACE
 | |
|     Enables the trace feature in U-Boot.
 | |
| 
 | |
| CONFIG_CMD_TRACE
 | |
|     Enables the trace command.
 | |
| 
 | |
| CONFIG_TRACE_BUFFER_SIZE
 | |
|     Size of trace buffer to allocate for U-Boot. This buffer is
 | |
|     used after relocation, as a place to put function tracing
 | |
|     information. The address of the buffer is determined by
 | |
|     the relocation code.
 | |
| 
 | |
| CONFIG_TRACE_EARLY
 | |
|     Define this to start tracing early, before relocation.
 | |
| 
 | |
| CONFIG_TRACE_EARLY_SIZE
 | |
|     Size of 'early' trace buffer. Before U-Boot has relocated
 | |
|     it doesn't have a proper trace buffer. On many boards
 | |
|     you can define an area of memory to use for the trace
 | |
|     buffer until the 'real' trace buffer is available after
 | |
|     relocation. The contents of this buffer are then copied to
 | |
|     the real buffer.
 | |
| 
 | |
| CONFIG_TRACE_EARLY_ADDR
 | |
|     Address of early trace buffer
 | |
| 
 | |
| 
 | |
| Building U-Boot with Tracing Enabled
 | |
| ------------------------------------
 | |
| 
 | |
| Pass 'FTRACE=1' to the U-Boot Makefile to actually instrument the code.
 | |
| This is kept as a separate option so that it is easy to enable/disable
 | |
| instrumenting from the command line instead of having to change board
 | |
| config files.
 | |
| 
 | |
| 
 | |
| Collecting Trace Data
 | |
| ---------------------
 | |
| 
 | |
| When you run U-Boot on your board it will collect trace data up to the
 | |
| limit of the trace buffer size you have specified. Once that is exhausted
 | |
| no more data will be collected.
 | |
| 
 | |
| Collecting trace data has an affect on execution time/performance. You
 | |
| will notice this particularly with trivial functions - the overhead of
 | |
| recording their execution may even exceed their normal execution time.
 | |
| In practice this doesn't matter much so long as you are aware of the
 | |
| effect. Once you have done your optimizations, turn off tracing before
 | |
| doing end-to-end timing.
 | |
| 
 | |
| The best time to start tracing is right at the beginning of U-Boot. The
 | |
| best time to stop tracing is right at the end. In practice it is hard
 | |
| to achieve these ideals.
 | |
| 
 | |
| This implementation enables tracing early in board_init_f(). This means
 | |
| that it captures most of the board init process, missing only the
 | |
| early architecture-specific init. However, it also misses the entire
 | |
| SPL stage if there is one.
 | |
| 
 | |
| U-Boot typically ends with a 'bootm' command which loads and runs an
 | |
| OS. There is useful trace data in the execution of that bootm
 | |
| command. Therefore this implementation provides a way to collect trace
 | |
| data after bootm has finished processing, but just before it jumps to
 | |
| the OS. In practical terms, U-Boot runs the 'fakegocmd' environment
 | |
| variable at this point. This variable should have a short script which
 | |
| collects the trace data and writes it somewhere.
 | |
| 
 | |
| Trace data collection relies on a microsecond timer, accessed through
 | |
| timer_get_us(). So the first think you should do is make sure that
 | |
| this produces sensible results for your board. Suitable sources for
 | |
| this timer include high resolution timers, PWMs or profile timers if
 | |
| available. Most modern SOCs have a suitable timer for this. Make sure
 | |
| that you mark this timer (and anything it calls) with
 | |
| __attribute__((no_instrument_function)) so that the trace library can
 | |
| use it without causing an infinite loop.
 | |
| 
 | |
| 
 | |
| Commands
 | |
| --------
 | |
| 
 | |
| The trace command has variable sub-commands:
 | |
| 
 | |
| stats
 | |
|     Display tracing statistics
 | |
| 
 | |
| pause
 | |
|     Pause tracing
 | |
| 
 | |
| resume
 | |
|     Resume tracing
 | |
| 
 | |
| funclist [<addr> <size>]
 | |
|     Dump a list of functions into the buffer
 | |
| 
 | |
| calls  [<addr> <size>]
 | |
|     Dump function call trace into buffer
 | |
| 
 | |
| If the address and size are not given, these are obtained from environment
 | |
| variables (see below). In any case the environment variables are updated
 | |
| after the command runs.
 | |
| 
 | |
| 
 | |
| Environment Variables
 | |
| ---------------------
 | |
| 
 | |
| The following are used:
 | |
| 
 | |
| profbase
 | |
|     Base address of trace output buffer
 | |
| 
 | |
| profoffset
 | |
|     Offset of first unwritten byte in trace output buffer
 | |
| 
 | |
| profsize
 | |
|     Size of trace output buffer
 | |
| 
 | |
| All of these are set by the 'trace calls' command.
 | |
| 
 | |
| These variables keep track of the amount of data written to the trace
 | |
| output buffer by the 'trace' command. The trace commands which write data
 | |
| to the output buffer can use these to specify the buffer to write to, and
 | |
| update profoffset each time. This allows successive commands to append data
 | |
| to the same buffer, for example::
 | |
| 
 | |
|     => trace funclist 10000 e00000
 | |
|     => trace calls
 | |
| 
 | |
| (the latter command appends more data to the buffer).
 | |
| 
 | |
| 
 | |
| fakegocmd
 | |
|     Specifies commands to run just before booting the OS. This
 | |
|     is a useful time to write the trace data to the host for
 | |
|     processing.
 | |
| 
 | |
| 
 | |
| Writing Out Trace Data
 | |
| ----------------------
 | |
| 
 | |
| Once the trace data is in an output buffer in memory there are various ways
 | |
| to transmit it to the host. Notably you can use tftput to send the data
 | |
| over a network link::
 | |
| 
 | |
|     fakegocmd=trace pause; usb start; set autoload n; bootp;
 | |
|     trace calls 10000000 1000000;
 | |
|     tftpput ${profbase} ${profoffset} 192.168.1.4:/tftpboot/calls
 | |
| 
 | |
| This starts up USB (to talk to an attached USB Ethernet dongle), writes
 | |
| a trace log to address 10000000 and sends it to a host machine using
 | |
| TFTP. After this, U-Boot will boot the OS normally, albeit a little
 | |
| later.
 | |
| 
 | |
| 
 | |
| Converting Trace Output Data
 | |
| ----------------------------
 | |
| 
 | |
| The trace output data is kept in a binary format which is not documented
 | |
| here. To convert it into something useful, you can use proftool.
 | |
| 
 | |
| This tool must be given the U-Boot map file and the trace data received
 | |
| from running that U-Boot. It produces a text output file.
 | |
| 
 | |
| Options
 | |
| 
 | |
| -m <map_file>
 | |
|     Specify U-Boot map file
 | |
| 
 | |
| -p <trace_file>
 | |
|     Specify profile/trace file
 | |
| 
 | |
| Commands:
 | |
| 
 | |
| dump-ftrace
 | |
|     Write a text dump of the file in Linux ftrace format to stdout
 | |
| 
 | |
| 
 | |
| Viewing the Trace Data
 | |
| ----------------------
 | |
| 
 | |
| You can use pytimechart for this (sudo apt-get pytimechart might work on
 | |
| your Debian-style machine, and use your favourite search engine to obtain
 | |
| documentation). It expects the file to have a .txt extension. The program
 | |
| has terse user interface but is very convenient for viewing U-Boot
 | |
| profile information.
 | |
| 
 | |
| 
 | |
| Workflow Suggestions
 | |
| --------------------
 | |
| 
 | |
| The following suggestions may be helpful if you are trying to reduce boot
 | |
| time:
 | |
| 
 | |
| 1. Enable CONFIG_BOOTSTAGE and CONFIG_BOOTSTAGE_REPORT. This should get
 | |
|    you are helpful overall snapshot of the boot time.
 | |
| 
 | |
| 2. Build U-Boot with tracing and run it. Note the difference in boot time
 | |
|    (it is common for tracing to add 10% to the time)
 | |
| 
 | |
| 3. Collect the trace information as described above. Use this to find where
 | |
|    all the time is being spent.
 | |
| 
 | |
| 4. Take a look at that code and see if you can optimize it. Perhaps it is
 | |
|    possible to speed up the initialization of a device, or remove an unused
 | |
|    feature.
 | |
| 
 | |
| 5. Rebuild, run and collect again. Compare your results.
 | |
| 
 | |
| 6. Keep going until you run out of steam, or your boot is fast enough.
 | |
| 
 | |
| 
 | |
| Configuring Trace
 | |
| -----------------
 | |
| 
 | |
| There are a few parameters in the code that you may want to consider.
 | |
| There is a function call depth limit (set to 15 by default). When the
 | |
| stack depth goes above this then no tracing information is recorded.
 | |
| The maximum depth reached is recorded and displayed by the 'trace stats'
 | |
| command.
 | |
| 
 | |
| 
 | |
| Future Work
 | |
| -----------
 | |
| 
 | |
| Tracing could be a little tidier in some areas, for example providing
 | |
| run-time configuration options for trace.
 | |
| 
 | |
| Some other features that might be useful:
 | |
| 
 | |
| - Trace filter to select which functions are recorded
 | |
| - Sample-based profiling using a timer interrupt
 | |
| - Better control over trace depth
 | |
| - Compression of trace information
 | |
| 
 | |
| 
 | |
| Simon Glass <sjg@chromium.org>
 | |
| April 2013
 |