|
| 1 | +=========================== |
| 2 | +Static Stack Usage Analysis |
| 3 | +=========================== |
| 4 | + |
| 5 | +Overview |
| 6 | +======== |
| 7 | + |
| 8 | +``tools/stackusage.py`` performs static stack usage analysis by reading |
| 9 | +DWARF ``.debug_frame`` data from an ELF file. It extracts per-function |
| 10 | +stack sizes from CFA (Canonical Frame Address) offsets and optionally |
| 11 | +builds a call graph via disassembly to compute worst-case total stack |
| 12 | +depth. |
| 13 | + |
| 14 | +- **Self** – stack bytes used by the function itself (max CFA offset). |
| 15 | +- **Total** – worst-case stack depth through the deepest call chain |
| 16 | + (self + callees). A marker prefix flags uncertain values. |
| 17 | + |
| 18 | +Dependencies |
| 19 | +============ |
| 20 | + |
| 21 | +The tool invokes standard toolchain binaries: |
| 22 | + |
| 23 | +- **readelf** – symbol table and DWARF frame info |
| 24 | +- **objdump** – disassembly for call graph analysis |
| 25 | +- **addr2line** – source file and line resolution |
| 26 | + |
| 27 | +Both GNU and LLVM toolchains are supported. Use ``-p`` to set the |
| 28 | +toolchain prefix (e.g. ``-p arm-none-eabi-`` for GCC, |
| 29 | +``-p llvm-`` for LLVM). |
| 30 | + |
| 31 | +The ELF must contain DWARF debug info (``-g`` or ``-gdwarf``). |
| 32 | +No special Kconfig option is needed. |
| 33 | + |
| 34 | +Usage |
| 35 | +===== |
| 36 | + |
| 37 | +Analyze a native ELF (no prefix needed):: |
| 38 | + |
| 39 | + python3 tools/stackusage.py nuttx |
| 40 | + |
| 41 | +Cross-compiled ELF with GCC toolchain:: |
| 42 | + |
| 43 | + python3 tools/stackusage.py -p arm-none-eabi- nuttx |
| 44 | + |
| 45 | +Cross-compiled ELF with LLVM toolchain:: |
| 46 | + |
| 47 | + python3 tools/stackusage.py -p llvm- nuttx |
| 48 | + |
| 49 | +Show top 20 functions:: |
| 50 | + |
| 51 | + python3 tools/stackusage.py -p arm-none-eabi- -n 20 nuttx |
| 52 | + |
| 53 | +Estimate recursion depth of 10:: |
| 54 | + |
| 55 | + python3 tools/stackusage.py -p arm-none-eabi- -r 10 nuttx |
| 56 | + |
| 57 | +Command Line Options |
| 58 | +==================== |
| 59 | + |
| 60 | +.. code-block:: text |
| 61 | +
|
| 62 | + positional arguments: |
| 63 | + elf path to ELF file with DWARF debug info |
| 64 | +
|
| 65 | + options: |
| 66 | + -p, --prefix PREFIX toolchain prefix (e.g. arm-none-eabi- or llvm-) |
| 67 | + -n, --rank N show top N functions (default: 0 = all) |
| 68 | + -r, --recursion-depth N |
| 69 | + assumed recursion depth (default: 0) |
| 70 | +
|
| 71 | +Text Output |
| 72 | +=========== |
| 73 | + |
| 74 | +The default output is an aligned table. Each function's deepest |
| 75 | +backtrace is shown with one frame per row. The ``Self`` column shows |
| 76 | +each frame's own stack cost. The ``Backtrace`` column shows the |
| 77 | +function name followed by its code size in parentheses (when available |
| 78 | +from the symbol table), e.g. ``main(128)``. The entry point of each |
| 79 | +call chain is suffixed with ``~``. |
| 80 | + |
| 81 | +Example (``nucleo-f429zi:trace``, ``-n 3``):: |
| 82 | + |
| 83 | + Total Self Backtrace File:Line |
| 84 | + ----- ---- --------------------------- ------------------------------------------- |
| 85 | + @2344 56 telnetd_main(236)~ apps/system/telnetd/telnetd.c:42 |
| 86 | + ^24 nsh_telnetmain(128) apps/nshlib/nsh_telnetd.c:48 |
| 87 | + ^48 nsh_session(400) apps/nshlib/nsh_session.c:73 |
| 88 | + ... |
| 89 | + @224 nsh_parse_cmdparm(1024) apps/nshlib/nsh_parse.c:2362 |
| 90 | + @96 nsh_execute(512) apps/nshlib/nsh_parse.c:510 |
| 91 | + ^56 nsh_builtin(320) apps/nshlib/nsh_builtin.c:76 |
| 92 | + 88 exec_builtin(256) apps/builtin/exec_builtin.c:61 |
| 93 | + ... |
| 94 | + ^64 file_vopen(192) nuttx/fs/vfs/fs_open.c:124 |
| 95 | + ... |
| 96 | + @2328 16 sh_main(64)~ apps/system/nsh/sh_main.c:40 |
| 97 | + 16 nsh_system_ctty(96) apps/nshlib/nsh_system.c:105 |
| 98 | + ^32 nsh_system_(160) apps/nshlib/nsh_system.c:41 |
| 99 | + ^48 nsh_session(400) apps/nshlib/nsh_session.c:73 |
| 100 | + ... |
| 101 | + @2312 24 nsh_main(80)~ apps/system/nsh/nsh_main.c:54 |
| 102 | + ^24 nsh_consolemain(48) apps/nshlib/nsh_consolemain.c:65 |
| 103 | + ^48 nsh_session(400) apps/nshlib/nsh_session.c:73 |
| 104 | + ... |
| 105 | + |
| 106 | +Uncertainty markers on both Total and Self columns indicate the most |
| 107 | +significant reason: |
| 108 | + |
| 109 | +======= ========================================== |
| 110 | +Marker Meaning |
| 111 | +======= ========================================== |
| 112 | +``~`` entry point of the call chain (suffix) |
| 113 | +``?`` no DWARF data (self counted as zero) |
| 114 | +``*`` dynamic stack (alloca or VLA) |
| 115 | +``@`` recursion detected |
| 116 | +``^`` indirect call (function pointer) |
| 117 | +======= ========================================== |
| 118 | + |
| 119 | +Uncertainty Reasons |
| 120 | +=================== |
| 121 | + |
| 122 | +====================================== ========================================= |
| 123 | +Reason Description |
| 124 | +====================================== ========================================= |
| 125 | +recursion: A->B->...->A Recursive cycle detected. Use ``-r N`` |
| 126 | + to estimate. |
| 127 | +indirect call (function pointer) Callee unknown at compile time. |
| 128 | +no DWARF data No ``.debug_frame`` entry; self counted |
| 129 | + as zero. |
| 130 | +dynamic stack (alloca/VLA) Function uses ``alloca()`` or |
| 131 | + variable-length arrays; self is a |
| 132 | + minimum. |
| 133 | +====================================== ========================================= |
| 134 | + |
| 135 | +Uncertainty propagates upward: if any callee in the deepest path is |
| 136 | +uncertain the caller is also marked uncertain. |
| 137 | + |
| 138 | +Recursion Depth Estimation |
| 139 | +========================== |
| 140 | + |
| 141 | +By default (``-r 0``) recursive back-edges contribute zero stack. |
| 142 | +With ``-r N`` (N > 0) the tool estimates:: |
| 143 | + |
| 144 | + cycle_body_cost × N |
| 145 | + |
| 146 | +For example ``A(64) -> B(32) -> A``:: |
| 147 | + |
| 148 | + cycle_body_cost = 64 + 32 = 96 |
| 149 | + -r 10 → 96 × 10 = 960 bytes |
| 150 | + |
| 151 | +The result is still marked uncertain. |
| 152 | + |
| 153 | +Supported Architectures |
| 154 | +======================= |
| 155 | + |
| 156 | +Any architecture supported by the toolchain's ``readelf``, |
| 157 | +``objdump``, and ``addr2line`` is supported. This includes |
| 158 | +ARM, AArch64, x86, x86_64, MIPS, RISC-V, Xtensa, PowerPC, SPARC, |
| 159 | +TriCore, SuperH, and others. |
0 commit comments