OpenCSD - CoreSight Trace Decode Library  1.3.3
/usr/src/packages/BUILD/HOWTO.md
Go to the documentation of this file.
1 HOWTO - using the library with perf {#howto_perf}
2 ===================================
3 
4 @brief Using command line perf and OpenCSD to collect and decode trace.
5 
6 This HOWTO explains how to use the perf cmd line tools and the openCSD
7 library to collect and extract program flow traces generated by the
8 CoreSight IP blocks on a Linux system. The examples have been generated using
9 an aarch64 Juno-r0 platform.
10 
11 
12 On Target Trace Acquisition - Perf Record
13 -----------------------------------------
14 
15 Compile the perf tool from the same kernel source code version you are using with:
16 
17  make -C tools/perf
18 
19 This will yield a `perf` executable that will support CoreSight trace collection.
20 
21 *Note:* If traces are to be decompressed **off** target, there is no need to download
22 and compile the openCSD library (on the target).
23 
24 If you are instead planning to use perf to record and decode the trace on the target,
25 compile the perf tool linking against the openCSD library, in the following way:
26 
27  make -C tools/perf VF=1 CORESIGHT=1
28 
29 Further information on the needed build environments and options are detailed later
30 in the section **Off Target Perf Tools Compilation**.
31 
32 Before launching a trace run a sink that will collect trace data needs to be
33 identified. All CoreSight blocks identified by the framework are registed in
34 sysFS:
35 
36 
37  linaro@linaro-nano:~$ ls /sys/bus/coresight/devices/
38  etm0 etm2 etm4 etm6 funnel0 funnel2 funnel4 stm0 tmc_etr0
39  etm1 etm3 etm5 etm7 funnel1 funnel3 replicator0 tmc_etf0
40 
41 
42 CoreSight blocks are listed in the device tree for a specific system and
43 discovered at boot time. Since tracers can be linked to more than one sink,
44 the sink that will recieve trace data needs to be identified and given as an
45 option on the perf command line. Once a sink has been identify trace collection
46 can start. An easy and yet interesting example is the `uname` command:
47 
48  linaro@linaro-nano:~/kernel$ ./tools/perf/perf record -e cs_etm/@tmc_etr0/ --per-thread uname
49 
50 This will generate a `perf.data` file where execution has been traced for both
51 user and kernel space. To narrow the field to either user or kernel space the
52 `u` and `k` options can be specified. For example the following will limit
53 traces to user space:
54 
55 
56  linaro@linaro-nano:~/kernel$ ./tools/perf/perf record -vvv -e cs_etm/@tmc_etr0/u --per-thread uname
57  Problems setting modules path maps, continuing anyway...
58  -----------------------------------------------------------
59  perf_event_attr:
60  type 8
61  size 112
62  { sample_period, sample_freq } 1
63  sample_type IP|TID|IDENTIFIER
64  read_format ID
65  disabled 1
66  exclude_kernel 1
67  exclude_hv 1
68  enable_on_exec 1
69  sample_id_all 1
70  ------------------------------------------------------------
71  sys_perf_event_open: pid 11375 cpu -1 group_fd -1 flags 0x8
72  ------------------------------------------------------------
73  perf_event_attr:
74  type 1
75  size 112
76  config 0x9
77  { sample_period, sample_freq } 1
78  sample_type IP|TID|IDENTIFIER
79  read_format ID
80  disabled 1
81  exclude_kernel 1
82  exclude_hv 1
83  mmap 1
84  comm 1
85  enable_on_exec 1
86  task 1
87  sample_id_all 1
88  mmap2 1
89  comm_exec 1
90  ------------------------------------------------------------
91  sys_perf_event_open: pid 11375 cpu -1 group_fd -1 flags 0x8
92  mmap size 266240B
93  AUX area mmap length 131072
94  perf event ring buffer mmapped per thread
95  Synthesizing auxtrace information
96  Linux
97  auxtrace idx 0 old 0 head 0x11ea0 diff 0x11ea0
98  [ perf record: Woken up 1 times to write data ]
99  overlapping maps:
100  7f99daf000-7f99db0000 0 [vdso]
101  7f99d84000-7f99db3000 0 /lib/aarch64-linux-gnu/ld-2.21.so
102  7f99d84000-7f99daf000 0 /lib/aarch64-linux-gnu/ld-2.21.so
103  7f99db0000-7f99db3000 0 /lib/aarch64-linux-gnu/ld-2.21.so
104  failed to write feature 8
105  failed to write feature 9
106  failed to write feature 14
107  [ perf record: Captured and wrote 0.072 MB perf.data ]
108 
109  linaro@linaro-nano:~/kernel$ ls -l ~/.debug/ perf.data
110  _-rw------- 1 linaro linaro 77888 Mar 2 20:41 perf.data
111 
112  /home/linaro/.debug/:
113  total 16
114  drwxr-xr-x 2 linaro linaro 4096 Mar 2 20:40 [kernel.kallsyms]
115  drwxr-xr-x 2 linaro linaro 4096 Mar 2 20:40 [vdso]
116  drwxr-xr-x 3 linaro linaro 4096 Mar 2 20:40 bin
117  drwxr-xr-x 3 linaro linaro 4096 Mar 2 20:40 lib
118 
119 Trace data filtering
120 --------------------
121 The amount of traces generated by CoreSight tracers is staggering, event for
122 the most simple trace scenario. Reducing trace generation to specific areas
123 of interest is desirable to save trace buffer space and avoid getting lost in
124 the trace data that isn't relevant. Supplementing the 'k' and 'u' options
125 described above is the notion of address filters.
126 
127 On CoreSight two types of address filter have been implemented - address range
128 and start/stop filter:
129 
130 **Address range filters:**
131 With address range filters traces are generated if the instruction pointer
132 falls within the specified range. Any work done by the CPU outside of that
133 range will not be traced. Address range filters can be specified for both
134 user and kernel space session:
135 
136  perf record -e cs_etm/@tmc_etr0/k --filter 'filter 0xffffff8008562d0c/0x48' --per-thread uname
137 
138  perf record -e cs_etm/@tmc_etr0/u --filter 'filter 0x72c/0x40@/opt/lib/libcstest.so.1.0' --per-thread ./main
139 
140 When dealing with kernel space trace addresses are typically taken in the
141 'System.map' file. In user space addresses are relocatable and can be
142 extracted from an objdump output:
143 
144  $ aarch64-linux-gnu-objdump -d libcstest.so.1.0
145  ...
146  ...
147  000000000000072c <coresight_test1>: <------------ Beginning of traces
148  72c: d10083ff sub sp, sp, #0x20
149  730: b9000fe0 str w0, [sp,#12]
150  734: b9001fff str wzr, [sp,#28]
151  738: 14000007 b 754 <coresight_test1+0x28>
152  73c: b9400fe0 ldr w0, [sp,#12]
153  740: 11000800 add w0, w0, #0x2
154  744: b9000fe0 str w0, [sp,#12]
155  748: b9401fe0 ldr w0, [sp,#28]
156  74c: 11000400 add w0, w0, #0x1
157  750: b9001fe0 str w0, [sp,#28]
158  754: b9401fe0 ldr w0, [sp,#28]
159  758: 7100101f cmp w0, #0x4
160  75c: 54ffff0d b.le 73c <coresight_test1+0x10>
161  760: b9400fe0 ldr w0, [sp,#12]
162  764: 910083ff add sp, sp, #0x20
163  768: d65f03c0 ret
164  ...
165  ...
166 
167 Following the address the amount of byte is specified and if tracing in user
168 space, the full path to the binary (or library) being traced.
169 
170 **Start/Stop filters:**
171 With start/stop filters traces are generated when the instruction pointer is
172 equal to the start address. Incidentally traces stop being generated when the
173 insruction pointer is equal to the stop address. Anything that happens between
174 there to events is traced:
175 
176  perf record -e cs_etm/@tmc_etr0/k --filter 'start 0xffffff800856bc50,stop 0xffffff800856bcb0' --per-thread uname
177 
178  perf record -vvv -e cs_etm/@tmc_etr0/u --filter 'start 0x72c@/opt/lib/libcstest.so.1.0, \
179  stop 0x40082c@/home/linaro/main' \
180  --per-thread ./main
181 
182 **Limitation on address filters:**
183 The only limitation on address filters is the amount of address comparator
184 found on an implementation and the mutual exclusion between range and
185 start stop filters. As such the following example would _not_ work:
186 
187  perf record -e cs_etm/@tmc_etr0/k --filter 'start 0xffffff800856bc50,stop 0xffffff800856bcb0, \ // start/stop
188  filter 0x72c/0x40@/opt/lib/libcstest.so.1.0' \ // address range
189  --per-thread uname
190 
191 Additional Trace Options
192 ------------------------
193 Additional options can be used during trace collection that add information to the captured trace.
194 
195 - Timestamps: These packets are added to the trace streams to allow correlation of different sources where tools support this.
196 - Cycle Counts: These packets are added to get a count of cycles for blocks of executed instructions. Adding cycle counts will considerably increase the amount of generated trace.
197 The relationship between cycle counts and executed instructions differs according to the trace protocol.
198 For example, the ETMv4 protocol will emit counts for groups of instructions according to a minimum count threshold.
199 Presently this threshold is fixed at 256 cycles for `perf record`.
200 
201 Command line options in `perf record` to use these features are part of the options for the `cs_etm` event:
202 
203  perf record -e cs_etm/timestamp,cycacc,@tmc_etr0/ --per-thread uname
204 
205 At current version, `perf record` and `perf script` do not use this additional information.
206 
207 The cs_etm perf event
208 ---------------------
209 
210 System information for this perf pmu event can be found at:
211 
212  /sys/devices/cs_etm
213 
214 This contains internal format of the parameters described above:
215 
216  root@linaro-developer:~# ls /sys/devices/cs_etm/format
217  contextid cycacc retstack sinkid timestamp
218 
219 and names of registered sinks:
220 
221  root@linaro-developer:~# ls /sys/devices/cs_etm/sinks
222  tmc_etf0 tmc_etr0 tpiu0
223 
224 Note: The `sinkid` parameter is there to document the usage of a 32-bit internal parameter to
225 pass the sink name used in the cs_etm/@sink/ command to the kernel drivers. It can be used
226 directly as cs_etm/sinkid=<hash_value>/ but this is not recommended as the values used are
227 considered opaque and subject to changes.
228 
229 On Target Trace Collection
230 --------------------------
231 The entire program flow will have been recorded in the `perf.data` file.
232 Information about libraries and executable is stored under `$HOME/.debug`:
233 
234  linaro@linaro-nano:~/kernel$ tree ~/.debug
235  .debug
236  ├── [kernel.kallsyms]
237  │   └── 0542921808098d591a7acba5a1163e8991897669
238  │   └── kallsyms
239  ├── [vdso]
240  │   └── 551fbbe29579eb63be3178a04c16830b8d449769
241  │   └── vdso
242  ├── bin
243  │   └── uname
244  │   └── ed95e81f97c4471fb2ccc21e356b780eb0c92676
245  │   └── elf
246  └── lib
247  └── aarch64-linux-gnu
248  ├── ld-2.21.so
249  │   └── 94912dc5a1dc8c7ef2c4e4649d4b1639b6ebc8b7
250  │   └── elf
251  └── libc-2.21.so
252  └── 169a143e9c40cfd9d09695333e45fd67743cd2d6
253  └── elf
254 
255  13 directories, 5 files
256  linaro@linaro-nano:~/kernel$
257 
258 
259 All this information needs to be collected in order to successfully decode
260 traces off target:
261 
262  linaro@linaro-nano:~/kernel$ tar czf uname.trace.tgz perf.data ~/.debug
263 
264 
265 Note that file `vmlinux` should also be added to the bundle if kernel traces
266 have also been collected.
267 
268 
269 Off Target OpenCSD Compilation
270 ------------------------------
271 The openCSD library is not part of the perf tools. It is available on
272 [github][1] and needs to be compiled before the perf tools. Checkout the
273 required branch/tag version into a local directory.
274 
275  linaro@t430:~/linaro/coresight$ git clone https://github.com/Linaro/OpenCSD.git my-opencsd
276  Cloning into 'OpenCSD'...
277  remote: Counting objects: 2063, done.
278  remote: Total 2063 (delta 0), reused 0 (delta 0), pack-reused 2063
279  Receiving objects: 100% (2063/2063), 2.51 MiB | 1.24 MiB/s, done.
280  Resolving deltas: 100% (1399/1399), done.
281  Checking connectivity... done.
282  linaro@t430:~/linaro/coresight$ ls my-opencsd
283  decoder LICENSE README.md HOWTO.md TODO
284 
285 Once the source code has been acquired compilation of the openCSD library can
286 take place. For Linux two options are available, LINUX and LINUX64, based on
287 the host's (which has nothing to do with the target) architecture:
288 
289  linaro@t430:~/linaro/coresight/$ cd my-opencsd/decoder/build/linux/
290  linaro@t430:~/linaro/coresight/my-opencsd/decoder/build/linux$ ls
291  makefile rctdl_c_api_lib ref_trace_decode_lib
292 
293  linaro@t430:~/linaro/coresight/my-opencsd/decoder/build/linux$ make LINUX64=1 DEBUG=1
294  ...
295  ...
296 
297  linaro@t430:~/linaro/coresight/my-opencsd/decoder/build/linux$ ls ../../lib/linux64/dbg/
298  libopencsd.a libopencsd_c_api.a libopencsd_c_api.so libopencsd.so
299 
300 From there the header file and libraries need to be installed on the system,
301 something that requires root privileges. The default installation path is
302 /usr/include/opencsd for the header files and /usr/lib/ for the libraries:
303 
304  linaro@t430:~/linaro/coresight/my-opencsd/decoder/build/linux$ sudo make install
305  linaro@t430:~/linaro/coresight/my-opencsd/decoder/build/linux$ ls -l /usr/include/opencsd
306  total 60
307  drwxr-xr-x 2 root root 4096 Dec 12 10:19 c_api
308  drwxr-xr-x 2 root root 4096 Dec 12 10:19 etmv3
309  drwxr-xr-x 2 root root 4096 Dec 12 10:19 etmv4
310  -rw-r--r-- 1 root root 28049 Dec 12 10:19 ocsd_if_types.h
311  drwxr-xr-x 2 root root 4096 Dec 12 10:19 ptm
312  drwxr-xr-x 2 root root 4096 Dec 12 10:19 stm
313  -rw-r--r-- 1 root root 7264 Dec 12 10:19 trc_gen_elem_types.h
314  -rw-r--r-- 1 root root 3972 Dec 12 10:19 trc_pkt_types.h
315 
316  linaro@t430:~/linaro/coresight/my-opencsd/decoder/build/linux$ ls -l /usr/lib/libopencsd*
317  -rw-r--r-- 1 root root 598720 Dec 12 10:19 /usr/lib/libopencsd_c_api.so
318  -rw-r--r-- 1 root root 4692200 Dec 12 10:19 /usr/lib/libopencsd.so
319 
320 A "clean_install" target is also available so that openCSD installed files can
321 be removed from a system. Going forward the goal is to have the openCSD library
322 packaged as a Debian or RPM archive so that it can be installed from a
323 distribution without having to be compiled.
324 
325 
326 Off Target Perf Tools Compilation
327 ---------------------------------
328 
329 As mentioned above the openCSD library is not part of the perf tools' code base
330 and needs to be installed on a system prior to compilation. Information about
331 the status of the openCSD library on a system is given at compile time by the
332 perf tools build script:
333 
334  linaro@t430:~/linaro/linux-kernel$ make CORESIGHT=1 VF=1 -C tools/perf
335  Auto-detecting system features:
336  ... dwarf: [ on ]
337  ... dwarf_getlocations: [ on ]
338  ... glibc: [ on ]
339  ... gtk2: [ on ]
340  ... libaudit: [ on ]
341  ... libbfd: [ OFF ]
342  ... libelf: [ on ]
343  ... libnuma: [ OFF ]
344  ... numa_num_possible_cpus: [ OFF ]
345  ... libperl: [ on ]
346  ... libpython: [ on ]
347  ... libslang: [ on ]
348  ... libcrypto: [ on ]
349  ... libunwind: [ OFF ]
350  ... libdw-dwarf-unwind: [ on ]
351  ... zlib: [ on ]
352  ... lzma: [ OFF ]
353  ... get_cpuid: [ on ]
354  ... bpf: [ on ]
355  ... libopencsd: [ on ] <-------
356 
357 
358 At the end of the compilation a new perf binary is available in `tools/perf/`:
359 
360  linaro@t430:~/linaro/linux-kernel$ ldd tools/perf/perf
361  linux-vdso.so.1 => (0x00007fff135db000)
362  libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f15f9176000)
363  librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f15f8f6e000)
364  libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f15f8c64000)
365  libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f15f8a60000)
366  libopencsd_c_api.so => /usr/lib/libopencsd_c_api.so (0x00007f15f884e000) <-------
367  libelf.so.1 => /usr/lib/x86_64-linux-gnu/libelf.so.1 (0x00007f15f8635000)
368  libdw.so.1 => /usr/lib/x86_64-linux-gnu/libdw.so.1 (0x00007f15f83ec000)
369  libaudit.so.1 => /lib/x86_64-linux-gnu/libaudit.so.1 (0x00007f15f81c5000)
370  libslang.so.2 => /lib/x86_64-linux-gnu/libslang.so.2 (0x00007f15f7e38000)
371  libperl.so.5.22 => /usr/lib/x86_64-linux-gnu/libperl.so.5.22 (0x00007f15f7a5d000)
372  libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f15f7693000)
373  libpython2.7.so.1.0 => /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0 (0x00007f15f7104000)
374  libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f15f6eea000)
375  /lib64/ld-linux-x86-64.so.2 (0x0000559b88038000)
376  libopencsd.so => /usr/lib/libopencsd.so (0x00007f15f6c62000) <-------
377  libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f15f68df000)
378  libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f15f66c9000)
379  liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007f15f64a6000)
380  libbz2.so.1.0 => /lib/x86_64-linux-gnu/libbz2.so.1.0 (0x00007f15f6296000)
381  libcrypt.so.1 => /lib/x86_64-linux-gnu/libcrypt.so.1 (0x00007f15f605e000)
382  libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007f15f5e5a000)
383 
384 
385 Additional debug output from the decoder can be compiled in by setting the
386 `CSTRACE_RAW` environment variable. Setting this to `packed` gets trace frame
387 output as follows:-
388 
389  Frame Data; Index 576; RAW_PACKED; d6 d6 d6 d6 d6 d6 d6 d6 fc fb d6 d6 d6 d6 e0 7f
390  Frame Data; Index 576; ID_DATA[0x14]; d7 d6 d7 d6 d7 d6 d7 d6 fd fb d7 d6 d7 d6 e0
391 
392 Set to any other value will remove the RAW_PACKED lines.
393 
394 Working with an alternate version of the openCSD library
395 --------------------------------------------------------
396 When compiling the perf tools it is possible to reference another version of
397 the openCSD library than the one installed on the system. This is useful when
398 working with multiple development trees or having the desire to keep system
399 libraries intact. Two environment variable are available to tell the perf tools
400 build script where to get the header file and libraries, namely CSINCLUDES and
401 CSLIBS:
402 
403  linaro@t430:~/linaro/linux-kernel$ export CSINCLUDES=~/linaro/coresight/my-opencsd/decoder/include/
404  linaro@t430:~/linaro/linux-kernel$ export CSLIBS=~/linaro/coresight/my-opencsd/decoder/lib/builddir/
405  linaro@t430:~/linaro/linux-kernel$ make CORESIGHT=1 VF=1 -C tools/perf
406 
407 This will have the effect of compiling and linking against the provided library.
408 Since the system's openCSD library is in the loader's search patch the
409 LD_LIBRARY_PATH environment variable needs to be set.
410 
411  linaro@t430:~/linaro/linux-kernel$ export LD_LIBRARY_PATH=$CSLIBS
412 
413 
414 Trace Decoding with Perf Report
415 -------------------------------
416 Before working with custom traces it is suggested to use a trace bundle that
417 is known to be working properly. A sample bundle has been made available
418 here [2]. Trace bundles can be extracted anywhere and have no dependencies on
419 where the perf tools and openCSD library have been compiled.
420 
421  linaro@t430:~/linaro/coresight$ mkdir sept20
422  linaro@t430:~/linaro/coresight$ cd sept20
423  linaro@t430:~/linaro/coresight/sept20$ wget http://people.linaro.org/~mathieu.poirier/openCSD/uname.v4.user.sept20.tgz
424  linaro@t430:~/linaro/coresight/sept20$ md5sum uname.v4.user.sept20.tgz
425  f53f11d687ce72bdbe9de2e67e960ec6 uname.v4.user.sept20.tgz
426  linaro@t430:~/linaro/coresight/sept20$ tar xf uname.v4.user.sept20.tgz
427  linaro@t430:~/linaro/coresight/sept20$ ls -la
428  total 1312
429  drwxrwxr-x 3 linaro linaro 4096 Mar 3 10:26 .
430  drwxrwxr-x 5 linaro linaro 4096 Mar 3 10:13 ..
431  drwxr-xr-x 7 linaro linaro 4096 Feb 24 12:21 .debug
432  -rw------- 1 linaro linaro 78016 Feb 24 12:21 perf.data
433  -rw-rw-r-- 1 linaro linaro 1245881 Feb 24 12:25 uname.v4.user.sept20.tgz
434 
435 Perf is expecting files related to the trace capture (`perf.data`) to be located in the `buildid` directory.
436 By default this is under `~/.debug`. Alternatively the default `buildid` directory can be changed
437 using the command:
438 
439  perf config --system buildid.dir=/my/own/buildid/dir
440 
441 This example will remove the current `~/.debug` directory to be sure everything is clean.
442 
443  linaro@t430:~/linaro/coresight/sept20$ rm -rf ~/.debug
444  linaro@t430:~/linaro/coresight/sept20$ cp -dpR .debug ~/
445  linaro@t430:~/linaro/coresight/sept20$ ../perf-opencsd-master/tools/perf/perf report --stdio
446 
447  # To display the perf.data header info, please use --header/--header-only options.
448  #
449  #
450  # Total Lost Samples: 0
451  #
452  # Samples: 0 of event 'cs_etm//u'
453  # Event count (approx.): 0
454  #
455  # Children Self Command Shared Object Symbol
456  # ........ ........ ....... ............. ......
457  #
458 
459 
460  # Samples: 0 of event 'dummy:u'
461  # Event count (approx.): 0
462  #
463  # Children Self Command Shared Object Symbol
464  # ........ ........ ....... ............. ......
465  #
466 
467 
468  # Samples: 115K of event 'instructions:u'
469  # Event count (approx.): 522009
470  #
471  # Children Self Command Shared Object Symbol
472  # ........ ........ ....... ................ ......................
473  #
474  4.13% 4.13% uname libc-2.21.so [.] 0x0000000000078758
475  3.81% 3.81% uname libc-2.21.so [.] 0x0000000000078e50
476  2.06% 2.06% uname libc-2.21.so [.] 0x00000000000fcaf4
477  1.65% 1.65% uname libc-2.21.so [.] 0x00000000000fcae4
478  1.59% 1.59% uname ld-2.21.so [.] 0x000000000000a7f4
479  1.50% 1.50% uname libc-2.21.so [.] 0x0000000000078e40
480  1.43% 1.43% uname libc-2.21.so [.] 0x00000000000fcac4
481  1.31% 1.31% uname libc-2.21.so [.] 0x000000000002f0c0
482  1.26% 1.26% uname ld-2.21.so [.] 0x0000000000016888
483  1.24% 1.24% uname libc-2.21.so [.] 0x0000000000078e7c
484  1.24% 1.24% uname libc-2.21.so [.] 0x00000000000fcab8
485  ...
486 
487 Additional data can be obtained, which contains a dump of the trace packets received using the command
488 
489  mjl@ubuntu-vbox:./perf-opencsd-master/coresight/tools/perf/perf report --stdio --dump
490 
491 resulting a large amount of data, trace looking like:-
492 
493  0x618 [0x30]: PERF_RECORD_AUXTRACE size: 0x11ef0 offset: 0 ref: 0x4d881c1f13216016 idx: 0 tid: 15244 cpu: -1
494 
495  . ... CoreSight ETM Trace data: size 73456 bytes
496 
497  0: I_ASYNC : Alignment Synchronisation.
498  12: I_TRACE_INFO : Trace Info.
499  17: I_TRACE_ON : Trace On.
500  18: I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0x0000007F89F24D80; Ctxt: AArch64,EL0, NS;
501  28: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE
502  29: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE
503  30: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE
504  32: I_ATOM_F6 : Atom format 6.; EEEEN
505  33: I_ATOM_F1 : Atom format 1.; E
506  34: I_EXCEPT : Exception.; Data Fault; Ret Addr Follows;
507  36: I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0x0000007F89F2832C;
508  45: I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0xFFFFFFC000083400; Ctxt: AArch64,EL1, NS;
509  56: I_TRACE_ON : Trace On.
510  57: I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0x0000007F89F2832C; Ctxt: AArch64,EL0, NS;
511  68: I_ATOM_F3 : Atom format 3.; NEE
512  69: I_ATOM_F3 : Atom format 3.; NEN
513  70: I_ATOM_F3 : Atom format 3.; NNE
514  71: I_ATOM_F5 : Atom format 5.; ENENE
515  72: I_ATOM_F5 : Atom format 5.; NENEN
516  73: I_ATOM_F5 : Atom format 5.; ENENE
517  74: I_ATOM_F5 : Atom format 5.; NENEN
518  75: I_ATOM_F5 : Atom format 5.; ENENE
519  76: I_ATOM_F3 : Atom format 3.; NNE
520  77: I_ATOM_F3 : Atom format 3.; NNE
521  78: I_ATOM_F3 : Atom format 3.; NNE
522  80: I_ATOM_F3 : Atom format 3.; NNE
523  81: I_ATOM_F3 : Atom format 3.; ENN
524  82: I_EXCEPT : Exception.; Data Fault; Ret Addr Follows;
525  84: I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0x0000007F89F283F0;
526  93: I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0xFFFFFFC000083400; Ctxt: AArch64,EL1, NS;
527  104: I_TRACE_ON : Trace On.
528  105: I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0x0000007F89F283F0; Ctxt: AArch64,EL0, NS;
529  116: I_ATOM_F5 : Atom format 5.; NNNNN
530  117: I_ATOM_F5 : Atom format 5.; NNNNN
531 
532 
533 Trace Decoding with Perf Script
534 -------------------------------
535 Working with perf scripts needs more command line options but yields
536 interesting results.
537 
538  linaro@t430:~/linaro/coresight/sept20$ export EXEC_PATH=/home/linaro/coresight/perf-opencsd-master/tools/perf/
539  linaro@t430:~/linaro/coresight/sept20$ export SCRIPT_PATH=$EXEC_PATH/scripts/python/
540  linaro@t430:~/linaro/coresight/sept20$ export XTOOL_PATH=/your/aarch64/toolchain/path/bin/
541  linaro@t430:~/linaro/coresight/sept20$ ../perf-opencsd-master/tools/perf/perf --exec-path=${EXEC_PATH} script --script=python:${SCRIPT_PATH}/cs-trace-disasm.py -- -d ${XTOOL_PATH}/aarch64-linux-gnu-objdump
542 
543  7f89f24d80: 910003e0 mov x0, sp
544  7f89f24d84: 94000d53 bl 7f89f282d0 <free@plt+0x3790>
545  7f89f282d0: d11203ff sub sp, sp, #0x480
546  7f89f282d4: a9ba7bfd stp x29, x30, [sp,#-96]!
547  7f89f282d8: 910003fd mov x29, sp
548  7f89f282dc: a90363f7 stp x23, x24, [sp,#48]
549  7f89f282e0: 9101e3b7 add x23, x29, #0x78
550  7f89f282e4: a90573fb stp x27, x28, [sp,#80]
551  7f89f282e8: a90153f3 stp x19, x20, [sp,#16]
552  7f89f282ec: aa0003fb mov x27, x0
553  7f89f282f0: 910a82e1 add x1, x23, #0x2a0
554  7f89f282f4: a9025bf5 stp x21, x22, [sp,#32]
555  7f89f282f8: a9046bf9 stp x25, x26, [sp,#64]
556  7f89f282fc: 910102e0 add x0, x23, #0x40
557  7f89f28300: f800841f str xzr, [x0],#8
558  7f89f28304: eb01001f cmp x0, x1
559  7f89f28308: 54ffffc1 b.ne 7f89f28300 <free@plt+0x37c0>
560  7f89f28300: f800841f str xzr, [x0],#8
561  7f89f28304: eb01001f cmp x0, x1
562  7f89f28308: 54ffffc1 b.ne 7f89f28300 <free@plt+0x37c0>
563  7f89f28300: f800841f str xzr, [x0],#8
564  7f89f28304: eb01001f cmp x0, x1
565  7f89f28308: 54ffffc1 b.ne 7f89f28300 <free@plt+0x37c0>
566 
567 Kernel Trace Decoding
568 ---------------------
569 
570 When dealing with kernel space traces the vmlinux file has to be communicated
571 explicitely to perf using the "--vmlinux" command line option:
572 
573  linaro@t430:~/linaro/coresight/sept20$ ../perf-opencsd-master/tools/perf/perf report --stdio --vmlinux=./vmlinux
574  ...
575  ...
576  linaro@t430:~/linaro/coresight/sept20$ ../perf-opencsd-master/tools/perf/perf script --vmlinux=./vmlinux
577 
578 When using scripts things get a little more convoluted. Using the same example
579 an above but for traces but for kernel traces, the command line becomes:
580 
581  linaro@t430:~/linaro/coresight/sept20$ export EXEC_PATH=/home/linaro/coresight/perf-opencsd-master/tools/perf/
582  linaro@t430:~/linaro/coresight/sept20$ export SCRIPT_PATH=$EXEC_PATH/scripts/python/
583  linaro@t430:~/linaro/coresight/sept20$ export XTOOL_PATH=/your/aarch64/toolchain/path/bin/
584  linaro@t430:~/linaro/coresight/sept20$ ../perf-opencsd-master/tools/perf/perf --exec-path=${EXEC_PATH} script \
585  --vmlinux=./vmlinux \
586  --script=python:${SCRIPT_PATH}/cs-trace-disasm.py -- \
587  -d ${XTOOLS_PATH}/aarch64-linux-gnu-objdump \
588  -k ./vmlinux
589  ...
590  ...
591 
592 The option "--vmlinux=./vmlinux" is interpreted by the "perf script" command
593 the same way it if for "perf report". The option "-k ./vmlinux" is dependant
594 on the script being executed and has no related to the "--vmlinux", though it
595 is highly advised to keep them synchronized.
596 
597 
598 Perf Test Environment Scripts
599 -----------------------------
600 
601 The decoder library comes with a number of `bash` scripts that ease the setting up of the
602 offline build and test environment for perf, and executing tests.
603 
604 These scripts can be found in
605 
606  decoder/tests/perf-test-scripts
607 
608 There are three scripts provided:
609 
610 - `perf-setup-env.bash` : this sets up all the environment variables mentioned above.
611 - `perf-test-report.bash` : this runs `perf report` - using the environment setup by `perf-setup-env.bash`
612 - `perf-test-script.bash` : this runs `perf script` - using the environment setup by `perf-setup-env.bash`
613 
614 Use as follows:-
615 
616 1. Prior to building perf, edit `perf-setup-env.bash` to conform to your environment. There are four lines at the top of the file that will require editing.
617 
618 2. Execute the script using the command:
619 
620  source perf-setup-env.bash
621 
622  This will set up a perf execute environment for using the perf report and script commands.
623 
624  Alternatively use the command:
625 
626  source perf-setup-env.base buildenv
627 
628  This will add in the build environment variables mentioned in the sections on building above alongside the
629  environment for using the used by the `perf-test...` scripts to run the tests.
630 
631 3. Build perf as described above.
632 4. Follow the instructions for downloading the test capture, or create a capture from your target.
633 5. Copy the `perf-test...` scripts into the capture data directory -> the one that contains `perf.data`.
634 
635 6. The scripts can now be run. No options are required for the default operation, but any command line options will be added to the perf report / perf script command line.
636 
637 e.g.
638 
639  ./perf-test-report.bash --dump
640 
641 will add the --dump option to the end of the command line and run
642 
643  ${PERF_EXEC_PATH}/perf report --stdio --dump
644 
645 
646 Generating coverage files for Feedback Directed Optimization: AutoFDO
647 ---------------------------------------------------------------------
648 
649 See autofdo.md (@ref AutoFDO) for details and scripts.
650 
651 
652 The Linaro CoreSight Team
653 -------------------------
654 - Mike Leach
655 - Mathieu Poirier
656 
657 
658 One Last Thing
659 --------------
660 We welcome help on this project. If you would like to add features or help
661 improve the way things work, we want to hear from you.
662 
663 Best regards,
664 *The Linaro CoreSight Team*
665 
666 --------------------------------------
667 [1]: https://github.com/Linaro/OpenCSD
668 
669 [2]: http://people.linaro.org/~mathieu.poirier/openCSD/uname.v4.user.sept20.tgz