diff options
| author | Linus Torvalds <torvalds@linux-foundation.org> | 2022-08-14 09:22:11 -0700 |
|---|---|---|
| committer | Linus Torvalds <torvalds@linux-foundation.org> | 2022-08-14 09:22:11 -0700 |
| commit | 96f86ff08332d88defd35c330fc6dae219b9e264 (patch) | |
| tree | 1122e9ac32022cc7c5a365c2ab6ddf116fbd3b15 /tools/perf/Documentation/perf-c2c.txt | |
| parent | d785610f052d7456497cdec2a2406f6d4b16569f (diff) | |
| parent | 7391db6459388d47d657aad633cb55fc04a8d4fb (diff) | |
| download | linux-96f86ff08332d88defd35c330fc6dae219b9e264.tar.gz linux-96f86ff08332d88defd35c330fc6dae219b9e264.tar.bz2 linux-96f86ff08332d88defd35c330fc6dae219b9e264.zip | |
Merge tag 'perf-tools-fixes-for-v6.0-2022-08-13' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull more perf tool updates from Arnaldo Carvalho de Melo:
- 'perf c2c' now supports ARM64, adjust its output to cope with
differences with what is in x86_64. Now go find false sharing on
ARM64 (at least Neoverse) as well!
- Refactor the JSON processing, making the output more compact and thus
reducing the size of the resulting perf binary
- Improvements for 'perf offcpu' profiling, including tracking child
processes
- Update Intel JSON metrics and events files for broadwellde,
broadwellx, cascadelakex, haswellx, icelakex, ivytown, jaketown,
knightslanding, sapphirerapids, skylakex and snowridgex
- Add 'perf stat' JSON output and a 'perf test' entry for it
- Ignore memfd and anonymous mmap events if jitdump present
- Refactor 'perf test' shell tests allowing subdirs
- Fix an error handling path in 'parse_perf_probe_command()'
- Fixes for the guest Intel PT tracing patchkit in the 1st batch of
this merge window
- Print debuginfod queries if -v option is used, to explain delays in
processing when debuginfo servers are enabled to fetch DSOs with
richer symbol tables
- Improve error message for 'perf record -p not_existing_pid'
- Fix openssl and libbpf feature detection
- Add PMU pai_crypto event description for IBM z16 on 'perf list'
- Fix typos and duplicated words on comments in various places
* tag 'perf-tools-fixes-for-v6.0-2022-08-13' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (81 commits)
perf test: Refactor shell tests allowing subdirs
perf vendor events: Update events for snowridgex
perf vendor events: Update events and metrics for skylakex
perf vendor events: Update metrics for sapphirerapids
perf vendor events: Update events for knightslanding
perf vendor events: Update metrics for jaketown
perf vendor events: Update metrics for ivytown
perf vendor events: Update events and metrics for icelakex
perf vendor events: Update events and metrics for haswellx
perf vendor events: Update events and metrics for cascadelakex
perf vendor events: Update events and metrics for broadwellx
perf vendor events: Update metrics for broadwellde
perf jevents: Fold strings optimization
perf jevents: Compress the pmu_events_table
perf metrics: Copy entire pmu_event in find metric
perf pmu-events: Hide the pmu_events
perf pmu-events: Don't assume pmu_event is an array
perf pmu-events: Move test events/metrics to JSON
perf test: Use full metric resolution
perf pmu-events: Hide pmu_events_map
...
Diffstat (limited to 'tools/perf/Documentation/perf-c2c.txt')
| -rw-r--r-- | tools/perf/Documentation/perf-c2c.txt | 31 |
1 files changed, 24 insertions, 7 deletions
diff --git a/tools/perf/Documentation/perf-c2c.txt b/tools/perf/Documentation/perf-c2c.txt index 6f69173731aa..f1f7ae6b08d1 100644 --- a/tools/perf/Documentation/perf-c2c.txt +++ b/tools/perf/Documentation/perf-c2c.txt @@ -109,7 +109,9 @@ REPORT OPTIONS -d:: --display:: - Switch to HITM type (rmt, lcl) to display and sort on. Total HITMs as default. + Switch to HITM type (rmt, lcl) or peer snooping type (peer) to display + and sort on. Total HITMs (tot) as default, except Arm64 uses peer mode + as default. --stitch-lbr:: Show callgraph with stitched LBRs, which may have more complete @@ -174,12 +176,18 @@ For each cacheline in the 1) list we display following data: Cacheline - cacheline address (hex number) - Rmt/Lcl Hitm + Rmt/Lcl Hitm (Display with HITM types) - cacheline percentage of all Remote/Local HITM accesses - LLC Load Hitm - Total, LclHitm, RmtHitm + Peer Snoop (Display with peer type) + - cacheline percentage of all peer accesses + + LLC Load Hitm - Total, LclHitm, RmtHitm (For display with HITM types) - count of Total/Local/Remote load HITMs + Load Peer - Total, Local, Remote (For display with peer type) + - count of Total/Local/Remote load from peer cache or DRAM + Total records - sum of all cachelines accesses @@ -201,16 +209,21 @@ For each cacheline in the 1) list we display following data: - count of LLC load accesses, includes LLC hits and LLC HITMs RMT Load Hit - RmtHit, RmtHitm - - count of remote load accesses, includes remote hits and remote HITMs + - count of remote load accesses, includes remote hits and remote HITMs; + on Arm neoverse cores, RmtHit is used to account remote accesses, + includes remote DRAM or any upward cache level in remote node Load Dram - Lcl, Rmt - count of local and remote DRAM accesses For each offset in the 2) list we display following data: - HITM - Rmt, Lcl + HITM - Rmt, Lcl (Display with HITM types) - % of Remote/Local HITM accesses for given offset within cacheline + Peer Snoop - Rmt, Lcl (Display with peer type) + - % of Remote/Local peer accesses for given offset within cacheline + Store Refs - L1 Hit, L1 Miss, N/A - % of store accesses that hit L1, missed L1 and N/A (no available) memory level for given offset within cacheline @@ -227,9 +240,12 @@ For each offset in the 2) list we display following data: Code address - code address responsible for the accesses - cycles - rmt hitm, lcl hitm, load + cycles - rmt hitm, lcl hitm, load (Display with HITM types) - sum of cycles for given accesses - Remote/Local HITM and generic load + cycles - rmt peer, lcl peer, load (Display with peer type) + - sum of cycles for given accesses - Remote/Local peer load and generic load + cpu cnt - number of cpus that participated on the access @@ -251,7 +267,8 @@ The 'Node' field displays nodes that accesses given cacheline offset. Its output comes in 3 flavors: - node IDs separated by ',' - node IDs with stats for each ID, in following format: - Node{cpus %hitms %stores} + Node{cpus %hitms %stores} (Display with HITM types) + Node{cpus %peers %stores} (Display with peer type) - node IDs with list of affected CPUs in following format: Node{cpu list} |
