Sep. 26th, 2011

izard: (Default)
Big part of my job last 6 years is quickly understanding performance characteristics of big unfamiliar software. So in this post I skip debugger, compiler, and tools focused on optimizing compute kernels (IACA).

90% of customers I work with use Linux OS, and I have to investigate system performance/scalability as well as application's performance.

Intel Vtune was the tool for a while. Than there was also PTU for some time. Statistical call graph and un-core performance counters were useful. Uncore counters support is still most convenient in PTU.

I could have used oprofile, but it lacked many features I need, and the only extra feature it has (support for AMD's counters) was never relevant for me.

Then Vtune was re-designed nearly from scratch and renamed.

Intel Vtune amplifier XE (former Vtune) is way more convenient and easy to use than old one. And it has statistical call graph and threading visualization which together help to get a quick picture on how complex multithreaded app behaves.

When Linux kernel is involved, Vtune (old and new) still do the trick. There were a bunch of other tools which did not seem to be very useful for me (systemtap, oprofile, etc).

Until they introduced perf in kernel/tools. That one is perfect, at last now they have in-kernel tool that is easy to use, and it gets meaningful performance data in a comprehansible format, is flexible and extendable and well documented (the former is sooo rare for kernel subsystems!).

Profile

izard: (Default)
izard

August 2025

S M T W T F S
     12
3456789
10111213 141516
17181920212223
24252627282930
31      

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Aug. 20th, 2025 05:17 am
Powered by Dreamwidth Studios