Sep. 26th, 2011

izard: (Default)
Big part of my job last 6 years is quickly understanding performance characteristics of big unfamiliar software. So in this post I skip debugger, compiler, and tools focused on optimizing compute kernels (IACA).

90% of customers I work with use Linux OS, and I have to investigate system performance/scalability as well as application's performance.

Intel Vtune was the tool for a while. Than there was also PTU for some time. Statistical call graph and un-core performance counters were useful. Uncore counters support is still most convenient in PTU.

I could have used oprofile, but it lacked many features I need, and the only extra feature it has (support for AMD's counters) was never relevant for me.

Then Vtune was re-designed nearly from scratch and renamed.

Intel Vtune amplifier XE (former Vtune) is way more convenient and easy to use than old one. And it has statistical call graph and threading visualization which together help to get a quick picture on how complex multithreaded app behaves.

When Linux kernel is involved, Vtune (old and new) still do the trick. There were a bunch of other tools which did not seem to be very useful for me (systemtap, oprofile, etc).

Until they introduced perf in kernel/tools. That one is perfect, at last now they have in-kernel tool that is easy to use, and it gets meaningful performance data in a comprehansible format, is flexible and extendable and well documented (the former is sooo rare for kernel subsystems!).

Profile

izard: (Default)
izard

July 2025

S M T W T F S
  12345
67 8 91011 12
13141516171819
20212223242526
2728293031  

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jul. 20th, 2025 04:35 pm
Powered by Dreamwidth Studios