This is an (incomplete) list of some of the stuff we want to look at doing. If you're interested in hacking on any of these, please contact the list first for some pointers and/or read HACKING and doc/CodingStyle. 0.9 release ----------- finish testing especially of anon region stuff 1.0 release ----------- (this is a minimal selection of stuff I think we need) o amd64 32 bit build needs a sys32_lookup_dcookie() translator in the kernel o op_bfd.cpp:get_linenr() see FIXME about the use of ibfd/dbfd, needs test. o decide on -m tgid semantics for anon regions o if ev67 is not fixed, back it out o opimport needs a man page o add call graph option to oprof_start Later ----- o remove 2.95/2.2 support so we can use boost multi index container in symbol/sample container o consider if we can improve anon mapping growing support [moz@lambent pp]$ ./opreport -lf lib-image:/lib/tls/libc-2.3.2.so /bin/bash | grep vfprintf 14 0.1301 6 0.0102 /lib/tls/libc-2.3.2.so vfprintf [moz@lambent pp]$ ./opreport -lf lib-image:/lib/tls/libc-2.3.2.so /usr/bin/vim | grep vfprintf 176 2.0927 349 1.2552 /lib/tls/libc-2.3.2.so vfprintf [moz@lambent pp]$ ./opreport -lf lib-image:/lib/tls/libc-2.3.2.so { image:/bin/bash } { image:/usr/bin/vim } | grep vfprintf 176 10.9657 +++ 349 7.8888 +++ vfprintf 14 --- --- 6 --- --- vfprintf it seems them as two separate symbols but can we remove the app_name from rough_less and still be able to walk the two lists? even if we could, it would still go wrong when we're profiling multiple apps o Java stuff?? o with opreport -c I can get "warning: /no-vmlinux could not be found.". Should be smarter ? o opreport -c gives weird output for an image with no symbols: samples % symbol name 15965 100.000 (no symbols) 253 100.000 (no symbols) 15965 98.4400 (no symbols) 253 1.5600 (no symbols) [self] o consider tagging opreport -c entries with a number like gprof o --details for opreport -c, or diff?? o should [self] entries be ommitted if 0 ?? o stress test opreport -c: compile a Big Application w/o frame pointer and look how driver and opreport -c react. o oparchive could fix up {kern} paths with -p (what about diff between archive and current though?) o can say more in opcontrol --status o consider a sort option for diff % o opannotate is silent about symbols missing debug info o opcontrol --reset should avoid to reload the module if it's unloaded o oprofiled.log now contains various statistics about lost sample etc. from the driver. Post profile tools must parse that and warn eventually, warning must include a proposed work around. User need this: if nothing seems wrong people are unlikely to get a look in oprofiled.log (I ran oprofile on 2.6.1 2 weeks before noticing at 30000 I lost a lot of samples, the profile seemed ok du to the randomization of lost samples). As developper we need that too, actually we have no clear idea of the behavior on different arch, NUMA etc. Not perfect because if the profiler is running the oprofiled.log will show those warning only after the first alarm signal, I think we must dump the statistics information after each opcontrol --dump to avoid that. o odb_insert() can fail on ftruncate or mremap() in db_manage.c but we don't try to recover gracefully. o output column shortname headers for opreport -l o is relative_to_absolute_path guaranteeing a trailing '/' documented ? o move oprofiled.log to OP_SAMPLE_DIR/current ? o --buffer-size is useless on 2.5 without tuning of watershed o pp tools must handle samples count overflow (marked as (unsigned)-1) o the way we show kernel modules in 2.5 is not very obvious - "/oprofile" o oparchive will be more usefull with a --root= options to allow profiling on a small box, nfs mount / to another box and transfer sample file and binary on a bigger box for analysis. There is also a problem in oparchive you can use session: to get the right path to samples files but oprofiled.log and abi files path are hardcoded to /var/lib/oprofile. o callgraph patch: better way to skip ignored backtrace ? o zwane problem with wrong text offset showed an interesting problem: if op_bfd.cpp get any symbol below text offset for vmlinux or a module then profile_t::sample_range() return a pair of iterator with first > second and we throw o lib-image: and image: behavior depend on --separate=, if --separate=library opreport "lib-image:*libc*" --merge=lib works but not opreport "image:*libc*" --merge=lib whilst the behavior is reversed if --separate==none. Must we take care ? o dependencies between profile_container.h symbol_container.h and sample_container.h become more and more ugly, I needed to include them in a specific order in some source (still true??) o add event aliases for common things like icache misses, we must start to think about metrics including simple like event alias mapped to two or more events and intepreted specially by user space tools like using the ratio of samples; more tricky will be to select an event used as call count (no cg on it) and used to emulate the call count field in gprof. I think this is a after 1.0 thing but event aliases must be specified in a way allowing such extension o do we need an opreport like opreport -c (showing caller/callee at binary boundary not symbols) ? o we should notice an opcontrol config change (--separate etc.) and auto-restart the daemon if necessary (Run) o we can add lots more unit tests yet o Itanium event constraints are not implemented o side-by-side opreport output (--compare - needs UI spec) ??? o GUI still has a physical-counter interface, should have a general one like opcontrol --event o I think we should have the ability to have *fixed* width headers, e.g. : vma samples cum. samples % cum. % symbol name image name app name 0804c350 64582 64582 35.0757 35.0757 odb_insert /usr/loc...in/oprofiled /usr/local/oprofile-pp/bin/oprofiled Note the ellipsis o should we make the sighup handler re-read counter config and re-start profiling too ? o improve --smart-demangle o allow user to add it's own pattern in user.pat, document it. o hard code ${typename} regular definition to remove all current limitations (difficult, perhaps after 1.0 ?). o oprof_start dialog size is too small initially o oprof_start key movement through events doesn't change help text o i18n. We need a good formatter, and also remember format_percent() o opannotate --source --output-dir=~moz/op/ /usr/bin/oprofiled will fail because the ~ is not expanded (no space around it) (popt bug I say) o cpu names instead of numbers in 2.4 module/ ? o remove 1 and 2 magic numbers for oprof_ready o adapt Anton's patch for handling non-symbolled libraries ? (nowaday C++ anon namespace symbol are static, 3.4 iirc, so with recent distro we are more likely to get problems with a "fallback to dynamic symbols" approch) o use standard C integer type int32_t int16_t etc. o event multiplexing for real o randomizing of reset value o XML output o profile the NMI handler code Documentation ------------- o the docs should mention the default event for each arch somewhere o more discussion of problematic code needs to go in the "interpreting" section. o document gcc 2.95 and linenr info problems especially for inline functions o finish the internals manual General checks to make ---------------------- o rgrep FIXME o valgrind (--show-reachable=yes --leak-check=yes) o audit to track unnecessary include <> o gcc 3.0/3.x compile o Qt2/3 check, no Qt check o verify builds (modversions, kernel versions, athlon etc.). I have the necessary stuff to check kernel versions/configurations on PIII core (Phil) o use nm and a little script to track unused function o test it to hell and back o compile all C++ programs with STL_port and test them (gcc 3.4 contain a debug mode too but std::string iterator are not checked) o There is probably place of post profile tools where looking at errno will give better error messages.