Performance issues in large applications arise only in particular
scenarios under heavy load conditions. It is therefore
difficult to catch them during testing and they easily escape
into production. This necessitates the design of a common
and efficient instrumentation strategy that profiles the flow
of objects during an execution. Designing such a strategy
which enables profile generation precisely with low overhead
is non-trivial due to the number of objects created, accessed
and paths traversed by them in an execution.
We have designed and implemented an instrumentation
tool that efficiently generates object flow
profiles for Java programs, without requiring any modifications
to the underlying virtual machine. An object flow profile for an
allocation site is a graph where the root node represents the allocation site,
the other nodes represent the definitions
of the corresponding
object reference, the edges represent the flows and the weights on these
edges indicate the count of the objects taking that flow.
The application that needs to be profiled forms the input to the object flow profiler.
The output is a set of object flow graphs (and profiles) corresponding
to different allocation sites. Our tool first constructs a novel hybrid flow graph (HFG)
that captures the control and data dependences between the relevant nodes in a method
and statically encodes this graph using Ball-Larus numbering
It then instruments the program using this encoding. The program, upon execution, generates
path profiles. Finally, the HFG
profiles are transformed in an
offline phase to obtain the object flow profiles.
We have validated the efficacy of our tool by
applying it on Java programs. The results demonstrate the
scalability of our profiler, which can handle 0.2M to 0.55B
object accesses with an average runtime overhead of 8x. We have
also demonstrated the effectiveness of the generated profiles
by implementing a client analysis that consumes the profiles
to detect performance bugs. The analysis is able to detect 38 performance
bugs which when refactored result in significant
performance gains (up to 30%) in running times.
The VM image and usage instructions are available here
Efficient Flow Profiling for Detecting Performance Bugs pdf
Rashmi Mudduluru firstname.lastname@example.org