Hierarchical CPU call Profiler More...

#include <CPUProfiler.h>

Classes
class	ResolvedEvent

Public Types
typedef unsigned	EventId

Public Member Functions
EventId	BeginEvent (const char eventLiteral[])

void	EndEvent (EventId eventId)

void	EndFrame ()

std::vector< ResolvedEvent >	CalculateResolvedEvents () const

Detailed Description

Hierarchical CPU call Profiler

This is a light weight profiler that can give a reasonably accurate profile of CPU events.

Profiling occurs by registering begin and end events associated with labels. When interpret these hierarchically, like a call stack. We expect socks & shoes type behaviour for these events (ie, first on, last off).

We use a pointer to a string constant literal to look for equality between profiler labels. This turns out to be just incredibly convenient (but relies on the string pooling compiler setting).

When calling BeginEvent or EndEvent, you should use a static literal string (or, at least, some pointer that will permanently point to a valid string). Normally this should look like:

HierarchicalCPUProfiler& profiler = ...;


auto id = profiler.BeginEvent("RenderFrame");
RenderFrame(); // (something that takes time)
profiler.EndEvent(id);

Above, the string literal ("RenderFrame") will evaluate to the same pointer where ever it appears in the code. So we can compare those pointer when we want to check events for equivalence.

Above, EndEvent() could be implemented to take a return code from BeginEvent, or it could use the same literal. Using the same literal might be slightly more efficient. However, I've decided to use an id – this enables use to query the cost of a specific event. Querying by a string label would give use the cost of every event using the same label. But the id allows us to target a specific instance.

The profiler will minimize it's cost during profiling, even if that means that interpreting the results afterwards is a little more expensive.

This has 2 advantages:

We want to avoid distorting the profile while we're profiling. If the profiler itself changes the cost of functions, the results won't be perfectly accurate. Similarly, if the performance of the CPU relative to the GPU changes, it can change the profile. So we want to avoid this at all costs!
We can profile many items. Sometimes we want to create a profile label in the middle of a loop, or some frequently called function. This is only possible if the profiler overhead is really low. So we need to limit the overhead to the absolute minimum.

The overhead is small enough that enabling/disabling profiling with a condition is too expensive. So profiling can only be disabled at compile time.

This is intended to be used on a single thread. When profiling multiple threads, use multiple instances.

I've written variations of this class so many times! But this one is open-source. It's forever!

The documentation for this class was generated from the following files:

Utility/Profiling/CPUProfiler.h
Utility/Profiling/CPUProfiler.cpp

Classes

Public Types

Public Member Functions

Detailed Description