Benchmarking is a process, involves a set of software tools, to measure the relative performance of an object by running a number of standard tests and trials against it.[0] One of the common properties of these programs, which are written with different APIs such as OpenCL and CUDA and which are expected to run on different architectures, is the execution time. However, the execution time of a program is not definitive because it contains the time for code initialization, memory allocation etc. We are only interested in the time required for the GPU, or the CPU, to carry out the calculations. Moreover, for programs that run on the GPU, there are other common metrics such as kernel metrics. For these reasons, we used two benchmarking tools:

Benchmarking Tools

Simple Stopwatch

This is a simple stopwatch implementation written in C. It uses the following structure:

 typedef struct stopwatch_s
 {
   double start;
   double stop;
   double time_elapsed;
 } stopwatch_t;

and defines the following following methods:

 void start_stopwatch(stopwatch_t *);
 void stop_stopwatch(stopwatch_t *);

It allows us to measure the spent on the specific parts of the code.

Nvidia Compute Visual Profiler

This is a cross-platform profiling tool that has the following, and not limited to, features: [1]

Create a profile based on:
- Kernel occupancy
- Instruction throughput
- Memory access characteristics
Generate charts and graphs based on results
Compare results across multiple sessions

CUDA Programming/BenchmarkingTools

Contents

Benchmarking Tools

Simple Stopwatch

Nvidia Compute Visual Profiler

References

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools