KDPV
From the translator:
Most of my friends use chrono
or, in particularly advanced cases, ctime
to measure time in different types of benchmarks in C ++. But benchmarking is much more useful to measure the CPU time. Recently, I came across an article about cross-platform measurement of CPU time and decided to share it here, perhaps somewhat increasing the quality of local benchmarks.
PS When the article says "today" or "now", it means "at the time of publication of the article", that is, if I am not mistaken, March 2012. Neither I nor the author can guarantee that this is still the case.
PPS At the time of publication, the original is unavailable, but is stored in the Yandex cache
API functions that allow you to get the processor time used by the process differ in different operating systems: Windows, Linux, OSX, BSD, Solaris, as well as other UNIX-like operating systems. This article provides a cross-platform function that takes the process CPU time and explains what features each OS supports.
CPU time increases when the process runs and consumes CPU cycles. During I / O operations, thread locks, and other operations that pause the processor, the processor time does not increase until the process starts using the CPU again.
Different tools, such as ps
on POSIX, Activity Monitor on OSX and Task Manager on Windows, show the processor time used by processes, but it is often useful to monitor it directly from the process itself. This is especially useful during benchmarking algorithms or a small part of a complex program. Despite the fact that all operating systems provide an API for getting CPU time, each of them has its own subtleties.
The getCPUTime( )
function shown below works on most operating systems (just copy the code or download the getCPUTime.c file). Where necessary, link up with librt to get POSIX timers (for example, AIX, BSD, Cygwin, HP-UX, Linux and Solaris, but not OSX). Otherwise, standard libraries are sufficient.
Next, we will discuss in detail all the functions, subtleties and reasons why there are so many #ifdef
in the code.
/* * Author: David Robert Nadeau * Site: http://NadeauSoftware.com/ * License: Creative Commons Attribution 3.0 Unported License * http://creativecommons.org/licenses/by/3.0/deed.en_US */ #if defined(_WIN32) #include <Windows.h> #elif defined(__unix__) || defined(__unix) || defined(unix) || (defined(__APPLE__) && defined(__MACH__)) #include <unistd.h> #include <sys/resource.h> #include <sys/times.h> #include <time.h> #else #error "Unable to define getCPUTime( ) for an unknown OS." #endif /** * Returns the amount of CPU time used by the current process, * in seconds, or -1.0 if an error occurred. */ double getCPUTime( ) { #if defined(_WIN32) /* Windows -------------------------------------------------- */ FILETIME createTime; FILETIME exitTime; FILETIME kernelTime; FILETIME userTime; if ( GetProcessTimes( GetCurrentProcess( ), &createTime, &exitTime, &kernelTime, &userTime ) != -1 ) { SYSTEMTIME userSystemTime; if ( FileTimeToSystemTime( &userTime, &userSystemTime ) != -1 ) return (double)userSystemTime.wHour * 3600.0 + (double)userSystemTime.wMinute * 60.0 + (double)userSystemTime.wSecond + (double)userSystemTime.wMilliseconds / 1000.0; } #elif defined(__unix__) || defined(__unix) || defined(unix) || (defined(__APPLE__) && defined(__MACH__)) /* AIX, BSD, Cygwin, HP-UX, Linux, OSX, and Solaris --------- */ #if defined(_POSIX_TIMERS) && (_POSIX_TIMERS > 0) /* Prefer high-res POSIX timers, when available. */ { clockid_t id; struct timespec ts; #if _POSIX_CPUTIME > 0 /* Clock ids vary by OS. Query the id, if possible. */ if ( clock_getcpuclockid( 0, &id ) == -1 ) #endif #if defined(CLOCK_PROCESS_CPUTIME_ID) /* Use known clock id for AIX, Linux, or Solaris. */ id = CLOCK_PROCESS_CPUTIME_ID; #elif defined(CLOCK_VIRTUAL) /* Use known clock id for BSD or HP-UX. */ id = CLOCK_VIRTUAL; #else id = (clockid_t)-1; #endif if ( id != (clockid_t)-1 && clock_gettime( id, &ts ) != -1 ) return (double)ts.tv_sec + (double)ts.tv_nsec / 1000000000.0; } #endif #if defined(RUSAGE_SELF) { struct rusage rusage; if ( getrusage( RUSAGE_SELF, &rusage ) != -1 ) return (double)rusage.ru_utime.tv_sec + (double)rusage.ru_utime.tv_usec / 1000000.0; } #endif #if defined(_SC_CLK_TCK) { const double ticks = (double)sysconf( _SC_CLK_TCK ); struct tms tms; if ( times( &tms ) != (clock_t)-1 ) return (double)tms.tms_utime / ticks; } #endif #if defined(CLOCKS_PER_SEC) { clock_t cl = clock( ); if ( cl != (clock_t)-1 ) return (double)cl / (double)CLOCKS_PER_SEC; } #endif #endif return -1; /* Failed. */ }
To measure the processor time of the algorithm, call getCPUTime( )
before and after running the algorithm, and output the difference. You should not assume that the value returned by a single function call carries some meaning.
double startTime, endTime; startTime = getCPUTime( ); ... endTime = getCPUTime( ); fprintf( stderr, "CPU time used = %lf\n", (endTime - startTime) );
Each OS provides one or more ways to get CPU time. However, some methods are more accurate than others.
OS | clock | clock_gettime | GetProcessTimes | getrusage | times |
---|---|---|---|---|---|
Aix | yes | yes | yes | yes | |
BSD | yes | yes | yes | yes | |
HP-UX | yes | yes | yes | yes | |
Linux | yes | yes | yes | yes | |
Osx | yes | yes | yes | ||
Solaris | yes | yes | yes | yes | |
Windows | yes |
Each of these methods is detailed below.
On Windows and Cygwin (a UNIX-like environment and command line interface for Windows), the GetProcessTimes () function fills the FILETIME structure with the processor time used by the process, and the FileTimeToSystemTime () function converts the FILETIME structure into a SYSTEMTIME structure containing the usable time value.
typedef struct _SYSTEMTIME { WORD wYear; WORD wMonth; WORD wDayOfWeek; WORD wDay; WORD wHour; WORD wMinute; WORD wSecond; WORD wMilliseconds; } SYSTEMTIME, *PSYSTEMTIME;
Accessibility GetProcessTimes (): Cygwin, Windows XP and later.
Getting CPU time:
#include <Windows.h> ... FILETIME createTime; FILETIME exitTime; FILETIME kernelTime; FILETIME userTime; if ( GetProcessTimes( GetCurrentProcess( ), &createTime, &exitTime, &kernelTime, &userTime ) != -1 ) { SYSTEMTIME userSystemTime; if ( FileTimeToSystemTime( &userTime, &userSystemTime ) != -1 ) return (double)userSystemTime.wHour * 3600.0 + (double)userSystemTime.wMinute * 60.0 + (double)userSystemTime.wSecond + (double)userSystemTime.wMilliseconds / 1000.0; }
On most POSIX-compatible OSs, clock_gettime( )
(see AIX , BSD , HP-UX , Linux, and Solaris manuals) provides the most accurate CPU time. The first argument of the function selects "clock id", and the second is the structure of the timespec
, filled with used processor time in seconds and nanoseconds. For most OSs, the program must be linked to librt .
However, there are several subtleties that make it difficult to use this function in cross-platform code:
_POSIX_TIMERS
defined in <unistd.h>
value greater than 0. Currently, AIX, BSD, HP-UX, Linux and Solaris support this function, but OSX does not support it.timespec
structure filled in by the clock_gettime( )
function can store time in nanoseconds, but the clock accuracy differs across OSs and systems. The clock_getres () function returns the accuracy of the clock if you need one. This function, again, is an optional part of the POSIX standard, available only if _POSIX_TIMERS
greater than zero. Currently, AIX, BSD, HP-UX, Linux, and Solaris provide this feature, but it does not work on Solaris.CLOCK_PROCESS_CPUTIME_ID
, to obtain the process time of the process. However, today BSD and HP-UX do not have this id, and define their own id CLOCK_VIRTUAL
for processor time instead. To confuse everything even more, Solaris defines both of these, but uses CLOCK_VIRTUAL
for the CLOCK_VIRTUAL
's processor time, not the process .OS | What id to use |
---|---|
Aix | CLOCK_PROCESS_CPUTIME_ID |
BSD | CLOCK_VIRTUAL |
HP-UX | CLOCK_VIRTUAL |
Linux | CLOCK_PROCESS_CPUTIME_ID |
Solaris | CLOCK_PROCESS_CPUTIME_ID |
_POSIX_CPUTIME
greater than 0. Today, only AIX and Linux provide this function, but Linux include files do not define _POSIX_CPUTIME
and the function returns unreliable and incompatible with POSIX results.clock_gettime( )
function can be implemented using the processor's time register. On multiprocessor systems, individual processors may have slightly different perceptions of time, due to which the function may return incorrect values if the process was transferred from the processor to the processor. On Linux, and only on Linux, this can be detected if clock_getcpuclockid( )
returns a non-POSIX error and sets errno
to ENOENT
. However, as noted above, Linux clock_getcpuclockid( )
unreliable on Linux.In practice, due to all these subtleties, the use of clock_gettime( )
requires a lot of checks with #ifdef
and the ability to switch to another function if it does not work.
Availability clock_gettime (): AIX, BSD, Cygwin, HP-UX, Linux, and Solaris. But the clock id on BSD and HP-UX is non-standard.
Availability clock_getres (): AIX, BSD, Cygwin, HP-UX, and Linux, but Solaris does not work.
Availability clock_getcpuclockid (): AIX and Cygwin, not unreliable on Linux.
Getting CPU time:
#include <unistd.h> #include <time.h> ... #if defined(_POSIX_TIMERS) && (_POSIX_TIMERS > 0) clockid_t id; struct timespec ts; #if _POSIX_CPUTIME > 0 /* Clock ids vary by OS. Query the id, if possible. */ if ( clock_getcpuclockid( 0, &id ) == -1 ) #endif #if defined(CLOCK_PROCESS_CPUTIME_ID) /* Use known clock id for AIX, Linux, or Solaris. */ id = CLOCK_PROCESS_CPUTIME_ID; #elif defined(CLOCK_VIRTUAL) /* Use known clock id for BSD or HP-UX. */ id = CLOCK_VIRTUAL; #else id = (clockid_t)-1; #endif if ( id != (clockid_t)-1 && clock_gettime( id, &ts ) != -1 ) return (double)ts.tv_sec + (double)ts.tv_nsec / 1000000000.0; #endif
On all UNIX-like operating systems, the getrusage () function is the most reliable way to get the processor time used by the current process. The function fills the rusage structure with time in seconds and microseconds. The ru_utime
field contains time spent in user mode, and the ru_stime
field ru_stime
system mode on behalf of the process.
Note: Some operating systems, before widespread 64-bit support, defined the getrusage( )
function, which returns a 32-bit value, and the getrusage64( )
function, which returns a 64-bit value. Today, getrusage( )
returns a 64-bit value, and getrusage64( )
is out of date.
Accessibility getrusage (): AIX, BSD, Cygwin, HP-UX, Linux, OSX, and Solaris.
Getting CPU time:
#include <sys/resource.h> #include <sys/times.h> ... struct rusage rusage; if ( getrusage( RUSAGE_SELF, &rusage ) != -1 ) return (double)rusage.ru_utime.tv_sec + (double)rusage.ru_utime.tv_usec / 1000000.0;
On all UNIX-like operating systems, the outdated times () function fills the tms
structure with processor time in ticks, and the sysconf () function returns the number of ticks per second. The tms_utime
field contains the time spent in user mode, and the tms_stime
field tms_stime
system time on behalf of the process.
Warning: The older argument of the sysconf( )
CLK_TCK
obsolete and may not be supported in some operating systems. If it is available, the sysconf( )
function usually does not work when using it. Use _SC_CLK_TCK
instead.
Times () availability: AIX, BSD, Cygwin, HP-UX, Linux, OSX, and Solaris.
Getting CPU time:
#include <unistd.h> #include <sys/times.h> ... const double ticks = (double)sysconf( _SC_CLK_TCK ); struct tms tms; if ( times( &tms ) != (clock_t)-1 ) return (double)tms.tms_utime / ticks;
On all UNIX-like operating systems, the very old clock () function returns the processor time of the process in ticks, and the CLOCKS_PER_SEC
macro CLOCKS_PER_SEC
number of ticks per second.
Note: The returned processor time includes the time spent in user mode and in system mode on behalf of the process.
Note: Although CLOCKS_PER_SEC
was originally supposed to return a processor-dependent value, the C ISO C89 and C99 standards, the Single UNIX Specification and the POSIX standard require CLOCKS_PER_SEC
have a fixed value of 1,000,000, which limits the accuracy of the function to microseconds. Most operating systems comply with these standards, but FreeBSD, Cygwin, and older OSX versions use non-standard values.
Caution: On AIX and Solaris, the clock( )
function includes the CPU time of the current AND process and any completed child process for which the parent executed one of the wait( )
, system( )
or pclose( )
functions.
Attention: In Windows, the clock () function is supported, but returns not processor, but real time.
Availability clock (): AIX, BSD, Cygwin, HP-UX, Linux, OSX, and Solaris.
Getting CPU time:
#include <time.h> ... clock_t cl = clock( ); if ( cl != (clock_t)-1 ) return (double)cl / (double)CLOCKS_PER_SEC;
There are other OS-specific ways to get CPU time. On Linux, Solaris and some BSD, you can parse / proc / [pid] / stat to get process statistics. On OSX, the proc_pidtaskinfo( )
private API function in libproc
returns information about the process. There are also open libraries, such as libproc, procps and Sigar .
On UNIX, there are several utilities for displaying the processor time of the process, including ps , top , mpstat, and others. You can also use the time utility to display the time spent on a command.
On Windows, you can use Task Manager to monitor CPU usage.
On OSX, you can use Activity Monitor to monitor CPU usage. The Instruments Profiling Utility bundled with Xcode can monitor CPU usage as well as many other things.
#ifdef
macros for OS-specific code. Some of these methods are used in this article to determine Windows, OSX and UNIX variants.Source: https://habr.com/ru/post/282301/
All Articles