相对于最大CPU频率的CPU/CPU进程/ CPU周期

问题描述：

目前我通过Python的psutil modul监控多个进程，并以execution_time/total_time为基础检索CPU使用率百分比。这样做的问题是动态电压和频率缩放（DVFS或ACPI的P状态或cpufreq等）。当前CPU频率越低，进程需要执行的时间越长，CPU使用率就越高。与此相反，我需要固定的 CPU使用率相对于CPU的最高性能。相对于最大CPU频率的CPU/CPU进程/ CPU周期

为了避免多次重复计算时永久更改“当前频率”，一种方法是直接使用该进程使用的CPU周期。原则上，这可以通过在C命令中的perf_event.h或在Linux命令行上的perf完成。不幸的是，我找不到一个提供类似功能的Python模块（基于前面提到的）。

答

感谢的BlackJack的评论

怎么样用C实现它的共享库，并通过Python中使用它吗？

库调用的开销较小。子进程调用会启动整个外部进程，并在每次需要该值时将结果作为字符串通过管道传递。共享库加载一次进入当前进程并将结果传递到内存中。

我将它作为共享库实现。该库的源代码cpucycles.c是（主要基于的perf_event_open's man page的示例）：

$ gcc -c -fPIC cpucycles.c -o cpucycles.o 
$ gcc -shared -Wl,-soname,libcpucycles.so.1 -o libcpucycles.so.1.0.1 cpucycles.o

：

#include <stdlib.h> 
#include <unistd.h> 
#include <string.h> 
#include <sys/ioctl.h> 
#include <linux/perf_event.h> 
#include <asm/unistd.h> 

static long 
perf_event_open(struct perf_event_attr *hw_event, pid_t pid, 
       int cpu, int group_fd, unsigned long flags) 
{ 
    int ret; 

    ret = syscall(__NR_perf_event_open, hw_event, pid, cpu, 
        group_fd, flags); 
    return ret; 
} 

long long 
cpu_cycles(unsigned int microseconds, 
      pid_t pid, 
      int cpu, 
      int exclude_user, 
      int exclude_kernel, 
      int exclude_hv, 
      int exclude_idle) 
{ 
    struct perf_event_attr pe; 
    long long count; 
    int fd; 

    memset(&pe, 0, sizeof(struct perf_event_attr)); 
    pe.type = PERF_TYPE_HARDWARE; 
    pe.size = sizeof(struct perf_event_attr); 
    pe.config = PERF_COUNT_HW_CPU_CYCLES; 
    pe.disabled = 1; 
    pe.exclude_user = exclude_user; 
    pe.exclude_kernel = exclude_kernel; 
    pe.exclude_hv = exclude_hv; 
    pe.exclude_idle = exclude_idle; 

    fd = perf_event_open(&pe, pid, cpu, -1, 0); 
    if (fd == -1) { 
     return -1; 
    } 
    ioctl(fd, PERF_EVENT_IOC_RESET, 0); 
    ioctl(fd, PERF_EVENT_IOC_ENABLE, 0); 
    usleep(microseconds); 
    ioctl(fd, PERF_EVENT_IOC_DISABLE, 0); 
    read(fd, &count, sizeof(long long)); 

    close(fd); 
    return count; 
}

该代码通过以下两个命令编译成一个共享库

import ctypes 
import os 

cdll = ctypes.cdll.LoadLibrary(os.path.join(os.path.dirname(__file__), "libcpucycles.so.1.0.1")) 
cdll.cpu_cycles.argtypes = (ctypes.c_uint, ctypes.c_int, ctypes.c_int, 
          ctypes.c_int, ctypes.c_int, ctypes.c_int, 
          ctypes.c_int) 
cdll.cpu_cycles.restype = ctypes.c_longlong 

def cpu_cycles(duration=1.0, pid=0, cpu=-1, 
       exclude_user=False, exclude_kernel=False, 
       exclude_hv=True, exclude_idle=True): 
    """ 
    See man page of perf_event_open for all the parameters. 

    :param duration: duration of counting cpu_cycles [seconds] 
    :type duration: float 
    :returns: cpu-cycle count of pid 
    :rtype: int 
    """ 
    count = cdll.cpu_cycles(int(duration*1000000), pid, cpu, 
          exclude_user, exclude_kernel, 
          exclude_hv, exclude_idle) 
    if count < 0: 
       raise OSError("cpu_cycles(pid={}, duration={}) from {} exited with code {}.".format(
        pid, duration, cdll._name, count)) 

    return count

答

最后，我通过perf命令行工具读取CPU周期做的并包裹成Python（简化代码）：

import subprocess 
maximum_cpu_frequency = 3e9 
cpu_percent = [] 
while True: # some stop criteria 
    try: 
     cpu_percent.append(int(
       subprocess.check_output(["perf", "stat", "-e", "cycles", 
         "-p", pid, "-x", ",", "sleep", "1"], 
         stderr=subprocess.STDOUT).decode().split(",")[0] 
       )/maximum_cpu_frequency) 
    except ValueError: 
     cpu_percent.append(0.0)

不幸的是，这是不准确由于不精确sleep命令以及由于为每个样品产生了新的perf过程，所以效率很高。

什么impleme：最后，库可以被Python在cpucycles.py使用在C中作为共享库它，并通过Python中的'ctypes'使用它？ – BlackJack

库调用是否会引入比进程调用更少的开销？如果是这样，这可能是一个更有效的方法。不幸的是，它需要更多的我在C中不太熟悉的实现。 – Chickenmarkus

库调用会减少开销。子进程调用会启动整个外部进程，并在每次需要该值时将结果作为字符串通过管道传递。共享库将_once_加载到当前进程中，并将结果传递到内存中。 – BlackJack

相对于最大CPU频率的CPU/CPU进程/ CPU周期

相关推荐