linux-stable/include/uapi/linux/acct.h
Dr. Thomas Orgis 0e0af57e0e taskstats: version 12 with thread group and exe info
The task exit struct needs some crucial information to be able to provide
an enhanced version of process and thread accounting.  This change
provides:

1. ac_tgid in additon to ac_pid
2. thread group execution walltime in ac_tgetime
3. flag AGROUP in ac_flag to indicate the last task
   in a thread group / process
4. device ID and inode of task's /proc/self/exe in
   ac_exe_dev and ac_exe_inode
5. tools/accounting/procacct as demonstrator

When a task exits, taskstats are reported to userspace including the
task's pid and ppid, but without the id of the thread group this task is
part of.  Without the tgid, the stats of single tasks cannot be correlated
to each other as a thread group (process).

The taskstats documentation suggests that on process exit a data set
consisting of accumulated stats for the whole group is produced.  But such
an additional set of stats is only produced for actually multithreaded
processes, not groups that had only one thread, and also those stats only
contain data about delay accounting and not the more basic information
about CPU and memory resource usage.  Adding the AGROUP flag to be set
when the last task of a group exited enables determination of process end
also for single-threaded processes.

My applicaton basically does enhanced process accounting with summed
cputime, biggest maxrss, tasks per process.  The data is not available
with the traditional BSD process accounting (which is not designed to be
extensible) and the taskstats interface allows more efficient on-the-fly
grouping and summing of the stats, anyway, without intermediate disk
writes.

Furthermore, I do carry statistics on which exact program binary is used
how often with associated resources, getting a picture on how important
which parts of a collection of installed scientific software in different
versions are, and how well they put load on the machine.  This is enabled
by providing information on /proc/self/exe for each task.  I assume the
two 64-bit fields for device ID and inode are more appropriate than the
possibly large resolved path to keep the data volume down.

Add the tgid to the stats to complete task identification, the flag AGROUP
to mark the last task of a group, the group wallclock time, and
inode-based identification of the associated executable file.

Add tools/accounting/procacct.c as a simplified fork of getdelays.c to
demonstrate process and thread accounting.

[thomas.orgis@uni-hamburg.de: fix version number in comment]
  Link: https://lkml.kernel.org/r/20220405003601.7a5f6008@plasteblaster
Link: https://lkml.kernel.org/r/20220331004106.64e5616b@plasteblaster
Signed-off-by: Dr. Thomas Orgis <thomas.orgis@uni-hamburg.de>
Reviewed-by: Ismael Luceno <ismael@iodev.co.uk>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: xu xin <xu.xin16@zte.com.cn>
Cc: Yang Yang <yang.yang29@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-29 14:38:03 -07:00

128 lines
4 KiB
C

/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
* BSD Process Accounting for Linux - Definitions
*
* Author: Marco van Wieringen (mvw@planets.elm.net)
*
* This header file contains the definitions needed to implement
* BSD-style process accounting. The kernel accounting code and all
* user-level programs that try to do something useful with the
* process accounting log must include this file.
*
* Copyright (C) 1995 - 1997 Marco van Wieringen - ELM Consultancy B.V.
*
*/
#ifndef _UAPI_LINUX_ACCT_H
#define _UAPI_LINUX_ACCT_H
#include <linux/types.h>
#include <asm/param.h>
#include <asm/byteorder.h>
/*
* comp_t is a 16-bit "floating" point number with a 3-bit base 8
* exponent and a 13-bit fraction.
* comp2_t is 24-bit with 5-bit base 2 exponent and 20 bit fraction
* (leading 1 not stored).
* See linux/kernel/acct.c for the specific encoding systems used.
*/
typedef __u16 comp_t;
typedef __u32 comp2_t;
/*
* accounting file record
*
* This structure contains all of the information written out to the
* process accounting file whenever a process exits.
*/
#define ACCT_COMM 16
struct acct
{
char ac_flag; /* Flags */
char ac_version; /* Always set to ACCT_VERSION */
/* for binary compatibility back until 2.0 */
__u16 ac_uid16; /* LSB of Real User ID */
__u16 ac_gid16; /* LSB of Real Group ID */
__u16 ac_tty; /* Control Terminal */
/* __u32 range means times from 1970 to 2106 */
__u32 ac_btime; /* Process Creation Time */
comp_t ac_utime; /* User Time */
comp_t ac_stime; /* System Time */
comp_t ac_etime; /* Elapsed Time */
comp_t ac_mem; /* Average Memory Usage */
comp_t ac_io; /* Chars Transferred */
comp_t ac_rw; /* Blocks Read or Written */
comp_t ac_minflt; /* Minor Pagefaults */
comp_t ac_majflt; /* Major Pagefaults */
comp_t ac_swaps; /* Number of Swaps */
/* m68k had no padding here. */
#if !defined(CONFIG_M68K) || !defined(__KERNEL__)
__u16 ac_ahz; /* AHZ */
#endif
__u32 ac_exitcode; /* Exitcode */
char ac_comm[ACCT_COMM + 1]; /* Command Name */
__u8 ac_etime_hi; /* Elapsed Time MSB */
__u16 ac_etime_lo; /* Elapsed Time LSB */
__u32 ac_uid; /* Real User ID */
__u32 ac_gid; /* Real Group ID */
};
struct acct_v3
{
char ac_flag; /* Flags */
char ac_version; /* Always set to ACCT_VERSION */
__u16 ac_tty; /* Control Terminal */
__u32 ac_exitcode; /* Exitcode */
__u32 ac_uid; /* Real User ID */
__u32 ac_gid; /* Real Group ID */
__u32 ac_pid; /* Process ID */
__u32 ac_ppid; /* Parent Process ID */
/* __u32 range means times from 1970 to 2106 */
__u32 ac_btime; /* Process Creation Time */
#ifdef __KERNEL__
__u32 ac_etime; /* Elapsed Time */
#else
float ac_etime; /* Elapsed Time */
#endif
comp_t ac_utime; /* User Time */
comp_t ac_stime; /* System Time */
comp_t ac_mem; /* Average Memory Usage */
comp_t ac_io; /* Chars Transferred */
comp_t ac_rw; /* Blocks Read or Written */
comp_t ac_minflt; /* Minor Pagefaults */
comp_t ac_majflt; /* Major Pagefaults */
comp_t ac_swaps; /* Number of Swaps */
char ac_comm[ACCT_COMM]; /* Command Name */
};
/*
* accounting flags
*/
/* bit set when the process/task ... */
#define AFORK 0x01 /* ... executed fork, but did not exec */
#define ASU 0x02 /* ... used super-user privileges */
#define ACOMPAT 0x04 /* ... used compatibility mode (VAX only not used) */
#define ACORE 0x08 /* ... dumped core */
#define AXSIG 0x10 /* ... was killed by a signal */
#define AGROUP 0x20 /* ... was the last task of the process (task group) */
#if defined(__BYTE_ORDER) ? __BYTE_ORDER == __BIG_ENDIAN : defined(__BIG_ENDIAN)
#define ACCT_BYTEORDER 0x80 /* accounting file is big endian */
#elif defined(__BYTE_ORDER) ? __BYTE_ORDER == __LITTLE_ENDIAN : defined(__LITTLE_ENDIAN)
#define ACCT_BYTEORDER 0x00 /* accounting file is little endian */
#else
#error unspecified endianness
#endif
#ifndef __KERNEL__
#define ACCT_VERSION 2
#define AHZ (HZ)
#endif /* __KERNEL */
#endif /* _UAPI_LINUX_ACCT_H */