linux-stable/kernel
Frederic Weisbecker 62188451f0 cputime: Avoid multiplication overflow on utime scaling
We scale stime, utime values based on rtime (sum_exec_runtime
converted to jiffies). During scaling we multiple rtime * utime,
which seems to be fine, since both values are converted to u64,
but it's not.

Let assume HZ is 1000 - 1ms tick. Process consist of 64 threads,
run for 1 day, threads utilize 100% cpu on user space. Machine
has 64 cpus.

Process rtime = utime will be 64 * 24 * 60 * 60 * 1000 jiffies,
which is 0x149970000. Multiplication rtime * utime result is
0x1a855771100000000, which can not be covered in 64 bits.

Result of overflow is stall of utime values visible in user
space (prev_utime in kernel), even if application still consume
lot of CPU time.

A solution to solve this is to perform the multiplication on
stime instead of utime. It's easy to grow the utime value fast
with a CPU bound thread in userspace for example. Now we assume
that doing so with stime is much harder. In most cases a task
shouldn't ever spend much time in kernel space as it tends to
sleep waiting for jobs completion when they take long to
achieve. IO is the typical example of that.

Hence scaling the cputime by performing the multiplication on
stime instead of utime should considerably reduce the chances of
an overflow on most workloads.

This is largely inspired by a patch from Stanislaw Gruszka:
http://lkml.kernel.org/r/20130107113144.GA7544@redhat.com

Inspired-by: Stanislaw Gruszka <sgruszka@redhat.com>
Reported-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1359217182-25184-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-01-27 14:04:44 +01:00
..
debug module: add new state MODULE_STATE_UNFORMED. 2013-01-12 13:27:05 +10:30
events Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2012-12-17 15:44:47 -08:00
gcov
irq irq: tsk->comm is an array 2012-12-18 15:02:11 -08:00
power Merge branch 'for-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2012-12-12 08:18:24 -08:00
sched cputime: Avoid multiplication overflow on utime scaling 2013-01-27 14:04:44 +01:00
time Merge tag 'kvm-3.8-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm 2012-12-13 15:31:08 -08:00
trace ftrace: Be first to run code modification on modules 2013-01-21 13:21:50 -05:00
.gitignore
acct.c vfs: make path_openat take a struct filename pointer 2012-10-12 20:15:09 -04:00
async.c async: fix __lowest_in_progress() 2013-01-22 16:21:24 -08:00
audit.c kernel/audit.c: avoid negative sleep durations 2013-01-11 14:54:56 -08:00
audit.h audit: optimize audit_compare_dname_path 2012-10-12 00:32:02 -04:00
audit_tree.c audit: catch possible NULL audit buffers 2013-01-11 14:54:55 -08:00
audit_watch.c audit: catch possible NULL audit buffers 2013-01-11 14:54:55 -08:00
auditfilter.c audit: fix auditfilter.c kernel-doc warnings 2013-01-10 14:35:23 -08:00
auditsc.c audit: catch possible NULL audit buffers 2013-01-11 14:54:55 -08:00
backtracetest.c
bounds.c
capability.c userns: Teach inode_capable to understand inodes whose uids map to other namespaces. 2012-05-15 14:59:24 -07:00
cgroup.c Merge branch 'akpm' (Andrew's patch-bomb) 2012-12-17 20:58:12 -08:00
cgroup_freezer.c cgroup: rename ->create/post_create/pre_destroy/destroy() to ->css_alloc/online/offline/free() 2012-11-19 08:13:38 -08:00
compat.c x32: fix sigtimedwait 2012-12-26 01:15:03 -05:00
configs.c
context_tracking.c context_tracking: New context tracking susbsystem 2012-11-30 11:40:07 -08:00
cpu.c Merge branch 'x86-bsp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-12-11 19:56:33 -08:00
cpu_pm.c kernel/cpu_pm.c: fix various typos 2012-05-31 17:49:27 -07:00
cpuset.c cpuset: use N_MEMORY instead N_HIGH_MEMORY 2012-12-12 17:38:32 -08:00
crash_dump.c
cred.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2012-12-18 10:55:28 -08:00
delayacct.c
dma.c Remove all #inclusions of asm/system.h 2012-03-28 18:30:03 +01:00
elfcore.c
exec_domain.c
exit.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2012-12-17 15:44:47 -08:00
extable.c extable: Skip sorting if sorted at build time. 2012-04-19 15:06:55 -07:00
fork.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal 2013-01-20 13:58:48 -08:00
freezer.c freezer: change ptrace_stop/do_signal_stop to use freezable_schedule() 2012-10-26 14:27:49 -07:00
futex.c futex: avoid wake_futex() for a PI futex_q 2012-11-26 17:41:24 -08:00
futex_compat.c futex: Mark get_robust_list as deprecated 2012-03-29 11:37:17 +02:00
groups.c userns: Convert in_group_p and in_egroup_p to use kgid_t 2012-05-03 03:29:33 -07:00
hrtimer.c hrtimer: Update hrtimer base offsets each hrtimer_interrupt 2012-07-11 23:34:39 +02:00
hung_task.c hung task debugging: Inject NMI when hung and going to panic 2012-04-25 12:39:25 +02:00
irq_work.c irq_work: fix compile failure on tile from missing include 2012-04-13 13:15:16 -04:00
itimer.c itimer: Use printk_once instead of WARN_ONCE 2012-04-10 11:00:30 +02:00
jump_label.c jump_label: Export jump_label_rate_limit() 2012-08-06 19:00:35 +03:00
kallsyms.c vsprintf: fix %ps on non symbols when using kallsyms 2012-05-29 16:22:32 -07:00
kcmp.c kcmp: include linux/ptrace.h 2012-12-20 17:40:19 -08:00
Kconfig.freezer
Kconfig.hz
Kconfig.locks locking: Adjust spin lock inlining Kconfig options 2012-09-13 17:56:13 +02:00
Kconfig.preempt locking/kconfig: Simplify INLINE_SPIN_UNLOCK usage 2012-03-23 13:18:57 +01:00
kexec.c kdump: remove unneeded include 2012-10-06 03:05:19 +09:00
kfifo.c [media] kernel:kfifo: export __kfifo_max_r symbol 2012-04-11 18:24:37 -03:00
kmod.c Bury the conditionals from kernel_thread/kernel_execve series 2012-12-19 18:07:38 -05:00
kprobes.c kprobes/x86: Fix to support jprobes on ftrace-based kprobe 2012-09-13 22:52:11 -04:00
ksysfs.c Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-12-11 18:10:49 -08:00
kthread.c kthread: use N_MEMORY instead N_HIGH_MEMORY 2012-12-12 17:38:33 -08:00
latencytop.c
lglock.c brlocks/lglocks: turn into functions 2012-05-29 23:28:41 -04:00
lockdep.c lockdep: Check if nested lock is actually held 2012-09-13 17:00:44 +02:00
lockdep_internals.h
lockdep_proc.c lockdep: Use KSYM_NAME_LEN'ed buffer for __get_key_name() 2012-10-24 12:39:09 +02:00
lockdep_states.h
Makefile Nothing all that exciting; a new module-from-fd syscall for those who want 2012-12-19 07:55:08 -08:00
modsign_certificate.S MODSIGN: Avoid using .incbin in C source 2012-12-14 13:06:44 +10:30
modsign_pubkey.c keys: use keyring_alloc() to create module signing keyring 2012-12-20 17:40:21 -08:00
module-internal.h MODSIGN: Move the magic string to the end of a module and eliminate the search 2012-10-19 17:30:40 -07:00
module.c module: fix missing module_mutex unlock 2013-01-20 20:22:58 -08:00
module_signing.c MODSIGN: Don't use enum-type bitfields in module signature info block 2012-12-05 11:27:24 +10:30
mutex-debug.c
mutex-debug.h
mutex.c
mutex.h
notifier.c
nsproxy.c userns: Implement unshare of the user namespace 2012-11-20 04:18:14 -08:00
padata.c padata: use __this_cpu_read per-cpu helper 2012-12-06 17:16:23 +08:00
panic.c panic: fix a possible deadlock in panic() 2012-07-30 17:25:13 -07:00
params.c params: replace printk(KERN_<LVL>...) with pr_<lvl>(...) 2012-05-04 17:28:18 -07:00
pid.c pidns: Stop pid allocation when init dies 2012-12-25 16:10:05 -08:00
pid_namespace.c pidns: Stop pid allocation when init dies 2012-12-25 16:10:05 -08:00
posix-cpu-timers.c A few /dev/random improvements for the v3.8 merge window. 2012-12-19 20:23:37 -08:00
posix-timers.c
printk.c printk: fix incorrect length from print_time() when seconds > 99999 2013-01-04 16:11:48 -08:00
profile.c propagate name change to comments in kernel source 2012-12-06 10:39:54 +01:00
ptrace.c ptrace: ensure arch_ptrace/ptrace_request can never race with SIGKILL 2013-01-22 10:08:00 -08:00
range.c
rcu.h rcu: Add a module parameter to force use of expedited RCU primitives 2012-10-23 14:54:08 -07:00
rcupdate.c rcu: Add a module parameter to force use of expedited RCU primitives 2012-10-23 14:54:08 -07:00
rcutiny.c rcu: Fix TINY_RCU rcu_is_cpu_rrupt_from_idle check 2012-11-13 14:08:34 -08:00
rcutiny_plugin.h rcu: Add a module parameter to force use of expedited RCU primitives 2012-10-23 14:54:08 -07:00
rcutorture.c Merge branches 'urgent.2012.10.27a', 'doc.2012.11.16a', 'fixes.2012.11.13a', 'srcu.2012.10.27a', 'stall.2012.11.13a', 'tracing.2012.11.08a' and 'idle.2012.10.24a' into HEAD 2012-11-16 09:59:58 -08:00
rcutree.c context_tracking: New context tracking susbsystem 2012-11-30 11:40:07 -08:00
rcutree.h rcu: Separate accounting of callbacks from callback-free CPUs 2012-11-16 10:05:57 -08:00
rcutree_plugin.h rcu: Separate accounting of callbacks from callback-free CPUs 2012-11-16 10:05:57 -08:00
rcutree_trace.c rcu: Separate accounting of callbacks from callback-free CPUs 2012-11-16 10:05:57 -08:00
relay.c splice: fix racy pipe->buffers uses 2012-06-13 21:16:42 +02:00
res_counter.c res_counter: return amount of charges after res_counter_uncharge() 2012-12-18 15:02:12 -08:00
resource.c kernel/resource.c: fix stack overflow in __reserve_region_with_split() 2012-10-06 03:05:31 +09:00
rtmutex-debug.c
rtmutex-debug.h
rtmutex-tester.c
rtmutex.c
rtmutex.h
rtmutex_common.h
rwsem.c lockdep, rwsem: provide down_write_nest_lock() 2013-01-11 14:54:55 -08:00
seccomp.c seccomp: Make syscall skipping and nr changes more consistent 2012-10-02 21:14:29 +10:00
semaphore.c semaphore: fix improper comment reference to mutex 2012-04-05 17:15:55 -07:00
signal.c ptrace: ensure arch_ptrace/ptrace_request can never race with SIGKILL 2013-01-22 10:08:00 -08:00
smp.c smp: Remove ipi_call_lock[_irq]()/ipi_call_unlock[_irq]() 2012-06-05 17:27:14 +02:00
smpboot.c hotplug: Fix UP bug in smpboot hotplug code 2012-08-13 17:01:07 +02:00
smpboot.h smpboot: Provide infrastructure for percpu hotplug threads 2012-08-13 17:01:07 +02:00
softirq.c cputime: Specialize irq vtime hooks 2012-10-29 21:31:32 +01:00
spinlock.c locking/kconfig: Simplify INLINE_SPIN_UNLOCK usage 2012-03-23 13:18:57 +01:00
srcu.c Merge branches 'urgent.2012.10.27a', 'doc.2012.11.16a', 'fixes.2012.11.13a', 'srcu.2012.10.27a', 'stall.2012.11.13a', 'tracing.2012.11.08a' and 'idle.2012.10.24a' into HEAD 2012-11-16 09:59:58 -08:00
stacktrace.c
stop_machine.c
sys.c cputime: Rename thread_group_times to thread_group_cputime_adjusted 2012-11-28 17:07:57 +01:00
sys_ni.c module: add syscall to load module from fd 2012-12-14 13:05:22 +10:30
sysctl.c Automatic NUMA Balancing V11 2012-12-16 15:18:08 -08:00
sysctl_binary.c pidns: Use task_active_pid_ns where appropriate 2012-11-19 05:59:09 -08:00
task_work.c task_work: task_work_add() should not succeed after exit_task_work() 2012-09-13 16:47:34 +02:00
taskstats.c taskstats: cgroupstats_user_cmd() may leak on error 2012-10-06 03:05:31 +09:00
test_kprobes.c
time.c time: Move update_vsyscall definitions to timekeeper_internal.h 2012-09-24 12:38:06 -04:00
timeconst.pl
timer.c timers: Fix endless looping between cascade() and internal_add_timer() 2012-10-09 21:27:14 +02:00
tracepoint.c
tsacct.c userns: Convert taskstats to handle the user and pid namespaces. 2012-09-18 01:01:32 -07:00
uid16.c userns: Convert setting and getting uid and gid system calls to use kuid and kgid 2012-05-03 03:28:41 -07:00
up.c
user-return-notifier.c
user.c proc: Usable inode numbers for the namespace file descriptors. 2012-11-20 04:19:49 -08:00
user_namespace.c userns: Fix typo in description of the limitation of userns_install 2012-12-14 18:36:36 -08:00
utsname.c userns: Require CAP_SYS_ADMIN for most uses of setns. 2012-12-14 16:12:03 -08:00
utsname_sysctl.c
wait.c propagate name change to comments in kernel source 2012-12-06 10:39:54 +01:00
watchdog.c watchdog: Fix disable/enable regression 2012-12-19 12:10:33 -08:00
workqueue.c Merge branch 'for-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq 2012-12-12 08:15:13 -08:00
workqueue_sched.h