linux-stable/Documentation
Mel Gorman cb2517653f sched/debug: Make schedstats a runtime tunable that is disabled by default
schedstats is very useful during debugging and performance tuning but it
incurs overhead to calculate the stats. As such, even though it can be
disabled at build time, it is often enabled as the information is useful.

This patch adds a kernel command-line and sysctl tunable to enable or
disable schedstats on demand (when it's built in). It is disabled
by default as someone who knows they need it can also learn to enable
it when necessary.

The benefits are dependent on how scheduler-intensive the workload is.
If it is then the patch reduces the number of cycles spent calculating
the stats with a small benefit from reducing the cache footprint of the
scheduler.

These measurements were taken from a 48-core 2-socket
machine with Xeon(R) E5-2670 v3 cpus although they were also tested on a
single socket machine 8-core machine with Intel i7-3770 processors.

netperf-tcp
                           4.5.0-rc1             4.5.0-rc1
                             vanilla          nostats-v3r1
Hmean    64         560.45 (  0.00%)      575.98 (  2.77%)
Hmean    128        766.66 (  0.00%)      795.79 (  3.80%)
Hmean    256        950.51 (  0.00%)      981.50 (  3.26%)
Hmean    1024      1433.25 (  0.00%)     1466.51 (  2.32%)
Hmean    2048      2810.54 (  0.00%)     2879.75 (  2.46%)
Hmean    3312      4618.18 (  0.00%)     4682.09 (  1.38%)
Hmean    4096      5306.42 (  0.00%)     5346.39 (  0.75%)
Hmean    8192     10581.44 (  0.00%)    10698.15 (  1.10%)
Hmean    16384    18857.70 (  0.00%)    18937.61 (  0.42%)

Small gains here, UDP_STREAM showed nothing intresting and neither did
the TCP_RR tests. The gains on the 8-core machine were very similar.

tbench4
                                 4.5.0-rc1             4.5.0-rc1
                                   vanilla          nostats-v3r1
Hmean    mb/sec-1         500.85 (  0.00%)      522.43 (  4.31%)
Hmean    mb/sec-2         984.66 (  0.00%)     1018.19 (  3.41%)
Hmean    mb/sec-4        1827.91 (  0.00%)     1847.78 (  1.09%)
Hmean    mb/sec-8        3561.36 (  0.00%)     3611.28 (  1.40%)
Hmean    mb/sec-16       5824.52 (  0.00%)     5929.03 (  1.79%)
Hmean    mb/sec-32      10943.10 (  0.00%)    10802.83 ( -1.28%)
Hmean    mb/sec-64      15950.81 (  0.00%)    16211.31 (  1.63%)
Hmean    mb/sec-128     15302.17 (  0.00%)    15445.11 (  0.93%)
Hmean    mb/sec-256     14866.18 (  0.00%)    15088.73 (  1.50%)
Hmean    mb/sec-512     15223.31 (  0.00%)    15373.69 (  0.99%)
Hmean    mb/sec-1024    14574.25 (  0.00%)    14598.02 (  0.16%)
Hmean    mb/sec-2048    13569.02 (  0.00%)    13733.86 (  1.21%)
Hmean    mb/sec-3072    12865.98 (  0.00%)    13209.23 (  2.67%)

Small gains of 2-4% at low thread counts and otherwise flat.  The
gains on the 8-core machine were slightly different

tbench4 on 8-core i7-3770 single socket machine
Hmean    mb/sec-1        442.59 (  0.00%)      448.73 (  1.39%)
Hmean    mb/sec-2        796.68 (  0.00%)      794.39 ( -0.29%)
Hmean    mb/sec-4       1322.52 (  0.00%)     1343.66 (  1.60%)
Hmean    mb/sec-8       2611.65 (  0.00%)     2694.86 (  3.19%)
Hmean    mb/sec-16      2537.07 (  0.00%)     2609.34 (  2.85%)
Hmean    mb/sec-32      2506.02 (  0.00%)     2578.18 (  2.88%)
Hmean    mb/sec-64      2511.06 (  0.00%)     2569.16 (  2.31%)
Hmean    mb/sec-128     2313.38 (  0.00%)     2395.50 (  3.55%)
Hmean    mb/sec-256     2110.04 (  0.00%)     2177.45 (  3.19%)
Hmean    mb/sec-512     2072.51 (  0.00%)     2053.97 ( -0.89%)

In constract, this shows a relatively steady 2-3% gain at higher thread
counts. Due to the nature of the patch and the type of workload, it's
not a surprise that the result will depend on the CPU used.

hackbench-pipes
                         4.5.0-rc1             4.5.0-rc1
                           vanilla          nostats-v3r1
Amean    1        0.0637 (  0.00%)      0.0660 ( -3.59%)
Amean    4        0.1229 (  0.00%)      0.1181 (  3.84%)
Amean    7        0.1921 (  0.00%)      0.1911 (  0.52%)
Amean    12       0.3117 (  0.00%)      0.2923 (  6.23%)
Amean    21       0.4050 (  0.00%)      0.3899 (  3.74%)
Amean    30       0.4586 (  0.00%)      0.4433 (  3.33%)
Amean    48       0.5910 (  0.00%)      0.5694 (  3.65%)
Amean    79       0.8663 (  0.00%)      0.8626 (  0.43%)
Amean    110      1.1543 (  0.00%)      1.1517 (  0.22%)
Amean    141      1.4457 (  0.00%)      1.4290 (  1.16%)
Amean    172      1.7090 (  0.00%)      1.6924 (  0.97%)
Amean    192      1.9126 (  0.00%)      1.9089 (  0.19%)

Some small gains and losses and while the variance data is not included,
it's close to the noise. The UMA machine did not show anything particularly
different

pipetest
                             4.5.0-rc1             4.5.0-rc1
                               vanilla          nostats-v2r2
Min         Time        4.13 (  0.00%)        3.99 (  3.39%)
1st-qrtle   Time        4.38 (  0.00%)        4.27 (  2.51%)
2nd-qrtle   Time        4.46 (  0.00%)        4.39 (  1.57%)
3rd-qrtle   Time        4.56 (  0.00%)        4.51 (  1.10%)
Max-90%     Time        4.67 (  0.00%)        4.60 (  1.50%)
Max-93%     Time        4.71 (  0.00%)        4.65 (  1.27%)
Max-95%     Time        4.74 (  0.00%)        4.71 (  0.63%)
Max-99%     Time        4.88 (  0.00%)        4.79 (  1.84%)
Max         Time        4.93 (  0.00%)        4.83 (  2.03%)
Mean        Time        4.48 (  0.00%)        4.39 (  1.91%)
Best99%Mean Time        4.47 (  0.00%)        4.39 (  1.91%)
Best95%Mean Time        4.46 (  0.00%)        4.38 (  1.93%)
Best90%Mean Time        4.45 (  0.00%)        4.36 (  1.98%)
Best50%Mean Time        4.36 (  0.00%)        4.25 (  2.49%)
Best10%Mean Time        4.23 (  0.00%)        4.10 (  3.13%)
Best5%Mean  Time        4.19 (  0.00%)        4.06 (  3.20%)
Best1%Mean  Time        4.13 (  0.00%)        4.00 (  3.39%)

Small improvement and similar gains were seen on the UMA machine.

The gain is small but it stands to reason that doing less work in the
scheduler is a good thing. The downside is that the lack of schedstats and
tracepoints may be surprising to experts doing performance analysis until
they find the existence of the schedstats= parameter or schedstats sysctl.
It will be automatically activated for latencytop and sleep profiling to
alleviate the problem. For tracepoints, there is a simple warning as it's
not safe to activate schedstats in the context when it's known the tracepoint
may be wanted but is unavailable.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <mgalbraith@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1454663316-22048-1-git-send-email-mgorman@techsingularity.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-09 11:54:23 +01:00
..
ABI Initial roundup of 4.5 merge window patches 2016-01-23 18:45:06 -08:00
accounting Documentation-getdelays: Apply a recommendation from "checkpatch.pl" in main() 2015-12-24 07:22:32 -07:00
acpi
aoe
arm ARM: SoC multiplatform code changes for v4.5 2016-01-20 18:03:56 -08:00
arm64 arm64: Documentation: add list of software workarounds for errata 2015-12-11 17:33:21 +00:00
auxdisplay
backlight
blackfin
block A relatively boring cycle in the docs tree. There's a few kernel-doc 2016-01-17 11:55:07 -08:00
blockdev
bus-devices
cdrom
cgroup-v1 cgroup: rename cgroup documentations 2016-01-11 23:14:51 -05:00
cma
connector
console
cpu-freq Documentation: cpufreq: intel_pstate: enhance documentation 2016-01-05 13:47:37 +01:00
cpuidle
cris
crypto
development-process
device-mapper dm verity: add ignore_zero_blocks feature 2015-12-10 10:39:03 -05:00
devicetree Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-02-01 15:56:08 -08:00
dmaengine Merge branch 'topic/async' into for-linus 2016-01-06 15:17:47 +05:30
DocBook Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux 2016-01-17 13:40:25 -08:00
driver-model
dvb [media] use https://linuxtv.org for LinuxTV URLs 2015-12-04 10:38:59 -02:00
early-userspace
EDID
extcon
fault-injection net: Add support for CHANGEUPPER notifier error injection 2015-12-03 11:49:23 -05:00
fb
features dma-mapping: always provide the dma_map_ops based implementation 2016-01-20 17:09:18 -08:00
filesystems mm: polish virtual memory accounting 2016-02-03 08:28:43 -08:00
firmware_class
fmc
fpga
frv
gpio Doc: gpio: Fix typos in Documentation/gpio 2015-11-20 16:51:16 -07:00
hid
hwmon hwmon: (pmbus) Add client driver for LTC3815 2015-12-18 08:20:59 -08:00
i2c i2c: i801: add Intel Lewisburg device IDs 2015-11-20 16:22:21 +01:00
ia64
ide
iio iio: Documentation: Add IIO configfs documentation 2015-12-03 18:19:28 +00:00
infiniband IB: remove in-kernel support for memory windows 2015-12-23 14:29:04 -05:00
input
ioctl Doc: ioctl: Fix typos in Documentation/ioctl 2015-11-20 16:52:50 -07:00
isdn
ja_JP Documentation: translations: update linux cross reference link 2016-01-11 18:26:58 -07:00
kbuild
kdump
ko_KR Documentation: translations: update linux cross reference link 2016-01-11 18:26:58 -07:00
laptops
leds Documentation: leds: Add description of brightness setting API 2016-01-04 09:57:31 +01:00
locking
m68k
memory-devices
metag
mic
mips
misc-devices
mmc
mn10300
mtd
namespaces
netlabel
networking net: change tcp_syn_retries documentation 2016-01-20 18:55:08 -08:00
nfc
nios2
nvdimm
nvmem
parisc
PCI
pcmcia
phy
platform
power Merge branches 'pm-pci' and 'pm-core' 2016-01-12 01:10:52 +01:00
powerpc
pps
prctl
pti
ptp
rapidio
RCU documentation: Update RCU requirements based on expedited changes 2015-12-05 12:34:32 -08:00
s390 s390/zcore: remove /sys/kernel/debug/zcore/mem 2015-11-27 09:24:12 +01:00
scheduler
scsi
security keys, trusted: seal with a TPM2 authorization policy 2015-12-20 15:27:13 +02:00
serial
sh
sound
spi spi: tools: move spidev_test metadata 2015-11-30 12:14:12 +00:00
sysctl sched/debug: Make schedstats a runtime tunable that is disabled by default 2016-02-09 11:54:23 +01:00
target
thermal thermal: add description for integral_cutoff unit 2016-01-14 13:29:08 -07:00
timers
tpm
trace x86, tracing, perf: Add trace point for MSR accesses 2015-12-06 12:56:10 +01:00
usb The chipidea changes for v4.5-rc1 2015-12-26 16:59:14 -08:00
vDSO
video4linux [media] media framework: rename pads init function to media_entity_pads_init() 2016-01-11 12:19:03 -02:00
virtual KVM doc: Fix KVM_SMI chapter number 2016-01-26 16:29:59 +01:00
vm Merge branch 'akpm' (patches from Andrew) 2016-01-17 12:58:52 -08:00
w1
watchdog watchdog: Drop pointer to watchdog device from struct watchdog_device 2016-01-11 21:53:59 +01:00
wimax
x86
xtensa
zh_CN [media] media framework: rename pads init function to media_entity_pads_init() 2016-01-11 12:19:03 -02:00
00-INDEX
adding-syscalls.txt
applying-patches.txt
assoc_array.txt
atomic_ops.txt
bad_memory.txt
basic_profiling.txt
bcache.txt
binfmt_misc.txt
braille-console.txt
bt8xxgpio.txt
btmrvl.txt
BUG-HUNTING
bus-virt-phys-mapping.txt
cachetlb.txt
cgroup-v2.txt Documentation: cgroup-v2: add memory.stat::sock description 2016-02-03 08:28:43 -08:00
Changes
circular-buffers.txt
clk.txt
coccinelle.txt
CodeOfConflict
CodingStyle Documentation: fix typo in CodingStyle 2016-01-11 18:18:16 -07:00
cpu-hotplug.txt Documentation: cpu-hotplug: Fix sysfs mount instructions 2015-12-10 11:35:30 -07:00
cpu-load.txt
cputopology.txt
crc32.txt
dcdbas.txt
debugging-modules.txt
debugging-via-ohci1394.txt
dell_rbu.txt
devices.txt
digsig.txt
DMA-API-HOWTO.txt dma-mapping: always provide the dma_map_ops based implementation 2016-01-20 17:09:18 -08:00
DMA-API.txt DMA-API: fix confusing sentence in Documentation/DMA-API.txt 2016-01-11 18:29:00 -07:00
DMA-attributes.txt
dma-buf-sharing.txt
DMA-ISA-LPC.txt
dontdiff
dynamic-debug-howto.txt
edac.txt EDAC: Remove references to bluesmoke.sourceforge.net 2015-11-26 14:46:06 +01:00
efi-stub.txt
eisa.txt
email-clients.txt
flexible-arrays.txt
futex-requeue-pi.txt
gcov.txt
gdb-kernel-debugging.txt
highuid.txt
HOWTO Documentation: HOWTO: update code cross reference link 2015-12-10 11:19:35 -07:00
hsi.txt
hw_random.txt
hwspinlock.txt
init.txt
initrd.txt
Intel-IOMMU.txt iommu/vt-d: Fix link to Intel IOMMU Specification 2016-01-29 12:32:12 +01:00
intel_txt.txt
io-mapping.txt
io_ordering.txt
iostats.txt
IPMI.txt
IRQ-affinity.txt
IRQ-domain.txt
IRQ.txt
irqflags-tracing.txt
isapnp.txt
java.txt
kasan.txt
kernel-doc-nano-HOWTO.txt
kernel-docs.txt Documentation: translations: update linux cross reference link 2016-01-11 18:26:58 -07:00
kernel-parameters.txt sched/debug: Make schedstats a runtime tunable that is disabled by default 2016-02-09 11:54:23 +01:00
kernel-per-CPU-kthreads.txt irq_poll: make blk-iopoll available outside the block layer 2015-12-11 11:52:24 -08:00
kmemcheck.txt
kmemleak.txt
kobject.txt
kprobes.txt
kref.txt
kselftest.txt
ldm.txt
local_ops.txt
lockup-watchdogs.txt
logo.gif
logo.txt
lzo.txt
magic-number.txt
mailbox.txt
Makefile spi: Move spi code from Documentation to tools 2015-11-23 14:54:01 +00:00
ManagementStyle
md-cluster.txt md-cluster: update the documentation 2016-01-06 11:39:06 +11:00
md.txt
memory-barriers.txt virtio: barrier rework+fixes 2016-01-18 16:44:24 -08:00
memory-hotplug.txt
men-chameleon-bus.txt
module-signing.txt
mono.txt
nommu-mmap.txt
ntb.txt
numastat.txt
oops-tracing.txt
padata.txt
parport-lowlevel.txt
parport.txt
percpu-rw-semaphore.txt
phy.txt
pi-futex.txt
pinctrl.txt
pnp.txt
preempt-locking.txt
printk-formats.txt printk-formats.txt: remove unimplemented %pT 2016-01-16 11:17:30 -08:00
pwm.txt
ramoops.txt
rbtree.txt
remoteproc.txt
rfkill.txt
robust-futex-ABI.txt
robust-futexes.txt
rpmsg.txt
rtc.txt
SAK.txt
SecurityBugs
serial-console.txt
sgi-ioc4.txt
SM501.txt
smsc_ece1099.txt
sparse.txt
stable_api_nonsense.txt
stable_kernel_rules.txt stable_kernel_rules.txt: Remove extra space after Cc: 2015-11-20 16:54:57 -07:00
static-keys.txt
SubmitChecklist
SubmittingDrivers
SubmittingPatches
svga.txt
sysfs-rules.txt
sysrq.txt
this_cpu_ops.txt
ubsan.txt UBSAN: run-time undefined behavior sanity checker 2016-01-20 17:09:18 -08:00
unaligned-memory-access.txt
unicode.txt
unshare.txt
vfio.txt
VGA-softcursor.txt
vgaarbiter.txt
video-output.txt
vme_api.txt
volatile-considered-harmful.txt
workqueue.txt
xillybus.txt
xz.txt
zorro.txt