linux-stable

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2024-09-13 22:25:03 +00:00

Author	SHA1	Message	Date
Anton Blanchard	2bb44d628c	powerpc/kexec: Don't initialise kexec hooks to default handlers There's no need to initialise ppc_md.machine_kexec and ppc_md.machine_kexec_prepare to the default handlers. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-01-21 14:08:35 +11:00
Anton Blanchard	c1f784e553	powerpc/kdump: Remove ppc_md.machine_crash_shutdown No one uses ppc_md.machine_crash_shutdown, so remove it. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-01-21 14:08:35 +11:00
Anton Blanchard	c94868788c	powerpc/kexec: Remove ppc_md.machine_kexec No one uses ppc_md.machine_kexec, so remove it. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-01-21 14:08:35 +11:00
Anton Blanchard	619b267724	powerpc/kexec: Remove ppc_md.machine_kexec_cleanup No one uses ppc_md.machine_kexec_cleanup, so remove it. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-01-21 14:08:35 +11:00
Anton Blanchard	50266a1f8a	powerpc/kexec: Move all ppc_md kexec function pointers together Move all the kexec handlers together. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-01-21 14:08:34 +11:00
Tejun Heo	b18ae08dea	powerpc/cell: Use system_wq in cpufreq_spudemand With cmwq, there's no reason to use a separate workqueue in cpufreq_spudemand. Use system_wq instead. The work items are already sync canceled on stop, so it's already guaranteed that no work is running when spu_gov_exit() is entered. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linuxppc-dev@lists.ozlabs.org Cc: Dave Jones <davej@redhat.com> Cc: cpufreq@vger.kernel.org Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-01-21 14:08:34 +11:00
roel kluin	f32be0c540	powerpc/macintosh: Fix wrong test in fan_{read,write}_reg() Fix error test in fan_{read,write}_reg() Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-01-21 14:08:34 +11:00
Akinobu Mita	4c4a5cf64b	powerpc/rtas_flash: Use simple_read_from_buffer Simplify read file operation for /proc/powerpc/rtas/* interface by using simple_read_from_buffer. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-01-21 14:08:34 +11:00
Akinobu Mita	63c3b9d71b	powerpc/spufs: Use simple_write_to_buffer Simplify several write fileoperations for spufs by using simple_write_to_buffer(). Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-01-21 14:08:34 +11:00
Steven Rostedt	06ca2188ec	powerpc/ppc32/tracing: Add stack frame to calls of trace_hardirqs_on/off 32-bit variant of the previous patch for 64-bit: << When an interrupt occurs in userspace, we can call trace_hardirqs_on/off() With one level stack. But if we have irqsoff tracing enabled, it checks both CALLER_ADDR0 and CALLER_ADDR1. The second call goes two stack frames up. If this is from user space, then there may not exist a second stack.... >> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-01-21 14:08:33 +11:00
Steven Rostedt	3cb5f1a3e5	powerpc/ppc64/tracing: Add stack frame to calls of trace_hardirqs_on/off When an interrupt occurs in userspace, we can call trace_hardirqs_on/off() With one level stack. But if we have irqsoff tracing enabled, it checks both CALLER_ADDR0 and CALLER_ADDR1. The second call goes two stack frames up. If this is from user space, then there may not exist a second stack. Add a second stack when calling trace_hardirqs_on/off() otherwise the following oops might occur: Oops: Kernel access of bad area, sig: 11 [#1] PREEMPT SMP NR_CPUS=2 PA Semi PWRficient last sysfs file: /sys/block/sda/size Modules linked in: ohci_hcd ehci_hcd usbcore NIP: c0000000000e1c00 LR: c0000000000034d4 CTR: 000000011012c440 REGS: c00000003e2f3af0 TRAP: 0300 Not tainted (2.6.37-rc6+) MSR: 9000000000001032 <ME,IR,DR> CR: 48044444 XER: 20000000 DAR: 00000001ffb9db50, DSISR: 0000000040000000 TASK = c00000003e1a00a0[2088] 'emacs' THREAD: c00000003e2f0000 CPU: 1 GPR00: 0000000000000001 c00000003e2f3d70 c00000000084e0d0 c0000000008816e8 GPR04: 000000001034c678 000000001032e8f9 0000000010336540 0000000040020000 GPR08: 0000000040020000 00000001ffb9db40 c00000003e2f3e30 0000000060000000 GPR12: 100000000000f032 c00000000fff0280 000000001032e8c9 0000000000000008 GPR16: 00000000105be9c0 00000000105be950 00000000105be9b0 00000000105be950 GPR20: 00000000ffb9dc50 00000000ffb9dbf0 00000000102f0000 00000000102f0000 GPR24: 00000000102e0000 00000000102f0000 0000000010336540 c0000000009ded38 GPR28: 00000000102e0000 c0000000000034d4 c0000000007ccb10 c00000003e2f3d70 NIP [c0000000000e1c00] .trace_hardirqs_off+0xb0/0x1d0 LR [c0000000000034d4] decrementer_common+0xd4/0x100 Call Trace: [c00000003e2f3d70] [c00000003e2f3e30] 0xc00000003e2f3e30 (unreliable) [c00000003e2f3e30] [c0000000000034d4] decrementer_common+0xd4/0x100 Instruction dump: 81690000 7f8b0000 419e0018 f84a0028 60000000 60000000 60000000 e95f0000 80030000 e92a0000 eb6301f8 2f800000 <eb890010> 41fe00dc a06d000a eb1e8050 ---[ end trace 4ec7fd2be9240928 ]--- Reported-by: Joerg Sommer <joerg@alea.gnuu.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-01-21 14:08:33 +11:00
Michael Ellerman	c0337288ab	powerpc: Ensure the else case of feature sections will fit When we create an alternative feature section, the else case must be the same size or smaller than the body. This is because when we patch the else case in we just overwrite the body, so there must be room. Up to now we just did this by inspection, but it's quite easy to enforce it in the assembler, so we should. The only change is to add the ifgt block, but that effects the alignment of the tabs and so the whole macro is modified. Also add a test, but #if 0 it because we don't want to break the build. Anyone who's modifying the feature macros should enable the test. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-01-21 14:08:33 +11:00
Linus Torvalds	2b1caf6ed7	Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: smp: Allow on_each_cpu() to be called while early_boot_irqs_disabled status to init/main.c lockdep: Move early boot local IRQ enable/disable status to init/main.c	2011-01-20 18:30:37 -08:00
Rafael J. Wysocki	d551d81d6a	ACPI / PM: Call suspend_nvs_free() earlier during resume It turns out that some device drivers map pages from the ACPI NVS region during resume using ioremap(), which conflicts with ioremap_cache() used for mapping those pages by the NVS save/restore code in nvs.c. Make the NVS pages mapped by the code in nvs.c be unmapped before device drivers' resume routines run. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-01-20 18:30:17 -08:00
Rafael J. Wysocki	2d6d9fd3a5	ACPI: Introduce acpi_os_ioremap() Commit `ca9b600be3` ("ACPI / PM: Make suspend_nvs_save() use acpi_os_map_memory()") attempted to prevent the code in osl.c and nvs.c from using different ioremap() variants by making the latter use acpi_os_map_memory() for mapping the NVS pages. However, that also requires acpi_os_unmap_memory() to be used for unmapping them, which causes synchronize_rcu() to be executed many times in a row unnecessarily and introduces substantial delays during resume on some systems. Instead of using acpi_os_map_memory() for mapping the NVS pages in nvs.c introduce acpi_os_ioremap() calling ioremap_cache() and make the code in both osl.c and nvs.c use it. Reported-by: Jeff Chua <jeff.chua.linux@gmail.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-01-20 18:30:17 -08:00
Jeff Layton	99d86c8f1b	cifs: fix up CIFSSMBEcho for unaligned access Make sure that CIFSSMBEcho can handle unaligned fields. Also fix a minor bug that causes this warning: fs/cifs/cifssmb.c: In function 'CIFSSMBEcho': fs/cifs/cifssmb.c:740: warning: large integer implicitly truncated to unsigned type ...WordCount is u8, not __le16, so no need to convert it. This patch should apply cleanly on top of the rest of the patchset to clean up unaligned access. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2011-01-21 02:23:27 +00:00
Steve French	bf67b9be97	Merge branch 'for-next'	2011-01-21 02:19:30 +00:00
Linus Torvalds	8d99641f6c	Merge branch 'akpm' * akpm: kernel/smp.c: consolidate writes in smp_call_function_interrupt() kernel/smp.c: fix smp_call_function_many() SMP race memcg: correctly order reading PCG_USED and pc->mem_cgroup backlight: fix 88pm860x_bl macro collision drivers/leds/ledtrig-gpio.c: make output match input, tighten input checking MAINTAINERS: update Atmel AT91 entry mm: fix truncate_setsize() comment memcg: fix rmdir, force_empty with THP memcg: fix LRU accounting with THP memcg: fix USED bit handling at uncharge in THP memcg: modify accounting function for supporting THP better fs/direct-io.c: don't try to allocate more than BIO_MAX_PAGES in a bio mm: compaction: prevent division-by-zero during user-requested compaction mm/vmscan.c: remove duplicate include of compaction.h memblock: fix memblock_is_region_memory() thp: keep highpte mapped until it is no longer needed kconfig: rename CONFIG_EMBEDDED to CONFIG_EXPERT	2011-01-20 17:02:14 -08:00
Milton Miller	225c8e010f	kernel/smp.c: consolidate writes in smp_call_function_interrupt() We have to test the cpu mask in the interrupt handler before checking the refs, otherwise we can start to follow an entry before its deleted and find it partially initailzed for the next trip. Presently we also clear the cpumask bit before executing the called function, which implies getting write access to the line. After the function is called we then decrement refs, and if they go to zero we then unlock the structure. However, this implies getting write access to the call function data before and after another the function is called. If we can assert that no smp_call_function execution function is allowed to enable interrupts, then we can move both writes to after the function is called, hopfully allowing both writes with one cache line bounce. On a 256 thread system with a kernel compiled for 1024 threads, the time to execute testcase in the "smp_call_function_many race" changelog was reduced by about 30-40ms out of about 545 ms. I decided to keep this as WARN because its now a buggy function, even though the stack trace is of no value -- a simple printk would give us the information needed. Raw data: Without patch: ipi_test startup took 1219366ns complete 539819014ns total 541038380ns ipi_test startup took 1695754ns complete 543439872ns total 545135626ns ipi_test startup took 7513568ns complete 539606362ns total 547119930ns ipi_test startup took 13304064ns complete 533898562ns total 547202626ns ipi_test startup took 8668192ns complete 544264074ns total 552932266ns ipi_test startup took 4977626ns complete 548862684ns total 553840310ns ipi_test startup took 2144486ns complete 541292318ns total 543436804ns ipi_test startup took 21245824ns complete 530280180ns total 551526004ns With patch: ipi_test startup took 5961748ns complete 500859628ns total 506821376ns ipi_test startup took 8975996ns complete 495098924ns total 504074920ns ipi_test startup took 19797750ns complete 492204740ns total 512002490ns ipi_test startup took 14824796ns complete 487495878ns total 502320674ns ipi_test startup took 11514882ns complete 494439372ns total 505954254ns ipi_test startup took 8288084ns complete 502570774ns total 510858858ns ipi_test startup took 6789954ns complete 493388112ns total 500178066ns #include <linux/module.h> #include <linux/init.h> #include <linux/sched.h> /* sched clock / #define ITERATIONS 100 static void do_nothing_ipi(void dummy) { } static void do_ipis(struct work_struct *dummy) { int i; for (i = 0; i < ITERATIONS; i++) smp_call_function(do_nothing_ipi, NULL, 1); printk(KERN_DEBUG "cpu %d finished\n", smp_processor_id()); } static struct work_struct work[NR_CPUS]; static int __init testcase_init(void) { int cpu; u64 start, started, done; start = local_clock(); for_each_online_cpu(cpu) { INIT_WORK(&work[cpu], do_ipis); schedule_work_on(cpu, &work[cpu]); } started = local_clock(); for_each_online_cpu(cpu) flush_work(&work[cpu]); done = local_clock(); pr_info("ipi_test startup took %lldns complete %lldns total %lldns\n", started-start, done-started, done-start); return 0; } static void __exit testcase_exit(void) { } module_init(testcase_init) module_exit(testcase_exit) MODULE_LICENSE("GPL"); MODULE_AUTHOR("Anton Blanchard"); Signed-off-by: Milton Miller <miltonm@bga.com> Cc: Anton Blanchard <anton@samba.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-01-20 17:02:06 -08:00
Anton Blanchard	6dc1989995	kernel/smp.c: fix smp_call_function_many() SMP race I noticed a failure where we hit the following WARN_ON in generic_smp_call_function_interrupt: if (!cpumask_test_and_clear_cpu(cpu, data->cpumask)) continue; data->csd.func(data->csd.info); refs = atomic_dec_return(&data->refs); WARN_ON(refs < 0); <------------------------- We atomically tested and cleared our bit in the cpumask, and yet the number of cpus left (ie refs) was 0. How can this be? It turns out commit `54fdade1c3` ("generic-ipi: make struct call_function_data lockless") is at fault. It removes locking from smp_call_function_many and in doing so creates a rather complicated race. The problem comes about because: - The smp_call_function_many interrupt handler walks call_function.queue without any locking. - We reuse a percpu data structure in smp_call_function_many. - We do not wait for any RCU grace period before starting the next smp_call_function_many. Imagine a scenario where CPU A does two smp_call_functions back to back, and CPU B does an smp_call_function in between. We concentrate on how CPU C handles the calls: CPU A CPU B CPU C CPU D smp_call_function smp_call_function_interrupt walks call_function.queue sees data from CPU A on list smp_call_function smp_call_function_interrupt walks call_function.queue sees (stale) CPU A on list smp_call_function int clears last ref on A list_del_rcu, unlock smp_call_function reuses percpu data A data->cpumask sees and clears bit in cpumask might be using old or new fn! decrements refs below 0 set data->refs (too late!) The important thing to note is since the interrupt handler walks a potentially stale call_function.queue without any locking, then another cpu can view the percpu data structure at any time, even when the owner is in the process of initialising it. The following test case hits the WARN_ON 100% of the time on my PowerPC box (having 128 threads does help :) #include <linux/module.h> #include <linux/init.h> #define ITERATIONS 100 static void do_nothing_ipi(void dummy) { } static void do_ipis(struct work_struct dummy) { int i; for (i = 0; i < ITERATIONS; i++) smp_call_function(do_nothing_ipi, NULL, 1); printk(KERN_DEBUG "cpu %d finished\n", smp_processor_id()); } static struct work_struct work[NR_CPUS]; static int __init testcase_init(void) { int cpu; for_each_online_cpu(cpu) { INIT_WORK(&work[cpu], do_ipis); schedule_work_on(cpu, &work[cpu]); } return 0; } static void __exit testcase_exit(void) { } module_init(testcase_init) module_exit(testcase_exit) MODULE_LICENSE("GPL"); MODULE_AUTHOR("Anton Blanchard"); I tried to fix it by ordering the read and the write of ->cpumask and ->refs. In doing so I missed a critical case but Paul McKenney was able to spot my bug thankfully :) To ensure we arent viewing previous iterations the interrupt handler needs to read ->refs then ->cpumask then ->refs _again_. Thanks to Milton Miller and Paul McKenney for helping to debug this issue. [miltonm@bga.com: add WARN_ON and BUG_ON, remove extra read of refs before initial read of mask that doesn't help (also noted by Peter Zijlstra), adjust comments, hopefully clarify scenario ] [miltonm@bga.com: remove excess tests] Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Milton Miller <miltonm@bga.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: <stable@kernel.org> [2.6.32+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-01-20 17:02:06 -08:00
Johannes Weiner	713735b423	memcg: correctly order reading PCG_USED and pc->mem_cgroup The placement of the read-side barrier is confused: the writer first sets pc->mem_cgroup, then PCG_USED. The read-side barrier has to be between testing PCG_USED and reading pc->mem_cgroup. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-01-20 17:02:06 -08:00
Randy Dunlap	2550326ac7	backlight: fix 88pm860x_bl macro collision Fix collision with kernel-supplied #define: drivers/video/backlight/88pm860x_bl.c:24:1: warning: "CURRENT_MASK" redefined arch/x86/include/asm/page_64_types.h:6:1: warning: this is the location of the previous definition Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Haojian Zhuang <haojian.zhuang@marvell.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-01-20 17:02:06 -08:00
Janusz Krzysztofik	cc587ece12	drivers/leds/ledtrig-gpio.c: make output match input, tighten input checking Replicate changes made to drivers/leds/ledtrig-backlight.c. Cc: Paul Mundt <lethal@linux-sh.org> Cc: Richard Purdie <richard.purdie@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-01-20 17:02:06 -08:00
Nicolas Ferre	c1fc8675c9	MAINTAINERS: update Atmel AT91 entry Add two co-maintainers and update the entry with new information. Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com> Acked-by: Andrew Victor <linux@maxim.org.za> Acked-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-01-20 17:02:06 -08:00
Jan Kara	382e27daa5	mm: fix truncate_setsize() comment Contrary to what the comment says, truncate_setsize() should be called before filesystem truncated blocks. Signed-off-by: Jan Kara <jack@suse.cz> Cc: Christoph Hellwig <hch@infradead.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-01-20 17:02:06 -08:00
KAMEZAWA Hiroyuki	987eba66e0	memcg: fix rmdir, force_empty with THP Now, when THP is enabled, memcg's rmdir() function is broken because move_account() for THP page is not supported. This will cause account leak or -EBUSY issue at rmdir(). This patch fixes the issue by supporting move_account() THP pages. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-01-20 17:02:06 -08:00
KAMEZAWA Hiroyuki	ece35ca810	memcg: fix LRU accounting with THP memory cgroup's LRU stat should take care of size of pages because Transparent Hugepage inserts hugepage into LRU. If this value is the number wrong, memory reclaim will not work well. Note: only head page of THP's huge page is linked into LRU. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-01-20 17:02:06 -08:00
KAMEZAWA Hiroyuki	ca3e021417	memcg: fix USED bit handling at uncharge in THP Now, under THP: at charge: - PageCgroupUsed bit is set to all page_cgroup on a hugepage. ....set to 512 pages. at uncharge - PageCgroupUsed bit is unset on the head page. So, some pages will remain with "Used" bit. This patch fixes that Used bit is set only to the head page. Used bits for tail pages will be set at splitting if necessary. This patch adds this lock order: compound_lock() -> page_cgroup_move_lock(). [akpm@linux-foundation.org: fix warning] Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-01-20 17:02:06 -08:00
KAMEZAWA Hiroyuki	e401f1761c	memcg: modify accounting function for supporting THP better mem_cgroup_charge_statisics() was designed for charging a page but now, we have transparent hugepage. To fix problems (in following patch) it's required to change the function to get the number of pages as its arguments. The new function gets following as argument. - type of page rather than 'pc' - size of page which is accounted. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-01-20 17:02:05 -08:00
David Dillow	20d9600cb4	fs/direct-io.c: don't try to allocate more than BIO_MAX_PAGES in a bio When using devices that support max_segments > BIO_MAX_PAGES (256), direct IO tries to allocate a bio with more pages than allowed, which leads to an oops in dio_bio_alloc(). Clamp the request to the supported maximum, and change dio_bio_alloc() to reflect that bio_alloc() will always return a bio when called with __GFP_WAIT and a valid number of vectors. [akpm@linux-foundation.org: remove redundant BUG_ON()] Signed-off-by: David Dillow <dillowda@ornl.gov> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-01-20 17:02:05 -08:00
Johannes Weiner	82478fb7bc	mm: compaction: prevent division-by-zero during user-requested compaction Up until `3e7d344` ("mm: vmscan: reclaim order-0 and use compaction instead of lumpy reclaim"), compaction skipped calculating the fragmentation index of a zone when compaction was explicitely requested through the procfs knob. However, when compaction_suitable was introduced, it did not come with an extra check for order == -1, set on explicit compaction requests, and passed this order on to the fragmentation index calculation, where it overshifts the number of requested pages, leading to a division by zero. This patch makes sure that order == -1 is recognized as the flag it is rather than passing it along as valid order parameter. [akpm@linux-foundation.org: add comment, per Mel] Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-01-20 17:02:05 -08:00
Jesper Juhl	3305de51bf	mm/vmscan.c: remove duplicate include of compaction.h Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-01-20 17:02:05 -08:00
Tomi Valkeinen	abb65272a1	memblock: fix memblock_is_region_memory() memblock_is_region_memory() uses reserved memblocks to search for the given region, while it should use the memory memblocks. I encountered the problem with OMAP's framebuffer ram allocation. Normally the ram is allocated dynamically, and this function is not called. However, if we want to pass the framebuffer from the bootloader to the kernel (to retain the boot image), this function is used to check the validity of the kernel parameters for the framebuffer ram area. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@nokia.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-01-20 17:02:05 -08:00
Johannes Weiner	453c719261	thp: keep highpte mapped until it is no longer needed Two users reported THP-related crashes on 32-bit x86 machines. Their oops reports indicated an invalid pte, and subsequent code inspection showed that the highpte is actually used after unmap. The fix is to unmap the pte only after all operations against it are finished. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reported-by: Ilya Dryomov <idryomov@gmail.com> Reported-by: werner <w.landgraf@ru.ru> Cc: Andrea Arcangeli <aarcange@redhat.com> Tested-by: Ilya Dryomov <idryomov@gmail.com> Tested-by: Steven Rostedt <rostedt@goodmis.org Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-01-20 17:02:05 -08:00
David Rientjes	6a108a14fa	kconfig: rename CONFIG_EMBEDDED to CONFIG_EXPERT The meaning of CONFIG_EMBEDDED has long since been obsoleted; the option is used to configure any non-standard kernel with a much larger scope than only small devices. This patch renames the option to CONFIG_EXPERT in init/Kconfig and fixes references to the option throughout the kernel. A new CONFIG_EMBEDDED option is added that automatically selects CONFIG_EXPERT when enabled and can be used in the future to isolate options that should only be considered for embedded systems (RISC architectures, SLOB, etc). Calling the option "EXPERT" more accurately represents its intention: only expert users who understand the impact of the configuration changes they are making should enable it. Reviewed-by: Ingo Molnar <mingo@elte.hu> Acked-by: David Woodhouse <david.woodhouse@intel.com> Signed-off-by: David Rientjes <rientjes@google.com> Cc: Greg KH <gregkh@suse.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jens Axboe <axboe@kernel.dk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Robin Holt <holt@sgi.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-01-20 17:02:05 -08:00
Shan Wei	d18046b3cd	dccp: clean up unused DCCP_STATE_MASK definition Remove unused DCCP_STATE_MASK macro. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Acked-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-01-20 17:01:09 -08:00
Roopa Prabhu	4dce2396b3	enic: Bug Fix: Dont reset ENIC_SET_APPLIED flag on port profile disassociate enic_get_vf_port returns port profile operation status only if ENIC_SET_APPLIED flag is set. A recent rework of enic_set_port_profile added code to reset this flag on disassociate. As a result of which a client calling enic_get_vf_port to get the status of port profile disassociate will always get a return value of ENODATA. This patch renames ENIC_SET_APPLIED to more appropriate ENIC_PORT_REQUEST_APPLIED and reverts back the recent change so that the flag is set both at associate and disassociate of a port profile. Signed-off-by: Roopa Prabhu <roprabhu@cisco.com> Signed-off-by: David Wang <dwang2@cisco.com> Signed-off-by: Christian Benvenuti <benve@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-01-20 16:59:34 -08:00
Eric Dumazet	f2eda47df4	ipv6: raw: rcu annotations Remove sparse warnings, using a function typedef to be able to use __rcu annotation on mh_filter pointer. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-01-20 16:59:34 -08:00
Eric Dumazet	6193d2be29	neigh: __rcu annotations fix some minor issues and sparse (__rcu) warnings Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-01-20 16:59:34 -08:00
Eric Dumazet	753ea8e962	net: ipv6: sit: fix rcu annotations Fix minor __rcu annotations and remove sparse warnings Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-01-20 16:59:33 -08:00
françois romieu	2ffa007eaa	via-velocity: fix the WOL bug on 1000M full duplex forced mode. The VIA velocity card can't be waken up by WOL tool on 1000M full duplex forced mode. This patch fixes the bug. Signed-off-by: David Lv <DavidLv@viatech.com.cn> Acked-by: Francois Romieu <romieu@fr.zoreil.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-01-20 16:59:33 -08:00
Eric Dumazet	a2da570d62	net_sched: RCU conversion of stab This patch converts stab qdisc management to RCU, so that we can perform the qdisc_calculate_pkt_len() call before getting qdisc lock. This shortens the lock's held time in __dev_xmit_skb(). This permits more qdiscs to get TCQ_F_CAN_BYPASS status, avoiding lot of cache misses and so reducing latencies. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Patrick McHardy <kaber@trash.net> CC: Jesper Dangaard Brouer <hawk@diku.dk> CC: Jarek Poplawski <jarkao2@gmail.com> CC: Jamal Hadi Salim <hadi@cyberus.ca> CC: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-01-20 16:59:32 -08:00
Eric Dumazet	fd245a4adb	net_sched: move TCQ_F_THROTTLED flag In commit `3711210576` (net: QDISC_STATE_RUNNING dont need atomic bit ops) I moved QDISC_STATE_RUNNING flag to __state container, located in the cache line containing qdisc lock and often dirtied fields. I now move TCQ_F_THROTTLED bit too, so that we let first cache line read mostly, and shared by all cpus. This should speedup HTB/CBQ for example. Not using test_bit()/__clear_bit()/__test_and_set_bit allows to use an "unsigned int" for __state container, reducing by 8 bytes Qdisc size. Introduce helpers to hide implementation details. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Patrick McHardy <kaber@trash.net> CC: Jesper Dangaard Brouer <hawk@diku.dk> CC: Jarek Poplawski <jarkao2@gmail.com> CC: Jamal Hadi Salim <hadi@cyberus.ca> CC: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-01-20 16:59:32 -08:00
Eric Dumazet	817fb15dfd	net_sched: sfq: allow divisor to be a parameter SFQ currently uses a 1024 slots hash table, and its internal structure (sfq_sched_data) allocation needs order-1 page on x86_64 Allow tc command to specify a divisor value (hash table size), between 1 and 65536. If no value is provided, assume the 1024 default size. This allows admins to setup smaller (or bigger) SFQ for specific needs. This also brings back sfq_sched_data allocations to order-0 ones, saving 3KB per SFQ qdisc. Jesper uses ~55.000 SFQ in one machine, this patch should free 165 MB of memory. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Patrick McHardy <kaber@trash.net> CC: Jesper Dangaard Brouer <hawk@diku.dk> CC: Jarek Poplawski <jarkao2@gmail.com> CC: Jamal Hadi Salim <hadi@cyberus.ca> CC: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-01-20 16:59:16 -08:00
Eric Dumazet	3fbd8758b0	net: dev_close_many() is static Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Octavian Purdila <opurdila@ixiacom.com> Reviewed-by: Octavian Purdila <opurdila@ixiacom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-01-20 16:55:30 -08:00
françois romieu	ccd5c8ef24	atl1e: remove private #define. Either unused or duplicates from mii.h. Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Cc: Jay Cliburn <jcliburn@gmail.com> Cc: Chris Snook <chris.snook@gmail.com> Cc: Jie Yang <jie.yang@atheros.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-01-20 16:50:05 -08:00
françois romieu	34aac66cc2	atl1c: remove private #define. Either unused or duplicates from mii.h. Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Cc: Jay Cliburn <jcliburn@gmail.com> Cc: Chris Snook <chris.snook@gmail.com> Cc: Jie Yang <jie.yang@atheros.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-01-20 16:50:05 -08:00
Neil Horman	b30532515f	bonding: Ensure that we unshare skbs prior to calling pskb_may_pull Recently reported oops: kernel BUG at net/core/skbuff.c:813! invalid opcode: 0000 [#1] SMP last sysfs file: /sys/devices/virtual/net/bond0/broadcast CPU 8 Modules linked in: sit tunnel4 cpufreq_ondemand acpi_cpufreq freq_table bonding ipv6 dm_mirror dm_region_hash dm_log cdc_ether usbnet mii serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support shpchp ioatdma i7core_edac edac_core bnx2 ixgbe dca mdio sg ext4 mbcache jbd2 sd_mod crc_t10dif mptsas mptscsih mptbase scsi_transport_sas dm_mod [last unloaded: microcode] Modules linked in: sit tunnel4 cpufreq_ondemand acpi_cpufreq freq_table bonding ipv6 dm_mirror dm_region_hash dm_log cdc_ether usbnet mii serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support shpchp ioatdma i7core_edac edac_core bnx2 ixgbe dca mdio sg ext4 mbcache jbd2 sd_mod crc_t10dif mptsas mptscsih mptbase scsi_transport_sas dm_mod [last unloaded: microcode] Pid: 0, comm: swapper Not tainted 2.6.32-71.el6.x86_64 #1 BladeCenter HS22 -[7870AC1]- RIP: 0010:[<ffffffff81405b16>] [<ffffffff81405b16>] pskb_expand_head+0x36/0x1e0 RSP: 0018:ffff880028303b70 EFLAGS: 00010202 RAX: 0000000000000002 RBX: ffff880c6458ec80 RCX: 0000000000000020 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880c6458ec80 RBP: ffff880028303bc0 R08: ffffffff818a6180 R09: ffff880c6458ed64 R10: ffff880c622b36c0 R11: 0000000000000400 R12: 0000000000000000 R13: 0000000000000180 R14: ffff880c622b3000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff880028300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 00000038653452a4 CR3: 0000000001001000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper (pid: 0, threadinfo ffff8806649c2000, task ffff880c64f16ab0) Stack: ffff880028303bc0 ffffffff8104fff9 000000000000001c 0000000100000000 <0> ffff880000047d80 ffff880c6458ec80 000000000000001c ffff880c6223da00 <0> ffff880c622b3000 0000000000000000 ffff880028303c10 ffffffff81407f7a Call Trace: <IRQ> [<ffffffff8104fff9>] ? __wake_up_common+0x59/0x90 [<ffffffff81407f7a>] __pskb_pull_tail+0x2aa/0x360 [<ffffffffa0244530>] bond_arp_rcv+0x2c0/0x2e0 [bonding] [<ffffffff814a0857>] ? packet_rcv+0x377/0x440 [<ffffffff8140f21b>] netif_receive_skb+0x2db/0x670 [<ffffffff8140f788>] napi_skb_finish+0x58/0x70 [<ffffffff8140fc89>] napi_gro_receive+0x39/0x50 [<ffffffffa01286eb>] ixgbe_clean_rx_irq+0x35b/0x900 [ixgbe] [<ffffffffa01290f6>] ixgbe_clean_rxtx_many+0x136/0x240 [ixgbe] [<ffffffff8140fe53>] net_rx_action+0x103/0x210 [<ffffffff81073bd7>] __do_softirq+0xb7/0x1e0 [<ffffffff810d8740>] ? handle_IRQ_event+0x60/0x170 [<ffffffff810142cc>] call_softirq+0x1c/0x30 [<ffffffff81015f35>] do_softirq+0x65/0xa0 [<ffffffff810739d5>] irq_exit+0x85/0x90 [<ffffffff814cf915>] do_IRQ+0x75/0xf0 [<ffffffff81013ad3>] ret_from_intr+0x0/0x11 <EOI> [<ffffffff8101bc01>] ? mwait_idle+0x71/0xd0 [<ffffffff814cd80a>] ? atomic_notifier_call_chain+0x1a/0x20 [<ffffffff81011e96>] cpu_idle+0xb6/0x110 [<ffffffff814c17c8>] start_secondary+0x1fc/0x23f Resulted from bonding driver registering packet handlers via dev_add_pack and then trying to call pskb_may_pull. If another packet handler (like for AF_PACKET sockets) gets called first, the delivered skb will have a user count > 1, which causes pskb_may_pull to BUG halt when it does its skb_shared check. Fix this by calling skb_share_check prior to the may_pull call sites in the bonding driver to clone the skb when needed. Tested by myself and the reported successfully. Signed-off-by: Neil Horman CC: Andy Gospodarek <andy@greyhouse.net> CC: Jay Vosburgh <fubar@us.ibm.com> CC: "David S. Miller" <davem@davemloft.net> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-01-20 16:45:56 -08:00
Dimitris Michailidis	6a3c869a60	cxgb4: fix reported state of interfaces without link Currently tools like ip and ifconfig report incorrect state for cxgb4 interfaces that are up but do not have link and do so until first link establishment. This is because the initial netif_carrier_off call is before register_netdev and it needs to be after to be fully effective. Fix this by moving netif_carrier_off into .ndo_open. Signed-off-by: Dimitris Michailidis <dm@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-01-20 16:45:55 -08:00
Linus Torvalds	fc887b15d9	Merge branch 'tty-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6 * 'tty-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6: tty: update MAINTAINERS file due to driver movement tty: move drivers/serial/ to drivers/tty/serial/ tty: move hvc drivers to drivers/tty/hvc/	2011-01-20 16:39:23 -08:00

... 2 3 4 5 6 ...

232474 commits