linux-stable

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2024-09-27 12:57:53 +00:00

Author	SHA1	Message	Date
Rusty Russell	842bbaaa73	[PATCH] Module per-cpu alignment cannot always be met The module code assumes noone will ever ask for a per-cpu area more than SMP_CACHE_BYTES aligned. However, as these cases show, gcc asks sometimes asks for 32-byte alignment for the per-cpu section on a module, and if CONFIG_X86_L1_CACHE_SHIFT is 4, we hit that BUG_ON(). This is obviously an unusual combination, as there have been few reports, but better to warn than die. See: http://www.ussg.iu.edu/hypermail/linux/kernel/0409.0/0768.html And more recently: http://bugs.gentoo.org/show_bug.cgi?id=97006 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-08-01 21:38:01 -07:00
Ingo Molnar	6cb54819d7	[PATCH] remove sys_set_zone_reclaim() This removes sys_set_zone_reclaim() for now. While i'm sure Martin is trying to solve a real problem, we must not hard-code an incomplete and insufficient approach into a syscall, because syscalls are pretty much for eternity. I am quite strongly convinced that this syscall must not hit v2.6.13 in its current form. Firstly, the syscall lacks basic syscall design: e.g. it allows the global setting of VM policy for unprivileged users. (!) [ Imagine an Oracle installation and a SAP installation on the same NUMA box fighting over the 'optimal' setting for this flag. What will they do? Will they try to set the flag to their own preferred value every second or so? ] Secondly, it was added based on a single datapoint from Martin: http://marc.theaimsgroup.com/?l=linux-mm&m=111763597218177&w=2 where Martin characterizes the numbers the following way: ' Run-to-run variability for "make -j" is huge, so these numbers aren't terribly useful except to see that with reclaim the benchmark still finishes in a reasonable amount of time. ' in other words: the fundamental problem has likely not been solved, only a tendential move into the right direction has been observed, and a handful of numbers were picked out of a set of hugely variable results, without showing the variability data. How much variance is there run-to-run? I'd really suggest to first walk the walk and see what's needed to get stable & predictable kernel compilation numbers on that NUMA box, before adding random syscalls to tune a particular aspect of the VM ... which approach might not even matter once the whole picture has been analyzed and understood! The third, most important point is that the syscall exposes VM tuning internals in a completely unstructured way. What sense does it make to have a _GLOBAL_ per-node setting for 'should we go to another node for reclaim'? If then it might make sense to do this per-app, via numalib or so. The change is minimalistic in that it doesnt remove the syscall and the underlying infrastructure changes, only the user-visible changes. We could perhaps add a CAP_SYS_ADMIN-only sysctl for this hack, a'ka /proc/sys/vm/swappiness, but even that looks quite counterproductive when the generic approach is that we are trying to reduce the number of external factors in the VM balance picture. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-08-01 10:03:56 -07:00
Andrew Morton	c70f5d6610	[PATCH] revert bogus softirq changes This snuck in with an x86_64 change. Thanks to Richard Purdie <rpurdie@rpsys.net> for spotting it. Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-30 10:49:59 -07:00
Eric W. Biederman	1108bae41e	[PATCH] reboot: remove device_suspend(PMSG_FREEZE) from kernel_kexec If device_suspend(PMSG_FREEZE) is not ready to be called in kernel_restart it is definitely not ready to be called in the even more fickle kernel_kexec. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-29 12:02:09 -07:00
George Anzinger	78fa74a23b	[PATCH] posix timers: fix normalization problem (We found this (after a customer complained) and it is in the kernel.org kernel. Seems that for CLOCK_MONOTONIC absolute timers and clock_nanosleep calls both the request time and wall_to_monotonic are subtracted prior to the normalize resulting in an overflow in the existing normalize test. This causes the result to be shifted ~4 seconds ahead instead of ~2 seconds back in time.) The normalize code in posix-timers.c fails when the tv_nsec member is ~1.2 seconds negative. This can happen on absolute timers (and clock_nanosleeps) requested on CLOCK_MONOTONIC (both the request time and wall_to_monotonic are subtracted resulting in the possibility of a number close to -2 seconds.) This fix uses the set_normalized_timespec() (which does not have an overflow problem) to fix the problem and as a side effect makes the code cleaner. Signed-off-by: George Anzinger <george@mvista.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-28 21:46:05 -07:00
Andi Kleen	ed6b676ca8	[PATCH] x86_64: Switch to the interrupt stack when running a softirq in local_bh_enable() This avoids some potential stack overflows with very deep softirq callchains. i386 does this too. TOADD CFI annotation Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-28 21:46:02 -07:00
Andrew Morton	e4ff4d7f9d	[PATCH] Avoid device suspend on reboot My fairly ordinary x86 test box gets stuck during reboot on the wait_for_completion() in ide_do_drive_cmd(): Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-27 16:46:37 -07:00
Jesper Juhl	77933d7276	[PATCH] clean up inline static vs static inline `gcc -W' likes to complain if the static keyword is not at the beginning of the declaration. This patch fixes all remaining occurrences of "inline static" up with "static inline" in the entire kernel tree (140 occurrences in 47 files). While making this change I came across a few lines with trailing whitespace that I also fixed up, I have also added or removed a blank line or two here and there, but there are no functional changes in the patch. Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-27 16:26:20 -07:00
Randy Dunlap	e77e17161c	[PATCH] kernel/crash_dump.c: add kerneldoc Add kerneldoc to kernel/crash_dump.c Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-27 16:26:06 -07:00
Randy Dunlap	d9fd8a6d44	[PATCH] kernel/cpuset.c: add kerneldoc, fix typos Add kerneldoc to kernel/cpuset.c Fix cpuset typos in init/Kconfig Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Acked-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-27 16:26:06 -07:00
Randy Dunlap	207a7ba8dc	[PATCH] kernel/capability.c: add kerneldoc Add kerneldoc to kernel/capability.c Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-27 16:26:06 -07:00
Martin Schwidefsky	951f22d5b1	[PATCH] s390: spin lock retry Split spin lock and r/w lock implementation into a single try which is done inline and an out of line function that repeatedly tries to get the lock before doing the cpu_relax(). Add a system control to set the number of retries before a cpu is yielded. The reason for the spin lock retry is that the diagnose 0x44 that is used to give up the virtual cpu is quite expensive. For spin locks that are held only for a short period of time the costs of the diagnoses outweights the savings for spin locks that are held for a longer timer. The default retry count is 1000. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-27 16:26:04 -07:00
George Anzinger	d912d1ff21	[PATCH] itimer fixes Fix the recent off-by-one fix in the itimer code: 1. The repeating timer is figured using the requested time (not +1 as we know where we are in the jiffie). 2. The tests for interval too large are left to the time_val to jiffie code. Signed-off-by: George Anzinger <george@mvista.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-27 16:25:51 -07:00
Nigel Cunningham	bba0e4670a	[PATCH] Address BUG: using smp_processor_id() in preemptible [00000001] code This patch fixes a warning in the disable_nonboot_cpus call in kernel/power/smp.c. Signed-off by: Nigel Cunningham <nigel@suspend2.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-27 16:25:50 -07:00
Steven Rostedt	d46523ea32	[PATCH] fix MAX_USER_RT_PRIO and MAX_RT_PRIO Here's the patch again to fix the code to handle if the values between MAX_USER_RT_PRIO and MAX_RT_PRIO are different. Without this patch, an SMP system will crash if the values are different. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Dean Nelson <dcn@sgi.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-26 15:40:00 -07:00
Andreas Steinmetz	18586e7216	[PATCH] Fix RLIMIT_RTPRIO breakage RLIMIT_RTPRIO is supposed to grant non privileged users the right to use SCHED_FIFO/SCHED_RR scheduling policies with priorites bounded by the RLIMIT_RTPRIO value via sched_setscheduler(). This is usually used by audio users. Unfortunately this is broken in 2.6.13rc3 as you can see in the excerpt from sched_setscheduler below: /* * Allow unprivileged RT tasks to decrease priority: / if (!capable(CAP_SYS_NICE)) { / can't change policy / if (policy != p->policy) return -EPERM; After the above unconditional test which causes sched_setscheduler to fail with no regard to the RLIMIT_RTPRIO value the following check is made: / can't increase priority */ if (policy != SCHED_NORMAL && param->sched_priority > p->rt_priority && param->sched_priority > p->signal->rlim[RLIMIT_RTPRIO].rlim_cur) return -EPERM; Thus I do believe that the RLIMIT_RTPRIO value must be taken into account for the policy check, especially as the RLIMIT_RTPRIO limit is of no use without this change. The attached patch fixes this problem. Signed-off-by: Andreas Steinmetz <ast@domdv.de> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-26 15:30:51 -07:00
Eric W. Biederman	fdde86ac50	[PATCH] swpsuspend: Have suspend to disk use factors of sys_reboot The suspend to disk code was a poor copy of the code in sys_reboot now that we have kernel_power_off, kernel_restart and kernel_halt use them instead of poorly duplicating them inline. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-26 14:35:44 -07:00
Eric W. Biederman	2f048ea81d	[PATCH] Call emergency_reboot from panic We know the system is in trouble so there is no question if this is an emergecy :) Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-26 14:35:43 -07:00
Eric W. Biederman	ff31977782	[PATCH] Use kernel_power_off in sysrq-o We already do all of the gymnastics to run from process context to call the power off code so call into the power off code cleanly. This especially helps acpi as part of it's shutdown logic should run acpi_shutdown called from device_shutdown which was not being called from here. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-26 14:35:43 -07:00
Eric W. Biederman	7c9034735e	[PATCH] Add emergency_restart() When the kernel is working well and we want to restart cleanly kernel_restart is the function to use. But in many instances the kernel wants to reboot when thing are expected to be working very badly such as from panic or a software watchdog handler. This patch adds the function emergency_restart() so that callers can be clear what semantics they expect when calling restart. emergency_restart() is expected to be callable from interrupt context and possibly reliable in even more trying circumstances. This is an initial generic implementation for all architectures. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-26 14:35:41 -07:00
Eric W. Biederman	abcd9e51f5	[PATCH] Make ctrl_alt_del call kernel_restart to get a proper reboot. It is obvious we wanted to call kernel_restart here but since we don't have it the code was expanded inline and hasn't been correct since sometime in 2.4. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-26 14:35:41 -07:00
Eric W. Biederman	4a00ea1e18	[PATCH] Refactor sys_reboot into reusable parts Because the factors of sys_reboot don't exist people calling into the reboot path duplicate the code badly, leading to inconsistent expectations of code in the reboot path. This patch should is just code motion. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-26 14:35:41 -07:00
Eric W. Biederman	47f61f397c	[PATCH] Add missing device_suspsend(PMSG_FREEZE) calls. In the recent addition of device_suspend calls into sys_reboot two code paths were missed. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-26 14:35:41 -07:00
Robert Love	0399cb08c5	[PATCH] inotify: move sysctl This moves the inotify sysctl knobs to "/proc/sys/fs/inotify" from "/proc/sys/fs". Also some related cleanup. Signed-off-by: Robert Love <rml@novell.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-13 11:09:31 -07:00
Robert Love	0eeca28300	[PATCH] inotify inotify is intended to correct the deficiencies of dnotify, particularly its inability to scale and its terrible user interface: * dnotify requires the opening of one fd per each directory that you intend to watch. This quickly results in too many open files and pins removable media, preventing unmount. * dnotify is directory-based. You only learn about changes to directories. Sure, a change to a file in a directory affects the directory, but you are then forced to keep a cache of stat structures. * dnotify's interface to user-space is awful. Signals? inotify provides a more usable, simple, powerful solution to file change notification: * inotify's interface is a system call that returns a fd, not SIGIO. You get a single fd, which is select()-able. * inotify has an event that says "the filesystem that the item you were watching is on was unmounted." * inotify can watch directories or files. Inotify is currently used by Beagle (a desktop search infrastructure), Gamin (a FAM replacement), and other projects. See Documentation/filesystems/inotify.txt. Signed-off-by: Robert Love <rml@novell.com> Cc: John McCutchan <ttb@tentacle.dhs.org> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-12 20:38:38 -07:00
Linus Torvalds	3f603ed319	Merge master.kernel.org:/pub/scm/linux/kernel/git/lenb/linux-2.6	2005-07-12 16:04:50 -07:00
Hugh Dickins	3b6bfcdb11	[PATCH] lower VM_DONTCOPY total_vm dup_mmap of a VM_DONTCOPY vma forgot to lower the child's total_vm. (But no way does this account for the recent report of total_vm seen too low.) Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-12 16:00:58 -07:00
Andrew Morton	d53d9f16ea	[PATCH] name_to_dev_t warning fix kernel/power/disk.c needs a declaration of name_to_dev_t() in scope. mount.h seems like an appropriate choice. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-12 16:00:58 -07:00
Len Brown	5028770a42	[ACPI] merge acpi-2.6.12 branch into latest Linux 2.6.13-rc... Signed-off-by: Len Brown <len.brown@intel.com>	2005-07-12 17:21:56 -04:00
David Shaohua Li	5ae947ecc9	[ACPI] Suspend to RAM fix Free some RAM before entering S3 so that upon resume we can be sure early allocations will succeed. http://bugzilla.kernel.org/show_bug.cgi?id=3469 Signed-off-by: David Shaohua Li <shaohua.li@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2005-07-11 23:21:54 -04:00
Alexey Starikovskiy	e2a5b420f7	[ACPI] ACPI poweroff fix Register an "acpi" system device to be notified of shutdown preparation. This depends on CONFIG_PM http://bugzilla.kernel.org/show_bug.cgi?id=4041 Signed-off-by: Alexey Starikovskiy <alexey.y.starikovskiy@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Len Brown <len.brown@intel.com>	2005-07-11 23:20:49 -04:00
Ingo Molnar	5bbcfd9000	[PATCH] cond_resched(): fix bogus might_sleep() warning The BKS might be reacquired before we have dropped PREEMPT_ACTIVE, which could trigger a second could trigger a second cond_resched() call. Bug found by Hirofumi Ogawa. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-07 18:23:47 -07:00
Christoph Lameter	6c036527a6	[PATCH] mostly_read data section Add a new section called ".data.read_mostly" for data items that are read frequently and rarely written to like cpumaps etc. If these maps are placed in the .data section then these frequenly read items may end up in cachelines with data is is frequently updated. In that case all processors in an SMP system must needlessly reload the cachelines again and again containing elements of those frequently used variables. The ability to share these cachelines will allow each cpu in an SMP system to keep local copies of those shared cachelines thereby optimizing performance. Signed-off-by: Alok N Kataria <alokk@calsoftinc.com> Signed-off-by: Shobhit Dayal <shobhit@calsoftinc.com> Signed-off-by: Christoph Lameter <christoph@scalex86.org> Signed-off-by: Shai Fultheim <shai@scalex86.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-07 18:23:46 -07:00
Pavel Machek	1322ad4151	[PATCH] pm: clean up process.c freezeable() already tests for TRACED/STOPPED processes, no need to do it twice. Signed-off-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-07 18:23:43 -07:00
Pavel Machek	47b724f3fe	[PATCH] swsusp: fix error handling Fix error handling and whitespace in swsusp.c. swsusp_free() was called when there was nothing allocating, leading to oops. Signed-off-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-07 18:23:43 -07:00
Pavel Machek	3efa147ad7	[PATCH] pm: Fix resume from initrd Move device name resolution code around so that it is not called from resume-from-initrd. name_to_dev_t may be unavailable at that point. Signed-off-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-07 18:23:43 -07:00
Rusty Lynch	6772926bef	[PATCH] kprobes: fix namespace problem and sparc64 build The following renames arch_init, a kprobes function for performing any architecture specific initialization, to arch_init_kprobes in order to cleanup the namespace. Also, this patch adds arch_init_kprobes to sparc64 to fix the sparc64 kprobes build from the last return probe patch. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-05 19:19:00 -07:00
Alan Cox	200803dfe4	[PATCH] irqpoll Anyone reporting a stuck IRQ should try these options. Its effectiveness varies we've found in the Fedora case. Quite a few systems with misdescribed IRQ routing just work when you use irqpoll. It also fixes up the VIA systems although thats now fixed with the VIA quirk (which we could just make default as its what Redmond OS does but Linus didn't like it historically). A small number of systems have jammed IRQ sources or misdescribes that cause an IRQ that we have no handler registered anywhere for. In those cases it doesn't help. Signed-off-by: Alan Cox <number6@the-village.bc.nu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-28 21:20:35 -07:00
Oleg Nesterov	f01b1b0baa	[PATCH] ITIMER_REAL: fix possible deadlock and race As Steven Rostedt pointed out, there are 2 problems with ITIMER_REAL timers. 1. do_setitimer() does not call del_timer_sync() in case when the timer is not pending (it_real_value() returns 0). This is wrong, the timer may still be running, and it can rearm itself. 2. It calls del_timer_sync() with tsk->sighand->siglock held. This is deadlockable, because timer's handler needs this lock too. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-28 21:20:30 -07:00
Luca Falavigna	47f176fdaf	[PATCH] Using msleep() instead of HZ Use msleep() in a few places. Signed-off-by: Luca Falavigna <dktrkranz@gmail.com> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Jeff Garzik <jgarzik@pobox.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-28 21:20:29 -07:00
Ingo Molnar	f340c0d1a3	[PATCH] Tweak idle thread setup semantics This patch tweaks idle thread setup semantics a bit: instead of setting NEED_RESCHED in init_idle(), we do an explicit schedule() before calling into cpu_idle(). This patch, while having no negative side-effects, enables wider use of cond_resched()s. (which might happen in the stock kernel too, but it's particulary important for voluntary-preempt) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-28 14:56:51 -07:00
Alexey Dobriyan	314b6a4d80	[PATCH] kexec: fix sparse warnings Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Eric Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-28 14:53:40 -07:00
Rusty Lynch	802eae7c80	[PATCH] Return probe redesign: architecture independent changes The following is the second version of the function return probe patches I sent out earlier this week. Changes since my last submission include: * Fix in ppc64 code removing an unneeded call to re-enable preemption * Fix a build problem in ia64 when kprobes was turned off * Added another BUG_ON check to each of the architecture trampoline handlers My initial patch description ==> From my experiences with adding return probes to x86_64 and ia64, and the feedback on LKML to those patches, I think we can simplify the design for return probes. The following patch tweaks the original design such that: * Instead of storing the stack address in the return probe instance, the task pointer is stored. This gives us all we need in order to: - find the correct return probe instance when we enter the trampoline (even if we are recursing) - find all left-over return probe instances when the task is going away This has the side effect of simplifying the implementation since more work can be done in kernel/kprobes.c since architecture specific knowledge of the stack layout is no longer required. Specifically, we no longer have: - arch_get_kprobe_task() - arch_kprobe_flush_task() - get_rp_inst_tsk() - get_rp_inst() - trampoline_post_handler() <see next bullet> * Instead of splitting the return probe handling and cleanup logic across the pre and post trampoline handlers, all the work is pushed into the pre function (trampoline_probe_handler), and then we skip single stepping the original function. In this case the original instruction to be single stepped was just a NOP, and we can do without the extra interruption. The new flow of events to having a return probe handler execute when a target function exits is: * At system initialization time, a kprobe is inserted at the beginning of kretprobe_trampoline. kernel/kprobes.c use to handle this on it's own, but ia64 needed to do this a little differently (i.e. a function pointer is really a pointer to a structure containing the instruction pointer and a global pointer), so I added the notion of arch_init(), so that kernel/kprobes.c:init_kprobes() now allows architecture specific initialization by calling arch_init() before exiting. Each architecture now registers a kprobe on it's own trampoline function. * register_kretprobe() will insert a kprobe at the beginning of the targeted function with the kprobe pre_handler set to arch_prepare_kretprobe (still no change) * When the target function is entered, the kprobe is fired, calling arch_prepare_kretprobe (still no change) * In arch_prepare_kretprobe() we try to get a free instance and if one is available then we fill out the instance with a pointer to the return probe, the original return address, and a pointer to the task structure (instead of the stack address.) Just like before we change the return address to the trampoline function and mark the instance as used. If multiple return probes are registered for a given target function, then arch_prepare_kretprobe() will get called multiple times for the same task (since our kprobe implementation is able to handle multiple kprobes at the same address.) Past the first call to arch_prepare_kretprobe, we end up with the original address stored in the return probe instance pointing to our trampoline function. (This is a significant difference from the original arch_prepare_kretprobe design.) * Target function executes like normal and then returns to kretprobe_trampoline. * kprobe inserted on the first instruction of kretprobe_trampoline is fired and calls trampoline_probe_handler() (no change here) * trampoline_probe_handler() consumes each of the instances associated with the current task by calling the registered handler function and marking the instance as unused until an instance is found that has a return address different then the trampoline function. (change similar to my previous ia64 RFC) * If the task is killed with some left-over return probe instances (meaning that a target function was entered, but never returned), then we just free any instances associated with the task. (Not much different other then we can handle this without calling architecture specific functions.) There is a known problem that this patch does not yet solve where registering a return probe flush_old_exec or flush_thread will put us in a bad state. Most likely the best way to handle this is to not allow registering return probes on these two functions. (Significant change) This patch series applies to the 2.6.12-rc6-mm1 kernel, and provides: * kernel/kprobes.c changes * i386 patch of existing return probes implementation * x86_64 patch of existing return probe implementation * ia64 implementation * ppc64 implementation (provided by Ananth) This patch implements the architecture independant changes for a reworking of the kprobes based function return probes design. Changes include: * Removing functions for querying a return probe instance off a stack address * Removing the stack_addr field from the kretprobe_instance definition, and adding a task pointer * Adding architecture specific initialization via arch_init() * Removing extern definitions for the architecture trampoline functions (this isn't needed anymore since the architecture handles the initialization of the kprobe in the return probe trampoline function.) Signed-off-by: Rusty Lynch <rusty.lynch@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-27 15:23:52 -07:00
Ananth N Mavinakayanahalli	9ec4b1f356	[PATCH] kprobes: fix single-step out of line - take2 Now that PPC64 has no-execute support, here is a second try to fix the single step out of line during kprobe execution. Kprobes on x86_64 already solved this problem by allocating an executable page and using it as the scratch area for stepping out of line. Reuse that. Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-27 15:23:52 -07:00
Jens Axboe	22e2c507c3	[PATCH] Update cfq io scheduler to time sliced design This updates the CFQ io scheduler to the new time sliced design (cfq v3). It provides full process fairness, while giving excellent aggregate system throughput even for many competing processes. It supports io priorities, either inherited from the cpu nice value or set directly with the ioprio_get/set syscalls. The latter closely mimic set/getpriority. This import is based on my latest from -mm. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-27 14:33:29 -07:00
Linus Torvalds	2031d0f586	Merge Christoph's freeze cleanup patch	2005-06-25 17:16:53 -07:00
Christoph Lameter	3e1d1d28d9	[PATCH] Cleanup patch for process freezing 1. Establish a simple API for process freezing defined in linux/include/sched.h: frozen(process) Check for frozen process freezing(process) Check if a process is being frozen freeze(process) Tell a process to freeze (go to refrigerator) thaw_process(process) Restart process frozen_process(process) Process is frozen now 2. Remove all references to PF_FREEZE and PF_FROZEN from all kernel sources except sched.h 3. Fix numerous locations where try_to_freeze is manually done by a driver 4. Remove the argument that is no longer necessary from two function calls. 5. Some whitespace cleanup 6. Clear potential race in refrigerator (provides an open window of PF_FREEZE cleared before setting PF_FROZEN, recalc_sigpending does not check PF_FROZEN). This patch does not address the problem of freeze_processes() violating the rule that a task may only modify its own flags by setting PF_FREEZE. This is not clean in an SMP environment. freeze(process) is therefore not SMP safe! Signed-off-by: Christoph Lameter <christoph@lameter.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 17:10:13 -07:00
Nick Wilson	8c0e33c133	[PATCH] Use ALIGN to remove duplicate code This patch makes use of ALIGN() to remove duplicate round-up code. Signed-off-by: Nick Wilson <njw@osdl.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:25:02 -07:00
Jesper Juhl	5a6b454f80	[PATCH] remove redundant NULL check before before kfree() in kernel/sysctl.c Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:59 -07:00
Domen Puncer	96ec3efdcb	[PATCH] kernel/timer: fix msleep_interruptible() comment The comment for msleep_interruptible() is wrong, as it will ignore wait-queue events, but will wake up early for signals. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Domen Puncer <domen@coderock.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:58 -07:00

1 2 3 4 5

211 commits