Commit graph

8468 commits

Author SHA1 Message Date
Tejun Heo
7c77fa4d51 libata: separate out ata_acpi_gtm_xfermask() from pacpi_discover_modes()
Finding out matching transfer mode from ACPI GTM values is useful for
other purposes too.  Separate out the function and timing tables from
pata_acpi::pacpi_discover_modes().

Other than checking shared-configuration bit after doing
ata_acpi_gtm() in pacpi_discover_modes() which should be safe, this
patch doesn't introduce any behavior change.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-23 05:24:12 -05:00
Tejun Heo
c88f90c377 libata: add ATA_CBL_PATA_IGN
ATA_CBL_PATA_UNK indicates that the cable type can't be determined
from the host side and might be either 80c or 40c.  libata applies
drive or other generic limit in this case.  However, there are
controllers where both host and drive side detections are
misimplemented and the driver has to rely solely on private method -
peeking BIOS or ACPI configuration or using some other private
mechanism.

This patch adds ATA_CBL_PATA_IGN which tells libata to ignore the
cable type completely and just let the LLD determine the transfer mode
via host transfer mode masks and ->mode_filter().

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-23 05:24:12 -05:00
Tejun Heo
7dc951aefd libata: xfer_mask is unsigned long not unsigned int
Jeff says xfer_mask is unsigned long not unsigned int.  Convert all
xfermask fields and handling functions to deal with unsigned longs.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-23 05:24:12 -05:00
Tejun Heo
9d3501ab96 libata: kill ata_id_to_dma_mode()
ata_id_to_dma_mode() isn't quite generic.  The function is basically
privately implemented ata_id_xfermask() combined with hardcoded mode
printing and configuration which are specific to ata_generic.

Kill the function and open code it in generic_set_mode() using generic
xfermode handling functions.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-23 05:24:11 -05:00
Tejun Heo
70cd071e4e libata: clean up xfermode / PATA timing related stuff
* s/ATA_BITS_(PIO|MWDMA|UDMA)/ATA_NR_\1_MODES/g

* Consistently use 0xff to indicate invalid transfer mode (0x00 is
  valid for PIO_SLOW).

* Make ata_xfer_mode2mask() return proper mode mask instead of just
  the highest bit.

* Sort ata_timing table in increasing xfermode order and update
  ata_timing_find_mode() accordingly.

This patch doesn't introduce any behavior change.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-23 05:24:11 -05:00
Tejun Heo
6357357cae libata: export xfermode / PATA timing related functions
Export the following xfermode related functions.

* ata_pack_xfermask()
* ata_unpack_xfermask()
* ata_xfer_mask2mode()
* ata_xfer_mode2mask()
* ata_xfer_mode2shift()
* ata_mode_string()
* ata_id_xfermask()
* ata_timing_find_mode()

These functions will be used later by LLD updates.  While at it,
change unsigned short @speed to u8 @xfer_mode in
ata_timing_find_mode() for consistency.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-23 05:24:11 -05:00
Tejun Heo
00115e0f5b libata: implement ATA_DFLAG_DUBIOUS_XFER
ATA_DFLAG_DUBIOUS_XFER is set whenever data transfer speed or method
changes and gets cleared when data transfer command succeeds in the
newly configured transfer mode.

This will be used to improve speed down logic.

Signed-off-by: Tejun Heo <htejun@gmail.com<
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-23 05:24:11 -05:00
Tejun Heo
3884f7b0a8 libata: clean up EH speed down implementation
Clean up EH speed down implementation.

* is_io boolean variable is replaced eflags.  is_io is ATA_EFLAG_IS_IO.

* Error categories now have names.

* Better comments.

* Reorder 5min and 10min rules in ata_eh_speed_down_verdict()

* Use local variable @link to cache @dev->link in ata_eh_speed_down()

These changes are to improve readability and ease further changes.
This patch doesn't introduce any behavior change.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-23 05:24:10 -05:00
Tejun Heo
405e66b387 libata: implement protocol tests
Implement protocol tests - ata_is_atapi(), ata_is_nodata(),
ata_is_pio(), ata_is_dma(), ata_is_ncq() and ata_is_data() and use
them to replace is_atapi_taskfile() and hard coded protocol tests.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-23 05:24:10 -05:00
Tejun Heo
f20ded38aa libata: rearrange ATA_DFLAG_*
Area for DFLAGs which are cleared on INIT is full.  Extend it by 8
bits.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-23 05:24:10 -05:00
Alan Cox
ae8d4ee7ff libata: Disable ATA8-ACS proposed Trusted Computing features by default
Historically word 48 in the identify data was used to mean 32bit I/O
was supported for VLB IDE etc. ATA8 reassigns this word to the Trusted
Computing Group, where it is used for TCG features. This means that
an ATA8 TCG drive is going to trigger 32bit I/O on some systems which
will be funny.

Anyway we need to sort this out ready for ATA8 so:
- Reorder the ata.h header a bit so the ata_version function occurs early
  in it
- Make dword_io check the ATA version
- Add an ATA8 version checking TCG presence test

While we are at it the current drafts have a flaw where it may not be
possible to disable TCG features at boot (and opt out of the trusted
model) as TCG intends because it relies on presence of a different
optional feature (DCS). Handle this in software by refusing the TCG
commands if libata.allow_tpm is not set. (We must make it possible
as some environments such as proprietary VDR devices will doubtless
want to use it to lock up content)

Finally as with CPRM print a warning so that the user knows they may
not be able to full access and use the device.

Signed-off-by: Alan Cox <alan@redhat.com>
2008-01-23 05:24:09 -05:00
Johannes Berg
eb13ba8738 lockdep: fix workqueue creation API lockdep interaction
Dave Young reported warnings from lockdep that the workqueue API
can sometimes try to register lockdep classes with the same key
but different names. This is not permitted in lockdep.

Unfortunately, I was unaware of that restriction when I wrote
the code to debug workqueue problems with lockdep and used the
workqueue name as the lockdep class name. This can obviously
lead to the problem if the workqueue name is dynamic.

This patch solves the problem by always using a constant name
for the workqueue's lockdep class, namely either the constant
name that was passed in or a string consisting of the variable
name.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
2008-01-16 09:51:58 +01:00
Alan Cox
121a09e590 libata: correct handling of TSS DVD
Devices that misreport the validity bit for word 93 look like SATA.  If
they are on the blacklist then we must not test for SATA but assume 40 wire
in the 40 wire case (The TSSCorp reports 80 wire on SATA it seems!)

Signed-off-by: Alan Cox <alan@redhat.com>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-15 16:35:21 -05:00
Linus Torvalds
c23f72cae9 Revert "writeback: introduce writeback_control.more_io to indicate more io"
This reverts commit 2e6883bdf4, as
requested by Fengguang Wu.  It's not quite fully baked yet, and while
there are patches around to fix the problems it caused, they should get
more testing.  Says Fengguang: "I'll resend them both for -mm later on,
in a more complete patchset".

See

	http://bugzilla.kernel.org/show_bug.cgi?id=9738

for some of this discussion.

Requested-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-14 21:21:29 -08:00
Jean Delvare
f9dd0194ff i2c: Driver IDs are optional
Document the fact that I2C driver IDs are optional.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
2008-01-14 21:53:31 +01:00
Linus Torvalds
417009f64f Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
  pnpacpi: print resource shortage message only once
  PM: ACPI and APM must not be enabled at the same time
  ACPI: apply quirk_ich6_lpc_acpi to more ICH8 and ICH9
  ACPICA: fix acpi_serialize hang regression
  ACPI : Not register gsi for PCI IDE controller in legacy mode
  ACPI: Reintroduce run time configurable max_cstate for !CPU_IDLE case
  ACPI: Make sysfs interface in ACPI power optional.
  ACPI: EC: Enable boot EC before bus_scan
  increase PNP_MAX_PORT to 40 from 24
2008-01-13 09:58:22 -08:00
Roland McGrath
84427eaef1 remove task_ppid_nr_ns
task_ppid_nr_ns is called in three places.  One of these should never
have called it.  In the other two, using it broke the existing
semantics.  This was presumably accidental.  If the function had not
been there, it would have been much more obvious to the eye that those
patches were changing the behavior.  We don't need this function.

In task_state, the pid of the ptracer is not the ppid of the ptracer.

In do_task_stat, ppid is the tgid of the real_parent, not its pid.
I also moved the call outside of lock_task_sighand, since it doesn't
need it.

In sys_getppid, ppid is the tgid of the real_parent, not its pid.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-13 09:56:43 -08:00
Len Brown
6b74c92521 Pull bugzilla-9535 into release branch 2008-01-11 12:27:50 -05:00
Len Brown
02d5bccf8e Pull bugzilla-9194 into release branch 2008-01-11 12:27:13 -05:00
Len Brown
9f9adecd2d PM: ACPI and APM must not be enabled at the same time
ACPI and APM used "pm_active" to guarantee that
they would not be simultaneously active.

But pm_active was recently moved under CONFIG_PM_LEGACY,
so that without CONFIG_PM_LEGACY, pm_active became a NOP --
allowing ACPI and APM to both be simultaneously enabled.
This caused unpredictable results, including boot hangs.

Further, the code under CONFIG_PM_LEGACY is scheduled
for removal.

So replace pm_active with pm_flags.
pm_flags depends only on CONFIG_PM,
which is present for both CONFIG_APM and CONFIG_ACPI.

http://bugzilla.kernel.org/show_bug.cgi?id=9194

Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2008-01-11 12:26:47 -05:00
Rusty Russell
b801a1e7db Don't blatt first element of prv in sg_chain()
I realize that sg chaining is a ploy to make the rest of the kernel
devs feel the pain of the SCSI subsystem.  But this was a little
unsubtle.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-11 10:12:55 +01:00
Zhao Yakui
d1ec7298fc ACPI: apply quirk_ich6_lpc_acpi to more ICH8 and ICH9
It is important that these resources be reserved
to avoid conflicts with well known ACPI registers.

Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-01-11 00:24:55 -05:00
David S. Miller
a0a46196cd [NET]: Add NAPI_STATE_DISABLE.
Create a bit to signal that a napi_disable() is in progress.

This sets up infrastructure such that net_rx_action() can generically
break out of the ->poll() loop on a NAPI context that has a pending
napi_disable() yet is being bombed with packets (and thus would
otherwise poll endlessly and not allow the napi_disable() to finish).

Now, what napi_disable() does is first set the NAPI_STATE_DISABLE bit
(to indicate that a disable is pending), then it polls for the
NAPI_STATE_SCHED bit, and once the NAPI_STATE_SCHED bit is acquired
the NAPI_STATE_DISABLE bit is cleared.  Here, the test_and_set_bit()
provides the necessary memory barrier between the various bitops.

napi_schedule_prep() now tests for a pending disable as it's first
action and won't try to obtain the NAPI_STATE_SCHED bit if a disable
is pending.

As a result, we can remove the netif_running() check in
netif_rx_schedule_prep() because the NAPI disable pending state serves
this purpose.  And, it does so in a NAPI centric manner which is what
we really want.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-08 23:30:07 -08:00
David S. Miller
bdb95b1792 [NET]: Do not grab device reference when scheduling a NAPI poll.
It is pointless, because everything that can make a device go away
will do a napi_disable() first.

The main impetus behind this is that now we can legally do a NAPI
completion in generic code like net_rx_action() which a following
changeset needs to do.  net_rx_action() can only perform actions
in NAPI centric ways, because there may be a one to many mapping
between NAPI contexts and network devices (SKY2 is one example).

We also want to get rid of this because it's an extra atomic in the
NAPI paths, and also because it is one of the last instances where the
NAPI interfaces care about net devices.

The one remaining netdev detail the NAPI stuff cares about is the
netif_running() check which will be killed off in a subsequent
changeset.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-08 23:30:07 -08:00
Alan Cox
bf5e5834bf pl2303: Fix mode switching regression
Cleaning out all the incorrect 'no change made' checks for termios
settings showed up a problem with the PL2303. The hardware here seems to
lose sync and bits if you tell it to make no changes. This shows up with
a real world application.

To fix this the driver check for meaningful hardware changes is restored
but doing the tests correctly and as a tty layer function so it doesn't
get duplicated wrongly everywhere if other drivers turn out to need it.

Signed-off-by: Alan Cox <alan@redhat.com>
Tested-by: Mirko Parthey <mirko.parthey@informatik.tu-chemnitz.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-08 16:16:34 -08:00
Sebastian Siewior
5b7741b332 KEYS: fix macro
Commit 664cceb009 changed the parameters of
the function make_key_ref().  The macros that are used in case CONFIG_KEY
is not defined did not change.

Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-08 16:10:35 -08:00
Ingo Molnar
a263898f62 CPU hotplug: fix cpu_is_offline() on !CONFIG_HOTPLUG_CPU
make randconfig bootup testing found that the cpufreq code
crashes on bootup, if the powernow-k8 driver is enabled and
if maxcpus=1 passed on the boot line to a !CONFIG_HOTPLUG_CPU
kernel.

First lockdep found out that there's an inconsistent unlock
sequence:

 =====================================
 [ BUG: bad unlock balance detected! ]
 -------------------------------------
 swapper/1 is trying to release lock (&per_cpu(cpu_policy_rwsem, cpu)) at:
 [<ffffffff806ffd8e>] unlock_policy_rwsem_write+0x3c/0x42
 but there are no more locks to release!

Call Trace:
 [<ffffffff806ffd8e>] unlock_policy_rwsem_write+0x3c/0x42
 [<ffffffff80251c29>] print_unlock_inbalance_bug+0x104/0x12c
 [<ffffffff80252f3a>] mark_held_locks+0x56/0x94
 [<ffffffff806ffd8e>] unlock_policy_rwsem_write+0x3c/0x42
 [<ffffffff807008b6>] cpufreq_add_dev+0x2a8/0x5c4
 ...

then shortly afterwards the cpufreq code crashed on an assert:

 ------------[ cut here ]------------
 kernel BUG at drivers/cpufreq/cpufreq.c:1068!
 invalid opcode: 0000 [1] SMP
 [...]
 Call Trace:
  [<ffffffff805145d6>] sysdev_driver_unregister+0x5b/0x91
  [<ffffffff806ff520>] cpufreq_register_driver+0x15d/0x1a2
  [<ffffffff80cc0596>] powernowk8_init+0x86/0x94
 [...]
 ---[ end trace 1e9219be2b4431de ]---

the bug was caused by maxcpus=1 bootup, which brought up the
secondary core as !cpu_online() but !cpu_is_offline() either,
which on on !CONFIG_HOTPLUG_CPU is always 0 (include/linux/cpu.h):

  /* CPUs don't go offline once they're online w/o CONFIG_HOTPLUG_CPU */
  static inline int cpu_is_offline(int cpu) { return 0; }

but the cpufreq code uses cpu_online() and cpu_is_offline() in
a mixed way - the low-level drivers use cpu_online(), while
the cpufreq core uses cpu_is_offline(). This opened up the
possibility to add the non-initialized sysdev device of the
secondary core:

 cpufreq-core: trying to register driver powernow-k8
 cpufreq-core: adding CPU 0
 powernow-k8: BIOS error - no PSB or ACPI _PSS objects
 cpufreq-core: initialization failed
 cpufreq-core: adding CPU 1
 cpufreq-core: initialization failed

which then blew up. The fix is to make cpu_is_offline() always
the negation of cpu_online(). With that fix applied the kernel
boots up fine without crashing:

 Calling initcall 0xffffffff80cc0510: powernowk8_init+0x0/0x94()
 powernow-k8: Found 1 AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ processors (1 cpu cores) (version 2.20.00)
 powernow-k8: BIOS error - no PSB or ACPI _PSS objects
 initcall 0xffffffff80cc0510: powernowk8_init+0x0/0x94() returned -19.
 initcall 0xffffffff80cc0510 ran for 19 msecs: powernowk8_init+0x0/0x94()
 Calling initcall 0xffffffff80cc328f: init_lapic_nmi_sysfs+0x0/0x39()

We could fix this by making CPU enumeration aware of max_cpus, but that
would be more fragile IMO, and the cpu_online(cpu) != cpu_is_offline(cpu)
possibility was quite confusing and a continuous source of bugs too.

Most distributions have kernels with CPU hotplug enabled, so this bug
remained hidden for a long time.

Bug forensics:

The broken cpu_is_offline() API variant was introduced via:

 commit a59d2e4e6977e7b94e003c96a41f07e96cddc340
 Author: Rusty Russell <rusty@rustcorp.com.au>
 Date:   Mon Mar 8 06:06:03 2004 -0800

     [PATCH] minor cleanups for hotplug CPUs

( this predates linux-2.6.git, this commit is available from Thomas's
  historic git tree. )

Then 1.5 years later the cpufreq code made use of it:

 commit c32b6b8e52
 Author: Ashok Raj <ashok.raj@intel.com>
 Date:   Sun Oct 30 14:59:54 2005 -0800

     [PATCH] create and destroy cpufreq sysfs entries based on cpu notifiers

 +       if (cpu_is_offline(cpu))
 +               return 0;

which is a correct use of the subtly broken new API. v2.6.15 then
shipped with this bug included.

then it took two more years for random-kernel qa to hit it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-06 12:39:42 -08:00
Al Viro
831830b5a2 restrict reading from /proc/<pid>/maps to those who share ->mm or can ptrace pid
Contents of /proc/*/maps is sensitive and may become sensitive after
open() (e.g.  if target originally shares our ->mm and later does exec
on suid-root binary).

Check at read() (actually, ->start() of iterator) time that mm_struct
we'd grabbed and locked is
 - still the ->mm of target
 - equal to reader's ->mm or the target is ptracable by reader.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-02 13:13:27 -08:00
Linus Torvalds
158a962422 Unify /proc/slabinfo configuration
Both SLUB and SLAB really did almost exactly the same thing for
/proc/slabinfo setup, using duplicate code and per-allocator #ifdef's.

This just creates a common CONFIG_SLABINFO that is enabled by both SLUB
and SLAB, and shares all the setup code.  Maybe SLOB will want this some
day too.

Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-02 13:04:48 -08:00
Pekka J Enberg
57ed3eda97 slub: provide /proc/slabinfo
This adds a read-only /proc/slabinfo file on SLUB, that makes slabtop work.

[ mingo@elte.hu: build fix. ]

Cc: Andi Kleen <andi@firstfloor.org>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-01 11:32:02 -08:00
Len Brown
2c83819775 increase PNP_MAX_PORT to 40 from 24
a7839e9606
(PNP: increase the maximum number of resources)
increased PNP_MAX_PORT to 24 from 8.
It also added a test and a complaint when a
machine exceeded the limit, causing:

pnpacpi: exceeded the max number of IO resources: 24

http://bugzilla.kernel.org/show_bug.cgi?id=9535

We should have been squawking about this all along,
as this is a potentially serious issue.

For now, simply burn some dynamic bytes and
increase the limit by another 16 to 40.
There is no guarantee that this will satisfy
every system on Earth.  It probably will not,
but it should be an improvement.

In the future, PNPACPI should allocate resource
structures as needed, rather than max-sized arrays.

Signed-off-by: Len Brown <len.brown@intel.com>
2007-12-27 23:55:13 -05:00
Stephen Hemminger
ecef969e5b [VETH]: move veth.h to include/linux
Move veth.h from net/ to linux/ since it is a user api, and add it to
user header processing Kbuild.

[ Use header-y as suggested by Sam Ravnborg.  -DaveM ]

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-12-26 19:36:35 -08:00
Stephen Hemminger
75ec533ec3 [NET] tc_nat: header install
iproute2 build needs tc_nat.h header from kernel make install_headers.

Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-12-26 19:36:35 -08:00
Christoph Lameter
ed367fc3a7 quicklists: do not release off node pages early
quicklists must keep even off node pages on the quicklists until the TLB
flush has been completed.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-12-23 12:54:36 -08:00
Neil Brown
91212507f9 dm: merge max_hw_sector
Make sure dm honours max_hw_sectors of underlying devices

  We still have no firm testing evidence in support of this patch but
  believe it may help to resolve some bug reports.  - agk

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2007-12-20 17:32:12 +00:00
Linus Torvalds
3e3b3916a9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86
* git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86:
  x86: fix "Kernel panic - not syncing: IO-APIC + timer doesn't work!"
  genirq: revert lazy irq disable for simple irqs
  x86: also define AT_VECTOR_SIZE_ARCH
  x86: kprobes bugfix
  x86: jprobe bugfix
  timer: kernel/timer.c section fixes
  genirq: add unlocked version of set_irq_handler()
  clockevents: fix reprogramming decision in oneshot broadcast
  oprofile: op_model_athlon.c support for AMD family 10h barcelona performance counters
2007-12-18 09:42:44 -08:00
Kevin Hilman
b019e57321 genirq: add unlocked version of set_irq_handler()
Add unlocked version for use by irq_chip.set_type handlers which may
wish to change handler to level or edge handler when IRQ type is
changed.

The normal set_irq_handler() call cannot be used because it tries to
take irq_desc.lock which is already held when the irq_chip.set_type
hook is called.

Signed-off-by: Kevin Hilman <khilman@mvista.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-12-18 18:05:58 +01:00
Linus Torvalds
3c615e19a4 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  Cleanup umem driver: fix most checkpatch warnings, conform to kernel
  block: let elv_register() return void
  as-iosched: fix write batch start point
  as-iosched: fix incorrect comments
  block: use jiffies conversion functions in scsi_ioctl.c
2007-12-18 08:04:24 -08:00
Linus Torvalds
d55653377d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc:
  mmc: remove unused 'mode' from the mmc_host structure
  sdhci: support JMicron JMB38x chips
  sdhci: use PIO when DMA can't satisfy the request
  sdhci: don't warn about sdhci 2.0 controllers
  sdhci: describe quirks
2007-12-18 08:03:01 -08:00
Adrian Bunk
2fdd82bd88 block: let elv_register() return void
elv_register() always returns 0, and there isn't anything it does where
it should return an error (the only error condition is so grave that
it's handled with a BUG_ON).

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-12-18 08:29:28 +01:00
Linus Torvalds
ededa4d396 Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  libata: fix ATAPI draining
  libata: update atapi_eh_request_sense() such that lbam/lbah contains buffer size
  libata-acpi: implement _GTF command filtering
  libata-acpi: improve _GTF execution error handling and reporting
  libata-acpi: improve ACPI disabling
  libata-acpi: implement dev->gtf_cache and evaluate _GTF right after _STM during resume
  libata-acpi: implement and use ata_acpi_init_gtm()
  libata-acpi: add new hooks ata_acpi_dissociate() and ata_acpi_on_disable()
  libata: ata_dev_disable() should be called from EH context
  libata: add more opcodes to ata.h
  libata: update ata_*_printk() macros such that level can be a variable
  libata-acpi: adjust constness in ata_acpi_gtm/stm() parameters
  sata_mv: improve warnings about Highpoint RocketRAID 23xx cards
  libata: add ST3160023AS / 3.42 to NCQ blacklist
  libata: clear link->eh_info.serror from ata_std_postreset()
  sata_sil: fix spurious IRQ handling
2007-12-17 19:29:32 -08:00
Nishanth Aravamudan
368d2c6358 Revert "hugetlb: Add hugetlb_dynamic_pool sysctl"
This reverts commit 54f9f80d65 ("hugetlb:
Add hugetlb_dynamic_pool sysctl")

Given the new sysctl nr_overcommit_hugepages, the boolean dynamic pool
sysctl is not needed, as its semantics can be expressed by 0 in the
overcommit sysctl (no dynamic pool) and non-0 in the overcommit sysctl
(pool enabled).

(Needed in 2.6.24 since it reverts a post-2.6.23 userspace-visible change)

Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Acked-by: Adam Litke <agl@us.ibm.com>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-12-17 19:28:17 -08:00
Nishanth Aravamudan
d1c3fb1f8f hugetlb: introduce nr_overcommit_hugepages sysctl
hugetlb: introduce nr_overcommit_hugepages sysctl

While examining the code to support /proc/sys/vm/hugetlb_dynamic_pool, I
became convinced that having a boolean sysctl was insufficient:

1) To support per-node control of hugepages, I have previously submitted
patches to add a sysfs attribute related to nr_hugepages. However, with
a boolean global value and per-mount quota enforcement constraining the
dynamic pool, adding corresponding control of the dynamic pool on a
per-node basis seems inconsistent to me.

2) Administration of the hugetlb dynamic pool with multiple hugetlbfs
mount points is, arguably, more arduous than it needs to be. Each quota
would need to be set separately, and the sum would need to be monitored.

To ease the administration, and to help make the way for per-node
control of the static & dynamic hugepage pool, I added a separate
sysctl, nr_overcommit_hugepages. This value serves as a high watermark
for the overall hugepage pool, while nr_hugepages serves as a low
watermark. The boolean sysctl can then be removed, as the condition

	nr_overcommit_hugepages > 0

indicates the same administrative setting as

	hugetlb_dynamic_pool == 1

Quotas still serve as local enforcement of the size of the pool on a
per-mount basis.

A few caveats:

1) There is a race whereby the global surplus huge page counter is
incremented before a hugepage has allocated. Another process could then
try grow the pool, and fail to convert a surplus huge page to a normal
huge page and instead allocate a fresh huge page. I believe this is
benign, as no memory is leaked (the actual pages are still tracked
correctly) and the counters won't go out of sync.

2) Shrinking the static pool while a surplus is in effect will allow the
number of surplus huge pages to exceed the overcommit value. As long as
this condition holds, however, no more surplus huge pages will be
allowed on the system until one of the two sysctls are increased
sufficiently, or the surplus huge pages go out of use and are freed.

Successfully tested on x86_64 with the current libhugetlbfs snapshot,
modified to use the new sysctl.

Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Acked-by: Adam Litke <agl@us.ibm.com>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-12-17 19:28:17 -08:00
Adam Jackson
8d936626dd apm_event{,info}_t are userspace types
These types define the size of data read from /dev/apm_bios.  They should
not be hidden behind #ifdef __KERNEL__.

This is killing my xserver compile, apm_event_t is used in the xserver
source.

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-12-17 19:28:16 -08:00
Andrew Morton
755271358c fix headers_install
make[3]: *** No rule to make target `/usr/src/devel/include/linux/ticable.h', needed by `/usr/src/devel/usr/include/linux/ticable.h'.  Stop.

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-12-17 19:28:15 -08:00
Tejun Heo
140b5e5911 libata: fix ATAPI draining
With ATAPI transfer chunk size properly programmed, libata PIO HSM
should be able to handle full spurious data chunks.  Also, it's a good
idea to suppress trailing data warning for misc ATAPI commands as
there can be many of them per command - for example, if the chunk size
is 16 and the drive tries to transfer 510 bytes, there can be 31
trailing data messages.

This patch makes the following updates to libata ATAPI PIO HSM
implementation.

* Make it drain full spurious chunks.

* Suppress trailing data warning message for misc commands.

* Put limit on how many bytes can be drained.

* If odd, round up consumed bytes and the number of bytes to be
  drained.  This gets the number of bytes to drain right for drivers
  which do 16bit PIO.

This patch is partial backport of improve-ATAPI-data-xfer patchset
pending for #upstream.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-12-17 20:43:28 -05:00
Tejun Heo
398e07826b libata-acpi: implement dev->gtf_cache and evaluate _GTF right after _STM during resume
On certain implementations, _GTF evaluation depends on preceding _STM
and both can be pretty picky about the configuration.  Using _GTM
result cached during controller initialization satisfies the most
neurotic _STM implementation.  However, libata evaluates _GTF after
reset during device configuration and the hardware state can be
different from what _GTF expects and can cause evaluation failure.

This patch adds dev->gtf_cache and updates ata_dev_get_GTF() such that
it uses the cached value if available.  Cache is cleared with a call
to ata_acpi_clear_gtf().

Because for SATA ACPI nodes _GTF must be evaluated after _SDD which
can't be done till IDENTIFY is complete, _GTF caching from
ata_acpi_on_resume() is used only for IDE ACPI nodes.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-12-17 20:33:14 -05:00
Tejun Heo
c05e6ff035 libata-acpi: implement and use ata_acpi_init_gtm()
_GTM fetches currently configured transfer mode while _STM configures
controller according to _GTM parameter and prepares transfer mode
configuration TFs for _GTF.  In many cases _GTM and _STM
implementations are quite brittle and can't cope with configuration
changed by libata.

libata does not depend on ATA ACPI to configure devices.  The only
reason libata performs _GTM and _STM are to make _GTF evaluation
succeed and libata also doesn't care about how _GTF TFs configure
transfer mode.  It overrides that configuration anyway, so from
libata's POV, it doesn't matter what value is feeded to _STM as long
as evaluation succeeds for _STM and following _GTF.

This patch adds dev->__acpi_init_gtm and store initial _GTM values on
host initialization before modified by reset and mode configuration.
If the field is valid, ata_acpi_init_gtm() returns pointer to the
saved _GTM structure; otherwise, NULL.

This saved value is used for _STM during resume and peek at
BIOS/firmware programmed initial timing for later use.  The accessor
is there to make building w/o ACPI easy as dev->__acpi_init doesn't
exist if ACPI is not enabled.

On driver detach, the initial BIOS configuration is restored by
executing _STM with the initial _GTM values such that the next driver
can also use the initial BIOS configured values.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-12-17 20:33:14 -05:00
Tejun Heo
ce2e0abbd3 libata: add more opcodes to ata.h
Add constants for DEVICE CONFIGURATION OVERLAY and SET_MAX to
include/linux/ata.h.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-12-17 20:33:12 -05:00
Tejun Heo
c2e366a107 libata: update ata_*_printk() macros such that level can be a variable
Make prink helpers format @lv together rather than prepending to the
format string as constant.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-12-17 20:33:12 -05:00