Commit graph

163625 commits

Author SHA1 Message Date
Philippe Gerum
6b8019c85e Blackfin: allow high priority domains to preempt schedule_tail()
ret_from_fork is always entered with hw interrupts off, which prevents
real-time domains to preempt the Linux kernel during part of the
initial context switch to the new task, which could in turn raise the
worst-case latency figures.

To avoid this, stall the root domain stage in the interrupt pipeline
to keep the scheduling tail code free from Linux-handled IRQs, then
enable hardware interrupts again.

Signed-off-by: Philippe Gerum <rpm@xenomai.org>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2009-09-16 21:28:33 -04:00
Philippe Gerum
bc569f1a77 Blackfin: export show_stack() to modules
Signed-off-by: Philippe Gerum <rpm@xenomai.org>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2009-09-16 21:28:32 -04:00
Philippe Gerum
b9c7eb498d Blackfin: fix misnomer of some I-pipe helpers
__ipipe_{stall, unstall}_root_raw() identifiers may leave the reader
under the impression that only the virtual state is affected by these
operations, which is wrong. Pick names following the convention used
throughout the interrupt pipeline code.

Signed-off-by: Philippe Gerum <rpm@xenomai.org>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2009-09-16 21:28:30 -04:00
Philippe Gerum
d8ca63955a Blackfin: checkpatch --file arch/blackfin/kernel/ipipe.c
Signed-off-by: Philippe Gerum <rpm@xenomai.org>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2009-09-16 21:28:29 -04:00
Robin Getz
ae4f073c40 Blackfin: make EVT3->EVT5 lowering more robust wrt IPEND[4]
We handle many exceptions at EVT5 (hardware error level) so that we can
catch exceptions in our exception handling code.  Today - if the global
interrupt enable bit (IPEND[4]) is set (interrupts disabled) our trap
handling code goes into a infinite loop, since we need interrupts to be
on to defer things to EVT5.

Normal kernel code should not trigger this for any reason as IPEND[4] gets
cleared early (when doing an interrupt context save) and the kernel stack
there should be sane (or something much worse is happening in the system).
But there have been a few times where this has happened, so this change
makes sure we dump a proper crash message even when things have gone south.

Signed-off-by: Robin Getz <robin.getz@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2009-09-16 21:28:28 -04:00
Barry Song
d4b834c139 Blackfin: bf537-stamp: add resources for AD1938 audio card
Signed-off-by: Barry Song <barry.song@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2009-09-16 21:28:26 -04:00
Yi Li
e68d1ebc30 Blackfin: bf537-stamp: declare SPI IRQ resources
Signed-off-by: Yi Li <yi.li@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2009-09-16 21:28:24 -04:00
Michael Hennerich
f39d56ec46 Blackfin: bf537-stamp: update ADP5588 header name
Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2009-09-16 21:28:23 -04:00
Jens Rosenboom
3933fc952a ipv6: Ignore route option with ROUTER_PREF_INVALID
RFC4191 says that "If the Reserved (10) value is received, the Route
Information Option MUST be ignored.", so this patch makes us conform
to the RFC. This is different to the usage of the Default Router
Preference, where an invalid value must indeed be treated as
PREF_MEDIUM.

Signed-off-by: Jens Rosenboom <me@jayr.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-16 17:10:38 -07:00
Jiri Pirko
b9f602533e bonding: make ab_arp select active slaves as other modes
When I was implementing primary_passive option (formely named primary_lazy) I've
run into troubles with ab_arp. This is the only mode which is not using
bond_select_active_slave() function to select active slave and instead it
selects it itself. This seems to be not the right behaviour and it would be
better to do it in bond_select_active_slave() for all cases. This patch makes
this happen. Please review.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-16 17:04:58 -07:00
David S. Miller
c127bdf9f6 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2009-09-16 17:01:24 -07:00
Johannes Berg
bbac31f4c0 cfg80211: fix SME connect
There's a check saying
	/* we're good if we have both BSSID and channel */
	if (wdev->conn->params.bssid && wdev->conn->params.channel) {

but that isn't true -- we need the BSS struct. This leads
to errors such as

    Trying to associate with 00:1b:53:11:dc:40 (SSID='TEST' freq=2412 MHz)
    ioctl[SIOCSIWFREQ]: No such file or directory
    ioctl[SIOCSIWESSID]: No such file or directory
    Association request to the driver failed
    Associated with 00:1b:53:11:dc:40

in wpa_supplicant, as reported by Holger.

Instead, we really need to have the BSS struct, and if we
don't, then we need to initiate a scan for it. But we may
already have the BSS struct here, so hang on to it if we
do and scan if we don't.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Tested-by: Holger Schurig <hs4233@mail.mn-solutions.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-09-16 16:21:00 -04:00
Pavel Roskin
8c6c03fe23 rc80211_minstrel: fix contention window calculation
The contention window is supposed to be a power of two minus one, i.e.
15, 31, 63, 127...  minstrel_rate_init() forgets to subtract 1, so the
sequence becomes 15, 32, 66, 134...

Bug reported by Dan Halperin <dhalperi@cs.washington.edu>

Signed-off-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-09-16 16:21:00 -04:00
Randy Dunlap
de32cce132 ssb/sdio: fix printk format warnings
Fix printk format warnings:

drivers/ssb/sdio.c:336: warning: format '%u' expects type 'unsigned int', but argument 7 has type 'size_t'
drivers/ssb/sdio.c:443: warning: format '%u' expects type 'unsigned int', but argument 7 has type 'size_t'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-09-16 16:21:00 -04:00
Christian Lamparter
f7f71173ea p54usb: add Zcomax XG-705A usbid
This patch adds a new usbid for Zcomax XG-705A to the device table.

Cc: stable@kernel.org
Reported-by: Jari Jaakola <jari.jaakola@gmail.com>
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-09-16 16:20:59 -04:00
Huang Weiyi
d4e54e871f ASoC: remove unused #include <linux/version.h>
Remove unused #include <linux/version.h>('s) in
  sound/soc/codecs/ad1836.c
  sound/soc/codecs/ad1938.c
  sound/soc/codecs/wm8974.c

Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2009-09-16 21:08:54 +01:00
Anirban Sinha
f4c3f03838 Fix ia64 build breakage in head.S
The "cleanup console_print()" patch in commit
353f6dd2de introduced an "extern"
declaration into an assembly language file.  Remove it.

Signed-off-by: Anirban Sinha <asinha@zeugmasystems.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-16 12:28:52 -07:00
Ingo Molnar
40d9d82c8a Merge branch 'tip/tracing/core4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/core 2009-09-16 21:16:37 +02:00
Muralidharan Karicheri
85609c1ccd DaVinci: DM646x - platform changes for vpif capture and display drivers
VPIF display changes (Chaithrika)

Add platform device and resource structures. Also define a platform specific
clock setup function that can be accessed by the driver to configure the clock
and CPLD.

VPIF caputure changes (Murali)

1) Modify vpif_subdev_info to add board_info, routing information and
   vpif interface configuration. Remove addr since it is part of
   board_info

2) Add code to setup channel mode and input decoder path for vpif
   capture driver

Also incorporated comments against version v0 of the patch series and
added a spinlock to protect writes to common registers

Tested on DM6467 on channel 0 using TVP514x. Following bootargs used
for drivers:

   vpif_capture.ch0_bufsize=829440 vpif_display.ch2_bufsize=829440

Signed-off-by: Manjunath Hadli <mrh@ti.com>
Signed-off-by: Brijesh Jadav <brijesh.j@ti.com>
Signed-off-by: Chaithrika U S <chaithrika@ti.com>
Reviewed-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Muralidharan Karicheri <m-karicheri2@ti.com>
Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
2009-09-16 10:28:46 -07:00
Muralidharan Karicheri
51e68e27d3 davinci: DM355 - platform changes for vpfe capture
DM355 platform and board setup

This has platform and board setup changes to support vpfe capture
driver for DM355 EVMs.

Tested video capture on DM355 using tvp514x

Reviewed-by: Hans Verkuil <hverkuil@xs4all.nl>
Reviewed-by: Laurent Pinchart <laurent.pinchart@skynet.be>
Reviewed-by: David Brownell <david-b@pacbell.net>
Signed-off-by: Muralidharan Karicheri <m-karicheri2@ti.com>
Signed-off-by: Denys Dmytriyenko <denis@denix.org>
Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
2009-09-16 10:25:45 -07:00
Muralidharan Karicheri
ab8e8df874 davinci: DM644x platform changes for vpfe capture
DM644x platform and board setup

This adds platform and board setup changes required to support
vpfe capture driver on DM644x

Tested video capture on DM6446 with tvp514x driver

Reviewed-by: Hans Verkuil <hverkuil@xs4all.nl>
Reviewed-by: Laurent Pinchart <laurent.pinchart@skynet.be>
Reviewed-by: David Brownell <david-b@pacbell.net>
Signed-off-by: Muralidharan Karicheri <m-karicheri2@ti.com>
Signed-off-by: Denys Dmytriyenko <denis@denix.org>
Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
2009-09-16 10:25:26 -07:00
Jan Kara
56fcad29d4 ext3: Flush disk caches on fsync when needed
In case we fsync() a file and inode is not dirty, we don't force a transaction
to disk and hence don't flush disk caches. Thus file data could be just in disk
caches and not on persistent storage. Fix the problem by flushing disk caches
if we didn't force a transaction commit.

Signed-off-by: Jan Kara <jack@suse.cz>
2009-09-16 17:44:11 +02:00
Chris Mason
4f003fd32b ext3: Add locking to ext3_do_update_inode
I've been struggling with this off and on while I've been testing the
data=guarded work.  The symptom is corrupted orphan lists and inodes
with the wrong i_size stored on disk.  I was convinced the
data=guarded code was just missing a call to ext3_mark_inode_dirty, but
tracing showed the i_disksize I was sending to ext3_mark_inode_dirty
wasn't actually making it to the drive.

ext3_mark_inode_dirty can be called without locks held (atime updates
and a few others), so the data=guarded code uses locks while updating
the in-memory inode, and then calls ext3_mark_inode_dirty
without any locks held.

But, ext3_mark_inode_dirty has no internal locking to make sure that
only one CPU is updating the buffer head at a time.  Generally this
works out ok because everyone that changes the inode then calls
ext3_mark_inode_dirty themselves.  Even though it races, eventually
someone updates the buffer heads and things move on.

But there is still a risk of the wrong values getting in, and the
data=guarded code seems to hit the race very often.

Since everyone that changes the inode also logs it, it should be
possible to fix this with some memory barriers.  I'll leave that as an
exercise to the reader and lock the buffer head instead.

It it probably a good idea to have a different patch series for lockless
bit flipping on the ext3 i_state field.  ext3_do_update_inode &= clears
EXT3_STATE_NEW without any locks held.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2009-09-16 17:44:11 +02:00
Jan Kara
00171d3c7e ext3: Fix possible deadlock between ext3_truncate() and ext3_get_blocks()
During truncate we are sometimes forced to start a new transaction as the
amount of blocks to be journaled is both quite large and hard to predict. So
far we restarted a transaction while holding truncate_mutex and that violates
lock ordering because truncate_mutex ranks below transaction start (and it
can lead to a real deadlock with ext3_get_blocks() allocating new blocks
from ext3_writepage()).

Luckily, the problem is easy to fix: We just drop the truncate_mutex before
restarting the transaction and acquire it afterwards. We are safe to do this as
by the time ext3_truncate() is called, all the page cache for the truncated
part of the file is dropped and so writepage() cannot come and allocate new
blocks in the part of the file we are truncating. The rest of writers is
stopped by us holding i_mutex.

Signed-off-by: Jan Kara <jack@suse.cz>
2009-09-16 17:44:11 +02:00
Jan Kara
3adae9da0b jbd: Annotate transaction start also for journal_restart()
lockdep annotation for a transaction start has been at the end of
journal_start(). But a transaction is also started from journal_restart(). Move
the lockdep annotation to start_this_handle() which covers both cases.

Signed-off-by: Jan Kara <jack@suse.cz>
2009-09-16 17:44:10 +02:00
Jan Kara
9c28cbccec jbd: Journal block numbers can ever be only 32-bit use unsigned int for them
It does not make sense to store block number for journal as unsigned long
since they can be only 32-bit (because of on-disk format limitation). So
change in-memory structures and variables to use unsigned int instead.

Signed-off-by: Jan Kara <jack@suse.cz>
2009-09-16 17:44:10 +02:00
Jan Kara
19003c18e9 ext3: Update MAINTAINERS for ext3 and JBD
Stephen agreed that he's unlikely to find time for working on ext3/JBD in the
near future and is not working on it for some time already. So remove him.
Added myself to JBD since after Andrew I'm probably the second most sensible
contact ;).

CC: Stephen Tweedie <sct@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2009-09-16 17:44:10 +02:00
Andreas Dilger
b449fc6fcc JBD: round commit timer up to avoid uncommitted transaction
Fix jiffie rounding in jbd commit timer setup code.  Rounding down could cause
the timer to be fired before the corresponding transaction has expired.  That
transaction can stay not committed forever if no new transaction is created or
explicit sync/umount happens.

Signed-off-by: Andreas Dilger <adilger@sun.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2009-09-16 17:44:10 +02:00
Linus Torvalds
ab86e5765d Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6:
  Driver Core: devtmpfs - kernel-maintained tmpfs-based /dev
  debugfs: Modify default debugfs directory for debugging pktcdvd.
  debugfs: Modified default dir of debugfs for debugging UHCI.
  debugfs: Change debugfs directory of IWMC3200
  debugfs: Change debuhgfs directory of trace-events-sample.h
  debugfs: Fix mount directory of debugfs by default in events.txt
  hpilo: add poll f_op
  hpilo: add interrupt handler
  hpilo: staging for interrupt handling
  driver core: platform_device_add_data(): use kmemdup()
  Driver core: Add support for compatibility classes
  uio: add generic driver for PCI 2.3 devices
  driver-core: move dma-coherent.c from kernel to driver/base
  mem_class: fix bug
  mem_class: use minor as index instead of searching the array
  driver model: constify attribute groups
  UIO: remove 'default n' from Kconfig
  Driver core: Add accessor for device platform data
  Driver core: move dev_get/set_drvdata to drivers/base/dd.c
  Driver core: add new device to bus's list before probing
2009-09-16 08:27:10 -07:00
Kurt Roeckx
f0adb134d8 [CPUFREQ] Fix NULL ptr regression in powernow-k8
Fixes bugzilla #13780

From: Kurt Roeckx <kurt@roeckx.be>
Signed-off-by: Dave Jones <davej@redhat.com>
2009-09-16 11:18:55 -04:00
Linus Torvalds
7ea61767e4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6: (641 commits)
  Staging: remove sxg driver
  Staging: remove heci driver
  Staging: remove at76_usb wireless driver.
  Staging: rspiusb: remove the driver
  Staging: meilhaus: remove the drivers
  Staging: remove me4000 driver.
  Staging: line6: ffzb returns an unsigned integer
  Staging: line6: pod.c: style cleanups
  Staging: iio: introduce missing kfree
  Staging: dream: introduce missing kfree
  Staging: comedi: addi-data: NULL dereference of amcc in v_pci_card_list_init()
  Staging: vt665x: fix built-in compiling
  Staging: rt3090: enable NATIVE_WPA_SUPPLICANT_SUPPORT option
  Staging: rt3090: port changes in WPA_MIX_PAIR_CIPHER to rt3090
  Staging: rt3090: rename device from raX to wlanX
  Staging: rt3090: remove possible conflict with rt2860
  Staging: rt2860/rt2870/rt3070/rt3090: fix compiler warning on x86_64
  Staging: rt2860: add new device ids
  Staging: rt3090: add device id 1462:891a
  Staging: asus_oled: Cleaned up checkpatch issues.
  ...
2009-09-16 08:11:54 -07:00
Linus Torvalds
0950efd1a1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/pcmcia-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/pcmcia-2.6:
  pcmcia: document return value of pcmcia_loop_config
  pcmcia: dtl1_cs: fix pcmcia_loop_config logic
  pcmcia: drop non-existant includes
  pcmcia: disable prefetch/burst for OZ6933
  pcmcia: fix incorrect argument order to list_add_tail()
  pcmcia: drivers/pcmcia/pcmcia_resource.c: Remove unnecessary semicolons
  pcmcia: Use phys_addr_t for physical addresses
  pcmcia: drivers/pcmcia: Make static
2009-09-16 08:11:23 -07:00
Linus Torvalds
4406c56d0a Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (75 commits)
  PCI hotplug: clean up acpi_run_hpp()
  PCI hotplug: acpiphp: use generic pci_configure_slot()
  PCI hotplug: shpchp: use generic pci_configure_slot()
  PCI hotplug: pciehp: use generic pci_configure_slot()
  PCI hotplug: add pci_configure_slot()
  PCI hotplug: clean up acpi_get_hp_params_from_firmware() interface
  PCI hotplug: acpiphp: don't cache hotplug_params in acpiphp_bridge
  PCI hotplug: acpiphp: remove superfluous _HPP/_HPX evaluation
  PCI: Clear saved_state after the state has been restored
  PCI PM: Return error codes from pci_pm_resume()
  PCI: use dev_printk in quirk messages
  PCI / PCIe portdrv: Fix pcie_portdrv_slot_reset()
  PCI Hotplug: convert acpi_pci_detect_ejectable() to take an acpi_handle
  PCI Hotplug: acpiphp: find bridges the easy way
  PCI: pcie portdrv: remove unused variable
  PCI / ACPI PM: Propagate wake-up enable for devices w/o ACPI support
  ACPI PM: Replace wakeup.prepared with reference counter
  PCI PM: Introduce device flag wakeup_prepared
  PCI / ACPI PM: Rework some debug messages
  PCI PM: Simplify PCI wake-up code
  ...

Fixed up conflict in arch/powerpc/kernel/pci_64.c due to OF device tree
scanning having been moved and merged for the 32- and 64-bit cases.  The
'needs_freset' initialization added in 6e19314cc ("PCI/powerpc: support
PCIe fundamental reset") is now in arch/powerpc/kernel/pci_of_scan.c.
2009-09-16 07:49:54 -07:00
Linus Torvalds
6b7b352f21 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  block: fix linkage problem with blk_iopoll and !CONFIG_BLOCK
2009-09-16 07:46:34 -07:00
Linus Torvalds
a3eb51ecfa Merge branch 'writeback' of git://git.kernel.dk/linux-2.6-block
* 'writeback' of git://git.kernel.dk/linux-2.6-block:
  writeback: fix possible bdi writeback refcounting problem
  writeback: Fix bdi use after free in wb_work_complete()
  writeback: improve scalability of bdi writeback work queues
  writeback: remove smp_mb(), it's not needed with list_add_tail_rcu()
  writeback: use schedule_timeout_interruptible()
  writeback: add comments to bdi_work structure
  writeback: splice dirty inode entries to default bdi on bdi_destroy()
  writeback: separate starting of sync vs opportunistic writeback
  writeback: inline allocation failure handling in bdi_alloc_queue_work()
  writeback: use RCU to protect bdi_list
  writeback: only use bdi_writeback_all() for WB_SYNC_NONE writeout
  fs: Assign bdi in super_block
  writeback: make wb_writeback() take an argument structure
  writeback: merely wakeup flusher thread if work allocation fails for WB_SYNC_NONE
  writeback: get rid of wbc->for_writepages
  fs: remove bdev->bd_inode_backing_dev_info
2009-09-16 07:45:38 -07:00
Peter Zijlstra
182a85f8a1 sched: Disable wakeup balancing
Sysbench thinks SD_BALANCE_WAKE is too agressive and kbuild doesn't
really mind too much, SD_BALANCE_NEWIDLE picks up most of the
slack.

On a dual socket, quad core, dual thread nehalem system:

sysbench (--num_threads=16):

 SD_BALANCE_WAKE-: 13982 tx/s
 SD_BALANCE_WAKE+: 15688 tx/s

kbuild (-j16):

 SD_BALANCE_WAKE-: 47.648295846  seconds time elapsed   ( +-   0.312% )
 SD_BALANCE_WAKE+: 47.608607360  seconds time elapsed   ( +-   0.026% )

(same within noise)

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-16 16:44:33 +02:00
Peter Zijlstra
5a9b86f647 sched: Rename flags to wake_flags
For consistencies sake, rename the argument (again).

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-16 16:44:33 +02:00
Peter Zijlstra
5158f4e442 sched: Clean up the load_idx selection in select_task_rq_fair
Clean up the code a little.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-16 16:44:32 +02:00
Peter Zijlstra
3b64089422 sched: Optimize cgroup vs wakeup a bit
We don't need to call update_shares() for each domain we iterate,
just got the largets one.

However, we should call it before wake_affine() as well, so that
that can use up-to-date values too.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-16 16:44:32 +02:00
Nick Piggin
1ef7d9aa32 writeback: fix possible bdi writeback refcounting problem
wb_clear_pending AFAIKS should not be called after the item has been
put on the list, except by the worker threads. It could lead to the
situation where the refcount is decremented below 0 and cause lots of
problems.

Presumably the !wb_has_dirty_io case is not a common one, so it can
be discovered when the thread wakes up to check?

Also add a comment in bdi_work_clear.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-16 15:18:53 +02:00
Nick Piggin
77b9d059cb writeback: Fix bdi use after free in wb_work_complete()
By the time bdi_work_on_stack gets evaluated again in bdi_work_free, it
can already have been deallocated and used for something else in the
!on stack case, giving a false positive in this test and causing
corruption.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-16 15:18:52 +02:00
Nick Piggin
77fad5e625 writeback: improve scalability of bdi writeback work queues
If you're going to do an atomic RMW on each list entry, there's not much
point in all the RCU complexities of the list walking. This is only going
to help the multi-thread case I guess, but it doesn't hurt to do now.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-16 15:18:52 +02:00
Nick Piggin
deed62edff writeback: remove smp_mb(), it's not needed with list_add_tail_rcu()
list_add_tail_rcu contains required barriers.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-16 15:18:52 +02:00
Jens Axboe
49db041430 writeback: use schedule_timeout_interruptible()
Gets rid of a manual set_current_state().

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-16 15:18:52 +02:00
Jens Axboe
8010c3b634 writeback: add comments to bdi_work structure
And document its retriever, get_next_work_item().

Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-16 15:18:52 +02:00
Jens Axboe
ce5f8e7795 writeback: splice dirty inode entries to default bdi on bdi_destroy()
We cannot safely ensure that the inodes are all gone at this point
in time, and we must not destroy this bdi with inodes having off it.
So just splice our entries to the default bdi since that one will
always persist.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-16 15:18:52 +02:00
Jens Axboe
b6e51316da writeback: separate starting of sync vs opportunistic writeback
bdi_start_writeback() is currently split into two paths, one for
WB_SYNC_NONE and one for WB_SYNC_ALL. Add bdi_sync_writeback()
for WB_SYNC_ALL writeback and let bdi_start_writeback() handle
only WB_SYNC_NONE.

Push down the writeback_control allocation and only accept the
parameters that make sense for each function. This cleans up
the API considerably.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-16 15:18:52 +02:00
Jens Axboe
bcddc3f01c writeback: inline allocation failure handling in bdi_alloc_queue_work()
This gets rid of work == NULL in bdi_queue_work() and puts the
OOM handling where it belongs.

Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-16 15:18:52 +02:00
Jens Axboe
cfc4ba5365 writeback: use RCU to protect bdi_list
Now that bdi_writeback_all() no longer handles integrity writeback,
it doesn't have to block anymore. This means that we can switch
bdi_list reader side protection to RCU.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-16 15:18:51 +02:00
Jens Axboe
f11fcae840 writeback: only use bdi_writeback_all() for WB_SYNC_NONE writeout
Data integrity writeback must use bdi_start_writeback() and ensure
that wbc->sb and wbc->bdi are set.

Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-16 15:18:51 +02:00