Commit graph

4964 commits

Author SHA1 Message Date
Linus Torvalds
a15a82f42c Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  Revert "x86: default to reboot via ACPI"
  x86: align DirectMap in /proc/meminfo
  AMD IOMMU: fix lazy IO/TLB flushing in unmap path
  x86: add smp_mb() before sending INVALIDATE_TLB_VECTOR
  x86: remove VISWS and PARAVIRT around NR_IRQS puzzle
  x86: mention ACPI in top-level Kconfig menu
  x86: size NR_IRQS on 32-bit systems the same way as 64-bit
  x86: don't allow nr_irqs > NR_IRQS
  x86/docs: remove noirqbalance param docs
  x86: don't use tsc_khz to calculate lpj if notsc is passed
  x86, voyager: fix smp_intr_init() compile breakage
  AMD IOMMU: fix detection of NP capable IOMMUs
2008-11-06 15:57:24 -08:00
Jeremy Fitzhardinge
47cb2ed9df x86, xen: fix use of pgd_page now that it really does return a page
Impact: fix 32-bit Xen guest boot crash

On 32-bit PAE, pud_page, for no good reason, didn't really return a
struct page *.  Since Jan Beulich's fix "i386/PAE: fix pud_page()",
pud_page does return a struct page *.

Because PAE has 3 pagetable levels, the pud level is folded into the
pgd level, so pgd_page() is the same as pud_page(), and now returns
a struct page *.  Update the xen/mmu.c code which uses pgd_page()
accordingly.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-06 23:20:47 +01:00
Eduardo Habkost
8d00450d29 Revert "x86: default to reboot via ACPI"
This reverts commit c7ffa6c262.

the assumptio of this change was that this would not break
any existing machine. Andrey Borzenkov reported troubles with
the ACPI reboot method: the system would hang on reboot, necessiating
a power cycle. Probably more systems are affected as well.

Also, there are patches queued up for v2.6.29 to disable virtualization
on emergency_restart() - which was the original motivation of
this change.

Reported-by: Andrey Borzenkov <arvidjaar@mail.ru>
Bisected-by: Andrey Borzenkov <arvidjaar@mail.ru>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-06 16:05:06 +01:00
Hugh Dickins
b9c3bfc24e x86: align DirectMap in /proc/meminfo
Impact: right-align /proc/meminfo consistent with other fields

When the split-LRU patches added Inactive(anon) and Inactive(file) lines
to /proc/meminfo, all counts were moved two columns rightwards to fit in.
Now move x86's DirectMap lines two columns rightwards to line up.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-06 15:27:37 +01:00
Ingo Molnar
31f297143b Merge branch 'iommu-fixes-2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommu into x86/urgent 2008-11-06 15:23:35 +01:00
Joerg Roedel
80be308dfa AMD IOMMU: fix lazy IO/TLB flushing in unmap path
Lazy flushing needs to take care of the unmap path too which is not yet
implemented and leads to stale IO/TLB entries. This is fixed by this
patch.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2008-11-06 14:59:05 +01:00
Suresh Siddha
d6f0f39b7d x86: add smp_mb() before sending INVALIDATE_TLB_VECTOR
Impact: fix rare x2apic hang

On x86, x2apic mode accesses for sending IPI's don't have serializing
semantics. If the IPI receivner refers(in lock-free fashion) to some
memory setup by the sender, the need for smp_mb() before sending the
IPI becomes critical in x2apic mode.

Add the smp_mb() in native_flush_tlb_others() before sending the IPI.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-06 09:41:49 +01:00
Yinghai Lu
7db282fa67 x86: remove VISWS and PARAVIRT around NR_IRQS puzzle
Impact: fix warning message when PARAVIRT is set in config

Remove stale #ifdef components from our IRQ sizing logic.
x86/Voyager is the only holdout.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-06 09:35:34 +01:00
Bjorn Helgaas
da85f865b1 x86: mention ACPI in top-level Kconfig menu
Impact: clarify menuconfig text

Mention ACPI in the top-level menu to give a clue as to where
it lives. This matches what ia64 does.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-06 08:16:19 +01:00
Yinghai Lu
1b48976880 x86: size NR_IRQS on 32-bit systems the same way as 64-bit
Impact: make NR_IRQS big enough for system with lots of apic/pins

If lots of IO_APIC's are there (or can be there), size the same way
as 64-bit, depending on MAX_IO_APICS and NR_CPUS.

This fixes the boot problem reported by Ben Hutchings on a 32-bit
server with 5 IO-APICs and 240 IO-APIC pins.

Signed-off-by: Yinghai <yinghai@kernel.org>
Tested-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-06 07:23:22 +01:00
Ben Hutchings
c78d0cf292 x86: don't allow nr_irqs > NR_IRQS
Impact: fix boot hang on 32-bit systems with more than 224 IO-APIC pins

On some 32-bit systems with a lot of IO-APICs probe_nr_irqs() can
return a value larger than NR_IRQS. This will lead to probe_irq_on()
overrunning the irq_desc array.

I hit this when running net-next-2.6 (close to 2.6.28-rc3) on a
Supermicro dual Xeon system.  NR_IRQS is 224 but probe_nr_irqs() detects
5 IOAPICs and returns 240.  Here are the log messages:

Tue Nov  4 16:53:47 2008 ACPI: IOAPIC (id[0x01] address[0xfec00000] gsi_base[0])
Tue Nov  4 16:53:47 2008 IOAPIC[0]: apic_id 1, version 32, address 0xfec00000, GSI 0-23
Tue Nov  4 16:53:47 2008 ACPI: IOAPIC (id[0x02] address[0xfec81000] gsi_base[24])
Tue Nov  4 16:53:47 2008 IOAPIC[1]: apic_id 2, version 32, address 0xfec81000, GSI 24-47
Tue Nov  4 16:53:47 2008 ACPI: IOAPIC (id[0x03] address[0xfec81400] gsi_base[48])
Tue Nov  4 16:53:47 2008 IOAPIC[2]: apic_id 3, version 32, address 0xfec81400, GSI 48-71
Tue Nov  4 16:53:47 2008 ACPI: IOAPIC (id[0x04] address[0xfec82000] gsi_base[72])
Tue Nov  4 16:53:47 2008 IOAPIC[3]: apic_id 4, version 32, address 0xfec82000, GSI 72-95
Tue Nov  4 16:53:47 2008 ACPI: IOAPIC (id[0x05] address[0xfec82400] gsi_base[96])
Tue Nov  4 16:53:47 2008 IOAPIC[4]: apic_id 5, version 32, address 0xfec82400, GSI 96-119
Tue Nov  4 16:53:47 2008 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 high edge)
Tue Nov  4 16:53:47 2008 ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
Tue Nov  4 16:53:47 2008 Enabling APIC mode:  Flat.  Using 5 I/O APICs

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-06 07:23:21 +01:00
Ingo Molnar
9fcd18c9e6 sched: re-tune balancing
Impact: improve wakeup affinity on NUMA systems, tweak SMP systems

Given the fixes+tweaks to the wakeup-buddy code, re-tweak the domain
balancing defaults on NUMA and SMP systems.

Turn on SD_WAKE_AFFINE which was off on x86 NUMA - there's no reason
why we would not want to have wakeup affinity across nodes as well.
(we already do this in the standard NUMA template.)

lat_ctx on a NUMA box is particularly happy about this change:

before:

 |   phoenix:~/l> ./lat_ctx -s 0 2
 |   "size=0k ovr=2.60
 |   2 5.70

after:

 |   phoenix:~/l> ./lat_ctx -s 0 2
 |   "size=0k ovr=2.65
 |   2 2.07

a 2.75x speedup.

pipe-test is similarly happy about it too:

 |  phoenix:~/sched-tests> ./pipe-test
 |   18.26 usecs/loop.
 |   14.70 usecs/loop.
 |   14.38 usecs/loop.
 |   10.55 usecs/loop.              # +WAKE_AFFINE on domain0+domain1
 |   8.63 usecs/loop.
 |   8.59 usecs/loop.
 |   9.03 usecs/loop.
 |   8.94 usecs/loop.
 |   8.96 usecs/loop.
 |   8.63 usecs/loop.

Also:

 - disable SD_BALANCE_NEWIDLE on NUMA and SMP domains (keep it for siblings)
 - enable SD_WAKE_BALANCE on SMP domains

Sysbench+postgresql improves all around the board, quite significantly:

           .28-rc3-11474e2c  .28-rc3-11474e2c-tune
-------------------------------------------------
    1:             571              688    +17.08%
    2:            1236             1206    -2.55%
    4:            2381             2642    +9.89%
    8:            4958             5164    +3.99%
   16:            9580             9574    -0.07%
   32:            7128             8118    +12.20%
   64:            7342             8266    +11.18%
  128:            7342             8064    +8.95%
  256:            7519             7884    +4.62%
  512:            7350             7731    +4.93%
-------------------------------------------------
  SUM:           55412            59341    +6.62%

So it's a win both for the runup portion, the peak area and the tail.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-05 18:04:38 +01:00
Alok Kataria
70de9a9704 x86: don't use tsc_khz to calculate lpj if notsc is passed
Impact: fix udelay when "notsc" boot parameter is passed

With notsc passed on commandline, tsc may not be used for
udelays, make sure that we do not use tsc_khz to calculate
the lpj value in such cases.

Reported-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Alok N Kataria <akataria@vmware.com>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-04 09:55:26 +01:00
Linus Torvalds
da4a22cba7 Merge branch 'io-mappings-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'io-mappings-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  io mapping: clean up #ifdefs
  io mapping: improve documentation
  i915: use io-mapping interfaces instead of a variety of mapping kludges
  resources: add io-mapping functions to dynamically map large device apertures
  x86: add iomap_atomic*()/iounmap_atomic() on 32-bit using fixmaps
2008-11-03 10:15:40 -08:00
Keith Packard
e5beae1690 io mapping: clean up #ifdefs
Impact: cleanup

clean up ifdefs: change #ifdef CONFIG_X86_32/64 to
CONFIG_HAVE_ATOMIC_IOMAP.

flip around the #ifdef sections to clean up the structure.

Signed-off-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-03 18:21:45 +01:00
James Bottomley
73557af5bf x86, voyager: fix smp_intr_init() compile breakage
Impact: fix x86/Voyager build

Looks like this became static on the rest of x86.  Fix it up by adding
an external definition to mach-voyager/setup.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-03 10:52:21 +01:00
Linus Torvalds
67d1128425 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: fix AMDC1E and XTOPOLOGY conflict in cpufeature
  x86: build fix
2008-11-01 10:36:30 -07:00
Linus Torvalds
1f98757776 x86: Clean up late e820 resource allocation
This makes the late e820 resources use 'insert_resource_expand_to_fit()'
instead of doing a 'reserve_region_with_split()', and also avoids
marking them as IORESOURCE_BUSY.

This results in us being perfectly happy to use pre-existing PCI
resources even if they were marked as being in a reserved region, while
still avoiding any _new_ allocations in the reserved regions.  It also
makes for a simpler and more accurate resource tree.

Example resource allocation from Jonathan Corbet, who has firmware that
has an e820 reserved entry that covered a big range (e0000000-fed003ff),
and that had various PCI resources in it set up by firmware.

With old kernels, the reserved range would force us to re-allocate all
pre-existing PCI resources, and his reserved range would end up looking
like this:

	e0000000-fed003ff : reserved
	  fec00000-fec00fff : IOAPIC 0
	  fed00000-fed003ff : HPET 0

where only the pre-allocated special regions (IOAPIC and HPET) were kept
around.

With 2.6.28-rc2, which uses 'reserve_region_with_split()', Jonathan's
resource tree looked like this:

	e0000000-fe7fffff : reserved
	fe800000-fe8fffff : PCI Bus 0000:01
	 fe800000-fe8fffff : reserved
	fe900000-fe9d9aff : reserved
	fe9d9b00-fe9d9bff : 0000:00:1f.3
	 fe9d9b00-fe9d9bff : reserved
	fe9d9c00-fe9d9fff : 0000:00:1a.7
	 fe9d9c00-fe9d9fff : reserved
	fe9da000-fe9dafff : 0000:00:03.3
	 fe9da000-fe9dafff : reserved
	fe9db000-fe9dbfff : 0000:00:19.0
	 fe9db000-fe9dbfff : reserved
	fe9dc000-fe9dffff : 0000:00:1b.0
	 fe9dc000-fe9dffff : reserved
	fe9e0000-fe9fffff : 0000:00:19.0
	 fe9e0000-fe9fffff : reserved
	fea00000-fea7ffff : 0000:00:02.0
	 fea00000-fea7ffff : reserved
	fea80000-feafffff : 0000:00:02.1
	 fea80000-feafffff : reserved
	feb00000-febfffff : 0000:00:02.0
	 feb00000-febfffff : reserved
	fec00000-fed003ff : reserved
	 fec00000-fec00fff : IOAPIC 0
	 fed00000-fed003ff : HPET 0

and because the reserved entry had been split and moved into the
individual resources, and because it used the IORESOURCE_BUSY flag, the
drivers that actually wanted to _use_ those resources couldn't actually
attach to them:

	e1000e 0000:00:19.0: BAR 0: can't reserve mem region [0xfe9e0000-0xfe9fffff]
	HDA Intel 0000:00:1b.0: BAR 0: can't reserve mem region [0xfe9dc000-0xfe9dffff]

with this patch, the resource tree instead becomes

	e0000000-fed003ff : reserved
	  fe800000-fe8fffff : PCI Bus 0000:01
	  fe9d9b00-fe9d9bff : 0000:00:1f.3
	  fe9d9c00-fe9d9fff : 0000:00:1a.7
	    fe9d9c00-fe9d9fff : ehci_hcd
	  fe9da000-fe9dafff : 0000:00:03.3
	  fe9db000-fe9dbfff : 0000:00:19.0
	    fe9db000-fe9dbfff : e1000e
	  fe9dc000-fe9dffff : 0000:00:1b.0
	    fe9dc000-fe9dffff : ICH HD audio
	  fe9e0000-fe9fffff : 0000:00:19.0
	    fe9e0000-fe9fffff : e1000e
	  fea00000-fea7ffff : 0000:00:02.0
	  fea80000-feafffff : 0000:00:02.1
	  feb00000-febfffff : 0000:00:02.0
	  fec00000-fec00fff : IOAPIC 0
	  fed00000-fed003ff : HPET 0

ie the one reserved region now ends up surrounding all the PCI resources
that were allocated inside of it by firmware, and because it is not
marked BUSY, drivers have no problem attaching to the pre-allocated
resources.

Reported-and-tested-by: Jonathan Corbet <corbet@lwn.net>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Robert Hancock <hancockr@shaw.ca>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-01 10:17:22 -07:00
Venki Pallipadi
2576c99917 x86: fix AMDC1E and XTOPOLOGY conflict in cpufeature
Impact: fix xsave slowdown regression

Fix two features from conflicting in feature bits.

Fixes this performance regression:

   Subject: cpu2000(both float and int) 13% regression with 2.6.28-rc1
   http://lkml.org/lkml/2008/10/28/36

Reported-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Bisected-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-31 11:01:40 +01:00
Keith Packard
fd94093435 x86: add iomap_atomic*()/iounmap_atomic() on 32-bit using fixmaps
Impact: introduce new APIs, separate kmap code from CONFIG_HIGHMEM

This takes the code used for CONFIG_HIGHMEM memory mappings except that
it's designed for dynamic IO resource mapping.

These fixmaps are available even with CONFIG_HIGHMEM turned off.

Signed-off-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-31 10:12:38 +01:00
Ingo Molnar
b342797c1e x86: build fix
Impact: build fix on certain UP configs

fix:

 arch/x86/kernel/cpu/common.c: In function 'cpu_init':
 arch/x86/kernel/cpu/common.c:1141: error: 'boot_cpu_id' undeclared (first use in this function)
 arch/x86/kernel/cpu/common.c:1141: error: (Each undeclared identifier is reported only once
 arch/x86/kernel/cpu/common.c:1141: error: for each function it appears in.)

Pull in asm/smp.h on UP, so that we get the definition of
boot_cpu_id.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-31 09:31:38 +01:00
Linus Torvalds
f2347dfcd1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  lguest: fix irq vectors.
  lguest: fix early_ioremap.
  lguest: fix example launcher compile after moved asm-x86 dir.
2008-10-30 18:35:09 -07:00
Linus Torvalds
74c75f524e Merge branch 'x86-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: cpu_index build fix
  x86/voyager: fix missing cpu_index initialisation
  x86/voyager: fix compile breakage caused by dc1e35c6e9
  x86: fix /dev/mem mmap breakage when PAT is disabled
  x86/voyager: fix compile breakage casued by x86: move prefill_possible_map calling early
  x86: use CONFIG_X86_SMP instead of CONFIG_SMP
  x86/voyager: fix boot breakage caused by x86: boot secondary cpus through initial_code
  x86, uv: fix compile error in uv_hub.h
  i386/PAE: fix pud_page()
  x86: remove debug code from arch_add_memory()
  x86: start annotating early ioremap pointers with __iomem
  x86: two trivial sparse annotations
  x86: fix init_memory_mapping for [dc000000 - e0000000) - v2
2008-10-30 18:33:46 -07:00
Rusty Russell
526e5ab200 lguest: fix irq vectors.
do_IRQ: cannot handle IRQ -1 vector 0x20 cpu 0
	------------[ cut here ]------------
	kernel BUG at arch/x86/kernel/irq_32.c:219!

We're not ISA: we have a 1:1 mapping from vectors to irqs.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-10-31 11:24:28 +11:00
Rusty Russell
ad5173ff8a lguest: fix early_ioremap.
dmi_scan_machine breaks under lguest:
	lguest: unhandled trap 14 at 0xc04edeae (0xffa00000)

This is because we use current_cr3 for the read_cr3() paravirt
function, and it isn't set until the first cr3 change.  We got away
with it until this happened.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-10-31 11:24:27 +11:00
Ingo Molnar
1c4acdb467 x86: cpu_index build fix
fix:

 arch/x86/kernel/cpu/common.c: In function 'early_identify_cpu':
 arch/x86/kernel/cpu/common.c:553: error: 'struct cpuinfo_x86' has no member named 'cpu_index'

as cpu_index is only available on SMP.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-31 00:43:03 +01:00
James Bottomley
bfcb4c1bec x86/voyager: fix missing cpu_index initialisation
Impact: fix /proc/cpuinfo output on x86/Voyager

Ever since

| commit 92cb7612ae
| Author: Mike Travis <travis@sgi.com>
| Date:   Fri Oct 19 20:35:04 2007 +0200
|
|     x86: convert cpuinfo_x86 array to a per_cpu array

We've had an extra field in cpuinfo_x86 which is cpu_index.
Unfortunately, voyager has never initialised this, although the only
noticeable impact seems to be that /proc/cpuinfo shows all zeros for
the processor ids.

Anyway, fix this by initialising the boot CPU properly and setting the
index when the secondaries update.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-31 00:19:37 +01:00
James Bottomley
b3572e361b x86/voyager: fix compile breakage caused by dc1e35c6e9
Impact: build fix on x86/Voyager

Given commits like this:

| Author: Suresh Siddha <suresh.b.siddha@intel.com>
| Date:   Tue Jul 29 10:29:19 2008 -0700
|
|     x86, xsave: enable xsave/xrstor on cpus with xsave support

Which deliberately expose boot cpu dependence to pieces of the system,
I think it's time to explicitly have a variable for it to prevent this
continual misassumption that the boot CPU is zero.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-31 00:19:33 +01:00
Ravikiran G Thirumalai
9e41bff270 x86: fix /dev/mem mmap breakage when PAT is disabled
Impact: allow /dev/mem mmaps on non-PAT CPUs/platforms

Fix mmap to /dev/mem when CONFIG_X86_PAT is off and CONFIG_STRICT_DEVMEM is
off

mmap to /dev/mem on kernel memory has been failing since the
introduction of PAT (CONFIG_STRICT_DEVMEM=n case).   Seems like
the check to avoid cache aliasing with PAT is kicking in even
when PAT is disabled. The bug seems to have crept in 2.6.26.

This patch makes sure that the mmap to regular
kernel memory succeeds if CONFIG_STRICT_DEVMEM=n and
PAT is disabled, and the checks to avoid cache aliasing
still happens if PAT is enabled.

Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Tested-by: Tim Sirianni <tim@scalemp.com>
Cc: <stable@kernel.org>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-30 23:54:41 +01:00
James Bottomley
ee477524b4 x86/voyager: fix compile breakage casued by x86: move prefill_possible_map calling early
Impact: fix build failure on x86/Voyager

Before:

| commit 329513a35d
| Author: Yinghai Lu <yhlu.kernel@gmail.com>
| Date:   Wed Jul 2 18:54:40 2008 -0700
|
|     x86: move prefill_possible_map calling early

prefill_possible_mask() was hidden under CONFIG_HOTPLUG_CPU rendering
it invisitble to voyager.  Since this commit it's exposed, but not
provided by the voyager subarch, so add a dummy stub to fix the link
breakage.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-30 22:55:51 +01:00
James Bottomley
017d9d20d8 x86: use CONFIG_X86_SMP instead of CONFIG_SMP
Impact: fix x86/Voyager boot

CONFIG_SMP is used for features which work on *all* x86 boxes.
CONFIG_X86_SMP is used for standard PC like x86 boxes (for things like
multi core and apics)

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-30 22:53:10 +01:00
James Bottomley
08c3330857 x86/voyager: fix boot breakage caused by x86: boot secondary cpus through initial_code
Impact: boot up secondary CPUs as well on x86/Voyager systems

This commit:

| commit 3e9704739d
| Author: Glauber Costa <gcosta@redhat.com>
| Date:   Wed May 28 13:01:54 2008 -0300
|
|     x86: boot secondary cpus through initial_code

removed the use of initialize_secondary.  However, it didn't update
voyager, so the secondary cpus no longer boot.  Fix this by adding the
initial_code switch to voyager as well.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-30 22:53:08 +01:00
Linus Torvalds
8bd93ca7b0 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, gart: fix gart detection for Fam11h CPUs
  x86: 64 bit print out absent pages num too
  x86, kdump: fix invalid access on i386 sparsemem
  x86: fix APIC_DEBUG with inquire_remote_apic
  x86: AMD microcode patch loader author update
  x86: microcode patch loader author update
  mailmap: add Peter Oruba
  x86, bts: improve help text for BTS config
  doc/x86: fix doc subdirs
2008-10-30 12:50:59 -07:00
Linus Torvalds
d6c3112abe Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  x86/PCI: build failure at x86/kernel/pci-dma.c with !CONFIG_PCI
2008-10-30 12:09:44 -07:00
Mike Travis
c08b6acc9b x86, uv: fix compile error in uv_hub.h
Impact: include file dependency cleanup

Fix compile errors of files that include asm/uv/uv_hub.h but do
not include linux/timer.h.

[ such files are not mainline right now. ]

Signed-of-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-30 19:38:46 +01:00
Alexey Dobriyan
c17dad6905 .gitignore updates
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-30 11:38:45 -07:00
Joerg Roedel
ae9b940364 AMD IOMMU: fix detection of NP capable IOMMUs
This patch changes the code to use IOMMU_CAP_NPCACHE as a shift and not
as a mask.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2008-10-30 17:50:14 +01:00
Jan Beulich
ab00fee30c i386/PAE: fix pud_page()
Impact: cleanup

To the unsuspecting user it is quite annoying that this broken and
inconsistent with x86-64 definition still exists.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-30 11:47:50 +01:00
Gary Hade
fe8b868ecc x86: remove debug code from arch_add_memory()
Impact: remove incorrect WARN_ON(1)

Gets rid of dmesg spam created during physical memory hot-add which
will very likely confuse users.  The change removes what appears to
be debugging code which I assume was unintentionally included in:

  x86: arch/x86/mm/init_64.c printk fixes
  commit 10f22dde55

Signed-off-by: Gary Hade <garyhade@us.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-29 09:29:22 +01:00
Harvey Harrison
1d6cf1feb8 x86: start annotating early ioremap pointers with __iomem
Impact: some new sparse warnings in e820.c etc, but no functional change.

As with regular ioremap, iounmap etc, annotate with __iomem.

Fixes the following sparse warnings, will produce some new ones
elsewhere in arch/x86 that will get worked out over time.

arch/x86/mm/ioremap.c:402:9: warning: cast removes address space of expression
arch/x86/mm/ioremap.c:406:10: warning: cast adds address space to expression (<asn:2>)
arch/x86/mm/ioremap.c:782:19: warning: Using plain integer as NULL pointer

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-29 08:05:14 +01:00
Harvey Harrison
9352f5698d x86: two trivial sparse annotations
Impact: fewer sparse warnings, no functional changes

arch/x86/kernel/vsmp_64.c:87:14: warning: incorrect type in argument 1 (different address spaces)
arch/x86/kernel/vsmp_64.c:87:14:    expected void const volatile [noderef] <asn:2>*addr
arch/x86/kernel/vsmp_64.c:87:14:    got void *[assigned] address
arch/x86/kernel/vsmp_64.c:88:22: warning: incorrect type in argument 1 (different address spaces)
arch/x86/kernel/vsmp_64.c:88:22:    expected void const volatile [noderef] <asn:2>*addr
arch/x86/kernel/vsmp_64.c:88:22:    got void *
arch/x86/kernel/vsmp_64.c💯23: warning: incorrect type in argument 2 (different address spaces)
arch/x86/kernel/vsmp_64.c💯23:    expected void volatile [noderef] <asn:2>*addr
arch/x86/kernel/vsmp_64.c💯23:    got void *
arch/x86/kernel/vsmp_64.c:101:23: warning: incorrect type in argument 1 (different address spaces)
arch/x86/kernel/vsmp_64.c:101:23:    expected void const volatile [noderef] <asn:2>*addr
arch/x86/kernel/vsmp_64.c:101:23:    got void *
arch/x86/mm/gup.c:235:6: warning: incorrect type in argument 1 (different base types)
arch/x86/mm/gup.c:235:6:    expected void const volatile [noderef] <asn:1>*<noident>
arch/x86/mm/gup.c:235:6:    got unsigned long [unsigned] [assigned] start

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-29 08:02:28 +01:00
Yinghai Lu
f96f57d91c x86: fix init_memory_mapping for [dc000000 - e0000000) - v2
Impact: change over-mapping to precise mapping, fix /proc/meminfo output

v2: fix less than 1G ram system handling

when gart aperture is 0xdc000000 - 0xe0000000
it return 0xc0000000 - 0xe0000000

that is not right.

this patch fix that will get exact mapping

on 256g sytem with that aperture after patch
LBSuse:~ # cat /proc/meminfo
MemTotal:       264742432 kB
MemFree:        263920628 kB
Buffers:            1416 kB
Cached:            24468 kB
...
DirectMap4k:      5760 kB
DirectMap2M:   3205120 kB
DirectMap1G:  265289728 kB

it is consistent to
LBSuse:~ # cat /sys/kernel/debug/kernel_page_tables
..
---[ Low Kernel Mapping ]---
0xffff880000000000-0xffff880000200000           2M     RW             GLB x  pte
0xffff880000200000-0xffff880040000000        1022M     RW         PSE GLB x  pmd
0xffff880040000000-0xffff8800c0000000           2G     RW         PSE GLB NX pud
0xffff8800c0000000-0xffff8800d7e00000         382M     RW         PSE GLB NX pmd
0xffff8800d7e00000-0xffff8800d7fa0000        1664K     RW             GLB NX pte
0xffff8800d7fa0000-0xffff8800d8000000         384K                           pte
0xffff8800d8000000-0xffff8800dc000000          64M                           pmd
0xffff8800dc000000-0xffff8800e0000000          64M     RW         PSE GLB NX pmd
0xffff8800e0000000-0xffff880100000000         512M                           pmd
0xffff880100000000-0xffff880800000000          28G     RW         PSE GLB NX pud
0xffff880800000000-0xffff880824600000         582M     RW         PSE GLB NX pmd
0xffff880824600000-0xffff8808247f0000        1984K     RW             GLB NX pte
0xffff8808247f0000-0xffff880824800000          64K     RW     PCD     GLB NX pte
0xffff880824800000-0xffff880840000000         440M     RW         PSE GLB NX pmd
0xffff880840000000-0xffff884000000000         223G     RW         PSE GLB NX pud
0xffff884000000000-0xffff884028000000         640M     RW         PSE GLB NX pmd
0xffff884028000000-0xffff884040000000         384M                           pmd
0xffff884040000000-0xffff888000000000         255G                           pud
0xffff888000000000-0xffffc20000000000       58880G                           pgd

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-28 20:54:47 +01:00
Linus Torvalds
e946217e4f Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (31 commits)
  ftrace: fix current_tracer error return
  tracing: fix a build error on alpha
  ftrace: use a real variable for ftrace_nop in x86
  tracing/ftrace: make boot tracer select the sched_switch tracer
  tracepoint: check if the probe has been registered
  asm-generic: define DIE_OOPS in asm-generic
  trace: fix printk warning for u64
  ftrace: warning in kernel/trace/ftrace.c
  ftrace: fix build failure
  ftrace, powerpc, sparc64, x86: remove notrace from arch ftrace file
  ftrace: remove ftrace hash
  ftrace: remove mcount set
  ftrace: remove daemon
  ftrace: disable dynamic ftrace for all archs that use daemon
  ftrace: add ftrace warn on to disable ftrace
  ftrace: only have ftrace_kill atomic
  ftrace: use probe_kernel
  ftrace: comment arch ftrace code
  ftrace: return error on failed modified text.
  ftrace: dynamic ftrace process only text section
  ...
2008-10-28 09:52:25 -07:00
Linus Torvalds
a186576925 Merge branch 'kvm-updates/2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
* 'kvm-updates/2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm:
  KVM: ia64: Makefile fix for forcing to re-generate asm-offsets.h
  KVM: Future-proof device assignment ABI
  KVM: ia64: Fix halt emulation logic
  KVM: Fix guest shared interrupt with in-kernel irqchip
  KVM: MMU: sync root on paravirt TLB flush
2008-10-28 09:50:11 -07:00
Linus Torvalds
0d8762c9ee Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  lockdep: fix irqs on/off ip tracing
  lockdep: minor fix for debug_show_all_locks()
  x86: restore the old swiotlb alloc_coherent behavior
  x86: use GFP_DMA for 24bit coherent_dma_mask
  swiotlb: remove panic for alloc_coherent failure
  xen: compilation fix of drivers/xen/events.c on IA64
  xen: portability clean up and some minor clean up for xencomm.c
  xen: don't reload cr3 on suspend
  kernel/resource: fix reserve_region_with_split() section mismatch
  printk: remove unused code from kernel/printk.c
2008-10-28 09:49:27 -07:00
Joerg Roedel
87c6f40128 x86, gart: fix gart detection for Fam11h CPUs
Impact: fix AMD Family 11h boot hangs / USB device problems

The AMD Fam11h CPUs have a K8 northbridge. This northbridge is different
from other family's because it lacks GART support (as I just learned).

But the kernel implicitly expects a GART if it finds an AMD northbridge.

Fix this by removing the Fam11h northbridge id from the scan list of K8
northbridges. This patch also changes the message in the GART driver
about missing K8 northbridges to tell that the GART is missing which is
the correct information in this case.

Reported-by: Jouni Malinen <jkmalinen@gmail.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-28 17:10:27 +01:00
Yinghai Lu
11a6b0c933 x86: 64 bit print out absent pages num too
so users are not confused with memhole causing big total ram

we don't need to worry about 32 bit, because memhole is always
above max_low_pfn.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-28 16:50:49 +01:00
Ken'ichi Ohmichi
e7706fc691 x86, kdump: fix invalid access on i386 sparsemem
Impact: fix kdump crash on 32-bit sparsemem kernels

Since linux-2.6.27, kdump has failed on i386 sparsemem kernel.
1st-kernel gets a panic just before switching to 2nd-kernel.

The cause is that a kernel accesses invalid mem_section by
page_to_pfn(image->swap_page) at machine_kexec().
image->swap_page is allocated if kexec for hibernation, but
it is not allocated if kdump. So if kdump, a kernel should
not access the mem_section corresponding to image->swap_page.

The attached patch fixes this invalid access.

Signed-off-by: Ken'ichi Ohmichi <oomichi@mxs.nes.nec.co.jp>
Cc: kexec-ml <kexec@lists.infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-28 16:48:24 +01:00
Yinghai Lu
1281675e9c x86: fix APIC_DEBUG with inquire_remote_apic
APIC_DEBUG is always 2.
need to update inquire_remote_apic to check apic_verbosity with
it instead.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-28 16:43:48 +01:00
Peter Oruba
3c52204bb9 x86: AMD microcode patch loader author update
Removed author's email address from MODULE_AUTHOR.

Signed-off-by: Peter Oruba <peter.oruba@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-28 16:41:28 +01:00
Peter Oruba
36b75da27b x86: microcode patch loader author update
Removed one author's email address from module init message.

Signed-off-by: Peter Oruba <peter.oruba@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-28 16:41:16 +01:00
Markus Metzger
531f6ed7de x86, bts: improve help text for BTS config
Improve the help text of the X86_PTRACE_BTS config.
Make X86_DS invisible and depend on X86_PTRACE_BTS.

Reported-by: Roland Dreier <rdreier@cisco.com>
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-28 16:39:37 +01:00
Sheng Yang
5550af4df1 KVM: Fix guest shared interrupt with in-kernel irqchip
Every call of kvm_set_irq() should offer an irq_source_id, which is
allocated by kvm_request_irq_source_id(). Based on irq_source_id, we
identify the irq source and implement logical OR for shared level
interrupts.

The allocated irq_source_id can be freed by kvm_free_irq_source_id().

Currently, we support at most sizeof(unsigned long) different irq sources.

[Amit: - rebase to kvm.git HEAD
       - move definition of KVM_USERSPACE_IRQ_SOURCE_ID to common file
       - move kvm_request_irq_source_id to the update_irq ioctl]

[Xiantao: - Add kvm/ia64 stuff and make it work for kvm/ia64 guests]

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-10-28 14:21:34 +02:00
Marcelo Tosatti
6ad9f15c94 KVM: MMU: sync root on paravirt TLB flush
The pvmmu TLB flush handler should request a root sync, similarly to
a native read-write CR3.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-10-28 14:09:27 +02:00
Shaohua Li
60817c9b31 x86, memory hotplug: remove wrong -1 in calling init_memory_mapping()
Impact: fix crash with memory hotplug

Shuahua Li found:

| I just did some experiments on a desktop for memory hotplug and this bug
| triggered a crash in my test.
|
| Yinghai's suggestion also fixed the bug.

We don't need to round it, just remove that extra -1

Signed-off-by: Yinghai <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-28 09:33:17 +01:00
Yinghai Lu
3afa39493d x86: keep the /proc/meminfo page count correct
Impact: get correct page count in /proc/meminfo

found page count in /proc/meminfo is nor correct on 1G system in VirtualBox 2.0.4

# cat /proc/meminfo
MemTotal:        1017508 kB
MemFree:          822700 kB
Buffers:            1456 kB
Cached:            26632 kB
SwapCached:            0 kB
...
Hugepagesize:       2048 kB
DirectMap4k:      4032 kB
DirectMap2M:  18446744073709549568 kB

with this patch get:
...
DirectMap4k:      4032 kB
DirectMap2M:   1044480 kB

which is consistent to kernel_page_tables
---[ Low Kernel Mapping ]---
0xffff880000000000-0xffff880000001000           4K     RW     PCD     GLB x  pte
0xffff880000001000-0xffff88000009f000         632K     RW             GLB x  pte
0xffff88000009f000-0xffff8800000a0000           4K     RW     PCD     GLB x  pte
0xffff8800000a0000-0xffff880000200000        1408K     RW             GLB x  pte
0xffff880000200000-0xffff88003fe00000        1020M     RW         PSE GLB x  pmd
0xffff88003fe00000-0xffff88003fff0000        1984K     RW             GLB NX pte
0xffff88003fff0000-0xffff880040000000          64K                           pte
0xffff880040000000-0xffff888000000000         511G                           pud
0xffff888000000000-0xffffc20000000000       58880G                           pgd

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-27 18:55:26 +01:00
Steven Rostedt
8115f3f0c9 ftrace: use a real variable for ftrace_nop in x86
Impact: avoid section mismatch warning, clean up

The dynamic ftrace determines which nop is safe to use at start up.
When it finds a safe nop for patching, it sets a pointer called ftrace_nop
to point to the code. All call sites are then patched to this nop.

Later, when tracing is turned on, this ftrace_nop variable is again used
to compare the location to make sure it is a nop before we update it to
an mcount call. If this fails just once, a warning is printed and ftrace
is disabled.

Rakib Mullick noted that the code that sets up the nop is a .init section
where as the nop itself is in the .text section. This is needed because
the nop is used later on after boot up. The problem is that the test of the
nop jumps back to the setup code and causes a "section mismatch" warning.

Rakib first recommended to convert the nop to .init.text, but as stated
above, this would fail since that text is used later.

The real solution is to extend Rabik's patch, and to make the ftrace_nop
into an array, and just save the code from the assembly to this array.

Now the section can stay as an init section, and we have a nop to use
later on.

Reported-by: Rakib Mullick <rakib.mullick@gmail.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-27 16:52:01 +01:00
Cliff Wickman
ef020ab010 x86/uv: memory allocation at initialization
Impact: on SGI UV platforms, fix boot crash

UV initialization is currently called too late to call alloc_bootmem_pages().
The current sequence is:

 start_kernel()
   mem_init()
     free_all_bootmem()           <--- discard of bootmem
   rest_init()
     kernel_init()
       smp_prepare_cpus()
       native_smp_prepare_cpus()
         uv_system_init()         <--- uses alloc_bootmem_pages()

It should be calling kmalloc().

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-27 14:17:16 +01:00
Chris Lalancette
9f32d21c98 xen: fix Xen domU boot with batched mprotect
Impact: fix guest kernel boot crash on certain configs

Recent i686 2.6.27 kernels with a certain amount of memory (between
736 and 855MB) have a problem booting under a hypervisor that supports
batched mprotect (this includes the RHEL-5 Xen hypervisor as well as
any 3.3 or later Xen hypervisor).

The problem ends up being that xen_ptep_modify_prot_commit() is using
virt_to_machine to calculate which pfn to update.  However, this only
works for pages that are in the p2m list, and the pages coming from
change_pte_range() in mm/mprotect.c are kmap_atomic pages.  Because of
this, we can run into the situation where the lookup in the p2m table
returns an INVALID_MFN, which we then try to pass to the hypervisor,
which then (correctly) denies the request to a totally bogus pfn.

The right thing to do is to use arbitrary_virt_to_machine, so that we
can be sure we are modifying the right pfn.  This unfortunately
introduces a performance penalty because of a full page-table-walk,
but we can avoid that penalty for pages in the p2m list by checking if
virt_addr_valid is true, and if so, just doing the lookup in the p2m
table.

The attached patch implements this, and allows my 2.6.27 i686 based
guest with 768MB of memory to boot on a RHEL-5 hypervisor again.
Thanks to Jeremy for the suggestions about how to fix this particular
issue.

Signed-off-by: Chris Lalancette <clalance@redhat.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Chris Lalancette <clalance@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-27 14:11:20 +01:00
Ingo Molnar
4944dd62de Merge commit 'v2.6.28-rc2' into tracing/urgent 2008-10-27 10:50:54 +01:00
Fenghua Yu
3b15e58198 x86/PCI: build failure at x86/kernel/pci-dma.c with !CONFIG_PCI
On Thu, Oct 23, 2008 at 04:09:52PM -0700, Alexander Beregalov wrote:
> arch/x86/kernel/built-in.o: In function `iommu_setup':
> pci-dma.c:(.init.text+0x36ad): undefined reference to `forbid_dac'
> pci-dma.c:(.init.text+0x36cc): undefined reference to `forbid_dac'
> pci-dma.c:(.init.text+0x3711): undefined reference to `forbid_dac

This patch partially reverts a patch to add IOMMU support to ia64.  The
forbid_dac variable was incorrectly moved to quirks.c, which isn't built
when PCI is disabled.

Tested-by: "Alexander Beregalov" <a.beregalov@gmail.com>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-10-24 11:09:43 -07:00
FUJITA Tomonori
03967c5267 x86: restore the old swiotlb alloc_coherent behavior
This restores the old swiotlb alloc_coherent behavior (before the
alloc_coherent rewrite):

  http://lkml.org/lkml/2008/8/12/200

The old alloc_coherent avoids GFP_DMA allocation first and if the
allocated address is not fit for the device's coherent_dma_mask, then
dma_alloc_coherent does GFP_DMA allocation. If it fails,
alloc_coherent calls swiotlb_alloc_coherent (in short, we rarely used
swiotlb_alloc_coherent).

After the alloc_coherent rewrite, dma_alloc_coherent
(include/asm-x86/dma-mapping.h) directly calls swiotlb_alloc_coherent.
It means that we possibly can't handle a device having dma_masks >
24bit < 32bits since swiotlb_alloc_coherent doesn't have the above
GFP_DMA retry mechanism.

This patch fixes x86's swiotlb alloc_coherent to use the GFP_DMA retry
mechanism, which dma_generic_alloc_coherent() provides now
(pci-nommu.c and GART IOMMU driver also use
dma_generic_alloc_coherent).

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-23 21:54:40 +02:00
FUJITA Tomonori
75bebb7f0c x86: use GFP_DMA for 24bit coherent_dma_mask
dma_alloc_coherent (include/asm-x86/dma-mapping.h) avoids GFP_DMA
allocation first and if the allocated address is not fit for the
device's coherent_dma_mask, then dma_alloc_coherent does GFP_DMA
allocation. This is because dma_alloc_coherent avoids precious GFP_DMA
zone if possible. This is also how the old dma_alloc_coherent
(arch/x86/kernel/pci-dma.c) works.

However, if the coherent_dma_mask of a device is 24bit, there is no
point to go into the above GFP_DMA retry mechanism. We had better use
GFP_DMA in the first place.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Tested-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-23 21:54:39 +02:00
Linus Torvalds
c3c9897c63 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: fix section mismatch warning - apic_x2apic_phys
  x86: fix section mismatch warning - apic_x2apic_cluster
  x86: fix section mismatch warning - apic_x2apic_uv_x
  x86: fix section mismatch warning - apic_physflat
  x86: fix section mismatch warning - apic_flat
  x86: memtest fix use of reserve_early()
  x86 syscall.h: fix argument order
  x86/tlb_uv: remove strange mc146818rtc include
  x86: remove redundant KERN_DEBUG on pr_debug
  x86: do_boot_cpu - check if we have ESR register
  x86: MAINTAINERS change for AMD microcode patch loader
  x86/proc: fix /proc/cpuinfo cpu offline bug
  x86: call dmi-quirks for HP Laptops after early-quirks are executed
  x86, kexec: fix hang on i386 when panic occurs while console_sem is held
  MCE: Don't run 32bit machine checks with interrupts on
  x86: SB600: skip IRQ0 override if it is not routed to INT2 of IOAPIC
  x86: make variables static
2008-10-23 12:38:39 -07:00
Linus Torvalds
88ed86fee6 Merge branch 'proc' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc
* 'proc' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc: (35 commits)
  proc: remove fs/proc/proc_misc.c
  proc: move /proc/vmcore creation to fs/proc/vmcore.c
  proc: move pagecount stuff to fs/proc/page.c
  proc: move all /proc/kcore stuff to fs/proc/kcore.c
  proc: move /proc/schedstat boilerplate to kernel/sched_stats.h
  proc: move /proc/modules boilerplate to kernel/module.c
  proc: move /proc/diskstats boilerplate to block/genhd.c
  proc: move /proc/zoneinfo boilerplate to mm/vmstat.c
  proc: move /proc/vmstat boilerplate to mm/vmstat.c
  proc: move /proc/pagetypeinfo boilerplate to mm/vmstat.c
  proc: move /proc/buddyinfo boilerplate to mm/vmstat.c
  proc: move /proc/vmallocinfo to mm/vmalloc.c
  proc: move /proc/slabinfo boilerplate to mm/slub.c, mm/slab.c
  proc: move /proc/slab_allocators boilerplate to mm/slab.c
  proc: move /proc/interrupts boilerplate code to fs/proc/interrupts.c
  proc: move /proc/stat to fs/proc/stat.c
  proc: move rest of /proc/partitions code to block/genhd.c
  proc: move /proc/cpuinfo code to fs/proc/cpuinfo.c
  proc: move /proc/devices code to fs/proc/devices.c
  proc: move rest of /proc/locks to fs/locks.c
  ...
2008-10-23 12:04:37 -07:00
Linus Torvalds
1f6d6e8ebe Merge branch 'v28-range-hrtimers-for-linus-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'v28-range-hrtimers-for-linus-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (37 commits)
  hrtimers: add missing docbook comments to struct hrtimer
  hrtimers: simplify hrtimer_peek_ahead_timers()
  hrtimers: fix docbook comments
  DECLARE_PER_CPU needs linux/percpu.h
  hrtimers: fix typo
  rangetimers: fix the bug reported by Ingo for real
  rangetimer: fix BUG_ON reported by Ingo
  rangetimer: fix x86 build failure for the !HRTIMERS case
  select: fix alpha OSF wrapper
  select: fix alpha OSF wrapper
  hrtimer: peek at the timer queue just before going idle
  hrtimer: make the futex() system call use the per process slack value
  hrtimer: make the nanosleep() syscall use the per process slack
  hrtimer: fix signed/unsigned bug in slack estimator
  hrtimer: show the timer ranges in /proc/timer_list
  hrtimer: incorporate feedback from Peter Zijlstra
  hrtimer: add a hrtimer_start_range() function
  hrtimer: another build fix
  hrtimer: fix build bug found by Ingo
  hrtimer: make select() and poll() use the hrtimer range feature
  ...
2008-10-23 10:53:02 -07:00
Linus Torvalds
5b34653963 Merge branch 'x86/um-header' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86/um-header' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (26 commits)
  x86: canonicalize remaining header guards
  x86: drop double underscores from header guards
  x86: Fix ASM_X86__ header guards
  x86, um: get rid of uml-config.h
  x86, um: get rid of arch/um/Kconfig.arch
  x86, um: get rid of arch/um/os symlink
  x86, um: get rid of excessive includes of uml-config.h
  x86, um: get rid of header symlinks
  x86, um: merge Kconfig.i386 and Kconfig.x86_64
  x86, um: get rid of sysdep symlink
  x86, um: trim the junk from uml ptrace-*.h
  x86, um: take vm-flags.h to sysdep
  x86, um: get rid of uml asm/arch
  x86, um: get rid of uml highmem.h
  x86, um: get rid of uml unistd.h
  x86, um: get rid of system.h -> system.h include
  x86, um: uml atomic.h is not needed anymore
  x86, um: untangle uml ldt.h
  x86, um: get rid of more uml asm/arch uses
  x86, um: remove dead header (uml module-generic.h; never used these days)
  ...
2008-10-23 10:22:01 -07:00
Linus Torvalds
765426e8ee Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (123 commits)
  dock: make dock driver not a module
  ACPI: fix ia64 build warning
  ACPI: hack around sysfs warning with link order
  ACPI suspend: fix build warning when CONFIG_ACPI_SLEEP=n
  intel_menlo: fix build warning
  panasonic-laptop: fix build
  ACPICA: Update version to 20080926
  ACPICA: Add support for zero-length buffer-to-string conversions
  ACPICA: New: Validation for predefined ACPI methods/objects
  ACPICA: Fix for implicit return compatibility
  ACPICA: Fixed a couple memory leaks associated with "implicit return"
  ACPICA: Optimize buffer allocation procedure
  ACPICA: Fix possible memory leak, error exit path
  ACPICA: Fix fault after mem allocation failure in AML parser
  ACPICA: Remove unused ACPI register bit definition
  ACPICA: Update version to 20080829
  ACPICA: Fix possible memory leak in acpi_ns_get_external_pathname
  ACPICA: Cleanup for internal Reference Object
  ACPICA: Update comments - no functional changes
  ACPICA: Update for Reference ACPI_OPERAND_OBJECT
  ...
2008-10-23 10:20:36 -07:00
Linus Torvalds
92fb83afd6 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile: (21 commits)
  OProfile: Fix buffer synchronization for IBS
  oprofile: hotplug cpu fix
  oprofile: fixing whitespaces in arch/x86/oprofile/*
  oprofile: fixing whitespaces in arch/x86/oprofile/*
  oprofile: fixing whitespaces in drivers/oprofile/*
  x86/oprofile: add the logic for enabling additional IBS bits
  x86/oprofile: reordering functions in nmi_int.c
  x86/oprofile: removing unused function parameter in add_ibs_begin()
  oprofile: more whitespace fixes
  oprofile: whitespace fixes
  OProfile: Rename IBS sysfs dir into "ibs_op"
  OProfile: Rework string handling in setup_ibs_files()
  OProfile: Rework oprofile_add_ibs_sample() function
  oprofile: discover counters for op ppro too
  oprofile: Implement Intel architectural perfmon support
  oprofile: Don't report Nehalem as core_2
  oprofile: drop const in num counters field
  Revert "Oprofile Multiplexing Patch"
  x86, oprofile: BUG: using smp_processor_id() in preemptible code
  x86/oprofile: fix on_each_cpu build error
  ...

Manually fixed trivial conflicts in
	drivers/oprofile/{cpu_buffer.c,event_buffer.h}
2008-10-23 10:05:40 -07:00
Linus Torvalds
6770ab5cf5 Merge git://git.infradead.org/iommu-2.6
* git://git.infradead.org/iommu-2.6:
  Admit to maintaining VT-d, for my sins.
  dmar: fix uninitialised 'ret' variable in dmar_parse_dev()
  intel-iommu: use coherent_dma_mask in alloc_coherent
  amd_iommu: fix nasty bug that caused ILLEGAL_DEVICE_TABLE_ENTRY errors
  intel-iommu: IA64 support
  dmar: remove the quirk which disables dma-remapping when intr-remapping enabled
  dmar: Use queued invalidation interface for IOTLB and context invalidation
  dmar: context cache and IOTLB invalidation using queued invalidation
  dmar: use spin_lock_irqsave() in qi_submit_sync()
2008-10-23 09:53:14 -07:00
Steven Rostedt
15adc04898 ftrace, powerpc, sparc64, x86: remove notrace from arch ftrace file
The entire file of ftrace.c in the arch code needs to be marked
as notrace. It is much cleaner to do this from the Makefile with
CFLAGS_REMOVE_ftrace.o.

[ powerpc already had this in its Makefile. ]

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-23 16:00:25 +02:00
Steven Rostedt
4d296c2432 ftrace: remove mcount set
The arch dependent function ftrace_mcount_set was only used by the daemon
start up code. This patch removes it.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-23 16:00:23 +02:00
Steven Rostedt
ab9a0918cb ftrace: use probe_kernel
Andrew Morton suggested using the proper API for reading and writing
kernel areas that might fault.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-23 16:00:18 +02:00
Steven Rostedt
76aefee576 ftrace: comment arch ftrace code
Add comments to explain what is happening in the x86 arch ftrace code.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-23 16:00:17 +02:00
Steven Rostedt
593eb8a2d6 ftrace: return error on failed modified text.
Have the ftrace_modify_code return error values:

  -EFAULT on error of reading the address

  -EINVAL if what is read does not match what it expected

  -EPERM  if the write fails to update after a successful match.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-23 16:00:13 +02:00
Alexey Dobriyan
e1759c215b proc: switch /proc/meminfo to seq_file
and move it to fs/proc/meminfo.c while I'm at it.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-23 13:52:40 +04:00
H. Peter Anvin
5e1b00758b x86: canonicalize remaining header guards
Canonicalize a few remaining header guards, with the exception for
those which are still in subarchitecture directories.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-10-23 00:20:33 -07:00
H. Peter Anvin
05e4d3169b x86: drop double underscores from header guards
Drop double underscores from header guards in arch/x86/include.  They
are used inconsistently, and are not necessary.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-10-23 00:01:39 -07:00
H. Peter Anvin
1965aae3c9 x86: Fix ASM_X86__ header guards
Change header guards named "ASM_X86__*" to "_ASM_X86_*" since:

a. the double underscore is ugly and pointless.
b. no leading underscore violates namespace constraints.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-10-22 22:55:23 -07:00
Al Viro
bb8985586b x86, um: ... and asm-x86 move
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-10-22 22:55:20 -07:00
Len Brown
1ca2cc728d Merge branch 'bugzilla-11715' into test 2008-10-23 01:28:19 -04:00
Len Brown
057316cc6a Merge branch 'linus' into test
Conflicts:
	MAINTAINERS
	arch/x86/kernel/acpi/boot.c
	arch/x86/kernel/acpi/sleep.c
	drivers/acpi/Kconfig
	drivers/pnp/Makefile
	drivers/pnp/quirks.c

Signed-off-by: Len Brown <len.brown@intel.com>
2008-10-23 00:11:07 -04:00
Len Brown
1b79b27da1 Merge branch 'yinghai' into test 2008-10-22 23:35:56 -04:00
Len Brown
5f50ef453d Merge branch 'misc' into test 2008-10-22 23:28:38 -04:00
Len Brown
530bc23bfe Merge branch 'i7300_idle' into test 2008-10-22 23:28:36 -04:00
Len Brown
4538fad56e Merge branch 'cpuidle' into test 2008-10-22 23:20:05 -04:00
Len Brown
6b3c4f8b9c Merge branch 'FW_BUG' into test 2008-10-22 23:19:45 -04:00
Marcin Slusarz
3cfba08925 x86: fix section mismatch warning - apic_x2apic_phys
Impact: cleanup only, no functionality changed

WARNING: vmlinux.o(.data+0xc008): Section mismatch in reference from the variable apic_x2apic_phys to the function .init.text:x2apic_acpi_madt_oem_check()
The variable apic_x2apic_phys references
the function __init x2apic_acpi_madt_oem_check()

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-22 17:36:30 +02:00
Marcin Slusarz
2caa37150d x86: fix section mismatch warning - apic_x2apic_cluster
Impact: cleanup only, no functionality changed

WARNING: vmlinux.o(.data+0xbf88): Section mismatch in reference from the variable apic_x2apic_cluster to the function .init.text:x2apic_acpi_madt_oem_check()
The variable apic_x2apic_cluster references
the function __init x2apic_acpi_madt_oem_check()

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-22 17:36:29 +02:00
Marcin Slusarz
f8827c017f x86: fix section mismatch warning - apic_x2apic_uv_x
Impact: cleanup only, no functionality changed

WARNING: vmlinux.o(.data+0xbf08): Section mismatch in reference from the variable apic_x2apic_uv_x to the function .init.text:uv_acpi_madt_oem_check()
The variable apic_x2apic_uv_x references
the function __init uv_acpi_madt_oem_check()

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-22 17:36:28 +02:00
Marcin Slusarz
fae1721601 x86: fix section mismatch warning - apic_physflat
Impact: cleanup only, no functionality changed

WARNING: vmlinux.o(.data+0xbe88): Section mismatch in reference from the variable apic_physflat to the function .init.text:physflat_acpi_madt_oem_check()
The variable apic_physflat references
the function __init physflat_acpi_madt_oem_check()

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-22 17:36:27 +02:00
Marcin Slusarz
983f91ff94 x86: fix section mismatch warning - apic_flat
Impact: cleanup only, no functionality changed

WARNING: vmlinux.o(.data+0xbe08): Section mismatch in reference from the variable apic_flat to the function .init.text:flat_acpi_madt_oem_check()
The variable apic_flat references
the function __init flat_acpi_madt_oem_check()

This is harmless, because the .acpi_madt_oem_check is only called
during init time. But we keep the function pointer around in a .data
function pointer template, so it's better we do not keep that stale
- so mark this function non-__init.

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-22 17:36:26 +02:00
Daniele Calore
2cb0ebeeb6 x86: memtest fix use of reserve_early()
Hi all,

Wrong usage of 2nd parameter in reserve_early call.
66/75: reserve_early(start_bad, last_bad - start_bad, "BAD RAM");
                                ^^^^^^^^^^^^^^^^^^^^

The correct way is to use 'end' address and not 'size'.
As a bonus a fix to the printk format.

Signed-off-by: Daniele Calore <orkaan@orkaan.org>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-22 17:08:06 +02:00
Jeremy Fitzhardinge
aef8f5b8c2 x86/tlb_uv: remove strange mc146818rtc include
For some reason tlb_uv was including linux/mc146818rtc.h.  It really
just needs linux/seq_file.h

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citix.com>
Cc: Cliff Wickman <cpw@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-22 16:56:23 +02:00
Gustavo F. Padovan
55410791c9 x86: remove redundant KERN_DEBUG on pr_debug
pr_debug don't need KERN_DEBUG.

Signed-off-by: Gustavo F. Padovan <gustavo@las.ic.unicamp.br>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-22 16:56:22 +02:00
Cyrill Gorcunov
db96b0a0e4 x86: do_boot_cpu - check if we have ESR register
Impact: fix APIC IRQ irregularities on certain older boxes

We should touch the APIC ESR register if only we have it.

The patch fixes the problem mentioned by Max Kellermann:

	http://lkml.org/lkml/2008/10/17/147

Bisected-by: Max Kellermann <mk@cm4all.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
[ mingo@elte.hu: build fix ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-22 16:56:16 +02:00
Lai Jiangshan
bc8bcc79ea x86/proc: fix /proc/cpuinfo cpu offline bug
Impact: fix missing CPUs in /proc/cpuinfo after CPU hotunplug/hotreplug

In my test, I found that if a cpu has been offline,
the next cpus may not be shown in the /proc/cpuinfo.

if one read() cannot consume the whole /proc/cpuinfo,
c_start() will be called again in the next read() calls.
And *pos has been increased by 1 by the caller(seq_read()).
if this time the cpu#*pos is offline, c_start() will return
NULL, and the next cpus can not be shown.

this fix use next_cpu_nr(*pos - 1, cpu_online_map) to
search the next unshown cpu.

the most easy way to reproduce this bug:
1) offline cpu#1             (cpu#0 is online)
2) dd ibs=2 if=/proc/cpuinfo
   the result is that only cpu#0 is shown.
   cpu#2 and cpu#3 .... cannot be shown in /proc/cpuinfo
   it's bug.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-22 14:29:37 +02:00
Andreas Herrmann
35af28219e x86: call dmi-quirks for HP Laptops after early-quirks are executed
Impact: make warning message disappear - functionality unchanged

Problems with bogus IRQ0 override of those laptops should be fixed
with commits

x86: SB600: skip IRQ0 override if it is not routed to INT2 of IOAPIC
x86: SB450: skip IRQ0 override if it is not routed to INT2 of IOAPIC

that introduce early-quirks based on chipset configuration.

For further information, see
http://bugzilla.kernel.org/show_bug.cgi?id=11516

Instead of removing the related dmi-quirks completely we'd like to
keep them for (at least) one kernel version -- to double-check whether
the early-quirks really took effect. But the dmi-quirks need to be
called after early-quirks are executed. With this patch calling
sequence for dmi-quriks is changed as follows:

 acpi_boot_table_init()   (dmi-quirks)
 ...
 early_quirks()           (detect bogus IRQ0 override)
 ...
 acpi_boot_init()         (late dmi-quirks and setup IO APIC)

Note: Plan is to remove the "late dmi-quirks" with next kernel version.

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-22 14:15:31 +02:00
Neil Horman
cf52ebedba x86, kexec: fix hang on i386 when panic occurs while console_sem is held
There's a corner case in 32 bit x86 kdump at the moment.  When the box
panics via nmi, we call bust_spinlocks(1) to disable sensitivity to the
console_sem (allowing us to print to the console in all cases), but we don't
call crash_kexec, until after we call bust_spinlocks(0), which re-enables
console_sem sensitivity.

The result is that, if we get an nmi while the console_sem is held and
kdump is configured, and we try to print something to the console during
kdump shutdown (which we often do) we deadlock the box.  The fix is to
simply do what 64 bit die_nmi does which is to not call bust_spinlocks(0)
until after we call crash_kexec.

Patch below tested successfully by me.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-22 13:59:44 +02:00
Andi Kleen
d2f6f7aeee MCE: Don't run 32bit machine checks with interrupts on
Running machine checks with interrupt on is a extremly bad idea. The machine
check handler only runs when the system is broken and needs to finish
as quickly as possible.

Remove the respective bogus post 2.6.27 regression and call
the machine check vector directly again.

This removes only code.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
[Cherry-picked from x86/mce]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-10-22 13:19:01 +02:00
Andreas Herrmann
2bfef69d9e x86: SB600: skip IRQ0 override if it is not routed to INT2 of IOAPIC
Impact: fix hung bootup and other misbehavior on certain laptops

On some more HP laptops BIOS reports an IRQ0 override
but the SB600 chipset is configured such that timer
interrupts go to INT0 of IOAPIC.

Check IRQ0 routing and if it is routed to INT0 of IOAPIC skip the
timer override.

See following bug reports:

  http://bugzilla.kernel.org/show_bug.cgi?id=11715
  http://bugzilla.kernel.org/show_bug.cgi?id=11516

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-22 12:00:10 +02:00
Thomas Gleixner
268a3dcfea Merge branch 'timers/range-hrtimers' into v28-range-hrtimers-for-linus-v2
Conflicts:

	kernel/time/tick-sched.c

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-10-22 09:48:06 +02:00
Ingo Molnar
debfcaf93e Merge branch 'tracing/ftrace' into tracing/urgent 2008-10-22 09:08:14 +02:00
roel kluin
8bcad30f2e x86: make variables static
These variables are only used in their source files, so make them static.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-22 07:31:28 +02:00
Andy Henroid
27471fdb32 i7300_idle driver v1.55
The Intel 7300 Memory Controller supports dynamic throttling of memory which can
be used to save power when system is idle. This driver does the memory
throttling when all CPUs are idle on such a system.

Refer to "Intel 7300 Memory Controller Hub (MCH)" datasheet
for the config space description.

Signed-off-by: Andy Henroid <andrew.d.henroid@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
2008-10-21 23:58:41 -04:00
Venkatesh Pallipadi
c7d87d79d1 x86 allow modules to register idle notifiers
needed if the i7300_idle driver is to be modular.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-10-21 23:58:20 -04:00
David Woodhouse
b876d08f81 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:

	drivers/pci/dmar.c
2008-10-21 19:42:20 +01:00
Ingo Molnar
e9f95e6373 genirq: fix off by one and coding style
Fix off-by-one in for_each_irq_desc_reverse().

Impact is near zero in practice, because nothing substantial wants to
iterate down to IRQ#0 - but fix it nevertheless.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-10-21 15:54:40 +02:00
Linus Torvalds
a0bfb673dc Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (41 commits)
  PCI: fix pci_ioremap_bar() on s390
  PCI: fix AER capability check
  PCI: use pci_find_ext_capability everywhere
  PCI: remove #ifdef DEBUG around dev_dbg call
  PCI hotplug: fix get_##name return value problem
  PCI: document the pcie_aspm kernel parameter
  PCI: introduce an pci_ioremap(pdev, barnr) function
  powerpc/PCI: Add legacy PCI access via sysfs
  PCI: Add ability to mmap legacy_io on some platforms
  PCI: probing debug message uniformization
  PCI: support PCIe ARI capability
  PCI: centralize the capabilities code in probe.c
  PCI: centralize the capabilities code in pci-sysfs.c
  PCI: fix 64-vbit prefetchable memory resource BARs
  PCI: replace cfg space size (256/4096) by macros.
  PCI: use resource_size() everywhere.
  PCI: use same arg names in PCI_VDEVICE comment
  PCI hotplug: rpaphp: make debug var unique
  PCI: use %pF instead of print_fn_descriptor_symbol() in quirks.c
  PCI: fix hotplug get_##name return value problem
  ...
2008-10-20 13:40:47 -07:00
Linus Torvalds
92b29b86fe Merge branch 'tracing-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (131 commits)
  tracing/fastboot: improve help text
  tracing/stacktrace: improve help text
  tracing/fastboot: fix initcalls disposition in bootgraph.pl
  tracing/fastboot: fix bootgraph.pl initcall name regexp
  tracing/fastboot: fix issues and improve output of bootgraph.pl
  tracepoints: synchronize unregister static inline
  tracepoints: tracepoint_synchronize_unregister()
  ftrace: make ftrace_test_p6nop disassembler-friendly
  markers: fix synchronize marker unregister static inline
  tracing/fastboot: add better resolution to initcall debug/tracing
  trace: add build-time check to avoid overrunning hex buffer
  ftrace: fix hex output mode of ftrace
  tracing/fastboot: fix initcalls disposition in bootgraph.pl
  tracing/fastboot: fix printk format typo in boot tracer
  ftrace: return an error when setting a nonexistent tracer
  ftrace: make some tracers reentrant
  ring-buffer: make reentrant
  ring-buffer: move page indexes into page headers
  tracing/fastboot: only trace non-module initcalls
  ftrace: move pc counter in irqtrace
  ...

Manually fix conflicts:
 - init/main.c: initcall tracing
 - kernel/module.c: verbose level vs tracepoints
 - scripts/bootgraph.pl: fallout from cherry-picking commits.
2008-10-20 13:35:07 -07:00
Linus Torvalds
b9d7ccf56b Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86 ACPI: fix breakage of resume on 64-bit UP systems with SMP kernel
  Introduce is_vmalloc_or_module_addr() and use with DEBUG_VIRTUAL
2008-10-20 13:27:05 -07:00
Linus Torvalds
9301975ec2 Merge branch 'genirq-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
This merges branches irq/genirq, irq/sparseirq-v4, timers/hpet-percpu
and x86/uv.

The sparseirq branch is just preliminary groundwork: no sparse IRQs are
actually implemented by this tree anymore - just the new APIs are added
while keeping the old way intact as well (the new APIs map 1:1 to
irq_desc[]).  The 'real' sparse IRQ support will then be a relatively
small patch ontop of this - with a v2.6.29 merge target.

* 'genirq-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (178 commits)
  genirq: improve include files
  intr_remapping: fix typo
  io_apic: make irq_mis_count available on 64-bit too
  genirq: fix name space collisions of nr_irqs in arch/*
  genirq: fix name space collision of nr_irqs in autoprobe.c
  genirq: use iterators for irq_desc loops
  proc: fixup irq iterator
  genirq: add reverse iterator for irq_desc
  x86: move ack_bad_irq() to irq.c
  x86: unify show_interrupts() and proc helpers
  x86: cleanup show_interrupts
  genirq: cleanup the sparseirq modifications
  genirq: remove artifacts from sparseirq removal
  genirq: revert dynarray
  genirq: remove irq_to_desc_alloc
  genirq: remove sparse irq code
  genirq: use inline function for irq_to_desc
  genirq: consolidate nr_irqs and for_each_irq_desc()
  x86: remove sparse irq from Kconfig
  genirq: define nr_irqs for architectures with GENERIC_HARDIRQS=n
  ...
2008-10-20 13:23:01 -07:00
Dave Jones
f4432c5cae Update email addresses.
Update assorted email addresses and related info to point
to a single current, valid address.

additionally
- trivial CREDITS entry updates. (Not that this file means much any more)
- remove arjans dead redhat.com address from powernow driver

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-20 12:50:03 -07:00
David Woodhouse
b364776ad1 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:

	drivers/pci/intel-iommu.c
2008-10-20 20:19:36 +01:00
Linus Torvalds
db7a6d8d01 Update .gitignore files for generated targets
The generated 'capflags.c' file wasn't properly ignored, and the list of
files in scripts/basic/ wasn't up-to-date.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-20 11:24:31 -07:00
Seth Heasley
37a84ec668 x86/PCI: irq and pci_ids patch for Intel Ibex Peak DeviceIDs
This patch updates the Intel Ibex Peak (PCH) LPC and SMBus Controller
DeviceIDs.

The LPC Controller ID is set by Firmware within the range of
0x3b00-3b1f.  This range is included in pci_ids.h using min and max
values, and irq.c now has code to handle the range (in lieu of 32
additions to a SWITCH statement).

The SMBus Controller ID is a fixed-value and will not change.

Signed-off-by: Seth Heasley <seth.heasley@intel.com>
Acked-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-10-20 10:53:48 -07:00
Bjorn Helgaas
d768cb6929 x86/PCI: follow lspci device/vendor style
Use "[%04x:%04x]" for PCI vendor/device IDs to follow the format
used by lspci(8).

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-10-20 10:53:43 -07:00
Steven Rostedt
606576ce81 ftrace: rename FTRACE to FUNCTION_TRACER
Due to confusion between the ftrace infrastructure and the gcc profiling
tracer "ftrace", this patch renames the config options from FTRACE to
FUNCTION_TRACER.  The other two names that are offspring from FTRACE
DYNAMIC_FTRACE and FTRACE_MCOUNT_RECORD will stay the same.

This patch was generated mostly by script, and partially by hand.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-20 18:27:03 +02:00
Steven Rostedt
c513867561 ftrace: do not enclose logic in WARN_ON
In ftrace, logic is defined in the WARN_ON_ONCE, which can become a
nop with some configs. This patch fixes it.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-20 18:27:00 +02:00
Adrian Bunk
357c6e6359 rtc: use bcd2bin/bin2bcd
Change various rtc related code to use the new bcd2bin/bin2bcd functions
instead of the obsolete BCD_TO_BIN/BIN_TO_BCD/BCD2BIN/BIN2BCD macros.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Alessandro Zummo <a.zummo@towertech.it>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-20 08:52:41 -07:00
Vivek Goyal
57cac4d188 kdump: make elfcorehdr_addr independent of CONFIG_PROC_VMCORE
o elfcorehdr_addr is used by not only the code under CONFIG_PROC_VMCORE
  but also by the code which is not inside CONFIG_PROC_VMCORE.  For
  example, is_kdump_kernel() is used by powerpc code to determine if
  kernel is booting after a panic then use previous kernel's TCE table.
  So even if CONFIG_PROC_VMCORE is not set in second kernel, one should be
  able to correctly determine that we are booting after a panic and setup
  calgary iommu accordingly.

o So remove the assumption that elfcorehdr_addr is under
  CONFIG_PROC_VMCORE.

o Move definition of elfcorehdr_addr to arch dependent crash files.
  (Unfortunately crash dump does not have an arch independent file
  otherwise that would have been the best place).

o kexec.c is not the right place as one can Have CRASH_DUMP enabled in
  second kernel without KEXEC being enabled.

o I don't see sh setup code parsing the command line for
  elfcorehdr_addr.  I am wondering how does vmcore interface work on sh.
  Anyway, I am atleast defining elfcoredhr_addr so that compilation is not
  broken on sh.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Simon Horman <horms@verge.net.au>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-20 08:52:39 -07:00
Matt Helsley
dc52ddc0e6 container freezer: implement freezer cgroup subsystem
This patch implements a new freezer subsystem in the control groups
framework.  It provides a way to stop and resume execution of all tasks in
a cgroup by writing in the cgroup filesystem.

The freezer subsystem in the container filesystem defines a file named
freezer.state.  Writing "FROZEN" to the state file will freeze all tasks
in the cgroup.  Subsequently writing "RUNNING" will unfreeze the tasks in
the cgroup.  Reading will return the current state.

* Examples of usage :

   # mkdir /containers/freezer
   # mount -t cgroup -ofreezer freezer  /containers
   # mkdir /containers/0
   # echo $some_pid > /containers/0/tasks

to get status of the freezer subsystem :

   # cat /containers/0/freezer.state
   RUNNING

to freeze all tasks in the container :

   # echo FROZEN > /containers/0/freezer.state
   # cat /containers/0/freezer.state
   FREEZING
   # cat /containers/0/freezer.state
   FROZEN

to unfreeze all tasks in the container :

   # echo RUNNING > /containers/0/freezer.state
   # cat /containers/0/freezer.state
   RUNNING

This is the basic mechanism which should do the right thing for user space
task in a simple scenario.

It's important to note that freezing can be incomplete.  In that case we
return EBUSY.  This means that some tasks in the cgroup are busy doing
something that prevents us from completely freezing the cgroup at this
time.  After EBUSY, the cgroup will remain partially frozen -- reflected
by freezer.state reporting "FREEZING" when read.  The state will remain
"FREEZING" until one of these things happens:

	1) Userspace cancels the freezing operation by writing "RUNNING" to
		the freezer.state file
	2) Userspace retries the freezing operation by writing "FROZEN" to
		the freezer.state file (writing "FREEZING" is not legal
		and returns EIO)
	3) The tasks that blocked the cgroup from entering the "FROZEN"
		state disappear from the cgroup's set of tasks.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: export thaw_process]
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: Matt Helsley <matthltc@us.ibm.com>
Acked-by: Serge E. Hallyn <serue@us.ibm.com>
Tested-by: Matt Helsley <matthltc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-20 08:52:34 -07:00
Nick Piggin
db64fe0225 mm: rewrite vmap layer
Rewrite the vmap allocator to use rbtrees and lazy tlb flushing, and
provide a fast, scalable percpu frontend for small vmaps (requires a
slightly different API, though).

The biggest problem with vmap is actually vunmap.  Presently this requires
a global kernel TLB flush, which on most architectures is a broadcast IPI
to all CPUs to flush the cache.  This is all done under a global lock.  As
the number of CPUs increases, so will the number of vunmaps a scaled
workload will want to perform, and so will the cost of a global TLB flush.
 This gives terrible quadratic scalability characteristics.

Another problem is that the entire vmap subsystem works under a single
lock.  It is a rwlock, but it is actually taken for write in all the fast
paths, and the read locking would likely never be run concurrently anyway,
so it's just pointless.

This is a rewrite of vmap subsystem to solve those problems.  The existing
vmalloc API is implemented on top of the rewritten subsystem.

The TLB flushing problem is solved by using lazy TLB unmapping.  vmap
addresses do not have to be flushed immediately when they are vunmapped,
because the kernel will not reuse them again (would be a use-after-free)
until they are reallocated.  So the addresses aren't allocated again until
a subsequent TLB flush.  A single TLB flush then can flush multiple
vunmaps from each CPU.

XEN and PAT and such do not like deferred TLB flushing because they can't
always handle multiple aliasing virtual addresses to a physical address.
They now call vm_unmap_aliases() in order to flush any deferred mappings.
That call is very expensive (well, actually not a lot more expensive than
a single vunmap under the old scheme), however it should be OK if not
called too often.

The virtual memory extent information is stored in an rbtree rather than a
linked list to improve the algorithmic scalability.

There is a per-CPU allocator for small vmaps, which amortizes or avoids
global locking.

To use the per-CPU interface, the vm_map_ram / vm_unmap_ram interfaces
must be used in place of vmap and vunmap.  Vmalloc does not use these
interfaces at the moment, so it will not be quite so scalable (although it
will use lazy TLB flushing).

As a quick test of performance, I ran a test that loops in the kernel,
linearly mapping then touching then unmapping 4 pages.  Different numbers
of tests were run in parallel on an 4 core, 2 socket opteron.  Results are
in nanoseconds per map+touch+unmap.

threads           vanilla         vmap rewrite
1                 14700           2900
2                 33600           3000
4                 49500           2800
8                 70631           2900

So with a 8 cores, the rewritten version is already 25x faster.

In a slightly more realistic test (although with an older and less
scalable version of the patch), I ripped the not-very-good vunmap batching
code out of XFS, and implemented the large buffer mapping with vm_map_ram
and vm_unmap_ram...  along with a couple of other tricks, I was able to
speed up a large directory workload by 20x on a 64 CPU system.  I believe
vmap/vunmap is actually sped up a lot more than 20x on such a system, but
I'm running into other locks now.  vmap is pretty well blown off the
profiles.

Before:
1352059 total                                      0.1401
798784 _write_lock                              8320.6667 <- vmlist_lock
529313 default_idle                             1181.5022
 15242 smp_call_function                         15.8771  <- vmap tlb flushing
  2472 __get_vm_area_node                         1.9312  <- vmap
  1762 remove_vm_area                             4.5885  <- vunmap
   316 map_vm_area                                0.2297  <- vmap
   312 kfree                                      0.1950
   300 _spin_lock                                 3.1250
   252 sn_send_IPI_phys                           0.4375  <- tlb flushing
   238 vmap                                       0.8264  <- vmap
   216 find_lock_page                             0.5192
   196 find_next_bit                              0.3603
   136 sn2_send_IPI                               0.2024
   130 pio_phys_write_mmr                         2.0312
   118 unmap_kernel_range                         0.1229

After:
 78406 total                                      0.0081
 40053 default_idle                              89.4040
 33576 ia64_spinlock_contention                 349.7500
  1650 _spin_lock                                17.1875
   319 __reg_op                                   0.5538
   281 _atomic_dec_and_lock                       1.0977
   153 mutex_unlock                               1.5938
   123 iget_locked                                0.1671
   117 xfs_dir_lookup                             0.1662
   117 dput                                       0.1406
   114 xfs_iget_core                              0.0268
    92 xfs_da_hashname                            0.1917
    75 d_alloc                                    0.0670
    68 vmap_page_range                            0.0462 <- vmap
    58 kmem_cache_alloc                           0.0604
    57 memset                                     0.0540
    52 rb_next                                    0.1625
    50 __copy_user                                0.0208
    49 bitmap_find_free_region                    0.2188 <- vmap
    46 ia64_sn_udelay                             0.1106
    45 find_inode_fast                            0.1406
    42 memcmp                                     0.2188
    42 finish_task_switch                         0.1094
    42 __d_lookup                                 0.0410
    40 radix_tree_lookup_slot                     0.1250
    37 _spin_unlock_irqrestore                    0.3854
    36 xfs_bmapi                                  0.0050
    36 kmem_cache_free                            0.0256
    35 xfs_vn_getattr                             0.0322
    34 radix_tree_lookup                          0.1062
    33 __link_path_walk                           0.0035
    31 xfs_da_do_buf                              0.0091
    30 _xfs_buf_find                              0.0204
    28 find_get_page                              0.0875
    27 xfs_iread                                  0.0241
    27 __strncpy_from_user                        0.2812
    26 _xfs_buf_initialize                        0.0406
    24 _xfs_buf_lookup_pages                      0.0179
    24 vunmap_page_range                          0.0250 <- vunmap
    23 find_lock_page                             0.0799
    22 vm_map_ram                                 0.0087 <- vmap
    20 kfree                                      0.0125
    19 put_page                                   0.0330
    18 __kmalloc                                  0.0176
    17 xfs_da_node_lookup_int                     0.0086
    17 _read_lock                                 0.0885
    17 page_waitqueue                             0.0664

vmap has gone from being the top 5 on the profiles and flushing the crap
out of all TLBs, to using less than 1% of kernel time.

[akpm@linux-foundation.org: cleanups, section fix]
[akpm@linux-foundation.org: fix build on alpha]
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-20 08:52:32 -07:00
Ingo Molnar
3e10e879a8 Merge branch 'linus' into tracing-v28-for-linus-v3
Conflicts:
	init/main.c
	kernel/module.c
	scripts/bootgraph.pl
2008-10-19 19:04:47 +02:00
Andreas Herrmann
f609891f42 amd_iommu: fix nasty bug that caused ILLEGAL_DEVICE_TABLE_ENTRY errors
We are on 64-bit so better use u64 instead of u32 to deal with
addresses:

static void __init iommu_set_device_table(struct amd_iommu *iommu)
{
        u64 entry;
  ...
        entry = virt_to_phys(amd_iommu_dev_table);
  ...

(I am wondering why gcc 4.2.x did not warn about the assignment
between u32 and unsigned long.)

Cc: iommu@lists.linux-foundation.org
Cc: stable@kernel.org
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2008-10-18 14:29:30 +01:00
Fenghua Yu
5b6985ce8e intel-iommu: IA64 support
The current Intel IOMMU code assumes that both host page size and Intel
IOMMU page size are 4KiB. The first patch supports variable page size.
This provides support for IA64 which has multiple page sizes.

This patch also adds some other code hooks for IA64 platform including
DMAR_OPERATION_TIMEOUT definition.

[dwmw2: some cleanup]
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2008-10-18 14:29:15 +01:00
Eric Anholt
d1d8c925b7 Export kmap_atomic_pfn for DRM-GEM.
The driver would like to map IO space directly for copying data in when
appropriate, to avoid CPU cache flushing for streaming writes.
kmap_atomic_pfn lets us avoid IPIs associated with ioremap for this process.

Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2008-10-18 07:10:12 +10:00
Arjan van de Ven
651dab4264 Merge commit 'linus/master' into merge-linus
Conflicts:

	arch/x86/kvm/i8254.c
2008-10-17 09:20:26 -07:00
Rafael J. Wysocki
3038edabf4 x86 ACPI: fix breakage of resume on 64-bit UP systems with SMP kernel
x86 ACPI: Fix breakage of resume on 64-bit UP systems with SMP kernel

We are now using per CPU GDT tables in head_64.S and the original
early_gdt_descr.address is invalidated after boot by
setup_per_cpu_areas().  This breaks resume from suspend to RAM on
x86_64 UP systems using SMP kernels, because this part of head_64.S
is also executed during the resume and the invalid GDT address
causes the system to crash.  It doesn't break on 'true' SMP systems,
because early_gdt_descr.address is modified every time
native_cpu_up() runs.  However, during resume it should point to the
GDT of the boot CPU rather than to another CPU's GDT.

For this reason, during suspend to RAM always make
early_gdt_descr.address point to the boot CPU's GDT.

This fixes http://bugzilla.kernel.org/show_bug.cgi?id=11568, which
is a regression from 2.6.26.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@suse.cz>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Reported-and-tested-by: Andy Wettstein <ajw1980@gmail.com>
2008-10-17 14:19:49 +02:00
Venkatesh Pallipadi
89cedfefca cpuidle: upon BIOS bug, default to default_idle rather than polling
http://bugzilla.kernel.org/show_bug.cgi?id=11345

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-10-16 19:00:08 -04:00
Linus Torvalds
08d19f51f0 Merge branch 'kvm-updates/2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
* 'kvm-updates/2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: (134 commits)
  KVM: ia64: Add intel iommu support for guests.
  KVM: ia64: add directed mmio range support for kvm guests
  KVM: ia64: Make pmt table be able to hold physical mmio entries.
  KVM: Move irqchip_in_kernel() from ioapic.h to irq.h
  KVM: Separate irq ack notification out of arch/x86/kvm/irq.c
  KVM: Change is_mmio_pfn to kvm_is_mmio_pfn, and make it common for all archs
  KVM: Move device assignment logic to common code
  KVM: Device Assignment: Move vtd.c from arch/x86/kvm/ to virt/kvm/
  KVM: VMX: enable invlpg exiting if EPT is disabled
  KVM: x86: Silence various LAPIC-related host kernel messages
  KVM: Device Assignment: Map mmio pages into VT-d page table
  KVM: PIC: enhance IPI avoidance
  KVM: MMU: add "oos_shadow" parameter to disable oos
  KVM: MMU: speed up mmu_unsync_walk
  KVM: MMU: out of sync shadow core
  KVM: MMU: mmu_convert_notrap helper
  KVM: MMU: awareness of new kvm_mmu_zap_page behaviour
  KVM: MMU: mmu_parent_walk
  KVM: x86: trap invlpg
  KVM: MMU: sync roots on mmu reload
  ...
2008-10-16 15:36:00 -07:00
Linus Torvalds
e533b22705 Merge branch 'core-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  do_generic_file_read: s/EINTR/EIO/ if lock_page_killable() fails
  softirq, warning fix: correct a format to avoid a warning
  softirqs, debug: preemption check
  x86, pci-hotplug, calgary / rio: fix EBDA ioremap()
  IO resources, x86: ioremap sanity check to catch mapping requests exceeding, fix
  IO resources, x86: ioremap sanity check to catch mapping requests exceeding the BAR sizes
  softlockup: Documentation/sysctl/kernel.txt: fix softlockup_thresh description
  dmi scan: warn about too early calls to dmi_check_system()
  generic: redefine resource_size_t as phys_addr_t
  generic: make PFN_PHYS explicitly return phys_addr_t
  generic: add phys_addr_t for holding physical addresses
  softirq: allocate less vectors
  IO resources: fix/remove printk
  printk: robustify printk, update comment
  printk: robustify printk, fix #2
  printk: robustify printk, fix
  printk: robustify printk

Fixed up conflicts in:
	arch/powerpc/include/asm/types.h
	arch/powerpc/platforms/Kconfig.cputype
manually.
2008-10-16 15:17:40 -07:00
Linus Torvalds
0999d978dc Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: fix compat-vdso
  x86/mm: unify init task OOM handling
  x86/mm: do not trigger a kernel warning if user-space disables interrupts and generates a page fault
2008-10-16 15:08:45 -07:00
Andreas Herrmann
26adcfbf00 x86: SB600: skip ACPI IRQ0 override if it is not routed to INT2 of IOAPIC
On some more HP laptops BIOS reports an IRQ0 override
but the SB600 chipset is configured such that timer
interrupts go to INT0 of IOAPIC.

Check IRQ0 routing and if it is routed to INT0 of IOAPIC skip the
timer override.

http://bugzilla.kernel.org/show_bug.cgi?id=11715
http://bugzilla.kernel.org/show_bug.cgi?id=11516

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-10-16 15:48:15 -04:00
Linus Torvalds
c813b4e16e Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (46 commits)
  UIO: Fix mapping of logical and virtual memory
  UIO: add automata sercos3 pci card support
  UIO: Change driver name of uio_pdrv
  UIO: Add alignment warnings for uio-mem
  Driver core: add bus_sort_breadthfirst() function
  NET: convert the phy_device file to use bus_find_device_by_name
  kobject: Cleanup kobject_rename and !CONFIG_SYSFS
  kobject: Fix kobject_rename and !CONFIG_SYSFS
  sysfs: Make dir and name args to sysfs_notify() const
  platform: add new device registration helper
  sysfs: use ilookup5() instead of ilookup5_nowait()
  PNP: create device attributes via default device attributes
  Driver core: make bus_find_device_by_name() more robust
  usb: turn dev_warn+WARN_ON combos into dev_WARN
  debug: use dev_WARN() rather than WARN_ON() in device_pm_add()
  debug: Introduce a dev_WARN() function
  sysfs: fix deadlock
  device model: Do a quickcheck for driver binding before doing an expensive check
  Driver core: Fix cleanup in device_create_vargs().
  Driver core: Clarify device cleanup.
  ...
2008-10-16 12:40:26 -07:00
Michal Januszewski
4d31a2b74c fbdev: ignore VESA modes if framebuffer does not support them
Currently, it is possible to set a graphics VESA mode at boot time via the
vga= parameter even when no framebuffer driver supporting this is
configured.  This could lead to the system booting with a black screen,
without a usable console.

Fix this problem by only allowing to set graphics modes at boot time if a
supporting framebuffer driver is configured.

Signed-off-by: Michal Januszewski <spock@gentoo.org>
Acked-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:45 -07:00
Joerg Roedel
036b4c50fe x86: convert Calgary IOMMU driver to generic iommu_num_pages function
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:33 -07:00
Joerg Roedel
e3c449f526 x86, AMD IOMMU: convert driver to generic iommu_num_pages function
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:33 -07:00
Joerg Roedel
1477b8e5f1 x86: convert GART driver to generic iommu_num_pages function
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: Dave Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:33 -07:00
Joerg Roedel
bdab0ba3d9 x86: rename iommu_num_pages function to iommu_nr_pages
This series of patches re-introduces the iommu_num_pages function so that
it can be used by each architecture specific IOMMU implementations.  The
series also changes IOMMU implementations for X86, Alpha, PowerPC and
UltraSparc.  The other implementations are not yet changed because the
modifications required are not obvious and I can't test them on real
hardware.

This patch:

This is a preparation patch for introducing a generic iommu_num_pages function.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: Dave Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:33 -07:00
Christoph Hellwig
b418da16dd compat: generic compat get/settimeofday
Nothing arch specific in get/settimeofday.  The details of the timeval
conversion varied a little from arch to arch, but all with the same
results.

Also add an extern declaration for sys_tz to linux/time.h because externs
in .c files are fowned upon.  I'll kill the externs in various other files
in a sparate patch.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: David S. Miller <davem@davemloft.net> [ sparc bits ]
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Grant Grundler <grundler@parisc-linux.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:33 -07:00
Christoph Hellwig
f7a5000f7a compat: move cp_compat_stat to common code
struct stat / compat_stat is the same on all architectures, so
cp_compat_stat should be, too.

Turns out it is, except that various architectures have slightly and some
high2lowuid/high2lowgid or the direct assignment instead of the
SET_UID/SET_GID that expands to the correct one anyway.

This patch replaces the arch-specific cp_compat_stat implementations with
a common one based on the x86-64 one.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: David S. Miller <davem@davemloft.net> [ sparc bits ]
Acked-by: Kyle McMartin <kyle@mcmartin.ca> [ parisc bits ]
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:33 -07:00
Andi Kleen
25ddbb18aa Make the taint flags reliable
It's somewhat unlikely that it happens, but right now a race window
between interrupts or machine checks or oopses could corrupt the tainted
bitmap because it is modified in a non atomic fashion.

Convert the taint variable to an unsigned long and use only atomic bit
operations on it.

Unfortunately this means the intvec sysctl functions cannot be used on it
anymore.

It turned out the taint sysctl handler could actually be simplified a bit
(since it only increases capabilities) so this patch actually removes
code.

[akpm@linux-foundation.org: remove unneeded include]
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:31 -07:00
Jan Beulich
9ba16087d9 Kconfig: eliminate "def_bool n" constructs
Using "def_bool n" is pointless, simply using bool here appears more
appropriate.

Further, retaining such options that don't have a prompt and aren't
selected by anything seems also at least questionable.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:31 -07:00
Harvey Harrison
80a914dc05 misc: replace __FUNCTION__ with __func__
__FUNCTION__ is gcc-specific, use __func__

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:30 -07:00
Greg Kroah-Hartman
a9b12619f7 device create: misc: convert device_create_drvdata to device_create
Now that device_create() has been audited, rename things back to the
original call to be sane.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-10-16 09:24:43 -07:00
Andrew Morton
ae87221d3c sysfs: crash debugging
Print the name of the last-accessed sysfs file when we oops, to help track
down oopses which occur in sysfs store/read handlers.  Because these oopses
tend to not leave any trace of the offending code in the stack traces.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-10-16 09:24:41 -07:00
Robert Richter
0f019cc477 oprofile: fixing whitespaces in arch/x86/oprofile/*
Signed-off-by: Robert Richter <robert.richter@amd.com>
2008-10-16 17:17:46 +02:00
Ingo Molnar
10e0298686 io_apic: make irq_mis_count available on 64-bit too
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-16 16:59:20 +02:00
Thomas Gleixner
249f6d9eab x86: move ack_bad_irq() to irq.c
Share more duplicated code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-10-16 16:53:30 +02:00