Commit graph

735 commits

Author SHA1 Message Date
Yinghai Lu
0450193bff x86, x2apic: Don't map lapic addr for preenabled x2apic systems
If x2apic is preenabled and used by the kernel, we don't need to map
the lapic address. That mapping will never be used.

So just skip that in register_lapic_address()

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
LKML-Reference: <4CFDF69C.9070501@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-12-09 21:52:05 +01:00
Yinghai Lu
326a2e6bae x86, apic: Use register_lapic_address() in init_apic_mapping()
Remove the printk as well, we don't want to print when nothing
changed. We print in register_lapic_address() already.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
LKML-Reference: <4CFDF68A.7020902@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-12-09 21:52:05 +01:00
Yinghai Lu
f115714163 x86, apic: Remove early_init_lapic_mapping()
It is almost the same as smp_register_lapic_addr(). We just need to
let smp_read_mpc() call smp_register_lapic_addr() when early==1.

Add the apic_printk to smp_register_lapic_address()

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
LKML-Reference: <4CFDF681.3030509@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-12-09 21:52:04 +01:00
Yinghai Lu
c0104d38a7 x86, apic: Unify identical register_lapic_address() functions
They are the same, move the common function to apic.c to allow
further cleanups.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Len Brown <lenb@kernel.org>
LKML-Reference: <4CFDF675.4060305@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-12-09 21:52:04 +01:00
Thomas Gleixner
d834a9dcec Merge branch 'x86/amd-nb' into x86/apic-cleanups
Reason: apic cleanup series depends on x86/apic, x86/amd-nb x86/platform

Conflicts:
	arch/x86/include/asm/io_apic.h

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-12-09 18:17:25 +01:00
Thomas Gleixner
4720dd1b38 x86: io_apic: Avoid unused variable warning when CONFIG_GENERIC_PENDING_IRQ=n
arch/x86/kernel/apic/io_apic.c: In function 'ack_apic_level':
arch/x86/kernel/apic/io_apic.c:2433: warning: unused variable 'desc'

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <201010272107.o9RL7rse018212@imap1.linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-12-09 17:43:21 +01:00
Rakib Mullick
2c6cb1053a x86: Address 'unused' warning in hw_nmi.c again
arch/x86/kernel/apic/hw_nmi.c:29: warning: backtrace_mask defined but not used

commit 0e2af2a9(x86, hw_nmi: Move backtrace_mask declaration under
ARCH_HAS_NMI_WATCHDOG) addressed this warning, but it was reintroduced
by commit 5f2b0ba4(x86, nmi_watchdog: Remove the old nmi_watchdog).

Move backtrace_mask into the #ifdef arch_trigger_all_cpu_backtrace
section again.

Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <AANLkTi=rcc38QzoKa6LFy4m++-p_9=Zt4_kDQE=GeKxf@mail.gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-12-09 16:06:52 +01:00
Ingo Molnar
10a18d7dc0 Merge commit 'v2.6.37-rc5' into perf/core
Merge reason: Pick up the latest -rc.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-12-07 07:49:51 +01:00
Sebastian Andrzej Siewior
a38c5380ef x86: io_apic: Split setup_ioapic_ids_from_mpc()
Sodaville needs to setup the IO_APIC ids as the boot loader leaves
them uninitialized. Split out the setter function so it can be called
unconditionally from the sodaville board code.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <20101126165020.GA26361@www.tglx.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-12-06 14:30:28 +01:00
Linus Torvalds
fbe6c4047f Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  dmar, x86: Use function stubs when CONFIG_INTR_REMAP is disabled
  x86-64: Fix and clean up AMD Fam10 MMCONF enabling
  x86: UV: Address interrupt/IO port operation conflict
  x86: Use online node real index in calulate_tbl_offset()
  x86, asm: Fix binutils 2.15 build failure
2010-11-27 07:28:47 +09:00
Ingo Molnar
6c869e772c Merge branch 'perf/urgent' into perf/core
Conflicts:
	arch/x86/kernel/apic/hw_nmi.c

Merge reason: Resolve conflict, queue up dependent patch.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-11-26 15:07:02 +01:00
Dimitri Sivanich
8191c9f692 x86: UV: Address interrupt/IO port operation conflict
This patch for SGI UV systems addresses a problem whereby
interrupt transactions being looped back from a local IOH,
through the hub to a local CPU can (erroneously) conflict with
IO port operations and other transactions.

To workaound this we set a high bit in the APIC IDs used for
interrupts. This bit appears to be ignored by the sockets, but
it avoids the conflict in the hub.

Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
LKML-Reference: <20101116222352.GA8155@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
___

 arch/x86/include/asm/uv/uv_hub.h   |    4 ++++
 arch/x86/include/asm/uv/uv_mmrs.h  |   19 ++++++++++++++++++-
 arch/x86/kernel/apic/x2apic_uv_x.c |   25 +++++++++++++++++++++++--
 arch/x86/platform/uv/tlb_uv.c      |    2 +-
 arch/x86/platform/uv/uv_time.c     |    4 +++-
 5 files changed, 49 insertions(+), 5 deletions(-)
2010-11-18 10:41:25 +01:00
Rakib Mullick
0e2af2a9ab x86, hw_nmi: Move backtrace_mask declaration under ARCH_HAS_NMI_WATCHDOG
backtrace_mask has been used under the code context of
ARCH_HAS_NMI_WATCHDOG. So put it into that context.
We were warned by the following warning:

  arch/x86/kernel/apic/hw_nmi.c:21: warning: ‘backtrace_mask’ defined but not used

Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
LKML-Reference: <1289573455-3410-2-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-11-18 09:15:12 +01:00
Don Zickus
072b198a4a x86, nmi_watchdog: Remove all stub function calls from old nmi_watchdog
Now that the bulk of the old nmi_watchdog is gone, remove all
the stub variables and hooks associated with it.

This touches lots of files mainly because of how the io_apic
nmi_watchdog was implemented.  Now that the io_apic nmi_watchdog
is forever gone, remove all its fingers.

Most of this code was not being exercised by virtue of
nmi_watchdog != NMI_IO_APIC, so there shouldn't be anything to
risky here.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: fweisbec@gmail.com
Cc: gorcunov@openvz.org
LKML-Reference: <1289578944-28564-3-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-11-18 09:08:23 +01:00
Don Zickus
5f2b0ba4d9 x86, nmi_watchdog: Remove the old nmi_watchdog
Now that we have a new nmi_watchdog that is more generic and
sits on top of the perf subsystem, we really do not need the old
nmi_watchdog any more.

In addition, the old nmi_watchdog doesn't really work if you are
using the default clocksource, hpet.  The old nmi_watchdog code
relied on local apic interrupts to determine if the cpu is still
alive.  With hpet as the clocksource, these interrupts don't
increment any more and the old nmi_watchdog triggers false
postives.

This piece removes the old nmi_watchdog code and stubs out any
variables and functions calls.  The stubs are the same ones used
by the new nmi_watchdog code, so it should be well tested.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: fweisbec@gmail.com
Cc: gorcunov@openvz.org
LKML-Reference: <1289578944-28564-2-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-11-18 09:08:23 +01:00
Jesper Juhl
2a8dcbd6cd x86, apic: Remove double #include
Remove the second <asm/atomic.h> inclusion.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
LKML-Reference: <alpine.LNX.2.00.1011072253360.26247@swampdragon.chaosbits.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-11-10 10:21:16 +01:00
Jack Steiner
62b0cfc240 x86, UV: Update node controller MMRs
A new version of the SGI UV hub node controller is being
developed. A few of the MMRs (control registers) that exist on
the current hub no longer exist on the new hub. Fortunately,
there are alternate MMRs that are are functionally equivalent
and that exist on both hubs.

This patch changes the UV code to use MMRs that exist in BOTH
versions of the hub node controller.

Signed-off-by: Jack Steiner <steiner@sgi.com>
LKML-Reference: <20101106204056.GA27584@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-11-10 10:06:38 +01:00
Yinghai Lu
7b79462a20 x86: Check irq_remapped instead of remapping_enabled in destroy_irq()
Russ Anderson reported:
| There is a regression that is causing a NULL pointer dereference
| in free_irte when shutting down xpc. git bisect narrowed it down
| to git commit d585d06(intr_remap: Simplify the code further), which
| changed free_irte(). Reverse applying the patch fixes the problem.

We need to use irq_remapped() for each irq instead of checking only
intr_remapping_enabled as there might be non remapped irqs even when
remapping is enabled.

[ tglx: use cfg instead of retrieving it again. Massaged changelog ]

Reported-bisected-and-tested-by: Russ Anderson <rja@sgi.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <4CCBD511.40607@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-10-30 10:28:31 +02:00
Linus Torvalds
2d10d8737c Merge branches 'x86-fixes-for-linus' and 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, alternative: Call stop_machine_text_poke() on all cpus
  x86-32: Restore irq stacks NUMA-aware allocations
  x86, memblock: Fix early_node_mem with big reserved region.

* 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, uv: More Westmere support on SGI UV
  x86, uv: Enable Westmere support on SGI UV
2010-10-29 18:58:00 -07:00
Russ Anderson
0520bd8438 x86, uv: More Westmere support on SGI UV
Enable Westmere support for all APIC modes on SGI UV.

Signed-off-by: Russ Anderson <rja@sgi.com>
LKML-Reference: <20101028224132.GB15804@sgi.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-10-28 22:38:07 -07:00
Linus Torvalds
18cb657ca1 Merge branch 'stable/xen-pcifront-0.8.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
and branch 'for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm

* 'for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm:
  xen: register xen pci notifier
  xen: initialize cpu masks for pv guests in xen_smp_init
  xen: add a missing #include to arch/x86/pci/xen.c
  xen: mask the MTRR feature from the cpuid
  xen: make hvc_xen console work for dom0.
  xen: add the direct mapping area for ISA bus access
  xen: Initialize xenbus for dom0.
  xen: use vcpu_ops to setup cpu masks
  xen: map a dummy page for local apic and ioapic in xen_set_fixmap
  xen: remap MSIs into pirqs when running as initial domain
  xen: remap GSIs as pirqs when running as initial domain
  xen: introduce XEN_DOM0 as a silent option
  xen: map MSIs into pirqs
  xen: support GSI -> pirq remapping in PV on HVM guests
  xen: add xen hvm acpi_register_gsi variant
  acpi: use indirect call to register gsi in different modes
  xen: implement xen_hvm_register_pirq
  xen: get the maximum number of pirqs from xen
  xen: support pirq != irq

* 'stable/xen-pcifront-0.8.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: (27 commits)
  X86/PCI: Remove the dependency on isapnp_disable.
  xen: Update Makefile with CONFIG_BLOCK dependency for biomerge.c
  MAINTAINERS: Add myself to the Xen Hypervisor Interface and remove Chris Wright.
  x86: xen: Sanitse irq handling (part two)
  swiotlb-xen: On x86-32 builts, select SWIOTLB instead of depending on it.
  MAINTAINERS: Add myself for Xen PCI and Xen SWIOTLB maintainer.
  xen/pci: Request ACS when Xen-SWIOTLB is activated.
  xen-pcifront: Xen PCI frontend driver.
  xenbus: prevent warnings on unhandled enumeration values
  xenbus: Xen paravirtualised PCI hotplug support.
  xen/x86/PCI: Add support for the Xen PCI subsystem
  x86: Introduce x86_msi_ops
  msi: Introduce default_[teardown|setup]_msi_irqs with fallback.
  x86/PCI: Export pci_walk_bus function.
  x86/PCI: make sure _PAGE_IOMAP it set on pci mappings
  x86/PCI: Clean up pci_cache_line_size
  xen: fix shared irq device passthrough
  xen: Provide a variant of xen_poll_irq with timeout.
  xen: Find an unbound irq number in reverse order (high to low).
  xen: statically initialize cpu_evtchn_mask_p
  ...

Fix up trivial conflicts in drivers/pci/Makefile
2010-10-28 17:11:17 -07:00
Russ Anderson
c8f730b1ab x86, uv: Enable Westmere support on SGI UV
Enable Westmere support on SGI UV.  The UV initialization code is dependent on
the APICID bits.  Westmere-EX uses different APIC bit mapping than Nehalem-EX.
This code reads the apic shift value from a UV MMR to do the proper bit
decoding to determint the pnode.

Signed-off-by: Russ Anderson <rja@sgi.com>
LKML-Reference: <20101026212728.GB15071@sgi.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-10-26 15:15:28 -07:00
Robert Richter
eb48c9cb20 apic, amd: Make firmware bug messages more meaningful
This improves error messages in case the BIOS was setting up
wrong LVT offsets.

Signed-off-by: Robert Richter <robert.richter@amd.com>
Acked-by: Borislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <1288015419-29543-6-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-10-25 18:59:43 +02:00
Thomas Gleixner
23f9b26715 x86: apic: Move probe_nr_irqs_gsi() into ioapic_init_mappings()
probe_br_irqs_gsi() is called right after ioapic_init_mappings() and
there are no other users. Move it into ioapic_init_mappings() so the
declaration can disappear and the function can become static.

Rename ioapic_init_mappings() to ioapic_and_gsi_init() to reflect that
change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <1287510389-8388-2-git-send-email-dirk.brandewie@gmail.com>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
2010-10-23 17:27:50 +02:00
Thomas Gleixner
5a7ae78fd4 x86: Allow platforms to force enable apic
Some embedded x86 platforms don't setup the APIC in the
BIOS/bootloader and would be forced to add "lapic" on the kernel
command line. That's a bit akward.

Split out the force enable code from detect_init_APIC() and allow
platform code to call it from the platform setup. That avoids the
command line parameter and possible replication of the MSR dance in
the force enable code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <1287510389-8388-1-git-send-email-dirk.brandewie@gmail.com>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
2010-10-23 17:27:43 +02:00
Linus Torvalds
3044100e58 Merge branch 'core-memblock-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-memblock-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (74 commits)
  x86-64: Only set max_pfn_mapped to 512 MiB if we enter via head_64.S
  xen: Cope with unmapped pages when initializing kernel pagetable
  memblock, bootmem: Round pfn properly for memory and reserved regions
  memblock: Annotate memblock functions with __init_memblock
  memblock: Allow memblock_init to be called early
  memblock/arm: Fix memblock_region_is_memory() typo
  x86, memblock: Remove __memblock_x86_find_in_range_size()
  memblock: Fix wraparound in find_region()
  x86-32, memblock: Make add_highpages honor early reserved ranges
  x86, memblock: Fix crashkernel allocation
  arm, memblock: Fix the sparsemem build
  memblock: Fix section mismatch warnings
  powerpc, memblock: Fix memblock API change fallout
  memblock, microblaze: Fix memblock API change fallout
  x86: Remove old bootmem code
  x86, memblock: Use memblock_memory_size()/memblock_free_memory_size() to get correct dma_reserve
  x86: Remove not used early_res code
  x86, memblock: Replace e820_/_early string with memblock_
  x86: Use memblock to replace early_res
  x86, memblock: Use memblock_debug to control debug message print out
  ...

Fix up trivial conflicts in arch/x86/kernel/setup.c and kernel/Makefile
2010-10-21 18:52:11 -07:00
Robert Richter
27afdf2008 apic, x86: Use BIOS settings for IBS and MCE threshold interrupt LVT offsets
We want the BIOS to setup the EILVT APIC registers. The offsets
were hardcoded and BIOS settings were overwritten by the OS.
Now, the subsystems for MCE threshold and IBS determine the LVT
offset from the registers the BIOS has setup. If the BIOS setup
is buggy on a family 10h system, a workaround enables IBS. If
the OS determines an invalid register setup, a "[Firmware Bug]:
" error message is reported.

We need this change also for upcomming cpu families.

Signed-off-by: Robert Richter <robert.richter@amd.com>
LKML-Reference: <1286360874-1471-3-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-10-20 04:42:13 +02:00
Robert Richter
a68c439b19 apic, x86: Check if EILVT APIC registers are available (AMD only)
This patch implements checks for the availability of LVT entries
(APIC500-530) and reserves it if used. The check becomes
necessary since we want to let the BIOS provide the LVT offsets.
 The offsets should be determined by the subsystems using it
like those for MCE threshold or IBS.  On K8 only offset 0
(APIC500) and MCE interrupts are supported. Beginning with
family 10h at least 4 offsets are available.

Since offsets must be consistent for all cores, we keep track of
the LVT offsets in software and reserve the offset for the same
vector also to be used on other cores. An offset is freed by
setting the entry to APIC_EILVT_MASKED.

If the BIOS is right, there should be no conflicts. Otherwise a
"[Firmware Bug]: ..." error message is generated. However, if
software does not properly determines the offsets, it is not
necessarily a BIOS bug.

Signed-off-by: Robert Richter <robert.richter@amd.com>
LKML-Reference: <1286360874-1471-2-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-10-20 04:42:13 +02:00
Yinghai Lu
9717967c4b x86: ioapic: Call free_irte only if interrupt remapping enabled
On a system that support intr-rempping when booting with "intremap=off"

[  177.895501] BUG: unable to handle kernel NULL pointer dereference at 00000000000000f8
[  177.913316] IP: [<ffffffff8145fc18>] free_irte+0x47/0xc0
...
[  178.173326] Call Trace:
[  178.173574]  [<ffffffff810515b4>] destroy_irq+0x3a/0x75
[  178.192934]  [<ffffffff81051834>] arch_teardown_msi_irq+0xe/0x10
[  178.193418]  [<ffffffff81458dc3>] arch_teardown_msi_irqs+0x56/0x7f
[  178.213021]  [<ffffffff81458e79>] free_msi_irqs+0x8d/0xeb

Call free_irte only when interrupt remapping is enabled.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4CBCB274.7010108@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-10-19 09:25:33 +02:00
Stefano Stabellini
294ee6f89c x86: Introduce x86_msi_ops
Introduce an x86 specific indirect mechanism to setup MSIs.
The MSI setup functions become function pointers in an x86_msi_ops
struct, that defaults to the implementation in io_apic.c and msi.c.

[v2: Use HAVE_DEFAULT_* knobs]
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-10-18 10:49:34 -04:00
Jeremy Fitzhardinge
7b586d7185 x86/io_apic: add get_nr_irqs_gsi()
Impact: new interface to get max GSI

Add get_nr_irqs_gsi() to return nr_irqs_gsi.  Xen will use this to
determine how many irqs it needs to reserve for hardware irqs.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: "H. Peter Anvin" <hpa@zytor.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-10-18 10:40:30 -04:00
Thomas Gleixner
2ee3906598 x86: Switch sparse_irq allocations to GFP_KERNEL
No callers from atomic context (except boot) anymore.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
2010-10-12 16:53:46 +02:00
Thomas Gleixner
ad9f43340f x86: Use sane enumeration
Instead of looping through all interrupts, use the bitmap lookup to
find the next.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
2010-10-12 16:53:44 +02:00
Thomas Gleixner
48b2650196 x86: uv: Clean up the direct access to irq_desc
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
2010-10-12 16:53:43 +02:00
Thomas Gleixner
1a8ce7ff68 x86: Make io_apic.c local functions static
No users outside of io_apic.c

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
2010-10-12 16:53:43 +02:00
Thomas Gleixner
1a0730d664 x86: Speed up the irq_remapped check in hot pathes
irq_2_iommu is in struct irq_cfg, so we can do the irq_remapped check
based on irq_cfg instead of going through a lookup function. That's
especially interesting in the eoi_ioapic_irq() hotpath.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-10-12 16:53:42 +02:00
Thomas Gleixner
bc5fdf9f3a x86: io_apic: Remove the now unused sparse_irq arch_* functions
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
2010-10-12 16:53:40 +02:00
Thomas Gleixner
fbc6bff04a x86: ioapic: Cleanup sparse irq code
Switch over to the new allocator and remove all the magic which was
caused by the unability to destroy irq descriptors. Get rid of the
create_irq_nr() loop for sparse and non sparse irq.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
2010-10-12 16:53:40 +02:00
Yinghai Lu
fe6dab4e79 x86: Don't setup ioapic irq for sci twice
The sparseirq rework triggered a warning in the iommu code, which was
caused by setting up ioapic for ACPI irq 9 twice. This function is
solely to handle interrupts which are on a secondary ioapic and
outside the legacy irq range.

Replace the sparse irq_to_desc check with a non ifdeffed version.

[ tglx: Moved it before the ioapic sparse conversion and simplified
  	the inverse logic ]

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4CB00122.3030301@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
2010-10-12 16:53:40 +02:00
Thomas Gleixner
f981a3dc19 x86: io_apic: Prepare alloc/free_irq_cfg()
Rename the grossly misnamed get_one_free_irq_cfg() to alloc_irq_cfg().
Add a (not yet used) irq number argument to free_irq_cfg()

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
2010-10-12 16:53:40 +02:00
Thomas Gleixner
08c33db6d0 x86: Implement new allocator functions
Implement new allocator functions which make use of the core changes.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
2010-10-12 16:53:40 +02:00
Thomas Gleixner
6e2fff50a5 x86: ioapic: Cleanup get_one_free_irq_cfg()
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
2010-10-12 16:53:39 +02:00
Thomas Gleixner
7e495529b6 x86: ioapic: Cleanup some more
Cleanup after the irq_chip conversion a bit.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
2010-10-12 16:53:39 +02:00
Thomas Gleixner
be5b7bf738 x86: Convert ht set_affinity to new chip function
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-10-12 16:53:39 +02:00
Thomas Gleixner
0e09ddf2d7 x86: Cleanup hpet affinity setting
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
2010-10-12 16:53:39 +02:00
Thomas Gleixner
fe52b2d259 x86: Convert dmar affinity setting to new chip function
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Cc: David Woodhouse <dwmw2@infradead.org>
2010-10-12 16:53:39 +02:00
Thomas Gleixner
b5d1c46579 x86: Convert remapped msi to new chip.irq_set_affinity function
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-10-12 16:53:38 +02:00
Thomas Gleixner
f19f5ecc92 x86: Convert remapped ioapic affinity setting to new irq chip function
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
2010-10-12 16:53:38 +02:00
Thomas Gleixner
5346b2a78f x86: Convert msi affinity setting to new chip functions
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-10-12 16:53:38 +02:00
Thomas Gleixner
f7e909eae4 x86: Prepare the affinity common functions for taking struct irq_data *
While at it rename it to sensible function names and fix the return
value from unsigned to int for __ioapic_set_affinity (set_desc_affinity).
Returning -1 in a function returning unsigned int is somewhat strange.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
2010-10-12 16:53:38 +02:00
Thomas Gleixner
60c69948e5 x86: ioapic: Clean up the direct access to irq_desc
Most of it is useless pseudo optimization.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
2010-10-12 16:53:38 +02:00
Thomas Gleixner
e9f7ac664b ht: Convert to new irq_chip functions
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-10-12 16:53:37 +02:00
Thomas Gleixner
5c2837fbaa dmar: Convert to new irq chip functions
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Acked-by: David Woodhouse <dwmw2@infradead.org>
2010-10-12 16:53:37 +02:00
Thomas Gleixner
d0fbca8f93 x86: ioapic/hpet: Convert to new chip functions
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
2010-10-12 16:53:37 +02:00
Thomas Gleixner
90297c5fe7 x86: ioapic: Convert mask to new irq_chip function
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
2010-10-12 16:53:37 +02:00
Thomas Gleixner
61a38ce3f5 x86: io_apic: Convert startup to new irq_chip function
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
2010-10-12 16:53:37 +02:00
Thomas Gleixner
dd5f15e5cf x86: Cleanup io_apic
Sanitize functions. Remove irq_desc pointer magic.
Preparatory patch for further cleanups.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
2010-10-12 16:53:36 +02:00
Thomas Gleixner
d4eba29770 x86: Cleanup access to irq_data
Fixup the open coded access to 
      irq_desc->[handler_data|chip_data|msi-desc]

Use the macros and inline functions for it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
2010-10-12 16:53:36 +02:00
Thomas Gleixner
4305df947c x86: i8259: Convert to new irq_chip functions
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
2010-10-12 16:53:36 +02:00
Thomas Gleixner
39431acb1a pci: Cleanup the irq_desc mess in msi
Handing down irq_desc to msi just so that msi can access
irq_desc.irq_data.msi_desc is a pretty stupid idea. The calling code
can hand down a pointer to msi_desc so msi code does not need to know
about the irq descriptor at all.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-10-12 16:53:34 +02:00
Thomas Gleixner
1c9db52534 pci: Convert msi to new irq_chip functions
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Russell King <linux@arm.linux.org.uk>
2010-10-12 16:53:34 +02:00
Thomas Gleixner
7c5f13519a Merge branch 'x86/urgent' of into irq/sparseirq
Reason: Pull in the latest io_apic bugfixes

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-10-12 16:41:26 +02:00
Thomas Gleixner
5e62feabcc Merge branch 'x86/cleanups' into irq/sparseirq
Reason: Avoid conflicts with removal of boot_cpu_id

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-10-12 16:40:42 +02:00
Thomas Gleixner
8ffcfa4e2d Merge branch 'x86/x2apic' into irq/sparseirq
Reason: Avoid conflicts with the x2apic modifications

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-10-12 16:39:53 +02:00
Thomas Gleixner
b683de2b3c genirq: Query arch for number of early descriptors
sparse irq sets up NR_IRQS_LEGACY irq descriptors and archs then go
ahead and allocate more.

Use the unused return value of arch_probe_nr_irqs() to let the
architecture return the number of early allocations. Fix up all users.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
2010-10-12 16:39:08 +02:00
Ingo Molnar
153db80f8c Merge commit 'v2.6.36-rc7' into core/memblock
Merge reason: Update from -rc3 to -rc7.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-10-08 09:15:00 +02:00
Thomas Gleixner
1cf180c94e x86, irq: Plug memory leak in sparse irq
free_irq_cfg() is not freeing the cpumask_vars in irq_cfg. Fixing this
triggers a use after free caused by the fact that copying struct
irq_cfg is done with memcpy, which copies the pointer not the cpumask.

Fix both places.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
LKML-Reference: <alpine.LFD.2.00.1009282052570.2416@localhost6.localdomain6>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@kernel.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-09-30 15:57:35 -07:00
Suresh Siddha
fa47f7e528 x86, x2apic: Simplify apic init in SMP and UP builds
Move enable_IR_x2apic() inside the default_setup_apic_routing(),
and for SMP platforms, move the default_setup_apic_routing() after
smp_sanity_check(). This cleans up the code that tries to avoid multiple
calls to default_setup_apic_routing() when smp_sanity_check() fails (which
goes through the APIC_init_uniprocessor() path).

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <20100827181049.173087246@sbsiddha-MOBL3.sc.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-09-15 17:37:10 -07:00
Suresh Siddha
62a92f4c69 x86, intr-remap: Remove IRTE setup duplicate code
Remove IRTE setup duplicate code with prepare_irte().

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <20100827181049.095067319@sbsiddha-MOBL3.sc.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-09-15 17:36:45 -07:00
Suresh Siddha
75e3cfbed6 x86, intr-remap: Set redirection hint in the IRTE
Currently the redirection hint in the interrupt-remapping table entry
is set to 0, which means the remapped interrupt is directed to the
processors listed in the destination. So in logical flat mode
in the presence of intr-remapping, this results in a single
interrupt multi-casted to multiple cpu's as specified by the destination
bit mask. But what we really want is to send that interrupt to one of the cpus
based on the lowest priority delivery mode.

Set the redirection hint in the IRTE to '1' to indicate that we want
the remapped interrupt to be directed to only one of the processors
listed in the destination.

This fixes the issue of same interrupt getting delivered to multiple cpu's
in the logical flat mode in the presence of interrupt-remapping. While
there is no functional issue observed with this behavior, this will
impact performance of such configurations (<=8 cpu's using logical flat
mode in the presence of interrupt-remapping)

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <20100827181049.013051492@sbsiddha-MOBL3.sc.intel.com>
Cc: Weidong Han <weidong.han@intel.com>
Cc: <stable@kernel.org> # [v2.6.32+]
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-09-15 17:36:37 -07:00
Jack Steiner
36ac4b987b x86, UV: Fix initialization of max_pnode
Fix calculation of "max_pnode" for systems where the the highest
blade has neither cpus or memory. (And, yes, although rare this
does occur).

Signed-off-by: Jack Steiner <steiner@sgi.com>
LKML-Reference: <20100910150808.GA19802@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-10 17:15:49 +02:00
Ingo Molnar
daab7fc734 Merge commit 'v2.6.36-rc3' into x86/memblock
Conflicts:
	arch/x86/kernel/trampoline.c
	mm/memblock.c

Merge reason: Resolve the conflicts, update to latest upstream.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-08-31 09:45:46 +02:00
Yinghai Lu
a9ce6bc151 x86, memblock: Replace e820_/_early string with memblock_
1.include linux/memblock.h directly. so later could reduce e820.h reference.
2 this patch is done by sed scripts mainly

-v2: use MEMBLOCK_ERROR instead of -1ULL or -1UL

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-08-27 11:13:47 -07:00
Daniel Kiper
05e407603e x86, apic: Fix apic=debug boot crash
Fix a boot crash when apic=debug is used and the APIC is
not properly initialized.

This issue appears during Xen Dom0 kernel boot but the
fix is generic and the crash could occur on real hardware
as well.

Signed-off-by: Daniel Kiper <dkiper@net-space.pl>
Cc: xen-devel@lists.xensource.com
Cc: konrad.wilk@oracle.com
Cc: jeremy@goop.org
Cc: <stable@kernel.org> # .35.x, .34.x, .33.x, .32.x
LKML-Reference: <20100819224616.GB9967@router-fw-old.local.net-space.pl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-08-20 10:18:28 +02:00
Linus Torvalds
c206d44ffd Merge branch 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, UV: Make kdump avoid stack dumps - fix !CONFIG_KEXEC breakage
  x86, UV: Initialize BAU hub map
  x86, UV: Make kdump avoid stack dumps
2010-08-13 18:00:25 -07:00
Linus Torvalds
c029b55af7 Merge branch 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, asm: Use a lower case name for the end macro in atomic64_386_32.S
  x86, asm: Refactor atomic64_386_32.S to support old binutils and be cleaner
  x86: Document __phys_reloc_hide() usage in __pa_symbol()
  x86, apic: Map the local apic when parsing the MP table.
2010-08-13 10:35:48 -07:00
Robert Richter
f6e9456c92 x86, cleanup: Remove obsolete boot_cpu_id variable
boot_cpu_id is there for historical reasons and was renamed to
boot_cpu_physical_apicid in patch:

 c70dcb7 x86: change boot_cpu_id to boot_cpu_physical_apicid

However, there are some remaining occurrences of boot_cpu_id that are
never touched in the kernel and thus its value is always 0.

This patch removes boot_cpu_id completely.

Signed-off-by: Robert Richter <robert.richter@amd.com>
LKML-Reference: <1279731838-1522-8-git-send-email-robert.richter@amd.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-08-12 14:01:38 -07:00
Linus Torvalds
75cb5fdce2 Merge branches 'x86-cleanups-for-linus', 'x86-vmware-for-linus', 'x86-mtrr-for-linus', 'x86-apic-for-linus', 'x86-fpu-for-linus' and 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Clean up arch/x86/kernel/cpu/mtrr/cleanup.c: use ";" not "," to terminate statements

* 'x86-vmware-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, vmware: Preset lpj values when on VMware.

* 'x86-mtrr-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, mtrr: Use stop machine context to rendezvous all the cpu's

* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86/apic/es7000_32: Remove unused variable

* 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Avoid unnecessary __clear_user() and xrstor in signal handling

* 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, vdso: Unmap vdso pages
2010-08-06 16:22:59 -07:00
Linus Torvalds
1cfd2bda8c Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (30 commits)
  PCI: update for owner removal from struct device_attribute
  PCI: Fix warnings when CONFIG_DMI unset
  PCI: Do not run NVidia quirks related to MSI with MSI disabled
  x86/PCI: use for_each_pci_dev()
  PCI: use for_each_pci_dev()
  PCI: MSI: Restore read_msi_msg_desc(); add get_cached_msi_msg_desc()
  PCI: export SMBIOS provided firmware instance and label to sysfs
  PCI: Allow read/write access to sysfs I/O port resources
  x86/PCI: use host bridge _CRS info on ASRock ALiveSATA2-GLAN
  PCI: remove unused HAVE_ARCH_PCI_SET_DMA_MAX_SEGMENT_{SIZE|BOUNDARY}
  PCI: disable mmio during bar sizing
  PCI: MSI: Remove unsafe and unnecessary hardware access
  PCI: Default PCIe ASPM control to on and require !EMBEDDED to disable
  PCI: kernel oops on access to pci proc file while hot-removal
  PCI: pci-sysfs: remove casts from void*
  ACPI: Disable ASPM if the platform won't provide _OSC control for PCIe
  PCI hotplug: make sure child bridges are enabled at hotplug time
  PCI hotplug: shpchp: Removed check for hotplug of display devices
  PCI hotplug: pciehp: Fixed return value sign for pciehp_unconfigure_device
  PCI: Don't enable aspm before drivers have had a chance to veto it
  ...
2010-08-06 11:44:36 -07:00
Linus Torvalds
4aed2fd8e3 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (162 commits)
  tracing/kprobes: unregister_trace_probe needs to be called under mutex
  perf: expose event__process function
  perf events: Fix mmap offset determination
  perf, powerpc: fsl_emb: Restore setting perf_sample_data.period
  perf, powerpc: Convert the FSL driver to use local64_t
  perf tools: Don't keep unreferenced maps when unmaps are detected
  perf session: Invalidate last_match when removing threads from rb_tree
  perf session: Free the ref_reloc_sym memory at the right place
  x86,mmiotrace: Add support for tracing STOS instruction
  perf, sched migration: Librarize task states and event headers helpers
  perf, sched migration: Librarize the GUI class
  perf, sched migration: Make the GUI class client agnostic
  perf, sched migration: Make it vertically scrollable
  perf, sched migration: Parameterize cpu height and spacing
  perf, sched migration: Fix key bindings
  perf, sched migration: Ignore unhandled task states
  perf, sched migration: Handle ignored migrate out events
  perf: New migration tool overview
  tracing: Drop cpparg() macro
  perf: Use tracepoint_synchronize_unregister() to flush any pending tracepoint call
  ...

Fix up trivial conflicts in Makefile and drivers/cpufreq/cpufreq.c
2010-08-06 09:30:52 -07:00
Eric W. Biederman
5989cd6a1c x86, apic: Map the local apic when parsing the MP table.
This fixes a regression in 2.6.35 from 2.6.34, that is
present for select models of Intel cpus when people are
using an MP table.

The commit cf7500c0ea
"x86, ioapic: In mpparse use mp_register_ioapic" started
calling mp_register_ioapic from MP_ioapic_info.  An extremely
simple change that was obviously correct.  Unfortunately
mp_register_ioapic did just a little more than the previous
hand crafted code and so we gained this call path.

The problem call path is:
MP_ioapic_info()
  mp_register_ioapic()
   io_apic_unique_id()
     io_apic_get_unique_id()
       get_physical_broadcast()
         modern_apic()
           lapic_get_version()
             apic_read(APIC_LVR)

Which turned out to be a problem because the local apic
was not mapped, at that point, unlike the similar point
in the ACPI parsing code.

This problem is fixed by mapping the local apic when
parsing the mptable as soon as we reasonably can.

Looking at the number of places we setup the fixmap for
the local apic, I see some serious simplification opportunities.
For the moment except for not duplicating the setting up of the
fixmap in init_apic_mappings, I have not acted on them.

The regression from 2.6.34 is tracked in bug
https://bugzilla.kernel.org/show_bug.cgi?id=16173

Cc: <stable@kernel.org> 2.6.35
Reported-by: David Hill <hilld@binarystorm.net>
Reported-by: Tvrtko Ursulin <tvrtko.ursulin@sophos.com>
Tested-by: Tvrtko Ursulin <tvrtko.ursulin@sophos.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
LKML-Reference: <m1eiee86jg.fsf_-_@fess.ebiederm.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-08-05 16:26:42 -07:00
Ingo Molnar
61be7fdec2 Merge branch 'perf/nmi' into perf/core
Conflicts:
	kernel/Makefile

Merge reason: Add the now complete topic, fix the conflict.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-08-05 08:45:05 +02:00
Jiri Kosina
d790d4d583 Merge branch 'master' into for-next 2010-08-04 15:14:38 +02:00
Ben Hutchings
30da552428 PCI: MSI: Restore read_msi_msg_desc(); add get_cached_msi_msg_desc()
commit 2ca1af9aa3285c6a5f103ed31ad09f7399fc65d7 "PCI: MSI: Remove
unsafe and unnecessary hardware access" changed read_msi_msg_desc() to
return the last MSI message written instead of reading it from the
device, since it may be called while the device is in a reduced
power state.

However, the pSeries platform code really does need to read messages
from the device, since they are initially written by firmware.
Therefore:
- Restore the previous behaviour of read_msi_msg_desc()
- Add new functions get_cached_msi_msg{,_desc}() which return the
  last MSI message written
- Use the new functions where appropriate

Acked-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-07-30 09:41:39 -07:00
Cliff Wickman
5edd19af18 x86, UV: Make kdump avoid stack dumps
UV NMI callback's should not write stack dumps when a kdump is to be written.

When invoking the crash kernel to write a dump, kdump_nmi_shootdown_cpus()
uses NMI's to get all the cpu's to save their register context and halt.

But the NMI interrupt handler runs a callback list.  This patch sets a flag
to prevent any of those callbacks from interfering with the halt of the cpu.

For UV, which currently has the only callback to which this is relevant, the
uv_handle_nmi() callback should not do dumping of stacks.

The 'in_crash_kexec' flag is defined as an extern in kdebug.h firstly
because x2apic_uv_x.c includes it.  Secondly because some future callback
might need the flag to know that it should not enter the debugger.
(Such a scenario was in fact present in the 2.6.32 kernel, SuSE distribution,
 where a call to kdb needed to be avoided.)

Signed-off-by: Cliff Wickman <cpw@sgi.com>
LKML-Reference: <E1ObLvt-0005UZ-Va@eag09.americas.sgi.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-07-21 11:33:27 -07:00
Yinghai Lu
fd19dce7ac x86: Fix x2apic preenabled system with kexec
Found one x2apic system kexec loop test failed
when CONFIG_NMI_WATCHDOG=y (old) or CONFIG_LOCKUP_DETECTOR=y (current tip)

first kernel can kexec second kernel, but second kernel can not kexec third one.

it can be duplicated on another system with BIOS preenabled x2apic.
First kernel can not kexec second kernel.

It turns out, when kernel boot with pre-enabled x2apic, it will not execute
disable_local_APIC on shutdown path.

when init_apic_mappings() is called in setup_arch, it will skip setting of
apic_phys when x2apic_mode is set. ( x2apic_mode is much early check_x2apic())
Then later, disable_local_APIC() will bail out early because !apic_phys.

So check !x2apic_mode in x2apic_mode in disable_local_APIC with !apic_phys.

another solution could be updating init_apic_mappings() to set apic_phys even
for preenabled x2apic system. Actually even for x2apic system, that lapic
address is mapped already in early stage.

BTW: is there any x2apic preenabled system with apicid of boot cpu > 255?

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4C3EB22B.3000701@kernel.org>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: stable@kernel.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-07-16 16:49:41 -07:00
Javier Martinez Canillas
5bbd4a336c x86/apic/es7000_32: Remove unused variable
In today's linux-next I got this compile warning:

 arch/x86/kernel/apic/es7000_32.c:132: warning: 'base' defined but not used

Current patch solves the issue removing the unused variable.

Signed-off-by: Javier Martinez Canillas <martinez.javier@gmail.com>
Cc: Rakib Mullick <rakib.mullick@gmail.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Tejun Heo <tj@kernel.org>
LKML-Reference: <1278546719.9020.4.camel@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-07-08 09:14:37 +02:00
Jiri Kosina
f1bbbb6912 Merge branch 'master' into for-next 2010-06-16 18:08:13 +02:00
Uwe Kleine-König
421f91d21a fix typos concerning "initiali[zs]e"
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-06-16 18:05:05 +02:00
Eric W. Biederman
a4384df3e2 x86, irq: Rename gsi_end gsi_top, and fix off by one errors
When I introduced the global variable gsi_end I thought gsi_end on
io_apics was one past the end of the gsi range for the io_apic.  After
it was pointed out the the range on io_apics was inclusive I changed
my global variable to match.  That was a big mistake.  Inclusive
semantics without a range start cannot describe the case when no gsi's
are allocated.  Describing the case where no gsi's are allocated is
important in sfi.c and mpparse.c so that we can assign gsi numbers
instead of blindly copying the gsi assignments the BIOS has done as we
do in the acpi case.

To keep from getting the global variable confused with the gsi range
end rename it gsi_top.

To allow describing the case where no gsi's are allocated have gsi_top
be one place the highest gsi number seen in the system.

This fixes an off by one bug in sfi.c:
Reported-by: jacob pan <jacob.jun.pan@linux.intel.com>

This fixes the same off by one bug in mpparse.c:

This fixes an off unreachable by one bug in acpi/boot.c:irq_to_gsi
Reported-by: Yinghai <yinghai.lu@oracle.com>

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
LKML-Reference: <m17hm9jre7.fsf_-_@fess.ebiederm.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-06-09 13:34:06 -07:00
Kerstin Jonsson
8c3ba8d049 x86, apic: ack all pending irqs when crashed/on kexec
When the SMP kernel decides to crash_kexec() the local APICs may have
pending interrupts in their vector tables.

The setup routine for the local APIC has a deficient mechanism for
clearing these interrupts, it only handles interrupts that has already
been dispatched to the local core for servicing (the ISR register) safely,
it doesn't consider lower prioritized queued interrupts stored in the IRR
register.

If you have more than one pending interrupt within the same 32 bit word in
the LAPIC vector table registers you may find yourself entering the IO
APIC setup with pending interrupts left in the LAPIC.  This is a situation
for wich the IO APIC setup is not prepared.  Depending of what/which
interrupt vector/vectors are stuck in the APIC tables your system may show
various degrees of malfunctioning.  That was the reason why the
check_timer() failed in our system, the timer interrupts was blocked by
pending interrupts from the old kernel when routed trough the IO APIC.

Additional comment from Jiri Bohac:
==============
If this should go into stable release,
I'd add some kind of limit on the number of iterations, just to be safe from
hard to debug lock-ups:

+if (loops++  > MAX_LOOPS) {
+        printk("LAPIC pending clean-up")
+        break;
+}
 while (queued);

with MAX_LOOPS something like 1E9 this would leave plenty of time for the
pending IRQs to be cleared and would and still cause at most a second of delay
if the loop were to lock-up for whatever reason.

[trenn@suse.de:

V2: Use tsc if avail to bail out after 1 sec due to possible virtual
    apic_read calls which may take rather long (suggested by: Avi Kivity
    <avi@redhat.com>) If no tsc is available bail out quickly after
    cpu_khz, if we broke out too early and still have irqs pending (which
    should never happen?) we still get a WARN_ON...

V3: - Fixed indentation -> checkpatch clean
    - max_loops must be signed

V4: - Fix typo, mixed up tsc and ntsc in first rdtscll() call

V5: Adjust WARN_ON() condition to also catch error in cpu_has_tsc case]

Cc: <jbohac@novell.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Kerstin Jonsson <kerstin.jonsson@ericsson.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Tested-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Thomas Renninger <trenn@suse.de>
LKML-Reference: <201005241913.o4OJDGWM010865@imap1.linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-05-24 13:31:35 -07:00
Linus Torvalds
537b60d178 Merge branch 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, UV: uv_irq.c: Fix all sparse warnings
  x86, UV: Improve BAU performance and error recovery
  x86, UV: Delete unneeded boot messages
  x86, UV: Clean up UV headers for MMR definitions
2010-05-18 09:46:35 -07:00
Linus Torvalds
98f01720cb Merge branch 'x86-irq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-irq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, acpi/irq: Define gsi_end when X86_IO_APIC is undefined
  x86, irq: Kill io_apic_renumber_irq
  x86, acpi/irq: Handle isa irqs that are not identity mapped to gsi's.
  x86, ioapic: Simplify probe_nr_irqs_gsi.
  x86, ioapic: Optimize pin_2_irq
  x86, ioapic: Move nr_ioapic_registers calculation to mp_register_ioapic.
  x86, ioapic: In mpparse use mp_register_ioapic
  x86, ioapic: Teach mp_register_ioapic to compute a global gsi_end
  x86, ioapic: Fix the types of gsi values
  x86, ioapic: Fix io_apic_redir_entries to return the number of entries.
  x86, ioapic: Only export mp_find_ioapic and mp_find_ioapic_pin in io_apic.h
  x86, acpi/irq: Generalize mp_config_acpi_legacy_irqs
  x86, acpi/irq: Fix acpi_sci_ioapic_setup so it has both bus_irq and gsi
  x86, acpi/irq: pci device dev->irq is an isa irq not a gsi
  x86, acpi/irq: Teach acpi_get_override_irq to take a gsi not an isa_irq
  x86, acpi/irq: Introduce apci_isa_irq_to_gsi
2010-05-18 09:15:57 -07:00
Don Zickus
cafcd80d21 lockup_detector: Cross arch compile fixes
Combining the softlockup and hardlockup code causes watchdog.c
to build even without the hardlockup detection support.

So if an arch, that has the previous and the new nmi watchdog
implementations cohabiting, wants to know if the generic one
is in use, CONFIG_LOCKUP_DETECTOR is not a reliable check.
We need to use CONFIG_HARDLOCKUP_DETECTOR instead.

Fixes:
	kernel/built-in.o: In function `touch_nmi_watchdog':
	(.text+0x449bc): multiple definition of `touch_nmi_watchdog'
	arch/sparc/kernel/built-in.o:(.text+0x11b28): first defined here

Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
LKML-Reference: <20100514151121.GR15159@redhat.com>
[ use CONFIG_HARDLOCKUP_DETECTOR instead of CONFIG_PERF_EVENTS_NMI]
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-05-16 04:25:14 +02:00
Ingo Molnar
5e85391b3b x86, watchdog: Fix build error in hw_nmi.c
On some configs the following build error triggers:

 arch/x86/kernel/apic/hw_nmi.c:35: error: 'apic' undeclared (first use in this function)
 arch/x86/kernel/apic/hw_nmi.c:35: error: (Each undeclared identifier is reported only once
 arch/x86/kernel/apic/hw_nmi.c:35: error: for each function it appears in.)

Because asm/apic.h was only included implicitly. Include it explicitly.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
LKML-Reference: <1273713674-8434-1-git-send-regression-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-13 09:12:39 +02:00
Don Zickus
10f9014912 x86: Cleanup hw_nmi.c cruft
The design of the hardlockup watchdog has changed and cruft was left
behind in the hw_nmi.c file.  Just remove the code that isn't used
anymore.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
LKML-Reference: <1273266711-18706-7-git-send-email-dzickus@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-05-12 23:55:48 +02:00
Don Zickus
7cbb7e7fa4 x86: Move trigger_all_cpu_backtrace to its own die_notifier
As part of the transition of the nmi watchdog to something more
generic, the trigger_all_cpu_backtrace code is getting left behind.
Put it in its own die_notifier so it can still be used.

V2:
- use arch_spin_locks

Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
LKML-Reference: <1273266711-18706-6-git-send-email-dzickus@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-05-12 23:55:47 +02:00
Don Zickus
58687acba5 lockup_detector: Combine nmi_watchdog and softlockup detector
The new nmi_watchdog (which uses the perf event subsystem) is very
similar in structure to the softlockup detector.  Using Ingo's
suggestion, I combined the two functionalities into one file:
kernel/watchdog.c.

Now both the nmi_watchdog (or hardlockup detector) and softlockup
detector sit on top of the perf event subsystem, which is run every
60 seconds or so to see if there are any lockups.

To detect hardlockups, cpus not responding to interrupts, I
implemented an hrtimer that runs 5 times for every perf event
overflow event.  If that stops counting on a cpu, then the cpu is
most likely in trouble.

To detect softlockups, tasks not yielding to the scheduler, I used the
previous kthread idea that now gets kicked every time the hrtimer fires.
If the kthread isn't being scheduled neither is anyone else and the
warning is printed to the console.

I tested this on x86_64 and both the softlockup and hardlockup paths
work.

V2:
- cleaned up the Kconfig and softlockup combination
- surrounded hardlockup cases with #ifdef CONFIG_PERF_EVENTS_NMI
- seperated out the softlockup case from perf event subsystem
- re-arranged the enabling/disabling nmi watchdog from proc space
- added cpumasks for hardlockup failure cases
- removed fallback to soft events if no PMU exists for hard events

V3:
- comment cleanups
- drop support for older softlockup code
- per_cpu cleanups
- completely remove software clock base hardlockup detector
- use per_cpu masking on hard/soft lockup detection
- #ifdef cleanups
- rename config option NMI_WATCHDOG to LOCKUP_DETECTOR
- documentation additions

V4:
- documentation fixes
- convert per_cpu to __get_cpu_var
- powerpc compile fixes

V5:
- split apart warn flags for hard and soft lockups

TODO:
- figure out how to make an arch-agnostic clock2cycles call
  (if possible) to feed into perf events as a sample period

[fweisbec: merged conflict patch]

Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
LKML-Reference: <1273266711-18706-2-git-send-email-dzickus@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-05-12 23:55:33 +02:00
Frederic Weisbecker
a9aa1d02de Merge commit 'v2.6.34-rc7' into perf/nmi
Merge reason: catch up with latest softlockup detector changes.
2010-05-12 23:20:33 +02:00
Eric W. Biederman
7b20bd5fb9 x86, irq: Kill io_apic_renumber_irq
Now that the generic irq layer is performing the exact same remapping as
io_apic_renumber_irq we can kill this weird  es7000 specific function.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
LKML-Reference: <1269936436-7039-15-git-send-email-ebiederm@xmission.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-05-04 13:35:20 -07:00
Eric W. Biederman
988856ee16 x86, acpi/irq: Handle isa irqs that are not identity mapped to gsi's.
ACPI irq source overrides are allowed for the 16 isa irqs and are
allowed to map any gsi to any isa irq.  A few motherboards have been
seen to take advantage of this and put the isa irqs on the 2nd or
3rd ioapic.  This causes some problems, most notably the fact
that we can not use any gsi < 16.

To correct this move the gsis that are not isa irqs and have
a gsi number < 16 into the linux irq space just past gsi_end.
This is what the es7000 platform is doing today.  Moving only the
low 16 gsis above the rest of the gsi's only penalizes weird
platforms, leaving sane acpi implementations with a 1-1 mapping
of gsis and irqs.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
LKML-Reference: <1269936436-7039-14-git-send-email-ebiederm@xmission.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-05-04 13:35:17 -07:00
Eric W. Biederman
4afc51a835 x86, ioapic: Simplify probe_nr_irqs_gsi.
Use the global gsi_end value now that all ioapics have
valid gsi numbers instead of a combination of acpi_probe_gsi
and walking all of the ioapics and couting their number of
entries by hand if acpi_probe_gsi gave us an answer we did
not like.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
LKML-Reference: <1269936436-7039-13-git-send-email-ebiederm@xmission.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-05-04 13:35:11 -07:00
Eric W. Biederman
d464207c4f x86, ioapic: Optimize pin_2_irq
Now that all ioapics have valid gsi_base values use this to
accellerate pin_2_irq.  In the case of acpi this also ensures
that pin_2_irq will compute the same irq value for an ioapic
pin as acpi will.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
LKML-Reference: <1269936436-7039-12-git-send-email-ebiederm@xmission.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-05-04 13:35:08 -07:00
Eric W. Biederman
7716a5c4ff x86, ioapic: Move nr_ioapic_registers calculation to mp_register_ioapic.
Now that all ioapic registration happens in mp_register_ioapic we can
move the calculation of nr_ioapic_registers there from enable_IO_APIC.
The number of ioapic registers is already calucated in mp_register_ioapic
so all that really needs to be done is to save the caluclated value
in nr_ioapic_registers.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
LKML-Reference: <1269936436-7039-11-git-send-email-ebiederm@xmission.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-05-04 13:35:03 -07:00
Eric W. Biederman
5777372af5 x86, ioapic: Teach mp_register_ioapic to compute a global gsi_end
Add the global variable gsi_end and teach mp_register_ioapic
to keep it uptodate as we add more ioapics into the system.

ioapics can only be added early in boot so the code that
runs later can treat gsi_end as a constant.

Remove the have hacks in sfi.c to second guess mp_register_ioapic
by keeping t's own running total of how many gsi's have been seen,
and instead use the gsi_end.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
LKML-Reference: <1269936436-7039-9-git-send-email-ebiederm@xmission.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-05-04 13:34:56 -07:00
Eric W. Biederman
eddb0c55a1 x86, ioapic: Fix the types of gsi values
This patches fixes the types of gsi_base and gsi_end values in
struct mp_ioapic_gsi, and the gsi parameter of mp_find_ioapic
and mp_find_ioapic_pin

A gsi is cannonically a u32, not an int.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
LKML-Reference: <1269936436-7039-8-git-send-email-ebiederm@xmission.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-05-04 13:34:52 -07:00
Eric W. Biederman
4b6b19a1c7 x86, ioapic: Fix io_apic_redir_entries to return the number of entries.
io_apic_redir_entries has a huge conceptual bug.  It returns the maximum
redirection entry not the number of redirection entries.  Which simply
does not match what the name of the function.  This just caught me
and it caught  Feng Tang, and  Len Brown when they wrote sfi_parse_ioapic.

Modify io_apic_redir_entries to actually return the number of redirection
entries, and fix the callers so that they properly handle receiving the
number of the number of redirection table entries, instead of the
number of redirection table entries less one.

While the usage in sfi.c does not show up in this patch it is fixed
by virtue of the fact that io_apic_redir_entries now has the semantics
sfi_parse_ioapic most reasonably expects.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
LKML-Reference: <1269936436-7039-7-git-send-email-ebiederm@xmission.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-05-04 13:34:48 -07:00
Eric W. Biederman
9a0a91bb56 x86, acpi/irq: Teach acpi_get_override_irq to take a gsi not an isa_irq
In perverse acpi implementations the isa irqs are not identity mapped
to the first 16 gsi.  Furthermore at least the extended interrupt
resource capability may return gsi's and not isa irqs.  So since
what we get from acpi is a gsi teach acpi_get_overrride_irq to
operate on a gsi instead of an isa_irq.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
LKML-Reference: <1269936436-7039-2-git-send-email-ebiederm@xmission.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-05-04 13:34:27 -07:00
Prarit Bhargava
bbd391a15d x86: Fix NULL pointer access in irq_force_complete_move() for Xen guests
Upstream PV guests fail to boot because of a NULL pointer in
irq_force_complete_move().  It is possible that xen guests have
irq_desc->chip_data = NULL.

Test for NULL chip_data pointer before attempting to complete an irq move.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
LKML-Reference: <20100427152434.16193.49104.sendpatchset@prarit.bos.redhat.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: <stable@kernel.org> [2.6.33]
2010-04-30 14:31:38 -07:00
Linus Torvalds
fb1ae63577 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip:
  x86: Fix double enable_IR_x2apic() call on SMP kernel on !SMP boards
  x86: Increase CONFIG_NODES_SHIFT max to 10
  ibft, x86: Change reserve_ibft_region() to find_ibft_region()
  x86, hpet: Fix bug in RTC emulation
  x86, hpet: Erratum workaround for read after write of HPET comparator
  bootmem, x86: Fix 32bit numa system without RAM on node 0
  nobootmem, x86: Fix 32bit numa system without RAM on node 0
  x86: Handle overlapping mptables
  x86: Make e820_remove_range to handle all covered case
  x86-32, resume: do a global tlb flush in S4 resume
2010-04-07 11:02:23 -07:00
Suresh Siddha
472a474c66 x86: Fix double enable_IR_x2apic() call on SMP kernel on !SMP boards
Jan Grossmann reported kernel boot panic while booting SMP
kernel on his system with a single core cpu. SMP kernels call
enable_IR_x2apic() from native_smp_prepare_cpus() and on
platforms where the kernel doesn't find SMP configuration we
ended up again calling enable_IR_x2apic() from the
APIC_init_uniprocessor() call in the smp_sanity_check(). Thus
leading to kernel panic.

Don't call enable_IR_x2apic() and default_setup_apic_routing()
from APIC_init_uniprocessor() in CONFIG_SMP case.

NOTE: this kind of non-idempotent and assymetric initialization
sequence is rather fragile and unclean, we'll clean that up
in v2.6.35. This is the minimal fix for v2.6.34.

Reported-by: Jan.Grossmann@kielnet.net
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: <jbarnes@virtuousgeek.org>
Cc: <david.woodhouse@intel.com>
Cc: <weidong.han@intel.com>
Cc: <youquan.song@intel.com>
Cc: <Jan.Grossmann@kielnet.net>
Cc: <stable@kernel.org> # [v2.6.32.x, v2.6.33.x]
LKML-Reference: <1270083887.7835.78.camel@sbs-t61.sc.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-02 20:48:47 +02:00
Tejun Heo
5a0e3ad6af include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-30 22:02:32 +09:00
Jack Steiner
2acebe9ecb x86, UV: Delete unneeded boot messages
SGI:UV: Delete extra boot messages that describe the system
topology. These messages are no longer useful.

Signed-off-by: Jack Steiner <steiner@sgi.com>
LKML-Reference: <20100317154038.GA29346@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-17 17:00:55 +01:00
Suresh Siddha
36e9e1eab7 x86: Handle legacy PIC interrupts on all the cpu's
Ingo Molnar reported that with the recent changes of not
statically blocking IRQ0_VECTOR..IRQ15_VECTOR's on all the
cpu's, broke an AMD platform (with Nvidia chipset) boot when
"noapic" boot option is used.

On this platform, legacy PIC interrupts are getting delivered to
all the cpu's instead of just the boot cpu. Thus not
initializing the vector to irq mapping for the legacy irq's
resulted in not handling certain interrupts causing boot hang.

Fix this by initializing the vector to irq mapping on all the
logical cpu's, if the legacy IRQ is handled by the legacy PIC.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
[ -v2: io-apic-enabled improvement ]
Acked-by: Yinghai Lu <yinghai@kernel.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
LKML-Reference: <1268692386.3296.43.camel@sbs-t61.sc.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-16 06:36:35 +01:00
Linus Torvalds
15c989d4d1 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, k8 nb: Fix boot crash: enable k8_northbridges unconditionally on AMD systems
  x86, UV: Fix target_cpus() in x2apic_uv_x.c
  x86: Reduce per cpu warning boot up messages
  x86: Reduce per cpu MCA boot up messages
  x86_64, cpa: Don't work hard in preserving kernel 2M mappings when using 4K already
2010-03-13 14:45:49 -08:00
Jack Steiner
8447b360a3 x86, UV: Fix target_cpus() in x2apic_uv_x.c
target_cpu() should initially target all cpus, not just cpu 0.
Otherwise systems with lots of disks can exhaust the interrupt
vectors on cpu 0 if a large number of disks are discovered
before the irq balancer is running.

Note: UV code only...

Signed-off-by: Jack Steiner <steiner@sgi.com>
LKML-Reference: <20100311184328.GA21433@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-12 10:19:29 +01:00
Jiri Kosina
318ae2edc3 Merge branch 'for-next' into for-linus
Conflicts:
	Documentation/filesystems/proc.txt
	arch/arm/mach-u300/include/mach/debug-macro.S
	drivers/net/qlge/qlge_ethtool.c
	drivers/net/qlge/qlge_main.c
	drivers/net/typhoon.c
2010-03-08 16:55:37 +01:00
Linus Torvalds
322aafa664 Merge branch 'x86-mrst-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-mrst-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (30 commits)
  x86, mrst: Fix whitespace breakage in apb_timer.c
  x86, mrst: Fix APB timer per cpu clockevent
  x86, mrst: Remove X86_MRST dependency on PCI_IOAPIC
  x86, olpc: Use pci subarch init for OLPC
  x86, pci: Add arch_init to x86_init abstraction
  x86, mrst: Add Kconfig dependencies for Moorestown
  x86, pci: Exclude Moorestown PCI code if CONFIG_X86_MRST=n
  x86, numaq: Make CONFIG_X86_NUMAQ depend on CONFIG_PCI
  x86, pci: Add sanity check for PCI fixed bar probing
  x86, legacy_irq: Remove duplicate vector assigment
  x86, legacy_irq: Remove left over nr_legacy_irqs
  x86, mrst: Platform clock setup code
  x86, apbt: Moorestown APB system timer driver
  x86, mrst: Add vrtc platform data setup code
  x86, mrst: Add platform timer info parsing code
  x86, mrst: Fill in PCI functions in x86_init layer
  x86, mrst: Add dummy legacy pic to platform setup
  x86/PCI: Moorestown PCI support
  x86, ioapic: Add dummy ioapic functions
  x86, ioapic: Early enable ioapic for timer irq
  ...

Fixed up semantic conflict of new clocksources due to commit
17622339af ("clocksource: add argument to resume callback").
2010-03-07 15:59:39 -08:00
Linus Torvalds
fb7b096d94 Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (25 commits)
  x86: Fix out of order of gsi
  x86: apic: Fix mismerge, add arch_probe_nr_irqs() again
  x86, irq: Keep chip_data in create_irq_nr and destroy_irq
  xen: Remove unnecessary arch specific xen irq functions.
  smp: Use nr_cpus= to set nr_cpu_ids early
  x86, irq: Remove arch_probe_nr_irqs
  sparseirq: Use radix_tree instead of ptrs array
  sparseirq: Change irq_desc_ptrs to static
  init: Move radix_tree_init() early
  irq: Remove unnecessary bootmem code
  x86: Add iMac9,1 to pci_reboot_dmi_table
  x86: Convert i8259_lock to raw_spinlock
  x86: Convert nmi_lock to raw_spinlock
  x86: Convert ioapic_lock and vector_lock to raw_spinlock
  x86: Avoid race condition in pci_enable_msix()
  x86: Fix SCI on IOAPIC != 0
  x86, ia32_aout: do not kill argument mapping
  x86, irq: Move __setup_vector_irq() before the first irq enable in cpu online path
  x86, irq: Update the vector domain for legacy irqs handled by io-apic
  x86, irq: Don't block IRQ0_VECTOR..IRQ15_VECTOR's on all cpu's
  ...
2010-03-03 08:15:37 -08:00
Linus Torvalds
0a135ba14d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
  percpu: add __percpu sparse annotations to what's left
  percpu: add __percpu sparse annotations to fs
  percpu: add __percpu sparse annotations to core kernel subsystems
  local_t: Remove leftover local.h
  this_cpu: Remove pageset_notifier
  this_cpu: Page allocator conversion
  percpu, x86: Generic inc / dec percpu instructions
  local_t: Move local.h include to ringbuffer.c and ring_buffer_benchmark.c
  module: Use this_cpu_xx to dynamically allocate counters
  local_t: Remove cpu_local_xx macros
  percpu: refactor the code in pcpu_[de]populate_chunk()
  percpu: remove compile warnings caused by __verify_pcpu_ptr()
  percpu: make accessors check for percpu pointer in sparse
  percpu: add __percpu for sparse.
  percpu: make access macros universal
  percpu: remove per_cpu__ prefix.
2010-03-03 07:34:18 -08:00
Linus Torvalds
30ff056c42 Merge branch 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, uv: Remove recursion in uv_heartbeat_enable()
  x86, uv: uv_global_gru_mmr_address() macro fix
  x86, uv: Add serial number parameter to uv_bios_get_sn_info()
2010-02-28 11:00:55 -08:00
Linus Torvalds
c7e15899d0 Merge branch 'x86-pci-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-pci-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Enable NMI on all cpus on UV
  vgaarb: Add user selectability of the number of GPUS in a system
  vgaarb: Fix VGA arbiter to accept PCI domains other than 0
  x86, uv: Update UV arch to target Legacy VGA I/O correctly.
  pci: Update pci_set_vga_state() to call arch functions
2010-02-28 10:59:18 -08:00
Linus Torvalds
43a834d86c Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Remove trailing spaces in messages
  x86, mtrr: Remove unused mtrr/state.c
  x86, trivial: Fix grammo in tsc comment about Geode TSC reliability
2010-02-28 10:36:48 -08:00
Eric W. Biederman
fad539956c x86: Fix out of order of gsi
Iranna D Ankad reported that IBM x3950 systems have boot
problems after this commit:

 |
 | commit b9c61b7007
 |
 |    x86/pci: update pirq_enable_irq() to setup io apic routing
 |

The problem is that with the patch, the machine freezes when
console=ttyS0,... kernel serial parameter is passed.

It seem to freeze at DVD initialization and the whole problem
seem to be DVD/pata related, but somehow exposed through the
serial parameter.

Such apic problems can expose really weird behavior:

  ACPI: IOAPIC (id[0x10] address[0xfecff000] gsi_base[0])
  IOAPIC[0]: apic_id 16, version 0, address 0xfecff000, GSI 0-2
  ACPI: IOAPIC (id[0x0f] address[0xfec00000] gsi_base[3])
  IOAPIC[1]: apic_id 15, version 0, address 0xfec00000, GSI 3-38
  ACPI: IOAPIC (id[0x0e] address[0xfec01000] gsi_base[39])
  IOAPIC[2]: apic_id 14, version 0, address 0xfec01000, GSI 39-74
  ACPI: INT_SRC_OVR (bus 0 bus_irq 1 global_irq 4 dfl dfl)
  ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 5 dfl dfl)
  ACPI: INT_SRC_OVR (bus 0 bus_irq 3 global_irq 6 dfl dfl)
  ACPI: INT_SRC_OVR (bus 0 bus_irq 4 global_irq 7 dfl dfl)
  ACPI: INT_SRC_OVR (bus 0 bus_irq 6 global_irq 9 dfl dfl)
  ACPI: INT_SRC_OVR (bus 0 bus_irq 7 global_irq 10 dfl dfl)
  ACPI: INT_SRC_OVR (bus 0 bus_irq 8 global_irq 11 low edge)
  ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 12 dfl dfl)
  ACPI: INT_SRC_OVR (bus 0 bus_irq 12 global_irq 15 dfl dfl)
  ACPI: INT_SRC_OVR (bus 0 bus_irq 13 global_irq 16 dfl dfl)
  ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 17 low edge)
  ACPI: INT_SRC_OVR (bus 0 bus_irq 15 global_irq 18 dfl dfl)

It turns out that the system has three io apic controllers, but
boot ioapic routing is in the second one, and that gsi_base is
not 0 - it is using a bunch of INT_SRC_OVR...

So these recent changes:

 1. one set routing for first io apic controller
 2. assume irq = gsi

... will break that system.

So try to remap those gsis, need to seperate boot_ioapic_idx
detection out of enable_IO_APIC() and call them early.

So introduce boot_ioapic_idx, and remap_ioapic_gsi()...

 -v2: shift gsi with delta instead of gsi_base of boot_ioapic_idx

 -v3: double check with find_isa_irq_apic(0, mp_INT) to get right
      boot_ioapic_idx

 -v4: nr_legacy_irqs

 -v5: add print out for boot_ioapic_idx, and also make it could be
      applied for current kernel and previous kernel

 -v6: add bus_irq, in acpi_sci_ioapic_setup, so can get overwride
      for sci right mapping...

 -v7: looks like pnpacpi get irq instead of gsi, so need to revert
      them back...

 -v8: split into two patches

 -v9: according to Eric, use fixed 16 for shifting instead of remap

 -v10: still need to touch rsparser.c

 -v11: just revert back to way Eric suggest...
      anyway the ioapic in first ioapic is blocked by second...

 -v12: two patches, this one will add more loop but check apic_id and irq > 16

Reported-by: Iranna D Ankad <iranna.ankad@in.ibm.com>
Bisected-by: Iranna D Ankad <iranna.ankad@in.ibm.com>
Tested-by: Gary Hade <garyhade@us.ibm.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Thomas Renninger <trenn@suse.de>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: len.brown@intel.com
LKML-Reference: <4B8A321A.1000008@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-28 10:33:25 +01:00
Ingo Molnar
21c2fd9970 x86: apic: Fix mismerge, add arch_probe_nr_irqs() again
Merge commit aef55d4922 mis-merged io_apic.c so we lost the
arch_probe_nr_irqs() method.

This caused subtle boot breakages (udev confusion likely
due to missing drivers) with certain configs.

Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <20100207210250.GB8256@jenkins.home.ifup.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-27 12:49:56 +01:00
Russ Anderson
78c0617646 x86: Enable NMI on all cpus on UV
Enable NMI on all cpus in UV system and add an NMI handler
to dump_stack on each cpu.

By default on x86 all the cpus except the boot cpu have NMI
masked off.  This patch enables NMI on all cpus in UV system
and adds an NMI handler to dump_stack on each cpu.  This
way if a system hangs we can NMI the machine and get a
backtrace from all the cpus.

Version 2: Use x86_platform driver mechanism for nmi init, per
           Ingo's suggestion.

Version 3: Clean up Ingo's nits.

Signed-off-by: Russ Anderson <rja@sgi.com>
LKML-Reference: <20100226164912.GA24439@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-27 12:34:21 +01:00
Don Zickus
47195d5763 nmi_watchdog: Clean up various small details
Mostly copy/paste whitespace damage with a couple of nitpicks by
the checkpatch script. Fix the struct definition as requested by Ingo too.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: peterz@infradead.org
Cc: gorcunov@gmail.com
Cc: aris@redhat.com
LKML-Reference: <1266880143-24943-1-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
--
 arch/x86/kernel/apic/hw_nmi.c |   14 +++++------
 arch/x86/kernel/traps.c       |    6 ++--
 include/linux/nmi.h           |    2 -
 kernel/nmi_watchdog.c         |   51 ++++++++++++++++++++----------------------
 4 files changed, 36 insertions(+), 37 deletions(-)
2010-02-25 12:40:50 +01:00
Yinghai Lu
9eeeb09edb x86, legacy_irq: Remove duplicate vector assigment
Remove duplicated cfg[i].vector assignment.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4B8493A0.6080501@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-24 11:01:34 -08:00
Yinghai Lu
28c6a0ba30 x86, legacy_irq: Remove left over nr_legacy_irqs
nr_legacy_irqs and its ilk have moved to legacy_pic.

-v2: there is one in ioapic_.c

Singed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4B84AAC4.2020204@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-24 11:01:34 -08:00
Jacob Pan
05ddafb17a x86, ioapic: Early enable ioapic for timer irq
Moorestown platform needs apic ready early for the system timer irq
which is delievered via ioapic.  Should not impact other platforms.

In the longer term, once ioapic setup is moved before late time init,
we will not need this patch to do early apic enabling.

Signed-off-by: Jacob Pan <jacob.jun.pan@intel.com>
LKML-Reference: <43F901BD926A4E43B106BF17856F07559FB80D07@orsmsx508.amr.corp.intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-23 23:13:19 -08:00
H. Peter Anvin
54b56170e4 Merge remote branch 'origin/x86/apic' into x86/mrst
Conflicts:
	arch/x86/kernel/apic/io_apic.c
2010-02-22 16:25:18 -08:00
H. Peter Anvin
d02e30c31c Merge branch 'x86/irq' into x86/apic
Merge reason:
	Conflicts in arch/x86/kernel/apic/io_apic.c

Resolved Conflicts:
	arch/x86/kernel/apic/io_apic.c

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-22 16:20:34 -08:00
H. Peter Anvin
aef55d4922 Merge branch 'x86/urgent' into x86/irq
Merge reason: conflict in arch/x86/kernel/apic/io_apic.c

Resolved Conflicts:
	arch/x86/kernel/apic/io_apic.c

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-20 22:54:05 -08:00
Jacob Pan
1f91233c26 x86, apic: Remove ioapic_disable_legacy()
The ioapic_disable_legacy() call is no longer needed for platforms do
not have legacy pic. the legacy pic abstraction has taken care it
automatically.

This patch also initialize irq-related static variables based on
information obtained from legacy_pic.

Signed-off-by: Jacob Pan <jacob.jun.pan@intel.com>
LKML-Reference: <43F901BD926A4E43B106BF17856F0755A30A7660@orsmsx508.amr.corp.intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-19 17:16:38 -08:00
Jacob Pan
b81bb373a7 x86, pic: Make use of legacy_pic abstraction
This patch replaces legacy PIC-related global variable and functions
with the new legacy_pic abstraction.

Signed-off-by: Jacob Pan <jacob.jun.pan@intel.com>
LKML-Reference: <43F901BD926A4E43B106BF17856F07559FB80D04@orsmsx508.amr.corp.intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-19 16:25:17 -08:00
Alek Du
d39f6495f6 x86, ioapic: Improve handling of i8259A irq init
Since we already track the number of legacy vectors by nr_legacy_irqs, we
can avoid use static vector allocations -- we can use dynamic one.

Signed-off-by: Alek Du <alek.du@intel.com>
LKML-Reference: <43F901BD926A4E43B106BF17856F07559FB80D01@orsmsx508.amr.corp.intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-19 16:25:17 -08:00
Thomas Gleixner
b72d0db9dd x86: Move pci init function to x86_init
The PCI initialization in pci_subsys_init() is a mess. pci_numaq_init,
pci_acpi_init, pci_visws_init and pci_legacy_init are called and each
implementation checks and eventually modifies the global variable
pcibios_scanned.

x86_init functions allow us to do this more elegant. The pci.init
function pointer is preset to pci_legacy_init. numaq, acpi and visws
can modify the pointer in their early setup functions. The functions
return 0 when they did the full initialization including bus scan. A
non zero return value indicates that pci_legacy_init needs to be
called either because the selected function failed or wants the
generic bus scan in pci_legacy_init to happen (e.g. visws).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <43F901BD926A4E43B106BF17856F07559FB80CFE@orsmsx508.amr.corp.intel.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Jacob Pan <jacob.jun.pan@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-19 16:12:29 -08:00
Don Zickus
2cc4452bc3 nmi_watchdog: Fix undefined 'apic' build bug
Ingo provided me a config that fails to compile with:

  arch/x86/built-in.o: In function
  `arch_trigger_all_cpu_backtrace': (.text+0x17e78): undefined
  reference to `apic' make: *** [.tmp_vmlinux1] Error 1

I realized I changed the compile behaviour of the nmi code by
not wrapping it with CONFIG_LOCAL_APIC.  To fix this I add a
compile check for ARCH_HAS_NMI_WATCHDOG around
arch_trigger_all_cpu_backtrace.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: a.p.zijlstra@chello.nl
Cc: gorcunov@gmail.com
Cc: aris@redhat.com
LKML-Reference: <1266548212-24243-1-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-19 23:04:13 +01:00
Brandon Philips
eb5b379406 x86, irq: Keep chip_data in create_irq_nr and destroy_irq
Version 4: use get_irq_chip_data() in destroy_irq() to get rid of some
local vars.

When two drivers are setting up MSI-X at the same time via
pci_enable_msix() there is a race.  See this dmesg excerpt:

[   85.170610] ixgbe 0000:02:00.1: irq 97 for MSI/MSI-X
[   85.170611]   alloc irq_desc for 99 on node -1
[   85.170613] igb 0000:08:00.1: irq 98 for MSI/MSI-X
[   85.170614]   alloc kstat_irqs on node -1
[   85.170616] alloc irq_2_iommu on node -1
[   85.170617]   alloc irq_desc for 100 on node -1
[   85.170619]   alloc kstat_irqs on node -1
[   85.170621] alloc irq_2_iommu on node -1
[   85.170625] ixgbe 0000:02:00.1: irq 99 for MSI/MSI-X
[   85.170626]   alloc irq_desc for 101 on node -1
[   85.170628] igb 0000:08:00.1: irq 100 for MSI/MSI-X
[   85.170630]   alloc kstat_irqs on node -1
[   85.170631] alloc irq_2_iommu on node -1
[   85.170635]   alloc irq_desc for 102 on node -1
[   85.170636]   alloc kstat_irqs on node -1
[   85.170639] alloc irq_2_iommu on node -1
[   85.170646] BUG: unable to handle kernel NULL pointer dereference
at 0000000000000088

As you can see igb and ixgbe are both alternating on create_irq_nr()
via pci_enable_msix() in their probe function.

ixgbe: While looping through irq_desc_ptrs[] via create_irq_nr() ixgbe
choses irq_desc_ptrs[102] and exits the loop, drops vector_lock and
calls dynamic_irq_init. Then it sets irq_desc_ptrs[102]->chip_data =
NULL via dynamic_irq_init().

igb: Grabs the vector_lock now and starts looping over irq_desc_ptrs[]
via create_irq_nr(). It gets to irq_desc_ptrs[102] and does this:

	cfg_new = irq_desc_ptrs[102]->chip_data;
	if (cfg_new->vector != 0)
		continue;

This hits the NULL deref.

Another possible race exists via pci_disable_msix() in a driver or in
the number of error paths that call free_msi_irqs():

destroy_irq()
dynamic_irq_cleanup() which sets desc->chip_data = NULL
...race window...
desc->chip_data = cfg;

Remove the save and restore code for cfg in create_irq_nr() and
destroy_irq() and take the desc->lock when checking the irq_cfg.

Reported-and-analyzed-by: Brandon Philips <bphilips@suse.de>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <20100207210250.GB8256@jenkins.home.ifup.org>
Signed-off-by: Brandon Phiilps <bphilips@suse.de>
Cc: stable@kernel.org
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-18 21:53:15 -08:00
Yinghai Lu
6738762d73 x86, irq: Remove arch_probe_nr_irqs
So keep nr_irqs == NR_IRQS.  With radix trees is matters less.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <1265793639-15071-33-git-send-email-yinghai@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-17 17:29:21 -08:00
Thomas Gleixner
5619c28061 x86: Convert i8259_lock to raw_spinlock
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-02-16 18:21:32 +01:00
Thomas Gleixner
0fdc7a8022 x86: Convert nmi_lock to raw_spinlock
nmi_lock must be a spinning spinlock in -rt.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-02-16 18:08:07 +01:00
Thomas Gleixner
dade771692 x86: Convert ioapic_lock and vector_lock to raw_spinlock
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-02-16 17:34:21 +01:00
Don Zickus
504d7cf10e nmi_watchdog: Compile and portability fixes
The original patch was x86_64 centric.  Changed the code to make
it less so.

ested by building and running on a powerpc.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: peterz@infradead.org
Cc: gorcunov@gmail.com
Cc: aris@redhat.com
LKML-Reference: <1266013161-31197-2-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-14 09:19:43 +01:00
Brandon Phiilps
ced5b697a7 x86: Avoid race condition in pci_enable_msix()
Keep chip_data in create_irq_nr and destroy_irq.

When two drivers are setting up MSI-X at the same time via
pci_enable_msix() there is a race.  See this dmesg excerpt:

[   85.170610] ixgbe 0000:02:00.1: irq 97 for MSI/MSI-X
[   85.170611]   alloc irq_desc for 99 on node -1
[   85.170613] igb 0000:08:00.1: irq 98 for MSI/MSI-X
[   85.170614]   alloc kstat_irqs on node -1
[   85.170616] alloc irq_2_iommu on node -1
[   85.170617]   alloc irq_desc for 100 on node -1
[   85.170619]   alloc kstat_irqs on node -1
[   85.170621] alloc irq_2_iommu on node -1
[   85.170625] ixgbe 0000:02:00.1: irq 99 for MSI/MSI-X
[   85.170626]   alloc irq_desc for 101 on node -1
[   85.170628] igb 0000:08:00.1: irq 100 for MSI/MSI-X
[   85.170630]   alloc kstat_irqs on node -1
[   85.170631] alloc irq_2_iommu on node -1
[   85.170635]   alloc irq_desc for 102 on node -1
[   85.170636]   alloc kstat_irqs on node -1
[   85.170639] alloc irq_2_iommu on node -1
[   85.170646] BUG: unable to handle kernel NULL pointer dereference
at 0000000000000088

As you can see igb and ixgbe are both alternating on create_irq_nr()
via pci_enable_msix() in their probe function.

ixgbe: While looping through irq_desc_ptrs[] via create_irq_nr() ixgbe
choses irq_desc_ptrs[102] and exits the loop, drops vector_lock and
calls dynamic_irq_init. Then it sets irq_desc_ptrs[102]->chip_data =
NULL via dynamic_irq_init().

igb: Grabs the vector_lock now and starts looping over irq_desc_ptrs[]
via create_irq_nr(). It gets to irq_desc_ptrs[102] and does this:

	cfg_new = irq_desc_ptrs[102]->chip_data;
	if (cfg_new->vector != 0)
		continue;

This hits the NULL deref.

Another possible race exists via pci_disable_msix() in a driver or in
the number of error paths that call free_msi_irqs():

destroy_irq()
dynamic_irq_cleanup() which sets desc->chip_data = NULL
...race window...
desc->chip_data = cfg;

Remove the save and restore code for cfg in create_irq_nr() and
destroy_irq() and take the desc->lock when checking the irq_cfg.

Reported-and-analyzed-by: Brandon Philips <bphilips@suse.de>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <1265793639-15071-3-git-send-email-yinghai@kernel.org>
Signed-off-by: Brandon Phililps <bphilips@suse.de>
Cc: stable@kernel.org
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-10 14:27:28 -08:00
Yinghai Lu
18dce6ba5c x86: Fix SCI on IOAPIC != 0
Thomas Renninger <trenn@suse.de> reported on IBM x3330

booting a latest kernel on this machine results in:

PCI: PCI BIOS revision 2.10 entry at 0xfd61c, last bus=1
PCI: Using configuration type 1 for base access bio: create slab <bio-0> at 0
ACPI: SCI (IRQ30) allocation failed
ACPI Exception: AE_NOT_ACQUIRED, Unable to install System Control Interrupt handler (20090903/evevent-161)
ACPI: Unable to start the ACPI Interpreter

Later all kind of devices fail...

and bisect it down to this commit:
commit b9c61b7007

    x86/pci: update pirq_enable_irq() to setup io apic routing

it turns out we need to set irq routing for the sci on ioapic1 early.

-v2: make it work without sparseirq too.
-v3: fix checkpatch.pl warning, and cc to stable

Reported-by: Thomas Renninger <trenn@suse.de>
Bisected-by: Thomas Renninger <trenn@suse.de>
Tested-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <1265793639-15071-2-git-send-email-yinghai@kernel.org>
Cc: stable@kernel.org
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-10 13:47:39 -08:00
Suresh Siddha
681ee44d40 x86, apic: Don't use logical-flat mode when CPU hotplug may exceed 8 CPUs
We need to fall back from logical-flat APIC mode to physical-flat mode
when we have more than 8 CPUs.  However, in the presence of CPU
hotplug(with bios listing not enabled but possible cpus as disabled cpus in
MADT), we have to consider the number of possible CPUs rather than
the number of current CPUs; otherwise we may cross the 8-CPU boundary
when CPUs are added later.

32bit apic code can use more cleanups (like the removal of vendor checks in
32bit default_setup_apic_routing()) and more unifications with 64bit code.
Yinghai has some patches in works already. This patch addresses the boot issue
that is reported in the virtualization guest context.

[ hpa: incorporated function annotation feedback from Yinghai Lu ]

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <1265767304.2833.19.camel@sbs-t61.sc.intel.com>
Acked-by: Shaohui Zheng <shaohui.zheng@intel.com>
Reviewed-by: Yinghai Lu <yinghai@kernel.org>
Cc: <stable@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-09 20:51:11 -08:00
Don Zickus
84e478c6f1 nmi_watchdog: Config option to enable new nmi_watchdog
These are the bits that enable the new nmi_watchdog and safely
isolate the old nmi_watchdog.  Only one or the other can run,
not both at the same time.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: gorcunov@gmail.com
Cc: aris@redhat.com
Cc: peterz@infradead.org
LKML-Reference: <1265424425-31562-4-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-08 08:29:03 +01:00
Don Zickus
1fb9d6ad27 nmi_watchdog: Add new, generic implementation, using perf events
This is a new generic nmi_watchdog implementation using the perf
events infrastructure as suggested by Ingo.

The implementation is simple, just create an in-kernel perf
event and register an overflow handler to check for cpu lockups.

I created a generic implementation that lives in kernel/ and
the hardware specific part that for now lives in arch/x86.

This approach has a number of advantages:

 - It simplifies the x86 PMU implementation in the long run,
   in that it removes the hardcoded low-level PMU implementation
   that was the NMI watchdog before.

 - It allows new NMI watchdog features to be added in a central
   place.

 - It allows other architectures to enable the NMI watchdog,
   as long as they have perf events (that provide NMIs)
   implemented.

 - It also allows for more graceful co-existence of existing
   perf events apps and the NMI watchdog - before these changes
   the relationship was exclusive. (The NMI watchdog will 'spend'
   a perf event when enabled. In later iterations we might be
   able to piggyback from an existing NMI event without having
   to allocate a hardware event for the NMI watchdog - turning
   this into a no-hardware-cost feature.)

As for compatibility, we'll keep the old NMI watchdog code as
well until the new one can 100% replace it on all CPUs, old and
new alike.  That might take some time as the NMI watchdog has
been ported to many CPU models.

I have done light testing to make sure the framework works
correctly and it does.

 v2: Set the correct timeout values based on the old nmi
     watchdog

Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: gorcunov@gmail.com
Cc: aris@redhat.com
Cc: peterz@infradead.org
LKML-Reference: <1265424425-31562-3-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-08 08:29:02 +01:00
Don Zickus
e40b17208b x86: Move notify_die from nmi.c to traps.c
In order to handle a new nmi_watchdog approach, I need to move
the notify_die() routine out of nmi_watchdog_tick() and into
default_do_nmi(). This lets me easily swap out the old
nmi_watchdog with the new one with just a config change.

The change probably makes sense from a high level perspective
because the nmi_watchdog shouldn't be handling notify_die
routines anyway.  However, this move does change the semantics a
little bit.  Instead of checking on every nmi interrupt if the
cpus are stuck, only check them on the nmi_watchdog interrupts.

 v2: Move notify_die call into #idef block

Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: gorcunov@gmail.com
Cc: aris@redhat.com
Cc: peterz@infradead.org
LKML-Reference: <1265424425-31562-2-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-08 08:29:02 +01:00
Frans Pop
3235dc3f22 x86: Remove trailing spaces in messages
Signed-off-by: Frans Pop <elendil@planet.nl>
Cc: Avi Kivity <avi@redhat.com>
Cc: x86@kernel.org
LKML-Reference: <1265478443-31072-10-git-send-email-elendil@planet.nl>
[ Left out the KVM bits. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-07 17:47:51 +01:00
Mike Travis
841582ea9e x86, uv: Update UV arch to target Legacy VGA I/O correctly.
Add function to direct Legacy VGA I/O traffic to correct I/O Hub.

Signed-off-by: Mike Travis <travis@sgi.com>
LKML-Reference: <201002022238.o12McEbi018727@imap1.linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Robin Holt <holt@sgi.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: David Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-05 14:05:41 -08:00
Jasper Spaans
e34b7005e5 arch/x86/kernel/apic/apic_flat_64.c: Make comment match the code
Make the comment match the code, this also holds for intel systems,
according to probe_64.c in the same directory.

Signed-off-by: Jasper Spaans <spaans@fox-it.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-02-04 11:55:47 +01:00
Tejun Heo
ab386128f2 Merge branch 'master' into percpu 2010-02-02 14:38:15 +09:00
Suresh Siddha
9d133e5db9 x86, irq: Move __setup_vector_irq() before the first irq enable in cpu online path
Lowest priority delivery of logical flat mode is broken on some systems,
such that even when IO-APIC RTE says deliver the interrupt to a particular CPU,
interrupt subsystem delivers the interrupt to totally different CPU.

For example, this behavior was observed on a P4 based system with SiS chipset
which was reported by Li Zefan. We have been handling this kind of behavior by
making sure that in logical flat mode, we assign the same vector to irq
mappings on all the 8 possible logical cpu's.

But we have been doing this initial assignment (__setup_vector_irq()) a little
late (before which interrupts were already enabled for a short duration).

Move the __setup_vector_irq() before the first irq enable point in the
cpu online path to avoid the issue of not handling some interrupts that
wrongly hit the cpu which is still coming online.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <20100129194330.283696385@sbs-t61.sc.intel.com>
Tested-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-01-29 14:47:22 -08:00
Suresh Siddha
69c89efb51 x86, irq: Update the vector domain for legacy irqs handled by io-apic
In the recent change of not reserving IRQ0_VECTOR..IRQ15_VECTOR's on all
cpu's, we start with irq 0..15 getting directed to (and handled on) cpu-0.

In the logical flat mode, once the AP's are online (and before irqbalance
comes into picture), kernel intends to handle these IRQ's on any cpu (as the
logical flat mode allows to specify multiple cpu's for the irq destination and
the chipset based routing can deliver to the interrupt to any one of
the specified cpu's). This was broken with our recent change, which was ending
up using only cpu 0 as the destination, even when the kernel was specifying to
use all online cpu's for the logical flat mode case.

Fix this by updating vector allocation domain (cfg->domain) for legacy irqs,
when the IO-APIC handles them.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <20100129194330.207790269@sbs-t61.sc.intel.com>
Tested-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-01-29 14:47:17 -08:00
Suresh Siddha
97943390b0 x86, irq: Don't block IRQ0_VECTOR..IRQ15_VECTOR's on all cpu's
Currently IRQ0..IRQ15 are assigned to IRQ0_VECTOR..IRQ15_VECTOR's on
all the cpu's.

If these IRQ's are handled by legacy pic controller, then the kernel
handles them only on cpu 0. So there is no need to block this vector
space on all cpu's.

Similarly if these IRQ's are handled by IO-APIC, then the IRQ affinity
will determine on which cpu's we need allocate the vector resource for
that particular IRQ. This can be done dynamically and here also there
is no need to block 16 vectors for IRQ0..IRQ15 on all cpu's.

Fix this by initially assigning IRQ0..IRQ15 to IRQ0_VECTOR..IRQ15_VECTOR's only
on cpu 0. If the legacy controllers like pic handles these irq's, then
this configuration will be fixed. If more modern controllers like IO-APIC
handle these IRQ's, then we start with this configuration and as IRQ's
migrate, vectors (/and cpu's) associated with these IRQ's change dynamically.

This will freeup the block of 16 vectors on other cpu's which don't handle
IRQ0..IRQ15, which can now be used for other IRQ's that the particular cpu
handle.

[ hpa: this also an architectural cleanup for future legacy-PIC-free
  configurations. ]
[ hpa: fixed typo NR_LEGACY_IRQS -> NR_IRQS_LEGACY ]

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <1263932453.2814.52.camel@sbs-t61.sc.intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-01-19 13:40:29 -08:00
Suresh Siddha
bb668da6d6 x86, apic: use logical flat for systems with <= 8 logical cpus
We can use logical flat mode if there are <= 8 logical cpu's
(irrespective of physical apic id values).  This will enable simplified
and efficient IPI and device interrupt routing on such platforms.

This has been tested to work on both Intel and AMD platforms.
Exceptions like IBM summit platform which can't use logical flat mode
are addressed by using OEM platform checks.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Chris McDermott <lcm@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-01-18 14:15:28 -08:00
Suresh Siddha
dfea91d5a7 x86, apic: use physical mode for IBM summit platforms
Chris McDermott from IBM confirmed that hurricane chipset in IBM summit
platforms doesn't support logical flat mode.  Irrespective of the other
things like apic_id's, total number of logical cpu's, Linux kernel
should default to physical mode for this system.

The 32-bit kernel does so using the OEM checks for the IBM summit
platform.  Add a similar OEM platform check for the 64bit kernel too.

Otherwise the linux kernel boot can hang on this platform under certain
bios/platform settings.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Tested-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Chris McDermott <lcm@linux.vnet.ibm.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-01-18 14:15:27 -08:00
Suresh Siddha
6579b47457 x86, irq: Use 0x20 for the IRQ_MOVE_CLEANUP_VECTOR instead of 0x1f
After talking to some more folks inside intel (Peter Anvin, Asit Mallick),
the safest option (for future compatibility etc) seen was to use vector 0x20
for IRQ_MOVE_CLEANUP_VECTOR instead of using vector 0x1f (which is documented as
reserved vector in the Intel IA32 manuals).

Also we don't need to reserve the entire privilege level (all 16 vectors in
the priority bucket that IRQ_MOVE_CLEANUP_VECTOR falls into), as the
x86 architecture (section 10.9.3 in SDM Vol3a) specifies that with in the
priority level, the higher the vector number the higher the priority.
And hence we don't need to reserve the complete priority level 0x20-0x2f for
the IRQ migration cleanup logic.

So change the IRQ_MOVE_CLEANUP_VECTOR to 0x20 and  allow 0x21-0x2f to be used
for device interrupts. 0x30-0x3f will be used for ISA interrupts (these
also can be migrated in the context of IOAPIC and hence need to be at a higher
priority level than IRQ_MOVE_CLEANUP_VECTOR).

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <20100114002118.521826763@sbs-t61.sc.intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-01-18 10:59:59 -08:00
Linus Torvalds
330a518a1a Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, uv: Ensure hub revision set for all ACPI modes.
  x86, uv: Add function retrieving node controller revision number
  x86: xen: 64-bit kernel RPL should be 0
  x86: kernel_thread() -- initialize SS to a known state
  x86/agp: Fix agp_amd64_init and agp_amd64_cleanup
  x86: SGI UV: Fix mapping of MMIO registers
  x86: mce.h: Fix warning in header checks
2010-01-16 12:31:42 -08:00
Russ Anderson
1d2c867c94 x86, uv: Ensure hub revision set for all ACPI modes.
Ensure that UV hub revision is set for all ACPI modes.

Signed-off-by: Russ Anderson <rja@sgi.com>
LKML-Reference: <20100115180908.GB7757@sgi.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-01-15 11:09:04 -08:00
Jack Steiner
7a1110e861 x86, uv: Add function retrieving node controller revision number
Add function for determining the revision id of the SGI UV
node controller chip (HUB). This function is needed in a
subsequent patch.

Signed-off-by: Jack Steiner <steiner@sgi.com>
LKML-Reference: <20100112210904.GA24546@sgi.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-01-15 11:08:55 -08:00
Mike Travis
fcfbb2b5fa x86: SGI UV: Fix mapping of MMIO registers
This fixes the problem of the initialization code not correctly
mapping the entire MMIO space on a UV system.  A side effect is
the map_high() interface needed to be changed to accommodate
different address and size shifts.

Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Mike Habeck <habeck@sgi.com>
Cc: <stable@kernel.org>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <4B479202.7080705@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-13 10:56:27 +01:00
Ananth N Mavinakayanahalli
066000dd85 Revert "x86, apic: Use logical flat on intel with <= 8 logical cpus"
Revert commit 2fbd07a5f5, as this commit
breaks an IBM platform with quad-core Xeon cpu's.

According to Suresh, this might be an IBM platform issue, as on other
Intel platforms with <= 8 logical cpu's, logical flat mode works fine
irespective of physical apic id values (inline with the xapic
architecture).

Revert this for now because of the IBM platform breakage.

Another version will be re-submitted after the complete analysis.

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-01-11 16:47:57 -08:00
Roel Kluin
99659a929d x86, uv: Remove recursion in uv_heartbeat_enable()
The recursion is not needed and does not improve readability.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
LKML-Reference: <4B45F13E.3040202@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-01-07 11:50:06 -08:00
Suresh Siddha
7f41c2e152 x86, irq: Check move_in_progress before freeing the vector mapping
With the recent irq migration fixes (post 2.6.32), Gary Hade has noticed
"No IRQ handler for vector" messages during the 2.6.33-rc1 kernel boot on IBM
AMD platforms and root caused the issue to this commit:

> commit 23359a88e7
> Author: Suresh Siddha <suresh.b.siddha@intel.com>
> Date:   Mon Oct 26 14:24:33 2009 -0800
>
>    x86: Remove move_cleanup_count from irq_cfg

As part of this patch, we have removed the move_cleanup_count check
in smp_irq_move_cleanup_interrupt(). With this change, we can run into a
situation where an irq cleanup interrupt on a cpu can cleanup the vector
mappings associated with multiple irqs, of which one of the irq's migration
might be still in progress. As such when that irq hits the old cpu, we get
the "No IRQ handler" messages.

Fix this by checking for the irq_cfg's move_in_progress and if the move
is still in progress delay the vector cleanup to another irq cleanup
interrupt request (which will happen when the irq starts arriving at the
new cpu destination).

Reported-and-tested-by: Gary Hade <garyhade@us.ibm.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <1262804191.2732.7.camel@sbs-t61.sc.intel.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-01-06 12:08:43 -08:00
H. Peter Anvin
ea94396629 x86, apic: Don't waste a vector to improve vector spread
We want to use a vector-assignment sequence that avoids stumbling onto
0x80 earlier in the sequence, in order to improve the spread of
vectors across priority levels on machines with a small number of
interrupt sources.  Right now, this is done by simply making the first
vector (0x31 or 0x41) completely unusable.  This is unnecessary; all
we need is to start assignment at a +1 offset, we don't actually need
to prohibit the usage of this vector once we have wrapped around.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
LKML-Reference: <4B426550.6000209@kernel.org>
2010-01-04 21:28:24 -08:00
Tejun Heo
32032df6c2 Merge branch 'master' into percpu
Conflicts:
	arch/powerpc/platforms/pseries/hvCall.S
	include/linux/percpu.h
2010-01-05 09:17:33 +09:00
Mike Travis
9a7262a056 x86_64 SGI UV: Fix writes to led registers on remote uv hubs.
The wrong address was being used to write the SCIR led regs on remote
hubs.  Also, there was an inconsistency between how BIOS and the kernel
indexed these regs.  Standardize on using the lower 6 bits of the APIC
ID as the index.

This patch fixes the problem of writing to an errant address to a
cpu # >= 64.

Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Jack Steiner <steiner@sgi.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-30 12:25:26 -08:00
Yinghai Lu
9959c888a3 x86: Increase NR_IRQS and nr_irqs
I have a system with lots of igb and ixgbe, when iov/vf are
enabled for them, we hit the limit of 3064.

when system has 20 pcie installed, and one card has 2
functions, and one function needs 64 msi-x,
 may need 20 * 2 * 64 = 2560 for msi-x

but if iov and vf are enabled
 may need 20 * 2 * 64 * 3 = 7680 for msi-x
assume system with 5 ioapic, nr_irqs_gsi will be 120.

NR_CPUS = 512, and nr_cpu_ids = 128
will have NR_IRQS = 256 + 512 * 64 = 33024
will have nr_irqs = 120 + 8 * 128 + 120 * 64 = 8824

When SPARSE_IRQ is not set, there is no increase with kernel data
size.

when NR_CPUS=128, and SPARSE_IRQ is set:
   text		   data	    bss		   dec		 hex	filename
21837444	4216564	12480736	38534744	24bfe58	vmlinux.before
21837442	4216580	12480736	38534758	24bfe66	vmlinux.after
when NR_CPUS=4096, and SPARSE_IRQ is set
   text		   data	    bss		   dec		 hex	filename
21878619	5610244	13415392	40904255	270263f	vmlinux.before
21878617	5610244	13415392	40904253	270263d	vmlinux.after

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <4B398ECD.1080506@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-30 11:55:59 +01:00
Mike Travis
39d3077099 x86: SGI UV: Fix writes to led registers on remote uv hubs
The wrong address was being used to write the SCIR led regs on
remote hubs.  Also, there was an inconsistency between how BIOS
and the kernel indexed these regs.  Standardize on using the
lower 6 bits of the APIC ID as the index.

This patch fixes the problem of writing to an errant address to
a cpu # >= 64.

Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Jack Steiner <steiner@sgi.com>
Cc: Robin Holt <holt@sgi.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: stable@kernel.org
LKML-Reference: <4B3922F9.3060905@sgi.com>
[ v2: fix a number of annoying checkpatch artifacts and whitespace noise ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-29 06:47:39 +01:00
Ingo Molnar
605c1a187f Merge branch 'iommu/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommu into x86/urgent 2009-12-28 09:23:13 +01:00
Linus Torvalds
3981e15286 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, irq: Allow 0xff for /proc/irq/[n]/smp_affinity on an 8-cpu system
  Makefile: Unexport LC_ALL instead of clearing it
  x86: Fix objdump version check in arch/x86/tools/chkobjdump.awk
  x86: Reenable TSC sync check at boot, even with NONSTOP_TSC
  x86: Don't use POSIX character classes in gen-insn-attr-x86.awk
  Makefile: set LC_CTYPE, LC_COLLATE, LC_NUMERIC to C
  x86: Increase MAX_EARLY_RES; insufficient on 32-bit NUMA
  x86: Fix checking of SRAT when node 0 ram is not from 0
  x86, cpuid: Add "volatile" to asm in native_cpuid()
  x86, msr: msrs_alloc/free for CONFIG_SMP=n
  x86, amd: Get multi-node CPU info from NodeId MSR instead of PCI config space
  x86: Add IA32_TSC_AUX MSR and use it
  x86, msr/cpuid: Register enough minors for the MSR and CPUID drivers
  initramfs: add missing decompressor error check
  bzip2: Add missing checks for malloc returning NULL
  bzip2/lzma/gzip: pre-boot malloc doesn't return NULL on failure
2009-12-19 09:48:14 -08:00
Suresh Siddha
18374d89e5 x86, irq: Allow 0xff for /proc/irq/[n]/smp_affinity on an 8-cpu system
John Blackwood reported:
> on an older Dell PowerEdge 6650 system with 8 cpus (4 are hyper-threaded),
> and  32 bit (x86) kernel, once you change the irq smp_affinity of an irq
> to be less than all cpus in the system, you can never change really the
> irq smp_affinity back to be all cpus in the system (0xff) again,
> even though no error status is returned on the "/bin/echo ff >
> /proc/irq/[n]/smp_affinity" operation.
>
> This is due to that fact that BAD_APICID has the same value as
> all cpus (0xff) on 32bit kernels, and thus the value returned from
> set_desc_affinity() via the cpu_mask_to_apicid_and() function is treated
> as a failure in set_ioapic_affinity_irq_desc(), and no affinity changes
> are made.

set_desc_affinity() is already checking if the incoming cpu mask
intersects with the cpu online mask or not. So there is no need
for the apic op cpu_mask_to_apicid_and() to check again
and return BAD_APICID.

Remove the BAD_APICID return value from cpu_mask_to_apicid_and()
and also fix set_desc_affinity() to return -1 instead of using BAD_APICID
to represent error conditions (as cpu_mask_to_apicid_and() can return
logical or physical apicid values and BAD_APICID is really to represent
bad physical apic id).

Reported-by: John Blackwood <john.blackwood@ccur.com>
Root-caused-by: John Blackwood <john.blackwood@ccur.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <1261103386.2535.409.camel@sbs-t61>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-17 22:03:06 -08:00
Russ Anderson
b76365a18f x86, uv: Add serial number parameter to uv_bios_get_sn_info()
Add system_serial_number to the information returned by
uv_bios_get_sn_info() UV BIOS call.

Signed-off-by: Russ Anderson <rja@sgi.com>
LKML-Reference: <20091217165323.GA30774@sgi.com>
Cc: Jack Steiner <steiner@sgi.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-17 11:33:51 -08:00
Thomas Gleixner
239007b844 genirq: Convert irq_desc.lock to raw_spinlock
Convert locks which cannot be sleeping locks in preempt-rt to
raw_spinlocks.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
2009-12-14 23:55:33 +01:00
Linus Torvalds
75b08038ce Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, mce: Clean up thermal init by introducing intel_thermal_supported()
  x86, mce: Thermal monitoring depends on APIC being enabled
  x86: Gart: fix breakage due to IOMMU initialization cleanup
  x86: Move swiotlb initialization before dma32_free_bootmem
  x86: Fix build warning in arch/x86/mm/mmio-mod.c
  x86: Remove usedac in feature-removal-schedule.txt
  x86: Fix duplicated UV BAU interrupt vector
  nvram: Fix write beyond end condition; prove to gcc copy is safe
  mm: Adjust do_pages_stat() so gcc can see copy_from_user() is safe
  x86: Limit the number of processor bootup messages
  x86: Remove enabling x2apic message for every CPU
  doc: Add documentation for bootloader_{type,version}
  x86, msr: Add support for non-contiguous cpumasks
  x86: Use find_e820() instead of hard coded trampoline address
  x86, AMD: Fix stale cpuid4_info shared_map data in shared_cpu_map cpumasks

Trivial percpu-naming-introduced conflicts in arch/x86/kernel/cpu/intel_cacheinfo.c
2009-12-14 12:36:46 -08:00
Linus Torvalds
d0316554d3 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (34 commits)
  m68k: rename global variable vmalloc_end to m68k_vmalloc_end
  percpu: add missing per_cpu_ptr_to_phys() definition for UP
  percpu: Fix kdump failure if booted with percpu_alloc=page
  percpu: make misc percpu symbols unique
  percpu: make percpu symbols in ia64 unique
  percpu: make percpu symbols in powerpc unique
  percpu: make percpu symbols in x86 unique
  percpu: make percpu symbols in xen unique
  percpu: make percpu symbols in cpufreq unique
  percpu: make percpu symbols in oprofile unique
  percpu: make percpu symbols in tracer unique
  percpu: make percpu symbols under kernel/ and mm/ unique
  percpu: remove some sparse warnings
  percpu: make alloc_percpu() handle array types
  vmalloc: fix use of non-existent percpu variable in put_cpu_var()
  this_cpu: Use this_cpu_xx in trace_functions_graph.c
  this_cpu: Use this_cpu_xx for ftrace
  this_cpu: Use this_cpu_xx in nmi handling
  this_cpu: Use this_cpu operations in RCU
  this_cpu: Use this_cpu ops for VM statistics
  ...

Fix up trivial (famous last words) global per-cpu naming conflicts in
	arch/x86/kvm/svm.c
	mm/slab.c
2009-12-14 09:58:24 -08:00
Mike Travis
450b1e8dd1 x86: Remove enabling x2apic message for every CPU
Print only once that the system is supporting x2apic mode.

Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
LKML-Reference: <4B226E92.5080904@sgi.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-11 14:32:31 -08:00
Joe Perches
5cd476effe x86: es7000_32.c: Use pr_<level> and add pr_fmt(fmt)
- Added #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 - Converted a few printk(KERN_INFO to pr_info(
 - Stripped "es7000_mipcfg" from pr_debug

Signed-off-by: Joe Perches <joe@perches.com>
LKML-Reference: <3b4375af246dec5941168858910210937c110af9.1260383912.git.joe@perches.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-10 08:57:49 +01:00
Ingo Molnar
4c68db38c8 Merge branch 'linus' into x86/urgent
Merge reason: We want to queue up a dependent patch.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-09 08:24:57 +01:00
Linus Torvalds
60d8ce2cd6 Merge branch 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  timers, init: Limit the number of per cpu calibration bootup messages
  posix-cpu-timers: optimize and document timer_create callback
  clockevents: Add missing include to pacify sparse
  x86: vmiclock: Fix printk format
  x86: Fix printk format due to variable type change
  sparc: fix printk for change of variable type
  clocksource/events: Fix fallout of generic code changes
  nohz: Allow 32-bit machines to sleep for more than 2.15 seconds
  nohz: Track last do_timer() cpu
  nohz: Prevent clocksource wrapping during idle
  nohz: Type cast printk argument
  mips: Use generic mult/shift factor calculation for clocks
  clocksource: Provide a generic mult/shift factor calculation
  clockevents: Use u32 for mult and shift factors
  nohz: Introduce arch_needs_cpu
  nohz: Reuse ktime in sub-functions of tick_check_idle.
  time: Remove xtime_cache
  time: Implement logarithmic time accumulation
2009-12-08 19:27:08 -08:00
Linus Torvalds
849e8dea09 Merge branch 'timers-for-linus-hpet' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-for-linus-hpet' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: hpet: Make WARN_ON understandable
  x86: arch specific support for remapping HPET MSIs
  intr-remap: generic support for remapping HPET MSIs
  x86, hpet: Simplify the HPET code
  x86, hpet: Disable per-cpu hpet timer if ARAT is supported
2009-12-08 19:26:55 -08:00
Linus Torvalds
e33c019722 Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (36 commits)
  x86, mm: Correct the implementation of is_untracked_pat_range()
  x86/pat: Trivial: don't create debugfs for memtype if pat is disabled
  x86, mtrr: Fix sorting of mtrr after subtracting
  x86: Move find_smp_config() earlier and avoid bootmem usage
  x86, platform: Change is_untracked_pat_range() to bool; cleanup init
  x86: Change is_ISA_range() into an inline function
  x86, mm: is_untracked_pat_range() takes a normal semiclosed range
  x86, mm: Call is_untracked_pat_range() rather than is_ISA_range()
  x86: UV SGI: Don't track GRU space in PAT
  x86: SGI UV: Fix BAU initialization
  x86, numa: Use near(er) online node instead of roundrobin for NUMA
  x86, numa, bootmem: Only free bootmem on NUMA failure path
  x86: Change crash kernel to reserve via reserve_early()
  x86: Eliminate redundant/contradicting cache line size config options
  x86: When cleaning MTRRs, do not fold WP into UC
  x86: remove "extern" from function prototypes in <asm/proto.h>
  x86, mm: Report state of NX protections during boot
  x86, mm: Clean up and simplify NX enablement
  x86, pageattr: Make set_memory_(x|nx) aware of NX support
  x86, sleep: Always save the value of EFER
  ...

Fix up conflicts (added both iommu_shutdown and is_untracked_pat_range)
to 'struct x86_platform_ops') in
	arch/x86/include/asm/x86_init.h
	arch/x86/kernel/x86_init.c
2009-12-08 13:27:33 -08:00
Thomas Gleixner
a946d8f11f x86: Fix bogus warning in apic_noop.apic_write()
apic_noop is used to provide dummy apic functions. It's installed
when the CPU has no APIC or when the APIC is disabled on the kernel
command line.

The apic_noop implementation of apic_write() warns when the CPU has
an APIC or when the APIC is not disabled.

That's bogus. The warning should only happen when the CPU has an
APIC _AND_ the APIC is not disabled. apic_noop.apic_read() has the
correct check.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: <stable@kernel.org> # in <= .32 this typo resides in native_apic_write_dummy()
LKML-Reference: <alpine.LFD.2.00.0912071255420.3089@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-07 13:16:37 +01:00
Linus Torvalds
6ec22f9b03 Merge branch 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Limit number of per cpu TSC sync messages
  x86: dumpstack, 64-bit: Disable preemption when walking the IRQ/exception stacks
  x86: dumpstack: Clean up the x86_stack_ids[][] initalization and other details
  x86, cpu: mv display_cacheinfo -> cpu_detect_cache_sizes
  x86: Suppress stack overrun message for init_task
  x86: Fix cpu_devs[] initialization in early_cpu_init()
  x86: Remove CPU cache size output for non-Intel too
  x86: Minimise printk spew from per-vendor init code
  x86: Remove the CPU cache size printk's
  cpumask: Avoid cpumask_t in arch/x86/kernel/apic/nmi.c
  x86: Make sure we also print a Code: line for show_regs()
2009-12-05 15:33:27 -08:00
Linus Torvalds
a77d2e081b Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (30 commits)
  x86, apic: Enable lapic nmi watchdog on AMD Family 11h
  x86: Remove unnecessary mdelay() from cpu_disable_common()
  x86, ioapic: Document another case when level irq is seen as an edge
  x86, ioapic: Fix the EOI register detection mechanism
  x86, io-apic: Move the effort of clearing remoteIRR explicitly before migrating the irq
  x86: SGI UV: Map low MMR ranges
  x86: apic: Print out SRAT table APIC id in hex
  x86: Re-get cfg_new in case reuse/move irq_desc
  x86: apic: Remove not needed #ifdef
  x86: io-apic: IO-APIC MMIO should not fail on resource insertion
  x86: Remove asm/apicnum.h
  x86: apic: Do not use stacked physid_mask_t
  x86, apic: Get rid of apicid_to_cpu_present assign on 64-bit
  x86, ioapic: Use snrpintf while set names for IO-APIC resourses
  x86, apic: Use PAGE_SIZE instead of numbers
  x86: Remove local_irq_enable()/local_irq_disable() in fixup_irqs()
  x86: Use EOI register in io-apic on intel platforms
  x86: Force irq complete move during cpu offline
  x86: Remove move_cleanup_count from irq_cfg
  x86, intr-remap: Avoid irq_chip mask/unmask in fixup_irqs() for intr-remapping
  ...
2009-12-05 15:31:25 -08:00
Suresh Siddha
1c83995b6c x86, ioapic: Document another case when level irq is seen as an edge
In the case when cpu goes offline, fixup_irqs() will forward any
unhandled interrupt on the offlined cpu to the new cpu
destination that is handling the corresponding interrupt. This
interrupt forwarding is done via IPI's. Hence, in this case also
level-triggered io-apic interrupt will be seen as an edge
interrupt in the cpu's APIC IRR.

Document this scenario in the code which handles this case by doing
an explicit EOI to the io-apic to clear remote IRR of the io-apic RTE.

Requested-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: ebiederm@xmission.com
Cc: garyhade@us.ibm.com
LKML-Reference: <20091201233335.143970505@sbs-t61.sc.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-02 10:11:01 +01:00
Suresh Siddha
c29d9db338 x86, ioapic: Fix the EOI register detection mechanism
Maciej W. Rozycki reported:

> 82093AA I/O APIC has its version set to 0x11 and it
> does not support the EOI register.  Similarly I/O APICs
> integrated into the 82379AB south bridge and the 82374EB/SB
> EISA component.

IO-APIC versions below 0x20 don't support EOI register.

Some of the Intel ICH Specs (ICH2 to ICH5) documents the io-apic
version as 0x2. This is an error with documentation and these
ICH chips use io-apic's of version 0x20 and indeed has a working
EOI register for the io-apic.

Fix the EOI register detection mechanism to check for version
0x20 and beyond.

And also, a platform can potentially  have io-apic's with
different versions. Make the EOI register check per io-apic.

Reported-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: ebiederm@xmission.com
Cc: garyhade@us.ibm.com
LKML-Reference: <20091201233335.065361533@sbs-t61.sc.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-02 10:11:01 +01:00
Maciej W. Rozycki
ca64c47cec x86, io-apic: Move the effort of clearing remoteIRR explicitly before migrating the irq
When the level-triggered interrupt is seen as an edge interrupt,
we try to clear the remoteIRR explicitly (using either an
io-apic eoi register when present or through the idea of
changing trigger mode of the io-apic RTE to edge and then back
to level). But this explicit try also needs to happen before we
try to migrate the irq. Otherwise irq migration attempt will
fail anyhow, as it postpones the irq migration to a later
attempt when it sees the remoteIRR in the io-apic RTE still set.

Signed-off-by: "Maciej W. Rozycki" <macro@linux-mips.org>
Reviewed-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: ebiederm@xmission.com
Cc: garyhade@us.ibm.com
LKML-Reference: <20091201233334.975416130@sbs-t61.sc.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-02 10:11:00 +01:00
H. Peter Anvin
ccef086454 x86, mm: Correct the implementation of is_untracked_pat_range()
The semantics the PAT code expect of is_untracked_pat_range() is "is
this range completely contained inside the untracked region."  This
means that checkin 8a27138924 was
technically wrong, because the implementation needlessly confusing.

The sane interface is for it to take a semiclosed range like just
about everything else (as evidenced by the sheer number of "- 1"'s
removed by that patch) so change the actual implementation to match.

Reported-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jack Steiner <steiner@sgi.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
LKML-Reference: <20091119202341.GA4420@sgi.com>
2009-11-30 21:33:51 -08:00
Jack Steiner
918bc960dc x86: SGI UV: Map low MMR ranges
Explicitly mmap the UV chipset MMR address ranges used to
access blade-local registers. Although these same MMRs are also
mmaped at higher addresses, the low range is more
convenient when accessing blade-local registers.

The low range addresses always alias to the local blade
regardless of the blade id.

Signed-off-by: Jack Steiner <steiner@sgi.com>
LKML-Reference: <20091125162018.GA25445@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-26 10:52:36 +01:00
Yinghai Lu
b24c2a925a x86: Move find_smp_config() earlier and avoid bootmem usage
Move the find_smp_config() call to before bootmem is initialized.
Use reserve_early() instead of reserve_bootmem() in it.

This simplifies the code, we only need to call find_smp_config()
once and can remove the now unneeded reserve parameter from
x86_init_mpparse::find_smp_config.

We thus also reduce x86's dependency on bootmem allocations.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4B0BB9F2.70907@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-24 12:10:51 +01:00
H. Peter Anvin
eb41c8be89 x86, platform: Change is_untracked_pat_range() to bool; cleanup init
- Change is_untracked_pat_range() to return bool.
- Clean up the initialization of is_untracked_pat_range() -- by default,
  we simply point it at is_ISA_range() directly.
- Move is_untracked_pat_range to the end of struct x86_platform, since
  it is the newest field.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jack Steiner <steiner@sgi.com>
LKML-Reference: <20091119202341.GA4420@sgi.com>
2009-11-23 17:09:59 -08:00
Jack Steiner
fd12a0d69a x86: UV SGI: Don't track GRU space in PAT
GRU space is always mapped as WB in the page table. There is
no need to track the mappings in the PAT. This also eliminates
the "freeing invalid memtype" messages when the GRU space is
unmapped.

Signed-off-by: Jack Steiner <steiner@sgi.com>
LKML-Reference: <20091119202341.GA4420@sgi.com>
[ v2: fix build failure ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-23 19:47:42 +01:00
Yinghai Lu
37ef2a3029 x86: Re-get cfg_new in case reuse/move irq_desc
When irq_desc is moved, we need to make sure to use the right cfg_new.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4B07A739.3030104@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-23 09:56:05 +01:00
Yinghai Lu
e670761f12 x86: apic: Remove not needed #ifdef
Suresh made dmar_table_init() already have that protection.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4B07A739.3030104@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-23 09:54:15 +01:00
Cyrill Gorcunov
e79c65a97c x86: io-apic: IO-APIC MMIO should not fail on resource insertion
If IO-APIC base address is 1K aligned we should not fail
on resourse insertion procedure. For this sake we define
IO_APIC_SLOT_SIZE constant which should cover all IO-APIC
direct accessible registers.

An example of a such configuration is there

	http://marc.info/?l=linux-kernel&m=118114792006520

 |
 | Quoting the message
 |
 | IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
 | IOAPIC[1]: apic_id 3, version 32, address 0xfec80000, GSI 24-47
 | IOAPIC[2]: apic_id 4, version 32, address 0xfec80400, GSI 48-71
 | IOAPIC[3]: apic_id 5, version 32, address 0xfec84000, GSI 72-95
 | IOAPIC[4]: apic_id 8, version 32, address 0xfec84400, GSI 96-119
 |

Reported-by: "Maciej W. Rozycki" <macro@linux-mips.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <20091116151426.GC5653@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-16 16:37:10 +01:00
Thomas Gleixner
411462f62a x86: Fix printk format due to variable type change
clockevents.mult became u32. Fix the printk format.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-11-16 11:55:20 +01:00
Cyrill Gorcunov
7abc075313 x86: apic: Do not use stacked physid_mask_t
We should not use physid_mask_t as a stack based
variable in apic code. This type depends on MAX_APICS
parameter which may be huge enough.

Especially it became a problem with apic NOOP driver which
is portable between 32 bit and 64 bit environment
(where we have really huge MAX_APICS).

So apic driver should operate with pointers and a caller
in turn should aware of allocation physid_mask_t variable.

As a side (but positive) effect -- we may use already
implemented physid_set_mask_of_physid function eliminating
default_apicid_to_cpu_present completely.

Note that physids_coerce and physids_promote turned into static
inline from macro (since macro hides the fact that parameter is
being interpreted as unsigned long, make it explicit).

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
LKML-Reference: <20091109220659.GA5568@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-10 05:52:07 +01:00
Cyrill Gorcunov
f4a70c5537 x86, apic: Get rid of apicid_to_cpu_present assign on 64-bit
In fact it's never get used on x86-64 (for 64 bit platform
we use differ technique to enumerate io-units).

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20091108131645.GD5300@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-08 19:46:17 +01:00
Cyrill Gorcunov
4343fe1024 x86, ioapic: Use snrpintf while set names for IO-APIC resourses
We should be ready that one day MAX_IO_APICS may raise its
number. To prevent memory overwrite we're to use safe
snprintf while set IO-APIC resourse name.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <20091108155431.GC25940@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-08 17:06:23 +01:00
Cyrill Gorcunov
46dc281b1b x86, apic: Use PAGE_SIZE instead of numbers
The whole page is reserved for IO-APIC fixmap
due to non-cacheable requirement. So lets note
this explicitly instead of playing with numbers.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
LKML-Reference: <20091108155356.GB25940@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-08 17:06:22 +01:00
Rusty Russell
ce7c42710e cpumask: Avoid cpumask_t in arch/x86/kernel/apic/nmi.c
Ingo wants the certainty of a static cpumask (rather than a
cpumask_var_t), but cpumask_t will some day be undefined to
avoid on-stack declarations.

This is what DECLARE_BITMAP/to_cpumask() is for.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
LKML-Reference: <200911031453.52394.rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-04 13:17:53 +01:00
Suresh Siddha
b3ec0a37a7 x86: Use EOI register in io-apic on intel platforms
IO-APIC's in intel chipsets support EOI register starting from
IO-APIC version 2. Use that when ever we need to clear the
IO-APIC RTE's RemoteIRR bit explicitly.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Acked-by: Gary Hade <garyhade@us.ibm.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
LKML-Reference: <20091026230001.947855317@sbs-t61.sc.intel.com>
[ Marked use_eio_reg as __read_mostly, fixed small details ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-02 15:56:36 +01:00
Suresh Siddha
a5e74b8419 x86: Force irq complete move during cpu offline
When a cpu goes offline, fixup_irqs() try to move irq's
currently destined to the offline cpu to a new cpu. But this
attempt will fail if the irq is recently moved to this cpu and
the irq still hasn't arrived at this cpu (for non intr-remapping
platforms this is when we free the vector allocation at the
previous destination) that is about to go offline.

This will endup with the interrupt subsystem still pointing the
irq to the offline cpu, causing that irq to not work any more.

Fix this by forcing the irq to complete its move (its been a
long time we moved the irq to this cpu which we are offlining
now) and then move this irq to a new cpu before this cpu goes
offline.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Acked-by: Gary Hade <garyhade@us.ibm.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
LKML-Reference: <20091026230001.848830905@sbs-t61.sc.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-02 15:56:36 +01:00
Suresh Siddha
23359a88e7 x86: Remove move_cleanup_count from irq_cfg
move_cleanup_count for each irq in irq_cfg is keeping track of
the total number of cpus that need to free the corresponding
vectors associated with the irq which has now been migrated to
new destination. As long as this move_cleanup_count is non-zero
(i.e., as long as we have n't freed the vector allocations on
the old destinations) we were preventing the irq's further
migration.

This cleanup count is unnecessary and it is enough to not allow
the irq migration till we send the cleanup vector to the
previous irq destination, for which we already have irq_cfg's
move_in_progress.  All we need to make sure is that we free the
vector at the old desintation but we don't need to wait till
that gets freed.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Acked-by: Gary Hade <garyhade@us.ibm.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
LKML-Reference: <20091026230001.752968906@sbs-t61.sc.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-02 15:56:35 +01:00
Rusty Russell
dd17c8f729 percpu: remove per_cpu__ prefix.
Now that the return from alloc_percpu is compatible with the address
of per-cpu vars, it makes sense to hand around the address of per-cpu
variables.  To make this sane, we remove the per_cpu__ prefix we used
created to stop people accidentally using these vars directly.

Now we have sparse, we can use that (next patch).

tj: * Updated to convert stuff which were missed by or added after the
      original patch.

    * Kill per_cpu_var() macro.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
2009-10-29 22:34:15 +09:00
Andreas Herrmann
6f9b41006a x86, apic: Clear APIC Timer Initial Count Register on shutdown
Commit a98f8fd24f (x86: apic reset
counter on shutdown) set the counter to max to avoid spurious
interrupts when the timer is re-enabled.

(In theory) you'll still get a spurious interrupt if spending
more than 344 seconds with this interrupt disabled and then
unmasking it.

The right thing to do is to clear the register. This disables
the interrupt from happening (at least it does on AMD hardware).

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
LKML-Reference: <20091027100138.GB30802@alberich.amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-27 14:54:21 +01:00
Robin Holt
036ed8ba61 x86, UV: Fix information in __uv_hub_info structure
A few parts of the uv_hub_info structure are initialized
incorrectly.

 - n_val is being loaded with m_val.
 - gpa_mask is initialized with a bytes instead of an unsigned long.
 - Handle the case where none of the alias registers are used.

Lastly I converted the bau over to using the uv_hub_info->m_val
which is the correct value.

Without this patch, booting a large configuration hits a
problem where the upper bits of the gnode affect the pnode
and the bau will not operate.

Signed-off-by: Robin Holt <holt@sgi.com>
Acked-by: Jack Steiner <steiner@sgi.com>
Cc: Cliff Whickman <cpw@sgi.com>
Cc: stable@kernel.org
LKML-Reference: <20091015224946.396355000@alcatraz.americas.sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-16 08:18:34 +02:00
Cyrill Gorcunov
f88f2b4fdb x86: apic: Allow noop operations to be called almost at any time
As only apic noop is used we allow to use almost any operation
caller wants (and which of them noop driver supports of
course).

Initially it was reported by Ingo Molnar that apic noop
issue a warning for pkg id (which is actually false positive
and should be eliminated).

So we save checking (and warning issue) for read/write
operations while allow any other ops to be freely used.

Also:
 - fix noop_cpu_to_logical_apicid, it should be 0.
 - rename noop_default_phys_pkg_id to noop_phys_pkg_id
   (we use default_ prefix for more general routines
    in apic subsystem).

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
LKML-Reference: <20091015150416.GC5331@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 17:26:53 +02:00
Dimitri Sivanich
9338ad6ffb x86, apic: Move SGI UV functionality out of generic IO-APIC code
Move UV specific functionality out of the generic IO-APIC code.

Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
LKML-Reference: <20091013203236.GD20543@sgi.com>
[ Cleaned up the code some more in their new places. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14 09:17:09 +02:00
Dimitri Sivanich
6c2c502910 x86: SGI UV: Fix irq affinity for hub based interrupts
This patch fixes handling of uv hub irq affinity.  IRQs with ALL or
NODE affinity can be routed to cpus other than their originally
assigned cpu.  Those with CPU affinity cannot be rerouted.

Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
LKML-Reference: <20090930160259.GA7822@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14 09:17:01 +02:00
Cyrill Gorcunov
2626eb2b2f x86, apic: Limit apic dumping, introduce new show_lapic= setup option
In case if a system has a large number of cpus printing apics
contents may consume a long time period.

We limit such an output by 1 apic by default. But to have an
ability to see all apics or some part of them we introduce
"show_lapic" setup option which allow us to limit/unlimit the
number of APICs being dumped.

Example: apic=debug show_lapic=5, or apic=debug show_lapic=all

Also move apic_verbosity checking upper that way so helper routines
do not need to inspect it at all.

Suggested-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: yinghai@kernel.org
Cc: macro@linux-mips.org
LKML-Reference: <20091013201022.926793122@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14 09:17:01 +02:00
Cyrill Gorcunov
a933c61829 x86, apic: Use apic noop driver
In case if apic were disabled we may use the whole apic NOOP driver
instead of sparse poking the some functions in apic driver.

Also NOOP would catch any inappropriate apic operation calls (not
just read/write).

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: yinghai@kernel.org
Cc: macro@linux-mips.org
LKML-Reference: <20091013201022.747817361@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14 09:17:00 +02:00
Cyrill Gorcunov
9844ab11c7 x86, apic: Introduce the NOOP apic driver
Introduce NOOP APIC driver. We should use it in case if apic was
disabled due to hardware of software/firmware problems (including
user requested to disable it case).

The driver is attempting to catch any inappropriate apic operation
call with warning issue.

Also it is possible to use some apic operation like IPI calls,
read/write without checking for apic presence which should make
callers code easier.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: yinghai@kernel.org
Cc: macro@linux-mips.org
LKML-Reference: <20091013201022.534682104@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14 09:17:00 +02:00
Christoph Lameter
494f6a9e12 this_cpu: Use this_cpu_xx in nmi handling
this_cpu_inc/dec reduces the number of instructions needed.

Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-10-12 19:51:49 +09:00
Alexey Dobriyan
8d65af789f sysctl: remove "struct file *" argument of ->proc_handler
It's unused.

It isn't needed -- read or write flag is already passed and sysctl
shouldn't care about the rest.

It _was_ used in two places at arch/frv for some reason.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24 07:21:04 -07:00
Li Zefan
79f5599772 cpumask: use zalloc_cpumask_var() where possible
Remove open-coded zalloc_cpumask_var() and zalloc_cpumask_var_node().

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-09-24 09:34:24 +09:30
Linus Torvalds
43c1266ce4 Merge branch 'perfcounters-rename-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perfcounters-rename-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf: Tidy up after the big rename
  perf: Do the big rename: Performance Counters -> Performance Events
  perf_counter: Rename 'event' to event_id/hw_event
  perf_counter: Rename list_entry -> group_entry, counter_list -> group_list

Manually resolved some fairly trivial conflicts with the tracing tree in
include/trace/ftrace.h and kernel/trace/trace_syscalls.c.
2009-09-21 09:15:07 -07:00
Ingo Molnar
cdd6c482c9 perf: Do the big rename: Performance Counters -> Performance Events
Bye-bye Performance Counters, welcome Performance Events!

In the past few months the perfcounters subsystem has grown out its
initial role of counting hardware events, and has become (and is
becoming) a much broader generic event enumeration, reporting, logging,
monitoring, analysis facility.

Naming its core object 'perf_counter' and naming the subsystem
'perfcounters' has become more and more of a misnomer. With pending
code like hw-breakpoints support the 'counter' name is less and
less appropriate.

All in one, we've decided to rename the subsystem to 'performance
events' and to propagate this rename through all fields, variables
and API names. (in an ABI compatible fashion)

The word 'event' is also a bit shorter than 'counter' - which makes
it slightly more convenient to write/handle as well.

Thanks goes to Stephane Eranian who first observed this misnomer and
suggested a rename.

User-space tooling and ABI compatibility is not affected - this patch
should be function-invariant. (Also, defconfigs were not touched to
keep the size down.)

This patch has been generated via the following script:

  FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')

  sed -i \
    -e 's/PERF_EVENT_/PERF_RECORD_/g' \
    -e 's/PERF_COUNTER/PERF_EVENT/g' \
    -e 's/perf_counter/perf_event/g' \
    -e 's/nb_counters/nb_events/g' \
    -e 's/swcounter/swevent/g' \
    -e 's/tpcounter_event/tp_event/g' \
    $FILES

  for N in $(find . -name perf_counter.[ch]); do
    M=$(echo $N | sed 's/perf_counter/perf_event/g')
    mv $N $M
  done

  FILES=$(find . -name perf_event.*)

  sed -i \
    -e 's/COUNTER_MASK/REG_MASK/g' \
    -e 's/COUNTER/EVENT/g' \
    -e 's/\<event\>/event_id/g' \
    -e 's/counter/event/g' \
    -e 's/Counter/Event/g' \
    $FILES

... to keep it as correct as possible. This script can also be
used by anyone who has pending perfcounters patches - it converts
a Linux kernel tree over to the new naming. We tried to time this
change to the point in time where the amount of pending patches
is the smallest: the end of the merge window.

Namespace clashes were fixed up in a preparatory patch - and some
stylistic fallout will be fixed up in a subsequent patch.

( NOTE: 'counters' are still the proper terminology when we deal
  with hardware registers - and these sed scripts are a bit
  over-eager in renaming them. I've undone some of that, but
  in case there's something left where 'counter' would be
  better than 'event' we can undo that on an individual basis
  instead of touching an otherwise nicely automated patch. )

Suggested-by: Stephane Eranian <eranian@google.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Paul Mackerras <paulus@samba.org>
Reviewed-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <linux-arch@vger.kernel.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-21 14:28:04 +02:00
Ingo Molnar
bfefb7a0c6 Merge branch 'linus' into x86/urgent
Merge reason: Bring in changes that the next patch will depend on.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-20 20:25:03 +02:00
Cyrill Gorcunov
8312136fa8 x86, apic: Fix missed handling of discrete apics
In case of discrete (pretty old) apics we may have cpu_has_apic bit
not set but have to check if smp_found_config (MP spec) is there
and apic was not disabled.

Also don't forget to print apic/io-apic for such case as well.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: "Maciej W. Rozycki" <macro@linux-mips.org>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <20090915071230.GA10604@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-20 20:18:07 +02:00
Suresh Siddha
2fbd07a5f5 x86, apic: Use logical flat on intel with <= 8 logical cpus
On Intel platforms, we can use logical flat mode if there are <= 8
logical cpu's (irrespective of physical apic id values). This will
enable simplified and efficient IPI and device interrupt routing on
such platforms.

Fix the relevant comments while we are at it.

We can clean up default_setup_apic_routing() by using apic->probe()
but that is a different item.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: "yinghai@kernel.org" <yinghai@kernel.org>
LKML-Reference: <1253327399.3948.747.camel@sbs-t61.sc.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-19 09:20:05 +02:00
Linus Torvalds
78f28b7c55 Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (38 commits)
  x86: Move get/set_wallclock to x86_platform_ops
  x86: platform: Fix section annotations
  x86: apic namespace cleanup
  x86: Distangle ioapic and i8259
  x86: Add Moorestown early detection
  x86: Add hardware_subarch ID for Moorestown
  x86: Add early platform detection
  x86: Move tsc_init to late_time_init
  x86: Move tsc_calibration to x86_init_ops
  x86: Replace the now identical time_32/64.c by time.c
  x86: time_32/64.c unify profile_pc
  x86: Move calibrate_cpu to tsc.c
  x86: Make timer setup and global variables the same in time_32/64.c
  x86: Remove mca bus ifdef from timer interrupt
  x86: Simplify timer_ack magic in time_32.c
  x86: Prepare unification of time_32/64.c
  x86: Remove do_timer hook
  x86: Add timer_init to x86_init_ops
  x86: Move percpu clockevents setup to x86_init_ops
  x86: Move xen_post_allocator_init into xen_pagetable_setup_done
  ...

Fix up conflicts in arch/x86/include/asm/io_apic.h
2009-09-18 14:05:47 -07:00
Jack Steiner
daf7b9c921 x86: SGI UV: Map MMIO-High memory range
UV depends on the MMRHI space being identity mapped. The patch:

	x86: Make 64-bit efi_ioremap use ioremap on MMIO regions

changed this to make efi regions at a different address using
ioremap. Add the identity mapping to uv_system_init.

( Note this code was previously present but was deleted when BIOS
  added the ranges to the EFI map - previous efi code identify
  mapped the ranges. )

Signed-off-by: Jack Steiner <steiner@sgi.com>
LKML-Reference: <20090909154339.GA7946@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-18 14:06:40 +02:00
Daniel Walker
c2777f98c2 x86: apic: Convert BUG() to BUG_ON()
This was done using Coccinelle's BUG_ON semantic patch.

Signed-off-by: Daniel Walker <dwalker@fifo99.com>
Cc: Julia Lawall <julia@diku.dk>
LKML-Reference: <1252777220-30796-1-git-send-email-dwalker@fifo99.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-18 13:45:33 +02:00
Linus Torvalds
df58bee21e Merge branch 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (21 commits)
  x86, mce: Fix compilation with !CONFIG_DEBUG_FS in mce-severity.c
  x86, mce: CE in last bank prevents panic by unknown MCE
  x86, mce: Fake panic support for MCE testing
  x86, mce: Move debugfs mce dir creating to mce.c
  x86, mce: Support specifying raise mode for software MCE injection
  x86, mce: Support specifying context for software mce injection
  x86, mce: fix reporting of Thermal Monitoring mechanism enabled
  x86, mce: remove never executed code
  x86, mce: add missing __cpuinit tags
  x86, mce: fix "mce" boot option handling for CONFIG_X86_NEW_MCE
  x86, mce: don't log boot MCEs on Pentium M (model == 13) CPUs
  x86: mce: Lower maximum number of banks to architecture limit
  x86: mce: macros to compute banks MSRs
  x86: mce: Move per bank data in a single datastructure
  x86: mce: Move code in mce.c
  x86: mce: Rename CONFIG_X86_NEW_MCE to CONFIG_X86_MCE
  x86: mce: Remove old i386 machine check code
  x86: mce: Update X86_MCE description in x86/Kconfig
  x86: mce: Make CONFIG_X86_ANCIENT_MCE dependent on CONFIG_X86_MCE
  x86, mce: use atomic_inc_return() instead of add by 1
  ...

Manually fixed up trivial conflicts:
	Documentation/feature-removal-schedule.txt
	arch/x86/kernel/cpu/mcheck/mce.c
2009-09-17 21:07:08 -07:00
Linus Torvalds
15b0404272 Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Make memtype_seq_ops const
  x86: uv: Clean up uv_ptc_init(), use proc_create()
  x86: Use printk_once()
  x86/cpu: Clean up various files a bit
  x86: Remove duplicated #include
  x86, ipi: Clean up safe_smp_processor_id() by using the cpu_has_apic() macro helper
  x86: Clean up idt_descr and idt_tableby using NR_VECTORS instead of hardcoded number
  x86: Further clean up of mtrr/generic.c
  x86: Clean up mtrr/main.c
  x86: Clean up mtrr/state.c
  x86: Clean up mtrr/mtrr.h
  x86: Clean up mtrr/if.c
  x86: Clean up mtrr/generic.c
  x86: Clean up mtrr/cyrix.c
  x86: Clean up mtrr/cleanup.c
  x86: Clean up mtrr/centaur.c
  x86: Clean up mtrr/amd.c:
  x86: ds.c fix invalid assignment
2009-09-14 07:56:43 -07:00
Linus Torvalds
ffaf854b01 Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (24 commits)
  ACPI, x86: expose some IO-APIC routines when CONFIG_ACPI=n
  x86, apic: Slim down stack usage in early_init_lapic_mapping()
  x86, ioapic: Get rid of needless check and simplify ioapic_setup_resources()
  x86, ioapic: Define IO_APIC_DEFAULT_PHYS_BASE constant
  x86: Fix x86_model test in es7000_apic_is_cluster()
  x86, apic: Move dmar_table_init() out of enable_IR()
  x86, ioapic: Panic on irq-pin binding only if needed
  x86/apic: Enable x2APIC without interrupt remapping under KVM
  x86, apic: Drop redundant bit assignment
  x86, ioapic: Throw BUG instead of NULL dereference
  x86, ioapic: Introduce for_each_irq_pin() helper
  x86: Remove superfluous NULL pointer check in destroy_irq()
  x86/ioapic.c: unify ioapic_retrigger_irq()
  x86/ioapic.c: convert __target_IO_APIC_irq to conventional for() loop
  x86/ioapic.c: clean up replace_pin_at_irq_node logic and comments
  x86/ioapic.c: convert replace_pin_at_irq_node to conventional for() loop
  x86/ioapic.c: simplify add_pin_to_irq_node()
  x86/ioapic.c: convert io_apic_level_ack_pending loop to normal for() loop
  x86/ioapic.c: move lost comment to what seems like appropriate place
  x86/ioapic.c: remove redundant declaration of irq_pin_list
  ...
2009-09-14 07:51:20 -07:00
Linus Torvalds
989aa44a5f Merge branch 'core-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  debug lockups: Improve lockup detection, fix generic arch fallback
  debug lockups: Improve lockup detection
2009-09-11 13:15:55 -07:00
Thomas Gleixner
e11dadabf4 x86: apic namespace cleanup
boot_cpu_physical_apicid is a global variable and used as function
argument as well. Rename the function arguments to avoid confusion.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31 21:30:47 +02:00
Thomas Gleixner
bc07844a33 x86: Distangle ioapic and i8259
The proposed Moorestown support patches use an extra feature flag
mechanism to make the ioapic work w/o an i8259. There is a much
simpler solution.

Most i8259 specific functions are already called dependend on the irq
number less than NR_IRQS_LEGACY. Replacing that constant by a
read_mostly variable which can be set to 0 by the platform setup code
allows us to achieve the same without any special feature flags.

That trivial change allows us to proceed with MRST w/o doing a full
blown overhaul of the ioapic code which would delay MRST unduly.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31 19:23:09 +02:00
Thomas Gleixner
845b3944bb x86: Add timer_init to x86_init_ops
The timer init code is convoluted with several quirks and the paravirt
timer chooser. Figuring out which code path is actually taken is not
for the faint hearted.

Move the numaq TSC quirk to tsc_pre_init x86_init_ops function and
replace the paravirt time chooser and the remaining x86 quirk with a
simple x86_init_ops function.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31 09:35:46 +02:00
Thomas Gleixner
736decac64 x86: Move percpu clockevents setup to x86_init_ops
paravirt overrides the setup of the default apic timers as per cpu
timers. Moorestown needs to override that as well.

Move it to x86_init_ops setup and create a separate x86_cpuinit struct
which holds the function for the secondary evtl. hotplugabble CPUs.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31 09:35:46 +02:00
Thomas Gleixner
428cf9025b x86: Move traps_init to x86_init_ops
Replace the quirks by a simple x86_init_ops function.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31 09:35:45 +02:00
Thomas Gleixner
66bcaf0bde x86: Move irq_init to x86_init_ops
irq_init is overridden by x86_quirks and by paravirts. Unify the whole
mess and make it an unconditional x86_init_ops function which defaults
to the standard function and can be overridden by the early platform
code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31 09:35:45 +02:00
Thomas Gleixner
d9112f4302 x86: Move pre_intr_init to x86_init_ops
Replace the quirk machinery by a x86_init_ops function which
defaults to the standard implementation. This is also a preparatory
patch for Moorestown support which needs to replace the default
init_ISA_irqs as well.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31 09:35:45 +02:00
Thomas Gleixner
b3f1b617f4 x86: Move get/find_smp_config to x86_init_ops
Replace the quirk machinery by a x86_init_ops function which defaults
to the standard implementation.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31 09:35:45 +02:00
Ingo Molnar
eebc57f73d Merge branch 'for-ingo' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-sfi-2.6 into x86/apic
Merge reason: the SFI (Simple Firmware Interface) feature in the ACPI
              tree needs this cleanup, pull it into the APIC branch as
	      well so that there's no interactions.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-29 09:31:47 +02:00
Feng Tang
2a4ab640d3 ACPI, x86: expose some IO-APIC routines when CONFIG_ACPI=n
Some IO-APIC routines are ACPI specific now, but need to
be exposed when CONFIG_ACPI=n for the benefit of SFI.

Remove #ifdef ACPI around these routines:

io_apic_get_unique_id(int ioapic, int apic_id);
io_apic_get_version(int ioapic);
io_apic_get_redir_entries(int ioapic);

Move these routines from ACPI-specific boot.c to io_apic.c:

uniq_ioapic_id(u8 id)
mp_find_ioapic()
mp_find_ioapic_pin()
mp_register_ioapic()

Also, since uniq_ioapic_id() is now no longer static,
re-name it to io_apic_unique_id() for consistency
with the other public io_apic routines.

For simplicity, do not #ifdef the resulting code ACPI || SFI,
thought that could be done in the future if it is important
to optimize the !ACPI !SFI IO-APIC x86 kernel for size.

Signed-off-by: Feng Tang <feng.tang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: x86@kernel.org
2009-08-28 19:57:27 -04:00
Suresh Siddha
c8bc6f3c80 x86: arch specific support for remapping HPET MSIs
x86 arch support for remapping HPET MSI's by associating the HPET timer block
with the interrupt-remapping HW unit and setting up appropriate irq_chip

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Jay Fenlason <fenlason@redhat.com>
LKML-Reference: <20090804190729.630510000@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-27 23:33:20 +02:00
Thomas Gleixner
90e1c6969d x86: Move oem_bus_info to x86_init_ops
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-27 17:12:53 +02:00
Thomas Gleixner
52fdb56846 x86: Move mpc_oem_pci_bus to x86_init_ops
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-27 17:12:52 +02:00
Thomas Gleixner
72302142e1 x86: Move smp_read_mpc_oem to x86_init_ops.
Move smp_read_mpc_oem from quirks to x86_init.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-27 17:12:52 +02:00
Thomas Gleixner
fd6c666149 x86: Move mpc_apic_id to x86_init_ops
The mpc_apic_id setup is handled by a x86_quirk. Make it a
x86_init_ops function with a default implementation.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-27 17:12:52 +02:00
Thomas Gleixner
de93410310 x86: Move ioapic_ids_setup to x86_init_ops
32bit and also the numaq code have special requirements on the
ioapic_id setup. Convert it to a x86_init_ops function and get rid
of the quirks and #ifdefs

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-27 17:12:52 +02:00
Thomas Gleixner
f4848472cd x86: Sanitize smp_record and move it to x86_init_ops
The x86 quirkification introduced an extra ugly hackery with a
variable pointer in the mpparse code. If the pointer is initialized
then it is dereferenced and the variable set to 0 or incremented.

Create a x86_init_ops function and let the affected numaq code
hold the function. Default init is a setup noop.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-27 17:12:52 +02:00
Thomas Gleixner
6b18ae3e2f x86: Move memory_setup to x86_init_ops
memory_setup is overridden by x86_quirks and by paravirts with weak
functions and quirks. Unify the whole mess and make it an
unconditional x86_init_ops function which defaults to the standard
function and can be overridden by the early platform code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-27 17:12:52 +02:00
Cyrill Gorcunov
d3a247bfb2 x86, apic: Slim down stack usage in early_init_lapic_mapping()
As far as I see there is no external poking of mp_lapic_addr in
this procedure which could lead to unpredited changes and
require local storage unit for it. Lets use it plain forward.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <20090826171324.GC4548@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-26 20:26:23 +02:00
Yinghai Lu
295594e9cf x86: Fix vSMP boot crash
2.6.31-rc7 does not boot on vSMP systems:

[    8.501108] CPU31: Thermal monitoring enabled (TM1)
[    8.501127] CPU 31 MCA banks SHD:2 SHD:3 SHD:5 SHD:6 SHD:8
[    8.650254] CPU31: Intel(R) Xeon(R) CPU           E5540  @ 2.53GHz stepping 04
[    8.710324] Brought up 32 CPUs
[    8.713916] Total of 32 processors activated (162314.96 BogoMIPS).
[    8.721489] ERROR: parent span is not a superset of domain->span
[    8.727686] ERROR: domain->groups does not contain CPU0
[    8.733091] ERROR: groups don't span domain->span
[    8.737975] ERROR: domain->cpu_power not set
[    8.742416]

Ravikiran Thirumalai bisected it to:

| commit 2759c3287d
| x86: don't call read_apic_id if !cpu_has_apic

The problem is that on vSMP systems the CPUID derived
initial-APICIDs are overlapping - so we need to fall
back on hard_smp_processor_id() which reads the local
APIC.

Both come from the hardware (influenced by firmware
though) so it's a tough call which one to trust.

Doing the quirk expresses the vSMP property properly
and also does not affect other systems, so we go for
this solution instead of a revert.

Reported-and-Tested-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Shai Fultheim <shai@scalex86.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <4A944D3C.5030100@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-26 10:13:17 +02:00
Cyrill Gorcunov
ffc438366c x86, ioapic: Get rid of needless check and simplify ioapic_setup_resources()
alloc_bootmem() already panics on allocation failure. There is
no need to check the result.

Also there is a way to unbind global variable from its body and
use it as a parameter which allow us to simplify
ioapic_init_mappings as well -- "for" cycle already uses
nr_ioapics as a conditional variable and there is no need to
check if ioapic_setup_resources was returning NULL again.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <20090824175551.493629148@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-26 08:16:38 +02:00
Roel Kluin
005155b1f6 x86: Fix x86_model test in es7000_apic_is_cluster()
For the x86_model to be greater than 6 or less than 12 is
logically always true.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-25 15:58:12 +02:00
Ingo Molnar
5f9ece0240 Merge commit 'v2.6.31-rc7' into x86/cleanups
Merge reason: we were on -rc1 before - go up to -rc7

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-24 12:25:54 +02:00
Linus Torvalds
83d349f35e x86: don't send an IPI to the empty set of CPU's
The default_send_IPI_mask_logical() function uses the "flat" APIC mode
to send an IPI to a set of CPU's at once, but if that set happens to be
empty, some older local APIC's will apparently be rather unhappy.  So
just warn if a caller gives us an empty mask, and ignore it.

This fixes a regression in 2.6.30.x, due to commit 4595f9620 ("x86:
change flush_tlb_others to take a const struct cpumask"), documented
here:

	http://bugzilla.kernel.org/show_bug.cgi?id=13933

which causes a silent lock-up.  It only seems to happen on PPro, P2, P3
and Athlon XP cores.  Most developers sadly (or not so sadly, if you're
a developer..) have more modern CPU's.  Also, on x86-64 we don't use the
flat APIC mode, so it would never trigger there even if the APIC didn't
like sending an empty IPI mask.

Reported-by: Pavel Vilim <wylda@volny.cz>
Reported-and-tested-by: Thomas Björnell <thomas.bjornell@gmail.com>
Reported-and-tested-by: Martin Rogge <marogge@onlinehome.de>
Cc: Mike Travis <travis@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-21 09:23:57 -07:00
Yinghai Lu
b7f42ab2e2 x86, apic: Move dmar_table_init() out of enable_IR()
On an x2apic system, we got:

[    1.818072] ------------[ cut here ]------------
[    1.820376] WARNING: at kernel/lockdep.c:2461 lockdep_trace_alloc+0xa5/0xe9()
[    1.835282] Hardware name: ASSY,
[    1.839006] Modules linked in:
[    1.841253] Pid: 1, comm: swapper Not tainted 2.6.31-rc5-tip-03926-g39aaa80-dirty #510
[    1.858056] Call Trace:
[    1.859913]  [<ffffffff810d13aa>] ? lockdep_trace_alloc+0xa5/0xe9
[    1.876270]  [<ffffffff81093f37>] warn_slowpath_common+0x8d/0xd0
[    1.879132]  [<ffffffff81093fa1>] warn_slowpath_null+0x27/0x3d
[    1.896823]  [<ffffffff810d13aa>] lockdep_trace_alloc+0xa5/0xe9
[    1.900659]  [<ffffffff810cf5a0>] ? lock_release_holdtime+0x2f/0x199
[    1.917188]  [<ffffffff81167a3c>] kmem_cache_alloc_notrace+0x42/0x111
[    1.922320]  [<ffffffff8106fe8c>] ? reserve_memtype+0x152/0x518
[    1.938137]  [<ffffffff8106f8b1>] ? pat_pagerange_is_ram+0x4a/0x91
[    1.941730]  [<ffffffff8106fe8c>] reserve_memtype+0x152/0x518
[    1.958115]  [<ffffffff8106ce62>] __ioremap_caller+0x1dd/0x30f
[    1.975507]  [<ffffffff81ce2c5c>] ? acpi_os_map_memory+0x2a/0x47
[    1.978987]  [<ffffffff8106d0fd>] ioremap_nocache+0x2a/0x40
[    2.031400]  [<ffffffff810d0364>] ? trace_hardirqs_off+0x20/0x36
[    2.036096]  [<ffffffff81ce2c5c>] acpi_os_map_memory+0x2a/0x47
[    2.046263]  [<ffffffff815cd642>] acpi_tb_verify_table+0x3d/0x85
[    2.050349]  [<ffffffff81d34af7>] ? _spin_unlock_irqrestore+0x50/0x76
[    2.067327]  [<ffffffff815ccad6>] acpi_get_table_with_size+0x64/0xd9
[    2.070860]  [<ffffffff81d34af7>] ? _spin_unlock_irqrestore+0x50/0x76
[    2.088000]  [<ffffffff825c88d5>] dmar_table_detect+0x33/0x70
[    2.092047]  [<ffffffff825c8a01>] dmar_table_init+0x43/0x428
[    2.106854]  [<ffffffff825a7537>] enable_IR+0x1c/0x8d
[    2.110256]  [<ffffffff825a7624>] enable_IR_x2apic+0x7c/0x19e
[    2.127139]  [<ffffffff825a4876>] native_smp_prepare_cpus+0x139/0x3b8
[    2.145175]  [<ffffffff8259678d>] kernel_init+0x71/0x1da
[    2.148913]  [<ffffffff8104305a>] child_rip+0xa/0x20
[    2.152349]  [<ffffffff810429fc>] ? restore_args+0x0/0x30
[    2.167931]  [<ffffffff8259671c>] ? kernel_init+0x0/0x1da
[    2.171671]  [<ffffffff81043050>] ? child_rip+0x0/0x20
[    2.187607] ---[ end trace a7919e7f17c0a725 ]---

Venkatesh Pallipadi said:

| Looks like the problem started with this commit
|
| commit ce69a78450
| Author: Gleb Natapov <gleb@redhat.com>
| Date:   Mon Jul 20 15:24:17 2009 +0300
|
| x86/apic: Enable x2APIC without interrupt remapping under KVM
|
| Before this commit, dmar_table_init() was getting called
| with interrupts enabled and after this commit, it is getting
| called with interrupts disabled.

so try to move out dmar_table_init out of that function.

Analyzed-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: "Pallipadi, Venkatesh" <venkatesh.pallipadi@intel.com>
LKML-Reference: <4A899F3C.2050104@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-17 20:22:56 +02:00
Leonardo Potenza
52459ab913 x86: Annotate section mismatch warnings in kernel/apic/x2apic_uv_x.c
The function uv_acpi_madt_oem_check() has been marked __init,
the struct apic_x2apic_uv_x has been marked __refdata.

The aim is to address the following section mismatch messages:

WARNING: arch/x86/kernel/apic/built-in.o(.data+0x1368): Section mismatch in reference from the variable apic_x2apic_uv_x to the function .cpuinit.text:uv_wakeup_secondary()
The variable apic_x2apic_uv_x references
the function __cpuinit uv_wakeup_secondary()
If the reference is valid then annotate the
variable with __init* or __refdata (see linux/init.h) or name the variable:
*driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,

WARNING: arch/x86/kernel/built-in.o(.data+0x68e8): Section mismatch in reference from the variable apic_x2apic_uv_x to the function .cpuinit.text:uv_wakeup_secondary()
The variable apic_x2apic_uv_x references
the function __cpuinit uv_wakeup_secondary()
If the reference is valid then annotate the
variable with __init* or __refdata (see linux/init.h) or name the variable:
*driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,

WARNING: arch/x86/built-in.o(.text+0x7b36f): Section mismatch in reference from the function uv_acpi_madt_oem_check() to the function .init.text:early_ioremap()
The function uv_acpi_madt_oem_check() references
the function __init early_ioremap().
This is often because uv_acpi_madt_oem_check lacks a __init
annotation or the annotation of early_ioremap is wrong.

WARNING: arch/x86/built-in.o(.text+0x7b38d): Section mismatch in reference from the function uv_acpi_madt_oem_check() to the function .init.text:early_iounmap()
The function uv_acpi_madt_oem_check() references
the function __init early_iounmap().
This is often because uv_acpi_madt_oem_check lacks a __init
annotation or the annotation of early_iounmap is wrong.

WARNING: arch/x86/built-in.o(.data+0x8668): Section mismatch in reference from the variable apic_x2apic_uv_x to the function .cpuinit.text:uv_wakeup_secondary()
The variable apic_x2apic_uv_x references
the function __cpuinit uv_wakeup_secondary()
If the reference is valid then annotate the
variable with __init* or __refdata (see linux/init.h) or name the variable:
*driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,

Signed-off-by: Leonardo Potenza <lpotenza@inwind.it>
LKML-Reference: <200908161855.48302.lpotenza@inwind.it>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-16 19:44:13 +02:00
Cyrill Gorcunov
f3d1915a86 x86, ioapic: Panic on irq-pin binding only if needed
Though the most time we are to panic on irq-pin allocation
fails, for PCI interrupts it's not the case and we could
continue operate even if irq-pin allocation failed.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <20090805200931.GB5319@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-08 17:20:03 +02:00
Yinghai Lu
087d7e56de x86: Fix MSI-X initialization by using online_mask for x2apic target_cpus
found a system where x2apic reports an MSI-X irq initialization
failure:

[  302.859446] igbvf 0000:81:10.4: enabling device (0000 -> 0002)
[  302.874369] igbvf 0000:81:10.4: using 64bit DMA mask
[  302.879023] igbvf 0000:81:10.4: using 64bit consistent DMA mask
[  302.894386] igbvf 0000:81:10.4: enabling bus mastering
[  302.898171] igbvf 0000:81:10.4: setting latency timer to 64
[  302.914050] reserve_memtype added 0xefb08000-0xefb0c000, track uncached-minus, req uncached-minus, ret uncached-minus
[  302.933839] reserve_memtype added 0xefb28000-0xefb29000, track uncached-minus, req uncached-minus, ret uncached-minus
[  302.940367]   alloc irq_desc for 265 on node 4
[  302.956874]   alloc kstat_irqs on node 4
[  302.959452] alloc irq_2_iommu on node 0
[  302.974328] igbvf 0000:81:10.4: irq 265 for MSI/MSI-X
[  302.977778]   alloc irq_desc for 266 on node 4
[  302.980347]   alloc kstat_irqs on node 4
[  302.995312] free_memtype request 0xefb28000-0xefb29000
[  302.998816] igbvf 0000:81:10.4: Failed to initialize MSI-X interrupts.

... it turns out that when trying to enable MSI-X,
__assign_irq_vector(new, cfg_new, apic->target_cpus()) can not
get vector because for x2apic target-cpus returns cpumask_of(0)

Update that to online_mask like xapic.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <4A785AFF.3050902@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-08 17:04:58 +02:00
Gleb Natapov
ce69a78450 x86/apic: Enable x2APIC without interrupt remapping under KVM
KVM would like to provide x2APIC interface to a guest without emulating
interrupt remapping device. The reason KVM prefers guest to use x2APIC
is that x2APIC interface is better virtualizable and provides better
performance than mmio xAPIC interface:

 - msr exits are faster than mmio (no page table walk, emulation)
 - no need to read back ICR to look at the busy bit
 - one 64 bit ICR write instead of two 32 bit writes
 - shared code with the Hyper-V paravirt interface

Included patch changes x2APIC enabling logic to enable it even if IR
initialization failed, but kernel runs under KVM and no apic id is
greater than 255 (if there is one spec requires BIOS to move to x2apic
mode before starting an OS).

-v2: fix build
-v3: fix bug causing compiler warning

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Sheng Yang <sheng@linux.intel.com>
Cc: "avi@redhat.com" <avi@redhat.com>
LKML-Reference: <20090720122417.GR5638@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-05 14:28:50 +02:00
Cyrill Gorcunov
9910887af8 x86, apic: Drop redundant bit assignment
cpu_has_apic has already investigated boot_cpu_data
X86_FEATURE_APIC bit for being clear if condition is
triggered.

So there is no need to clear this bit second time.

Signed-off-by: Cyrill Gorcuno v <gorcunov@openvz.org>
Cc: "Maciej W. Rozycki" <macro@linux-mips.org>
LKML-Reference: <20090722205259.GE15805@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-05 10:30:52 +02:00
Cyrill Gorcunov
a7428cd2ef x86, ioapic: Throw BUG instead of NULL dereference
Instead of plain NULL deref we better throw error
message with a backtrace. Actually we need more
gracious error handling here. Meanwhile leave it
as is.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: yinghai@kernel.org
LKML-Reference: <20090801075435.769301745@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-05 10:30:50 +02:00
Cyrill Gorcunov
2977fb3ffc x86, ioapic: Introduce for_each_irq_pin() helper
This allow us to save a few lines of code.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: yinghai@kernel.org
LKML-Reference: <20090801075435.597863129@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-05 10:30:49 +02:00
Jack Steiner
2a5ef41661 x86, UV: Complete IRQ interrupt migration in arch_enable_uv_irq()
In uv_setup_irq(), the call to create_irq() initially assigns
IRQ vectors to cpu 0. The subsequent call to
assign_irq_vector() in arch_enable_uv_irq() migrates the IRQ to
another cpu and frees the cpu 0 vector - at least it will be
freed as soon as the "IRQ move" completes.

arch_enable_uv_irq() needs to send a cleanup IPI to complete
the IRQ move. Otherwise, assignment of GRU interrupts on large
systems (>200 cpus) will exhaust the cpu 0 interrupt vectors
and initialization of the GRU driver will fail.

Signed-off-by: Jack Steiner <steiner@sgi.com>
LKML-Reference: <20090720142840.GA8885@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-04 16:32:52 +02:00
Yinghai Lu
d8c7eb34c2 x86: Don't use current_cpu_data in x2apic phys_pkg_id
One system has socket 1 come up as BSP.

kexeced kernel reports BSP as:

[    1.524550] Initializing cgroup subsys cpuacct
[    1.536064] initial_apicid:20
[    1.537135] ht_mask_width:1
[    1.538128] core_select_mask:f
[    1.539126] core_plus_mask_width:5
[    1.558479] CPU: Physical Processor ID: 0
[    1.559501] CPU: Processor Core ID: 0
[    1.560539] CPU: L1 I cache: 32K, L1 D cache: 32K
[    1.579098] CPU: L2 cache: 256K
[    1.580085] CPU: L3 cache: 24576K
[    1.581108] CPU 0/0x20 -> Node 0
[    1.596193] CPU 0 microcode level: 0xffff0008

It doesn't have correct physical processor id and will get an
error:

[   38.840859] CPU0 attaching sched-domain:
[   38.848287]  domain 0: span 0,8,72 level SIBLING
[   38.851151]   groups: 0 8 72
[   38.858137]   domain 1: span 0,8-15,72-79 level MC
[   38.868944]    groups: 0,8,72 9,73 10,74 11,75 12,76 13,77 14,78 15,79
[   38.881383] ERROR: parent span is not a superset of domain->span
[   38.890724]    domain 2: span 0-7,64-71 level CPU
[   38.899237] ERROR: domain->groups does not contain CPU0
[   38.909229]     groups: 8-15,72-79
[   38.912547] ERROR: groups don't span domain->span
[   38.919665]     domain 3: span 0-127 level NODE
[   38.930739]      groups: 0-7,64-71 8-15,72-79 16-23,80-87 24-31,88-95 32-39,96-103 40-47,104-111 48-55,112-119 56-63,120-127

it turns out: we can not use current_cpu_data in phys_pgd_id
for x2apic.

identify_boot_cpu() is called by check_bugs() before
smp_prepare_cpus() and till smp_prepare_cpus() current_cpu_data
for bsp is assigned with boot_cpu_data.

Just make phys_pkg_id for x2apic is aligned to xapic.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <4A6ADD0D.10002@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-04 16:22:44 +02:00
Jack Steiner
c5997fa8d7 x86, UV: Fix UV apic mode
Change SGI UV default apicid mode to "physical". This is
required to match settings in the UV hub chip.

Signed-off-by: Jack Steiner <steiner@sgi.com>
LKML-Reference: <20090727143856.GA8905@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-04 16:19:14 +02:00
Jack Steiner
cc5e4fa1bd x86, UV: Delete mapping of MMR rangs mapped by BIOS
The UV BIOS has added additional MMR ranges that are mapped via
EFI virtual mode mappings. These ranges should be deleted from
ranges mapped by uv_system_init().

Signed-off-by: Jack Steiner <steiner@sgi.com>
Cc: linux-mm@kvack.org
LKML-Reference: <20090727143656.GA7698@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-04 16:18:02 +02:00
Jack Steiner
6c7184b774 x86, UV: Handle missing blade-local memory correctly
UV blades may not have any blade-local memory. Add a field
(nid) to the UV blade structure to indicates whether the node
has local memory. This is needed by the GRU driver (pushed
separately).

Signed-off-by: Jack Steiner <steiner@sgi.com>
Cc: linux-mm@kvack.org
LKML-Reference: <20090727143507.GA7006@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-04 16:18:01 +02:00
Ingo Molnar
47cab6a722 debug lockups: Improve lockup detection, fix generic arch fallback
As Andrew noted, my previous patch ("debug lockups: Improve lockup
detection") broke/removed SysRq-L support from architecture that do
not provide a __trigger_all_cpu_backtrace implementation.

Restore a fallback path and clean up the SysRq-L machinery a bit:

 - Rename the arch method to arch_trigger_all_cpu_backtrace()

 - Simplify the define

 - Document the method a bit - in the hope of more architectures
   adding support for it.

[ The patch touches Sparc code for the rename. ]

Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "David S. Miller" <davem@davemloft.net>
LKML-Reference: <20090802140809.7ec4bb6b.akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-03 09:56:52 +02:00
Bartlomiej Zolnierkiewicz
25f6e89bed x86: Remove superfluous NULL pointer check in destroy_irq()
This takes care of the following entry from Dan's list:

  arch/x86/kernel/apic/io_apic.c +3241 destroy_irq(11) warning: variable derefenced before check 'desc'

Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Eugene Teo <eteo@redhat.com>
Cc: Julia Lawall <julia@diku.dk>
LKML-Reference: <200907302321.19086.bzolnier@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-02 21:37:00 +02:00
Ingo Molnar
c1dc0b9c0c debug lockups: Improve lockup detection
When debugging a recent lockup bug i found various deficiencies
in how our current lockup detection helpers work:

 - SysRq-L is not very efficient as it uses a workqueue, hence
   it cannot punch through hard lockups and cannot see through
   most soft lockups either.

 - The SysRq-L code depends on the NMI watchdog - which is off
   by default.

 - We dont print backtraces from the RCU code's built-in
   'RCU state machine is stuck' debug code. This debug
   code tends to be one of the first (and only) mechanisms
   that show that a lockup has occured.

This patch changes the code so taht we:

 - Trigger the NMI backtrace code from SysRq-L instead of using
   a workqueue (which cannot punch through hard lockups)

 - Trigger print-all-CPU-backtraces from the RCU lockup detection
   code

Also decouple the backtrace printing code from the NMI watchdog:

 - Dont use variable size cpumasks (it might not be initialized
   and they are a bit more fragile anyway)

 - Trigger an NMI immediately via an IPI, instead of waiting
   for the NMI tick to occur. This is a lot faster and can
   produce more relevant backtraces. It will also work if the
   NMI watchdog is disabled.

 - Dont print the 'dazed and confused' message when we print
   a backtrace from the NMI

 - Do a show_regs() plus a dump_stack() to get maximum info
   out of the dump. Worst-case we get two stacktraces - which
   is not a big deal. Sometimes, if register content is
   corrupted, the precise stack walker in show_regs() wont
   give us a full backtrace - in this case dump_stack() will
   do it.

Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-02 13:27:17 +02:00
Linus Torvalds
499ee0710f Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  x86/pci: insert ioapic resource before assigning unassigned resources
2009-07-17 10:51:55 -07:00
Jeremy Fitzhardinge
e25371d60c x86/ioapic.c: unify ioapic_retrigger_irq()
The 32 and 64-bit versions of ioapic_retrigger_irq() are identical
except the 64-bit one takes vector_lock.  vector_lock is defined and
used on 32-bit too, so just use a common ioapic_retrigger_irq().

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-07-14 13:32:51 -07:00
Jeremy Fitzhardinge
638f2f8c52 x86/ioapic.c: convert __target_IO_APIC_irq to conventional for() loop
Use a normal for() loop in __target_IO_APIC_irq().

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-07-14 13:32:50 -07:00
Jeremy Fitzhardinge
4eea6fff61 x86/ioapic.c: clean up replace_pin_at_irq_node logic and comments
There's no need for a control variable in replace_pin_at_irq_node();
it can just return if it finds the old apic/pin to replace.

If the loop terminates, then it didn't find the old apic/pin, so it can
add the new ones.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-07-14 13:32:50 -07:00
Jeremy Fitzhardinge
535b64291a x86/ioapic.c: convert replace_pin_at_irq_node to conventional for() loop
Use a conventional for() loop in replace_pin_at_irq_node().

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-07-14 13:32:50 -07:00
Jeremy Fitzhardinge
875e68ec32 x86/ioapic.c: simplify add_pin_to_irq_node()
Rather than duplicating the same alloc/init code twice, restructure
the function to look for duplicates and then add an entry
if none is found.

This function is not performance critical; all but one of its callers
are __init functions, and the non-__init caller is for PCI device setup.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-07-14 13:32:50 -07:00
Jeremy Fitzhardinge
d8c52063ed x86/ioapic.c: convert io_apic_level_ack_pending loop to normal for() loop
Convert the unconventional loop in io_apic_level_ack_pending() to
a conventional for() loop.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-07-14 13:32:50 -07:00
Jeremy Fitzhardinge
8e13d697fe x86/ioapic.c: move lost comment to what seems like appropriate place
The comment got separated from its subject, so move it to what
appears to be the right place, and update to describe the current
structure.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-07-14 13:32:49 -07:00
Jeremy Fitzhardinge
83c21bedf6 x86/ioapic.c: remove redundant declaration of irq_pin_list
The structure is defined immediately below, so there's no need
to forward declare it.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-07-14 13:32:49 -07:00
Jeremy Fitzhardinge
916a0fe739 x86/ioapic.c: remove #ifdef for 82093AA workaround
While no 64-bit hardware will have a version 0x11 I/O APIC which needs
the level/edge bug workaround, that's not a particular reason to use
CONFIG_X86_32 to #ifdef the code out.  Most 32-bit machines will no
longer need the workaround either, so the test to see whether it is
necessary should be more fine-grained than "32-bit=yes, 64-bit=no".

(Also fix formatting of block comment.)

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-07-14 13:32:49 -07:00
Jeremy Fitzhardinge
890aeacf64 x86/ioapic.c: unify __mask_IO_APIC_irq()
The main difference between 32 and 64-bit __mask_IO_APIC_irq() does a
readback from the I/O APIC to synchronize it.

If there's a hardware requirement to do a readback sync after updating
an APIC register, then it will be a hardware requrement regardless of
whether the kernel is compiled 32 or 64-bit.

Unify __mask_IO_APIC_irq() using the 64-bit version which always syncs
with io_apic_sync().

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-07-14 13:32:49 -07:00
Jeremy Fitzhardinge
2f210deba9 x86/ioapic.c: ioapic_modify_irq is too large to inline
If ioapic_modify_irq() is marked inline, it gets inlined several times.
Un-inlining it saves around 200 bytes in .text for me.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-07-14 13:32:48 -07:00
Rakib Mullick
7473727be8 x86, apic: Fix false positive section mismatch in numaq_32.c
The variable apic_numaq placed in noninit section references the
function wakeup_secondary_cpu_via_nmi(), which is in __cpuinit
section. Thus causes a section mismatch warning. To avoid such
mismatch we mark apic_numaq as __refdata.

We were warned by the following warning:

  WARNING: arch/x86/kernel/built-in.o(.data+0x932c): Section mismatch in
  reference from the variable apic_numaq to the function
  .cpuinit.text:wakeup_secondary_cpu_via_nmi()

Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com>
LKML-Reference: <b9df5fa10907120407p6b4f67dtf4d563155488188a@mail.gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-13 11:03:27 +02:00
Rakib Mullick
151586d0f7 x86: Fix false positive section mismatch in es7000_32.c
The variable apic_es7000_cluster references the function __cpuinit
wakeup_secondary_cpu_via_mip() from a noninit section. So we've been
warned by the following warning. To avoid possible collision between
init/noninit, its best to mark the variable as __refdata.

We were warned by the following warning:

  LD      arch/x86/kernel/apic/built-in.o
  WARNING: arch/x86/kernel/apic/built-in.o(.data+0x198c): Section
  mismatch in reference from the variable apic_es7000_cluster to the
  function .cpuinit.text:wakeup_secondary_cpu_via_mip()

Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com>
LKML-Reference: <b9df5fa10907120404k6279a10ch5e9682432272706f@mail.gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-13 11:03:26 +02:00
Yinghai Lu
857fdc53a0 x86/pci: insert ioapic resource before assigning unassigned resources
Stephen reported that his DL585 G2 needed noapic after 2.6.22 (?)

Dann bisected it down to:
  commit 30a18d6c3f
  Date:   Tue Feb 19 03:21:20 2008 -0800

      x86: multi pci root bus with different io resource range, on
      64-bit

It turns out that:
  1. that AMD-based systems have two HT chains.
  2. BIOS doesn't allocate resources for BAR 6 of devices under 8132 etc
  3. that multi-peer-root patch will try to split root resources to peer
     root resources according to PCI conf of NB
  4. PCI core assigns unassigned resources, but they overlap with BARs
     that are used by ioapic addr of io4 and 8132.

The reason: at that point ioapic address are not inserted yet.  Solution
is to insert ioapic resources into the tree a bit earlier.

Reported-by: Stephen Frost <sfrost@snowman.net>
Reported-and-Tested-by: dann frazier <dannf@hp.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: stable@kernel.org
Signed-off-by: Jesse Barnes <jbarnes@jbarnes-g45.(none)>
2009-07-10 13:03:14 -07:00
Cyrill Gorcunov
a1b4f1a5b7 x86, ipi: Clean up safe_smp_processor_id() by using the cpu_has_apic() macro helper
We already use a lot of cpu_has_ helpers.
Lets do here the same for consistency.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <20090705160154.GB4791@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-10 15:58:34 +02:00
Andi Kleen
c1ebf83561 x86: mce: Rename CONFIG_X86_NEW_MCE to CONFIG_X86_MCE
Drop the CONFIG_X86_NEW_MCE symbol and change all
references to it to check for CONFIG_X86_MCE directly.

No code changes

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-07-09 18:39:47 -07:00
Joe Perches
ad361c9884 Remove multiple KERN_ prefixes from printk formats
Commit 5fd29d6ccb ("printk: clean up
handling of log-levels and newlines") changed printk semantics.  printk
lines with multiple KERN_<level> prefixes are no longer emitted as
before the patch.

<level> is now included in the output on each additional use.

Remove all uses of multiple KERN_<level>s in formats.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-08 10:30:03 -07:00
Jaswinder Singh Rajput
23d0cd8e71 x86: Remove unused variable disable_x2apic
setup_nox2apic() is writing 1 to disable_x2apic but no one is reading it.

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
LKML-Reference: <1246554239.2242.27.camel@jaswinder.satnam>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-03 14:34:27 +02:00
Ingo Molnar
251e1e44b9 x86: Fix printk call in print_local_apic()
Instead of this:

[   75.690022] <7>printing local APIC contents on CPU#0/0:
[   75.704406] ... APIC ID:      00000000 (0)
[   75.707905] ... APIC VERSION: 00060015
[   75.722551] ... APIC TASKPRI: 00000000 (00)
[   75.725473] ... APIC PROCPRI: 00000000
[   75.728592] ... APIC LDR: 00000001
[   75.742137] ... APIC SPIV: 000001ff
[   75.744101] ... APIC ISR field:
[   75.746648] 0123456789abcdef0123456789abcdef
[   75.746649] <7>00000000000000000000000000000000

Improve the code to be saner and simpler and just print out
the bitfield in a single line using hexa values - not as a
(rather pointless) binary bitfield.

Partially reused Linus's initial fix for this.

Reported-and-Tested-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <4A4C43BC.90506@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-02 08:54:08 +02:00
Weidong Han
f007e99c8e Intel-IOMMU, intr-remap: source-id checking
To support domain-isolation usages, the platform hardware must be
capable of uniquely identifying the requestor (source-id) for each
interrupt message. Without source-id checking for interrupt remapping
, a rouge guest/VM with assigned devices can launch interrupt attacks
to bring down anothe guest/VM or the VMM itself.

This patch adds source-id checking for interrupt remapping, and then
really isolates interrupts for guests/VMs with assigned devices.

Because PCI subsystem is not initialized yet when set up IOAPIC
entries, use read_pci_config_byte to access PCI config space directly.

Signed-off-by: Weidong Han <weidong.han@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-23 22:09:17 +01:00
Cyrill Gorcunov
3f4c3955ea x86, ioapic: Don't call disconnect_bsp_APIC if no APIC present
Vegard Nossum reported:

[  503.576724] ACPI: Preparing to enter system sleep state S5
[  503.710857] Disabling non-boot CPUs ...
[  503.716853] Power down.
[  503.717770] ------------[ cut here ]------------
[  503.717770] WARNING: at arch/x86/kernel/apic/apic.c:249 native_apic_write_du)
[  503.717770] Hardware name: OptiPlex GX100
[  503.717770] Modules linked in:
[  503.717770] Pid: 2136, comm: halt Not tainted 2.6.30 #443
[  503.717770] Call Trace:
[  503.717770]  [<c154d327>] ? printk+0x18/0x1a
[  503.717770]  [<c1017358>] ? native_apic_write_dummy+0x38/0x50
[  503.717770]  [<c10360fc>] warn_slowpath_common+0x6c/0xc0
[  503.717770]  [<c1017358>] ? native_apic_write_dummy+0x38/0x50
[  503.717770]  [<c1036165>] warn_slowpath_null+0x15/0x20
[  503.717770]  [<c1017358>] native_apic_write_dummy+0x38/0x50
[  503.717770]  [<c1017173>] disconnect_bsp_APIC+0x63/0x100
[  503.717770]  [<c1019e48>] disable_IO_APIC+0xb8/0xc0
[  503.717770]  [<c1214231>] ? acpi_power_off+0x0/0x29
[  503.717770]  [<c1015e55>] native_machine_shutdown+0x65/0x80
[  503.717770]  [<c1015c36>] native_machine_power_off+0x26/0x30
[  503.717770]  [<c1015c49>] machine_power_off+0x9/0x10
[  503.717770]  [<c1046596>] kernel_power_off+0x36/0x40
[  503.717770]  [<c104680d>] sys_reboot+0xfd/0x1f0
[  503.717770]  [<c109daa0>] ? perf_swcounter_event+0xb0/0x130
[  503.717770]  [<c109db7d>] ? perf_counter_task_sched_out+0x5d/0x120
[  503.717770]  [<c102dfc6>] ? finish_task_switch+0x56/0xd0
[  503.717770]  [<c154da1e>] ? schedule+0x49e/0xb40
[  503.717770]  [<c10444b0>] ? sys_kill+0x70/0x160
[  503.717770]  [<c119d9db>] ? selinux_file_ioctl+0x3b/0x50
[  503.717770]  [<c10dd443>] ? sys_ioctl+0x63/0x70
[  503.717770]  [<c1003024>] sysenter_do_call+0x12/0x22
[  503.717770] ---[ end trace 8157b5d0ed378f15 ]---

|
| That's including this commit:
|
| commit 103428e57b
|Author: Cyrill Gorcunov <gorcunov@openvz.org>
|Date:   Sun Jun 7 16:48:40 2009 +0400
|
|    x86, apic: Fix dummy apic read operation together with broken MP handling
|

If we have apic disabled we don't even switch to APIC mode and do not
calling for connect_bsp_APIC. Though on SMP compiled kernel the
native_machine_shutdown does try to write the apic register anyway.

Fix it with explicit check if we really should touch apic registers.

Reported-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <20090617181322.GG10822@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-17 20:24:39 +02:00
Huang Weiyi
8653f88ff9 x86: Remove duplicated #include's
Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
LKML-Reference: <1244895686-2348-1-git-send-email-weiyi.huang@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-17 19:02:35 +02:00
Jaswinder Singh Rajput
8f7007aabe x86: apic/io_apic.c: dmar_msi_type should be static
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-17 17:16:08 +02:00
Figo.zhang
50a8d4d297 x86, io_apic.c: Work around compiler warning
This compiler warning:

  arch/x86/kernel/apic/io_apic.c: In function ‘ioapic_write_entry’:
  arch/x86/kernel/apic/io_apic.c:466: warning: ‘eu’ is used uninitialized in this function
  arch/x86/kernel/apic/io_apic.c:465: note: ‘eu’ was declared here

Is bogus as 'eu' is always initialized. But annotate it away by
initializing the variable, to make it easier for people to notice
real warnings. A compiler that sees through this logic will
optimize away the initialization.

Signed-off-by: Figo.zhang <figo1802@gmail.com>
LKML-Reference: <1245248720.3312.27.camel@myhost>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-17 17:13:25 +02:00
Linus Torvalds
19035e5b5d Merge branch 'timers-for-linus-migration' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-for-linus-migration' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  timers: Logic to move non pinned timers
  timers: /proc/sys sysctl hook to enable timer migration
  timers: Identifying the existing pinned timers
  timers: Framework for identifying pinned timers
  timers: allow deferrable timers for intervals tv2-tv5 to be deferred

Fix up conflicts in kernel/sched.c and kernel/timer.c manually
2009-06-15 10:06:19 -07:00
Linus Torvalds
a2ee2981ae Merge branch 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (80 commits)
  x86, mce: Add boot options for corrected errors
  x86, mce: Fix mce printing
  x86, mce: fix for mce counters
  x86, mce: support action-optional machine checks
  x86, mce: define MCE_VECTOR
  x86, mce: rename mce_notify_user to mce_notify_irq
  x86: fix panic with interrupts off (needed for MCE)
  x86, mce: export MCE severities coverage via debugfs
  x86, mce: implement new status bits
  x86, mce: print header/footer only once for multiple MCEs
  x86, mce: default to panic timeout for machine checks
  x86, mce: improve mce_get_rip
  x86, mce: make non Monarch panic message "Fatal machine check" too
  x86, mce: switch x86 machine check handler to Monarch election.
  x86, mce: implement panic synchronization
  x86, mce: implement bootstrapping for machine check wakeups
  x86, mce: check early in exception handler if panic is needed
  x86, mce: add table driven machine check grading
  x86, mce: remove TSC print heuristic
  x86, mce: log corrected errors when panicing
  ...
2009-06-13 13:14:51 -07:00
Yinghai Lu
12274e96b4 x86: use zalloc_cpumask_var in arch_early_irq_init
So we make sure MAXSMP gets a cleared cpumask

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-11 20:04:36 -07:00
Ingo Molnar
0d5959723e Merge branch 'linus' into x86/mce3
Conflicts:
	arch/x86/kernel/cpu/mcheck/mce_64.c
	arch/x86/kernel/irq.c

Merge reason: Resolve the conflicts above.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-11 23:31:52 +02:00
Linus Torvalds
8a1ca8cedd Merge branch 'perfcounters-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perfcounters-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (574 commits)
  perf_counter: Turn off by default
  perf_counter: Add counter->id to the throttle event
  perf_counter: Better align code
  perf_counter: Rename L2 to LL cache
  perf_counter: Standardize event names
  perf_counter: Rename enums
  perf_counter tools: Clean up u64 usage
  perf_counter: Rename perf_counter_limit sysctl
  perf_counter: More paranoia settings
  perf_counter: powerpc: Implement generalized cache events for POWER processors
  perf_counters: powerpc: Add support for POWER7 processors
  perf_counter: Accurate period data
  perf_counter: Introduce struct for sample data
  perf_counter tools: Normalize data using per sample period data
  perf_counter: Annotate exit ctx recursion
  perf_counter tools: Propagate signals properly
  perf_counter tools: Small frequency related fixes
  perf_counter: More aggressive frequency adjustment
  perf_counter/x86: Fix the model number of Intel Core2 processors
  perf_counter, x86: Correct some event and umask values for Intel processors
  ...
2009-06-11 14:01:07 -07:00
Yinghai Lu
dad213aeb5 irq/cpumask: make memoryless node zero happy
Don't hardcode to node zero for early boot IRQ setup memory allocations.

[ penberg@cs.helsinki.fi: minor cleanups ]
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-06-11 19:27:08 +03:00
Yinghai Lu
38c7fed2f5 x86: remove some alloc_bootmem_cpumask_var calling
Now that we set up the slab allocator earlier, we can get rid of some
alloc_bootmem_cpumask_var() calls in boot code.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-06-11 19:27:07 +03:00
Ingo Molnar
940010c5a3 Merge branch 'linus' into perfcounters/core
Conflicts:
	arch/x86/kernel/irqinit.c
	arch/x86/kernel/irqinit_64.c
	arch/x86/kernel/traps.c
	arch/x86/mm/fault.c
	include/linux/sched.h
	kernel/exit.c
2009-06-11 17:55:42 +02:00
Linus Torvalds
9b29e8228a Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Clear TS in irq_ts_save() when in an atomic section
  x86: Detect use of extended APIC ID for AMD CPUs
  x86: memtest: remove 64-bit division
  x86, UV: Fix macros for multiple coherency domains
  x86: Fix non-lazy GS handling in sys_vm86()
  x86: Add quirk for reboot stalls on a Dell Optiplex 360
  x86: Fix UV BAU activation descriptor init
2009-06-10 16:15:14 -07:00
Linus Torvalds
7dc3ca39cb Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, nmi: Use predefined numbers instead of hardcoded one
  x86: asm/processor.h: remove double declaration
  x86, mtrr: replace MTRRdefType_MSR with msr-index's MSR_MTRRdefType
  x86, mtrr: replace MTRRfix4K_C0000_MSR with msr-index's MSR_MTRRfix4K_C0000
  x86, mtrr: remove mtrr MSRs double declaration
  x86, mtrr: replace MTRRfix16K_80000_MSR with msr-index's MSR_MTRRfix16K_80000
  x86, mtrr: replace MTRRfix64K_00000_MSR with msr-index's MSR_MTRRfix64K_00000
  x86, mtrr: replace MTRRcap_MSR with msr-index's MSR_MTRRcap
  x86: mce: remove duplicated #include
  x86: msr-index.h remove duplicate MSR C001_0015 declaration
  x86: clean up arch/x86/kernel/tsc_sync.c a bit
  x86: use symbolic name for VM86_SIGNAL when used as vm86 default return
  x86: added 'ifndef _ASM_X86_IOMAP_H' to iomap.h
  x86: avoid multiple declaration of kstack_depth_to_print
  x86: vdso/vma.c declare vdso_enabled and arch_setup_additional_pages before they get used
  x86: clean up declarations and variables
  x86: apic/x2apic_cluster.c x86_cpu_to_logical_apicid should be static
  x86 early quirks: eliminate unused function
2009-06-10 15:49:36 -07:00
Jack Steiner
c4ed3f04ba x86, UV: Fix macros for multiple coherency domains
Fix bug in the SGI UV macros that support systems with multiple
coherency domains.  The macros used for referencing global MMR
(chipset registers) are failing to correctly "or" the NASID
(node identifier) bits that reside above M+N. These high bits
are supplied automatically by the chipset for memory accesses
coming from the processor socket.

However, the bits must be present for references to the special
global MMR space used to map chipset registers. (See uv_hub.h
for more details ...)

The bug results in references to invalid/incorrect nodes.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Cc: <stable@kernel.org>
LKML-Reference: <20090608154405.GA16395@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-08 18:57:47 +02:00
Cyrill Gorcunov
103428e57b x86, apic: Fix dummy apic read operation together with broken MP handling
Ingo Molnar reported that read_apic is buggy novadays:

[    0.000000] Using APIC driver default
[    0.000000] SMP: Allowing 1 CPUs, 0 hotplug CPUs
[    0.000000] Local APIC disabled by BIOS -- you can enable it with "lapic"
[    0.000000] APIC: disable apic facility
[    0.000000] ------------[ cut here ]------------
[    0.000000] WARNING: at arch/x86/kernel/apic/apic.c:254 native_apic_read_dummy+0x2d/0x3b()
[    0.000000] Hardware name: HP OmniBook PC

Indeed we still rely on apic->read operation for SMP compiled
kernel. And instead of disfigure the SMP code with #ifdef we
allow to call apic->read. To capture any unexpected results
we check for apic->read being called for sane reason via
WARN_ON_ONCE but(!) instead of OR we should use AND logical
operation (thanks Yinghai for spotting the root of the problem).

Along with that we could be have bad MP table and we are
to fix it that way no SMP started and no complains about
BIOS bug if apic was just disabled via command line.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <20090607124840.GD4547@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-07 16:08:05 +02:00
Jiri Slaby
3d58829b05 x86, apic: Restore irqs on fail paths
lapic_resume forgets to restore interrupts on fail paths.
Fix that.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
LKML-Reference: <1243497289-18591-1-git-send-email-jirislaby@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@zytor.com>
2009-06-02 02:48:59 +02:00
Naga Chumbalkar
58f892e022 x86: Print real IOAPIC version for x86-64
Fix the fact that the IOAPIC version number in the x86_64 code path always
gets assigned to 0, instead of the correct value.

Before the patch: (from "dmesg" output):

 ACPI: IOAPIC (id[0x08] address[0xfec00000] gsi_base[0])
 IOAPIC[0]: apic_id 8, version 0, address 0xfec00000, GSI 0-23     <---

 After the patch:
 ACPI: IOAPIC (id[0x08] address[0xfec00000] gsi_base[0])
 IOAPIC[0]: apic_id 8, version 32, address 0xfec00000, GSI 0-23    <---

History:

io_apic_get_version() was compiled out of the x86_64 code path in the commit
f2c2cca3ac:

Author: Andi Kleen <ak@suse.de>
Date:   Tue Sep 26 10:52:37 2006 +0200

    [PATCH] Remove APIC version/cpu capability mpparse checking/printing

    ACPI went to great trouble to get the APIC version and CPU capabilities
    of different CPUs before passing them to the mpparser. But all
    that data was used was to print it out.  Actually it even faked some data
    based on the boot cpu, not on the actual CPU being booted.

    Remove all this code because it's not needed.

    Cc: len.brown@intel.com

At the time, the IOAPIC version number was deliberately not printed
in the x86_64 code path. However, after the x86 and x86_64 files were
merged, the net result is that the IOAPIC version is printed incorrectly
in the x86_64 code path.

The patch below provides a fix. I have tested it with acpi, and with
acpi=off, and did not see any problems.

Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com>
Acked-by: Yinghai Lu <yhlu.kernel@gmail.com>
LKML-Reference: <20090416014230.4885.94926.sendpatchset@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
*************************
2009-06-02 02:03:18 +02:00
H. Peter Anvin
48b1fddbb1 Merge branch 'irq/numa' into x86/mce3
Merge reason: arch/x86/kernel/irqinit_{32,64}.c unified in irq/numa
and modified in x86/mce3; this merge resolves the conflict.

Conflicts:
	arch/x86/kernel/irqinit.c

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-06-01 15:25:31 -07:00
Ingo Molnar
3d58f48ba0 Merge branch 'linus' into irq/numa
Conflicts:
	arch/mips/sibyte/bcm1480/irq.c
	arch/mips/sibyte/sb1250/irq.c

Merge reason: we gathered a few conflicts plus update to latest upstream fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-01 21:06:21 +02:00
Ingo Molnar
23db9f430b Merge branch 'linus' into perfcounters/core
Merge reason: merge almost-rc8 into perfcounters/core, which was -rc6
              based - to pick up the latest upstream fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-01 10:01:39 +02:00
Yong Wang
c323d95fa4 perf_counter/x86: Always use NMI for performance-monitoring interrupt
Always use NMI for performance-monitoring interrupt as there could be
racy situations if we switch between irq and nmi mode frequently.

Signed-off-by: Yong Wang <yong.y.wang@intel.com>
LKML-Reference: <20090529052835.GA13657@ywang-moblin2.bj.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-29 09:04:58 +02:00
Andi Kleen
4efc0670ba x86, mce: use 64bit machine check code on 32bit
The 64bit machine check code is in many ways much better than
the 32bit machine check code: it is more specification compliant,
is cleaner, only has a single code base versus one per CPU,
has better infrastructure for recovery, has a cleaner way to communicate
with user space etc. etc.

Use the 64bit code for 32bit too.

This is the second attempt to do this. There was one a couple of years
ago to unify this code for 32bit and 64bit.  Back then this ran into some
trouble with K7s and was reverted.

I believe this time the K7 problems (and some others) are addressed.
I went over the old handlers and was very careful to retain
all quirks.

But of course this needs a lot of testing on old systems. On newer
64bit capable systems I don't expect much problems because they have been
already tested with the 64bit kernel.

I made this a CONFIG for now that still allows to select the old
machine check code. This is mostly to make testing easier,
if someone runs into a problem we can ask them to try
with the CONFIG switched.

The new code is default y for more coverage.

Once there is confidence the 64bit code works well on older hardware
too the CONFIG_X86_OLD_MCE and the associated code can be easily
removed.

This causes a behaviour change for 32bit installations. They now
have to install the mcelog package to be able to log
corrected machine checks.

The 64bit machine check code only handles CPUs which support the
standard Intel machine check architecture described in the IA32 SDM.
The 32bit code has special support for some older CPUs which
have non standard machine check architectures, in particular
WinChip C3 and Intel P5.  I made those a separate CONFIG option
and kept them for now. The WinChip variant could be probably
removed without too much pain, it doesn't really do anything
interesting. P5 is also disabled by default (like it
was before) because many motherboards have it miswired, but
according to Alan Cox a few embedded setups use that one.

Forward ported/heavily changed version of old patch, original patch
included review/fixes from Thomas Gleixner, Bert Wesarg.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-05-28 09:24:13 -07:00
Peter Zijlstra
48e22d56ec perf_counter: x86: Remove interrupt throttle
remove the x86 specific interrupt throttle

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
LKML-Reference: <20090525153931.616671838@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-25 21:41:12 +02:00
Paul Mackerras
a63eaf34ae perf_counter: Dynamically allocate tasks' perf_counter_context struct
This replaces the struct perf_counter_context in the task_struct with
a pointer to a dynamically allocated perf_counter_context struct.  The
main reason for doing is this is to allow us to transfer a
perf_counter_context from one task to another when we do lazy PMU
switching in a later patch.

This has a few side-benefits: the task_struct becomes a little smaller,
we save some memory because only tasks that have perf_counters attached
get a perf_counter_context allocated for them, and we can remove the
inclusion of <linux/perf_counter.h> in sched.h, meaning that we don't
end up recompiling nearly everything whenever perf_counter.h changes.

The perf_counter_context structures are reference-counted and freed
when the last reference is dropped.  A context can have references
from its task and the counters on its task.  Counters can outlive the
task so it is possible that a context will be freed well after its
task has exited.

Contexts are allocated on fork if the parent had a context, or
otherwise the first time that a per-task counter is created on a task.
In the latter case, we set the context pointer in the task struct
locklessly using an atomic compare-and-exchange operation in case we
raced with some other task in creating a context for the subject task.

This also removes the task pointer from the perf_counter struct.  The
task pointer was not used anywhere and would make it harder to move a
context from one task to another.  Anything that needed to know which
task a counter was attached to was already using counter->ctx->task.

The __perf_counter_init_context function moves up in perf_counter.c
so that it can be called from find_get_context, and now initializes
the refcount, but is otherwise unchanged.

We were potentially calling list_del_counter twice: once from
__perf_counter_exit_task when the task exits and once from
__perf_counter_remove_from_context when the counter's fd gets closed.
This adds a check in list_del_counter so it doesn't do anything if
the counter has already been removed from the lists.

Since perf_counter_task_sched_in doesn't do anything if the task doesn't
have a context, and leaves cpuctx->task_ctx = NULL, this adds code to
__perf_install_in_context to set cpuctx->task_ctx if necessary, i.e. in
the case where the current task adds the first counter to itself and
thus creates a context for itself.

This also adds similar code to __perf_counter_enable to handle a
similar situation which can arise when the counters have been disabled
using prctl; that also leaves cpuctx->task_ctx = NULL.

[ Impact: refactor counter context management to prepare for new feature ]

Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <18966.10075.781053.231153@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-22 12:18:19 +02:00
Yinghai Lu
4c6f18fc81 x86, io-apic: Don't mark pin_programmed early
Peter bisected that:

| commit b9c61b7007
| Date:   Wed May 6 10:10:06 2009 -0700
|
|     x86/pci: update pirq_enable_irq() to setup io apic routing
|
|     So we can set io apic routing only when enabling the device irq.

wrecked his opteron box, ata1 interrupts fail to get through.

ata1 is using irq 11:

[    1.451839] sata_svw 0000:01:0e.0: version 2.3
[    1.456333] sata_svw 0000:01:0e.0: PCI INT A -> GSI 11 (level, low) -> IRQ 11
[    1.463639] scsi0 : sata_svw
[    1.466949] scsi1 : sata_svw
[    1.470022] scsi2 : sata_svw
[    1.473090] scsi3 : sata_svw
[    1.476112] ata1: SATA max UDMA/133 mmio m8192@0xff3fe000 port 0xff3fe000 irq 11
[    1.483490] ata2: SATA max UDMA/133 mmio m8192@0xff3fe000 port 0xff3fe100 irq 11
[    1.490870] ata3: SATA max UDMA/133 mmio m8192@0xff3fe000 port 0xff3fe200 irq 11
[    1.498247] ata4: SATA max UDMA/133 mmio m8192@0xff3fe000 port 0xff3fe300 irq 11

that pin is overlapped with pin with legacy ones.

We should not set bits in pin_programmed here, so that those bit could
be set later via io_apic_set_pci_routing().

[ Impact: fix boot hang on certain systems ]

Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Yinghai Lu <yinghai.lu@kernel.org>
Tested-by: Peter Zijlstra <peterz@infradead.org>
Cc: Jack Steiner <steiner@sgi.com>
LKML-Reference: <4A119990.9020606@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-19 14:26:51 +02:00
Yinghai Lu
2759c3287d x86: don't call read_apic_id if !cpu_has_apic
should not call that if apic is disabled.

[ Impact: fix crash on certain UP configs ]

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
LKML-Reference: <4A09CCBB.2000306@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-18 08:43:25 +02:00
Yinghai Lu
e5198075c6 x86, apic: introduce io_apic_irq_attr
according to Ingo, io_apic irq-setup related functions have too many
parameters with a repetitive signature.

So reduce related funcs to get less params by passing a pointer
to a newly defined io_apic_irq_attr structure.

v2: io_apic_irq ==> irq_attr
    triggering ==> trigger

v3: add set_io_apic_irq_attr

[ Impact: cleanup ]

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Len Brown <lenb@kernel.org>
LKML-Reference: <4A08ACD3.2070401@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-18 08:38:55 +02:00
Arun R Bharadwaj
5c333864a6 timers: Identifying the existing pinned timers
* Arun R Bharadwaj <arun@linux.vnet.ibm.com> [2009-04-16 12:11:36]:

The following pinned hrtimers have been identified and marked:
1)sched_rt_period_timer
2)tick_sched_timer
3)stack_trace_timer_fn

[ tglx: fixup the hrtimer pinned mode ]

Signed-off-by: Arun R Bharadwaj <arun@linux.vnet.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-05-13 16:52:42 +02:00
Yinghai Lu
4797f6b021 x86: read apic ID in the !acpi_lapic case
Ed found that on 32-bit, boot_cpu_physical_apicid is not read right,
when the mptable is broken.

Interestingly, actually three paths use/set it:

 1. acpi: at that time that is already read from reg
 2. mptable: only read from mptable
 3. no madt, and no mptable, that use default apic id 0 for 64-bit, -1 for 32-bit

so we could read the apic id for the 2/3 path. We trust the hardware
register more than we trust a BIOS data structure (the mptable).

We can also avoid the double set_fixmap() when acpi_lapic
is used, and also need to move cpu_has_apic earlier and
call apic_disable().

Also when need to update the apic id, we'd better read and
set the apic version as well - so that quirks are applied precisely.

v2: make path 3 with 64bit, use -1 as apic id, so could read it later.
v3: fix whitespace problem pointed out by Ed Swierk
v5: fix boot crash

[ Impact: get correct apic id for bsp other than acpi path ]

Reported-by: Ed Swierk <eswierk@aristanetworks.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
LKML-Reference: <49FC85A9.2070702@kernel.org>
[ v4: sanity-check in the ACPI case too ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-12 12:22:06 +02:00
Ingo Molnar
6cda3eb62e Merge branch 'x86/apic' into irq/numa
Merge reason: both topics modify the APIC code but were able to do it in
              parallel so far. An upcoming patch generates a conflict so
              merge them to avoid the conflict.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-12 12:17:36 +02:00
Cyrill Gorcunov
cec6be6d10 x86: apic: Fixmap apic address even if apic disabled
In case if apic were disabled by boot option
we still need read_apic operation. So fixmap
a fake apic area if needed.

[ Impact: fix boot crash ]

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: yinghai@kernel.org
Cc: eswierk@aristanetworks.com
LKML-Reference: <20090511134140.GH4624@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-11 15:50:58 +02:00
Andreas Herrmann
97a5271465 x86: display extended apic registers with print_local_APIC and cpu_debug code
Both print_local_APIC (used when apic=debug kernel param is set) and
cpu_debug code missed support for some extended APIC registers that
I'd like to see.

This adds support to show:

 - extended APIC feature register
 - extended APIC control register
 - extended LVT registers

[ Impact: print more debug info ]

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Jaswinder Singh Rajput <jaswinder@kernel.org>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
LKML-Reference: <20090508162350.GO29045@alberich.amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-11 14:37:36 +02:00
Yinghai Lu
4401da6111 x86: read apic ID in the !acpi_lapic case
Ed found that on 32-bit, boot_cpu_physical_apicid is not read right,
when the mptable is broken.

Interestingly, actually three paths use/set it:

 1. acpi: at that time that is already read from reg
 2. mptable: only read from mptable
 3. no madt, and no mptable, that use default apic id 0 for 64-bit, -1 for 32-bit

so we could read the apic id for the 2/3 path. We trust the hardware
register more than we trust a BIOS data structure (the mptable).

We can also avoid the double set_fixmap() when acpi_lapic
is used, and also need to move cpu_has_apic earlier and
call apic_disable().

Also when need to update the apic id, we'd better read and
set the apic version as well - so that quirks are applied precisely.

v2: make path 3 with 64bit, use -1 as apic id, so could read it later.
v3: fix whitespace problem pointed out by Ed Swierk

[ Impact: get correct apic id for bsp other than acpi path ]

Reported-by: Ed Swierk <eswierk@aristanetworks.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
LKML-Reference: <49FC85A9.2070702@kernel.org>
[ v4: sanity-check in the ACPI case too ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-11 11:29:23 +02:00
Yinghai Lu
61fe91e131 x86: apic: Check rev 3 fadt correctly for physical_apic bit
Impact: fix fadt version checking

FADT2_REVISION_ID has value 3 aka rev 3 FADT. So need to use >= instead
of >, as other places in the code do.

[ Impact: extend scope of APIC boot quirk ]

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-11 10:52:40 +02:00
Yinghai Lu
b9c61b7007 x86/pci: update pirq_enable_irq() to setup io apic routing
So we can set io apic routing only when enabling the device irq.

This is advantageous for IRQ descriptor allocation affinity: if we set up
the IO-APIC entry later, we have a chance to allocate the IRQ descriptor
later and know which device it is on and can set affinity accordingly.

[ Impact: standardize/enhance irq-enabling sequence for mptable irqs ]

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <4A01C46E.8000501@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-11 10:35:10 +02:00
Yinghai Lu
5ef2183768 x86/acpi: move setup io apic routing out of CONFIG_ACPI scope
So we could set io apic routing when ACPI is not enabled.

[ Impact: prepare for new functionality ]

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Len Brown <lenb@kernel.org>
LKML-Reference: <4A01C422.5070400@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-11 10:35:09 +02:00
Yinghai Lu
e20c06fd69 x86/pci: add 4 more return parameters to IO_APIC_get_PCI_irq_vector()
To prepare those params for pcibios_irq_enable() to call setup_io_apic_routing().

[ Impact: extend function call API to prepare for new functionality ]

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <4A01C406.2040303@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-11 10:35:09 +02:00
Yinghai Lu
bdfe8ac153 x86/acpi: move pin_programmed bit map to io_apic.c
Prepare to call setup_io_apic_routing() in pcibios_irq_enable()
also remove not needed member apic_id.

[ Impact: clean up, prepare for future change ]

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Len Brown <lenb@kernel.org>
LKML-Reference: <4A01C3DD.3050104@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-11 10:35:08 +02:00
Ingo Molnar
7a309490da Merge commit 'v2.6.30-rc5' into x86/apic
Merge reason: this branch was on a .30-rc2 base - sync it up with
              all the latest fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-11 09:50:02 +02:00
Sam Ravnborg
b74d446f1f x86: Fix false positive section mismatch warnings in the apic code
[ Impact: reduce kernel image size a bit, annotate away warnings ]

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
[ modified and tested it ]
Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
LKML-Reference: <b9df5fa10905090235s4bfd26a8o979f93809c9727ad@mail.gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-10 09:26:54 +02:00
Cyrill Gorcunov
9a8709d441 x86: uv - prevent NULL dereference in uv_system_init()
We may reach NULL dereference oops if kmalloc failed.
Prevent it with explicit BUG_ON.

[ Impact: more controlled assert in 'impossible' scenario ]

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Jack Steiner <steiner@sgi.com>
LKML-Reference: <20090501202511.GE4633@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-03 10:49:31 +02:00
Cyrill Gorcunov
1cbac972ba x86: uv io-apic - use BUILD_BUG_ON instead of BUG_ON
The expression is known to be true/false at compilation
time so we're allowed to use build-time instead of
run-time check. Also align 'entry' items assignment.

[ Impact: shrink kernel a bit, cleanup ]

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Jack Steiner <steiner@sgi.com>
LKML-Reference: <20090502093956.GB4791@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-03 10:38:21 +02:00
Cyrill Gorcunov
6f0aced639 x86, apic: use pr_ macro
Replace recenly appeared printk with pr_ macro
(the file already use a lot of them).

[ Impact: cleanup ]

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
LKML-Reference: <20090501195425.GB4633@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-01 21:56:25 +02:00
Ingo Molnar
4420471f14 Merge branch 'x86/apic' into irq/numa
Conflicts:
	arch/x86/kernel/apic/io_apic.c

Merge reason: non-trivial interaction between ongoing work in io_apic.c
              and the NUMA migration feature in the irq tree.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-01 19:02:50 +02:00
Yinghai Lu
15e957d08d x86/irq: use move_irq_desc() in create_irq_nr()
move_irq_desc() will try to move irq_desc to the home node if
the allocated one is not correct, in create_irq_nr().

( This can happen on devices that are on different nodes that
  are using MSI, when drivers are loaded and unloaded randomly. )

v2: fix non-smp build
v3: add NUMA_IRQ_DESC to eliminate #ifdefs

[ Impact: improve irq descriptor locality on NUMA systems ]

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
LKML-Reference: <49F95EAE.2050903@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-01 19:01:12 +02:00
Ingo Molnar
e7fd5d4b3d Merge branch 'linus' into perfcounters/core
Merge reason: This brach was on -rc1, refresh it to almost-rc4 to pick up
              the latest upstream fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-29 14:47:05 +02:00
Yinghai Lu
d047f53a2e x86/irq: change MSI irq_desc to be more numa aware
Try to get irq_desc on the home node in create_irq_nr().

v2: don't check if we can move it when sparse_irq is not used
v3: use move_irq_des, if that node is not what we want

[ Impact: optimization, make MSI IRQ descriptors more NUMA aware ]

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
LKML-Reference: <49F6559F.7070005@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-28 12:21:18 +02:00
Yinghai Lu
024154cfdd irq: change io_apic_set_pci_routing() to use device parameter
Make actual use of the device parameter passed down to
io_apic_set_pci_routing() - to have the IRQ descriptor
on the home node of the device.

If no device has been passed down, we assume it's a platform
device and use the boot node ID for the IRQ descriptor.

[ Impact: optimization, make IO-APIC code more NUMA aware ]

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
LKML-Reference: <49F6557E.3080101@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-28 12:21:18 +02:00
Yinghai Lu
a2f809b08a irq: change ACPI GSI APIs to also take a device argument
We want to use dev_to_node() later on, to be aware of the 'home node'
of the GSI in question.

[ Impact: cleanup, prepare the IRQ code to be more NUMA aware ]

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Len Brown <lenb@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Len Brown <lenb@kernel.org>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-acpi@vger.kernel.org
Cc: linux-ia64@vger.kernel.org
LKML-Reference: <49F65560.20904@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-28 12:21:17 +02:00
Yinghai Lu
85ac16d033 x86/irq: change irq_desc_alloc() to take node instead of cpu
This simplifies the node awareness of the code. All our allocators
only deal with a NUMA node ID locality not with CPU ids anyway - so
there's no need to maintain (and transform) a CPU id all across the
IRq layer.

v2: keep move_irq_desc related

[ Impact: cleanup, prepare IRQ code to be NUMA-aware ]

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
LKML-Reference: <49F65536.2020300@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-28 12:21:17 +02:00
Yinghai Lu
d5dedd4507 irq: change ->set_affinity() to return status
according to Ingo, change set_affinity() in irq_chip should return int,
because that way we can handle failure cases in a much cleaner way, in
the genirq layer.

v2: fix two typos

[ Impact: extend API ]

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: linux-arch@vger.kernel.org
LKML-Reference: <49F654E9.4070809@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-28 12:21:16 +02:00
Yinghai Lu
fcef5911c7 x86/irq: remove leftover code from NUMA_MIGRATE_IRQ_DESC
The original feature of migrating irq_desc dynamic was too fragile
and was causing problems: it caused crashes on systems with lots of
cards with MSI-X when user-space irq-balancer was enabled.

We now have new patches that create irq_desc according to device
numa node. This patch removes the leftover bits of the dynamic balancer.

[ Impact: remove dead code ]

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
LKML-Reference: <49F654AF.8000808@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-28 12:21:15 +02:00
Yinghai Lu
e0e42142ba x86: Use dmi check in apic_is_clustered() on 64-bit to mark the TSC unstable
We will have systems with 2 and more sockets 8cores/2thread,
but we treat them as multi chassis - while they could have
a stable TSC domain.

Use DMI check instead.

[ Impact: do not turn possibly stable TSCs off incorrectly ]

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Ravikiran Thirumalai <kiran@scalex86.org>
LKML-Reference: <49F5532A.5000802@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-27 09:23:52 +02:00
Yinghai Lu
b2ba83ff4f x86: apic: Remove duplicated macros
XAPIC_DEST_* is dupliicated to the one in apicdef.h

[ Impact: cleanup ]

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <49F552D0.5050505@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-27 09:13:36 +02:00
Linus Torvalds
9b820a8c5f Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, hpet: Stop soliciting hpet=force users on ICH4M
  x86: check boundary in setup_node_bootmem()
  uv_time: add parameter to uv_read_rtc()
  x86: hpet: fix periodic mode programming on AMD 81xx
  x86: more than 8 32-bit CPUs requires X86_BIGSMP
  x86: avoid theoretical spurious NMI backtraces with CONFIG_CPUMASK_OFFSTACK=y
  x86: fix boot crash in NMI watchdog with CONFIG_CPUMASK_OFFSTACK=y and flat APIC
  x86-64: fix FPU corruption with signals and preemption
  x86/uv: fix for no memory at paddr 0
  docs, x86: add nox2apic back to kernel-parameters.txt
  x86: mm/numa_32.c calculate_numa_remap_pages should use __init
  x86, kbuild: make "make install" not depend on vmlinux
  x86/uv: fix init of cpu-less nodes
  x86/uv: fix init of memory-less nodes
2009-04-26 10:29:38 -07:00
Linus Torvalds
fc2e3180a7 Merge branch 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86/irq: mark NUMA_MIGRATE_IRQ_DESC broken
  x86, irq: Remove IRQ_DISABLED check in process context IRQ move
2009-04-26 10:29:18 -07:00