License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 14:07:57 +00:00
|
|
|
/* SPDX-License-Identifier: GPL-2.0 */
|
2005-04-16 22:20:36 +00:00
|
|
|
/*
|
|
|
|
*
|
|
|
|
* Copyright (C) 1991, 1992 Linus Torvalds
|
|
|
|
*
|
|
|
|
* Enhanced CPU detection and feature setting code by Mike Jagdis
|
|
|
|
* and Martin Mares, November 1997.
|
|
|
|
*/
|
|
|
|
|
|
|
|
.text
|
|
|
|
#include <linux/threads.h>
|
2008-01-30 12:33:28 +00:00
|
|
|
#include <linux/init.h>
|
2005-04-16 22:20:36 +00:00
|
|
|
#include <linux/linkage.h>
|
|
|
|
#include <asm/segment.h>
|
2009-02-13 19:14:01 +00:00
|
|
|
#include <asm/page_types.h>
|
|
|
|
#include <asm/pgtable_types.h>
|
2005-04-16 22:20:36 +00:00
|
|
|
#include <asm/cache.h>
|
|
|
|
#include <asm/thread_info.h>
|
2005-09-09 17:28:28 +00:00
|
|
|
#include <asm/asm-offsets.h>
|
2005-04-16 22:20:36 +00:00
|
|
|
#include <asm/setup.h>
|
2008-02-09 22:24:09 +00:00
|
|
|
#include <asm/processor-flags.h>
|
2009-11-13 23:28:13 +00:00
|
|
|
#include <asm/msr-index.h>
|
2016-01-26 21:12:04 +00:00
|
|
|
#include <asm/cpufeatures.h>
|
2009-02-09 13:17:40 +00:00
|
|
|
#include <asm/percpu.h>
|
2012-04-19 00:16:50 +00:00
|
|
|
#include <asm/nops.h>
|
2015-02-19 07:34:58 +00:00
|
|
|
#include <asm/bootparam.h>
|
2016-01-11 16:04:34 +00:00
|
|
|
#include <asm/export.h>
|
2016-12-08 16:44:31 +00:00
|
|
|
#include <asm/pgtable_32.h>
|
2008-02-09 22:24:09 +00:00
|
|
|
|
|
|
|
/* Physical address */
|
|
|
|
#define pa(X) ((X) - __PAGE_OFFSET)
|
2005-04-16 22:20:36 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* References to members of the new_cpu_data structure.
|
|
|
|
*/
|
|
|
|
|
|
|
|
#define X86 new_cpu_data+CPUINFO_x86
|
|
|
|
#define X86_VENDOR new_cpu_data+CPUINFO_x86_vendor
|
|
|
|
#define X86_MODEL new_cpu_data+CPUINFO_x86_model
|
2018-01-01 01:52:10 +00:00
|
|
|
#define X86_STEPPING new_cpu_data+CPUINFO_x86_stepping
|
2005-04-16 22:20:36 +00:00
|
|
|
#define X86_HARD_MATH new_cpu_data+CPUINFO_hard_math
|
|
|
|
#define X86_CPUID new_cpu_data+CPUINFO_cpuid_level
|
|
|
|
#define X86_CAPABILITY new_cpu_data+CPUINFO_x86_capability
|
|
|
|
#define X86_VENDOR_ID new_cpu_data+CPUINFO_x86_vendor_id
|
|
|
|
|
2007-05-02 17:27:16 +00:00
|
|
|
|
2016-09-21 21:04:06 +00:00
|
|
|
#define SIZEOF_PTREGS 17*4
|
|
|
|
|
2009-03-16 19:07:54 +00:00
|
|
|
/*
|
|
|
|
* Worst-case size of the kernel mapping we need to make:
|
2010-12-17 03:11:09 +00:00
|
|
|
* a relocatable kernel can live anywhere in lowmem, so we need to be able
|
|
|
|
* to map all of lowmem.
|
2009-03-16 19:07:54 +00:00
|
|
|
*/
|
2010-12-17 03:11:09 +00:00
|
|
|
KERNEL_PAGES = LOWMEM_PAGES
|
2009-03-16 19:07:54 +00:00
|
|
|
|
2011-02-25 20:46:13 +00:00
|
|
|
INIT_MAP_SIZE = PAGE_TABLE_SIZE(KERNEL_PAGES) * PAGE_SIZE
|
2009-03-09 08:15:57 +00:00
|
|
|
RESERVE_BRK(pagetables, INIT_MAP_SIZE)
|
2009-03-12 23:09:49 +00:00
|
|
|
|
2005-04-16 22:20:36 +00:00
|
|
|
/*
|
|
|
|
* 32-bit kernel entrypoint; only used by the boot CPU. On entry,
|
|
|
|
* %esi points to the real-mode code as a 32-bit pointer.
|
|
|
|
* CS and DS must be 4 GB flat segments, but we don't depend on
|
|
|
|
* any particular GDT layout, because we load our own as soon as we
|
|
|
|
* can.
|
|
|
|
*/
|
2009-09-16 20:44:28 +00:00
|
|
|
__HEAD
|
2019-10-11 11:51:05 +00:00
|
|
|
SYM_CODE_START(startup_32)
|
2016-08-18 15:59:03 +00:00
|
|
|
movl pa(initial_stack),%ecx
|
2011-02-05 00:14:11 +00:00
|
|
|
|
2005-04-16 22:20:36 +00:00
|
|
|
/*
|
|
|
|
* Set segments to known values.
|
|
|
|
*/
|
2008-02-09 22:24:09 +00:00
|
|
|
lgdt pa(boot_gdt_descr)
|
2005-04-16 22:20:36 +00:00
|
|
|
movl $(__BOOT_DS),%eax
|
|
|
|
movl %eax,%ds
|
|
|
|
movl %eax,%es
|
|
|
|
movl %eax,%fs
|
|
|
|
movl %eax,%gs
|
2011-02-05 00:14:11 +00:00
|
|
|
movl %eax,%ss
|
|
|
|
leal -__PAGE_OFFSET(%ecx),%esp
|
2005-04-16 22:20:36 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Clear BSS first so that there are no surprises...
|
|
|
|
*/
|
2007-10-21 23:41:35 +00:00
|
|
|
cld
|
2005-04-16 22:20:36 +00:00
|
|
|
xorl %eax,%eax
|
2008-02-09 22:24:09 +00:00
|
|
|
movl $pa(__bss_start),%edi
|
|
|
|
movl $pa(__bss_stop),%ecx
|
2005-04-16 22:20:36 +00:00
|
|
|
subl %edi,%ecx
|
|
|
|
shrl $2,%ecx
|
|
|
|
rep ; stosl
|
2005-09-03 22:56:31 +00:00
|
|
|
/*
|
|
|
|
* Copy bootup parameters out of the way.
|
|
|
|
* Note: %esi still has the pointer to the real-mode data.
|
|
|
|
* With the kexec as boot loader, parameter segment might be loaded beyond
|
|
|
|
* kernel image and might not even be addressable by early boot page tables.
|
|
|
|
* (kexec on panic case). Hence copy out the parameters before initializing
|
|
|
|
* page tables.
|
|
|
|
*/
|
2008-02-09 22:24:09 +00:00
|
|
|
movl $pa(boot_params),%edi
|
2005-09-03 22:56:31 +00:00
|
|
|
movl $(PARAM_SIZE/4),%ecx
|
|
|
|
cld
|
|
|
|
rep
|
|
|
|
movsl
|
2008-02-09 22:24:09 +00:00
|
|
|
movl pa(boot_params) + NEW_CL_POINTER,%esi
|
2005-09-03 22:56:31 +00:00
|
|
|
andl %esi,%esi
|
tree-wide: fix comment/printk typos
"gadget", "through", "command", "maintain", "maintain", "controller", "address",
"between", "initiali[zs]e", "instead", "function", "select", "already",
"equal", "access", "management", "hierarchy", "registration", "interest",
"relative", "memory", "offset", "already",
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-11-01 19:38:34 +00:00
|
|
|
jz 1f # No command line
|
2008-02-09 22:24:09 +00:00
|
|
|
movl $pa(boot_command_line),%edi
|
2005-09-03 22:56:31 +00:00
|
|
|
movl $(COMMAND_LINE_SIZE/4),%ecx
|
|
|
|
rep
|
|
|
|
movsl
|
|
|
|
1:
|
2005-04-16 22:20:36 +00:00
|
|
|
|
2011-02-23 09:08:31 +00:00
|
|
|
#ifdef CONFIG_OLPC
|
2010-06-18 21:46:53 +00:00
|
|
|
/* save OFW's pgdir table for later use when calling into OFW */
|
|
|
|
movl %cr3, %eax
|
|
|
|
movl %eax, pa(olpc_ofw_pgd)
|
|
|
|
#endif
|
|
|
|
|
2015-10-20 09:54:45 +00:00
|
|
|
#ifdef CONFIG_MICROCODE
|
2012-12-21 07:44:29 +00:00
|
|
|
/* Early load ucode on BSP. */
|
|
|
|
call load_ucode_bsp
|
|
|
|
#endif
|
|
|
|
|
2016-12-08 16:44:31 +00:00
|
|
|
/* Create early pagetables. */
|
|
|
|
call mk_early_pgtbl_32
|
2008-02-09 22:24:09 +00:00
|
|
|
|
|
|
|
/* Do early initialization of the fixmap area */
|
2010-08-28 13:58:33 +00:00
|
|
|
movl $pa(initial_pg_fixmap)+PDE_IDENT_ATTR,%eax
|
2016-12-08 16:44:31 +00:00
|
|
|
#ifdef CONFIG_X86_PAE
|
|
|
|
#define KPMDS (((-__PAGE_OFFSET) >> 30) & 3) /* Number of kernel PMDs */
|
2010-08-28 13:58:33 +00:00
|
|
|
movl %eax,pa(initial_pg_pmd+0x1000*KPMDS-8)
|
2016-12-08 16:44:31 +00:00
|
|
|
#else
|
2010-08-28 13:58:33 +00:00
|
|
|
movl %eax,pa(initial_page_table+0xffc)
|
2008-02-09 22:24:09 +00:00
|
|
|
#endif
|
2011-01-04 06:50:54 +00:00
|
|
|
|
2016-09-21 21:03:59 +00:00
|
|
|
jmp .Ldefault_entry
|
2019-10-11 11:51:05 +00:00
|
|
|
SYM_CODE_END(startup_32)
|
2011-01-04 06:50:54 +00:00
|
|
|
|
2012-11-13 19:32:45 +00:00
|
|
|
#ifdef CONFIG_HOTPLUG_CPU
|
|
|
|
/*
|
|
|
|
* Boot CPU0 entry point. It's called from play_dead(). Everything has been set
|
|
|
|
* up already except stack. We just set up stack here. Then call
|
|
|
|
* start_secondary().
|
|
|
|
*/
|
2019-10-11 11:51:07 +00:00
|
|
|
SYM_FUNC_START(start_cpu0)
|
2016-08-18 15:59:03 +00:00
|
|
|
movl initial_stack, %ecx
|
2012-11-13 19:32:45 +00:00
|
|
|
movl %ecx, %esp
|
2016-09-21 21:04:02 +00:00
|
|
|
call *(initial_code)
|
|
|
|
1: jmp 1b
|
2019-10-11 11:51:07 +00:00
|
|
|
SYM_FUNC_END(start_cpu0)
|
2012-11-13 19:32:45 +00:00
|
|
|
#endif
|
|
|
|
|
2005-04-16 22:20:36 +00:00
|
|
|
/*
|
|
|
|
* Non-boot CPU entry point; entered from trampoline.S
|
|
|
|
* We can't lgdt here, because lgdt itself uses a data segment, but
|
2007-05-02 17:27:10 +00:00
|
|
|
* we know the trampoline has already loaded the boot_gdt for us.
|
2007-02-13 12:26:22 +00:00
|
|
|
*
|
|
|
|
* If cpu hotplug is not supported then this code can go in init section
|
|
|
|
* which will be freed later
|
2005-04-16 22:20:36 +00:00
|
|
|
*/
|
2019-10-11 11:51:07 +00:00
|
|
|
SYM_FUNC_START(startup_32_smp)
|
2005-04-16 22:20:36 +00:00
|
|
|
cld
|
|
|
|
movl $(__BOOT_DS),%eax
|
|
|
|
movl %eax,%ds
|
|
|
|
movl %eax,%es
|
|
|
|
movl %eax,%fs
|
|
|
|
movl %eax,%gs
|
2016-08-18 15:59:03 +00:00
|
|
|
movl pa(initial_stack),%ecx
|
2011-02-05 00:14:11 +00:00
|
|
|
movl %eax,%ss
|
|
|
|
leal -__PAGE_OFFSET(%ecx),%esp
|
2012-05-08 18:22:28 +00:00
|
|
|
|
2015-10-20 09:54:45 +00:00
|
|
|
#ifdef CONFIG_MICROCODE
|
2012-12-21 07:44:29 +00:00
|
|
|
/* Early load ucode on AP. */
|
|
|
|
call load_ucode_ap
|
|
|
|
#endif
|
|
|
|
|
2016-09-21 21:03:59 +00:00
|
|
|
.Ldefault_entry:
|
x86-32: Start out cr0 clean, disable paging before modifying cr3/4
Patch
5a5a51db78e x86-32: Start out eflags and cr4 clean
... made x86-32 match x86-64 in that we initialize %eflags and %cr4
from scratch. This broke OLPC XO-1.5, because the XO enters the
kernel with paging enabled, which the kernel doesn't expect.
Since we no longer support 386 (the source of most of the variability
in %cr0 configuration), we can simply match further x86-64 and
initialize %cr0 to a fixed value -- the one variable part remaining in
%cr0 is for FPU control, but all that is handled later on in
initialization; in particular, configuring %cr0 as if the FPU is
present until proven otherwise is correct and necessary for the probe
to work.
To deal with the XO case sanely, explicitly disable paging in %cr0
before we muck with %cr3, %cr4 or EFER -- those operations are
inherently unsafe with paging enabled.
NOTE: There is still a lot of 386-related junk in head_32.S which we
can and should get rid of, however, this is intended as a minimal fix
whereas the cleanup can be deferred to the next merge window.
Reported-by: Andres Salomon <dilinger@queued.net>
Tested-by: Daniel Drake <dsd@laptop.org>
Link: http://lkml.kernel.org/r/50FA0661.2060400@linux.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-19 18:29:37 +00:00
|
|
|
movl $(CR0_STATE & ~X86_CR0_PG),%eax
|
|
|
|
movl %eax,%cr0
|
|
|
|
|
2005-04-16 22:20:36 +00:00
|
|
|
/*
|
2013-02-11 14:22:16 +00:00
|
|
|
* We want to start out with EFLAGS unambiguously cleared. Some BIOSes leave
|
|
|
|
* bits like NT set. This would confuse the debugger if this code is traced. So
|
|
|
|
* initialize them properly now before switching to protected mode. That means
|
|
|
|
* DF in particular (even though we have cleared it earlier after copying the
|
|
|
|
* command line) because GCC expects it.
|
|
|
|
*/
|
|
|
|
pushl $0
|
|
|
|
popfl
|
|
|
|
|
|
|
|
/*
|
|
|
|
* New page tables may be in 4Mbyte page mode and may be using the global pages.
|
2005-04-16 22:20:36 +00:00
|
|
|
*
|
2013-02-11 14:22:16 +00:00
|
|
|
* NOTE! If we are on a 486 we may have no cr4 at all! Specifically, cr4 exists
|
|
|
|
* if and only if CPUID exists and has flags other than the FPU flag set.
|
2005-04-16 22:20:36 +00:00
|
|
|
*/
|
2013-02-11 14:22:16 +00:00
|
|
|
movl $-1,pa(X86_CPUID) # preset CPUID level
|
2012-09-24 23:05:48 +00:00
|
|
|
movl $X86_EFLAGS_ID,%ecx
|
|
|
|
pushl %ecx
|
2013-02-11 14:22:16 +00:00
|
|
|
popfl # set EFLAGS=ID
|
2012-09-24 23:05:48 +00:00
|
|
|
pushfl
|
2013-02-11 14:22:16 +00:00
|
|
|
popl %eax # get EFLAGS
|
|
|
|
testl $X86_EFLAGS_ID,%eax # did EFLAGS.ID remained set?
|
2016-09-21 21:03:59 +00:00
|
|
|
jz .Lenable_paging # hw disallowed setting of ID bit
|
2013-02-11 14:22:16 +00:00
|
|
|
# which means no CPUID and no CR4
|
|
|
|
|
|
|
|
xorl %eax,%eax
|
|
|
|
cpuid
|
|
|
|
movl %eax,pa(X86_CPUID) # save largest std CPUID function
|
2012-09-24 23:05:48 +00:00
|
|
|
|
2012-11-27 16:54:36 +00:00
|
|
|
movl $1,%eax
|
|
|
|
cpuid
|
2013-02-11 14:22:16 +00:00
|
|
|
andl $~1,%edx # Ignore CPUID.FPU
|
2016-09-21 21:03:59 +00:00
|
|
|
jz .Lenable_paging # No flags or only CPUID.FPU = no CR4
|
2012-11-27 16:54:36 +00:00
|
|
|
|
2012-09-24 23:05:48 +00:00
|
|
|
movl pa(mmu_cr4_features),%eax
|
2005-04-16 22:20:36 +00:00
|
|
|
movl %eax,%cr4
|
|
|
|
|
2009-11-13 23:28:13 +00:00
|
|
|
testb $X86_CR4_PAE, %al # check if PAE is enabled
|
2016-09-21 21:03:59 +00:00
|
|
|
jz .Lenable_paging
|
2005-04-16 22:20:36 +00:00
|
|
|
|
|
|
|
/* Check if extended functions are implemented */
|
|
|
|
movl $0x80000000, %eax
|
|
|
|
cpuid
|
2009-11-13 23:28:13 +00:00
|
|
|
/* Value must be in the range 0x80000001 to 0x8000ffff */
|
|
|
|
subl $0x80000001, %eax
|
|
|
|
cmpl $(0x8000ffff-0x80000001), %eax
|
2016-09-21 21:03:59 +00:00
|
|
|
ja .Lenable_paging
|
2010-11-10 18:35:53 +00:00
|
|
|
|
|
|
|
/* Clear bogus XD_DISABLE bits */
|
|
|
|
call verify_cpu
|
|
|
|
|
2005-04-16 22:20:36 +00:00
|
|
|
mov $0x80000001, %eax
|
|
|
|
cpuid
|
|
|
|
/* Execute Disable bit supported? */
|
2009-11-13 23:28:13 +00:00
|
|
|
btl $(X86_FEATURE_NX & 31), %edx
|
2016-09-21 21:03:59 +00:00
|
|
|
jnc .Lenable_paging
|
2005-04-16 22:20:36 +00:00
|
|
|
|
|
|
|
/* Setup EFER (Extended Feature Enable Register) */
|
2009-11-13 23:28:13 +00:00
|
|
|
movl $MSR_EFER, %ecx
|
2005-04-16 22:20:36 +00:00
|
|
|
rdmsr
|
|
|
|
|
2009-11-13 23:28:13 +00:00
|
|
|
btsl $_EFER_NX, %eax
|
2005-04-16 22:20:36 +00:00
|
|
|
/* Make changes effective */
|
|
|
|
wrmsr
|
|
|
|
|
2016-09-21 21:03:59 +00:00
|
|
|
.Lenable_paging:
|
2005-04-16 22:20:36 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Enable paging
|
|
|
|
*/
|
2010-08-28 13:58:33 +00:00
|
|
|
movl $pa(initial_page_table), %eax
|
2005-04-16 22:20:36 +00:00
|
|
|
movl %eax,%cr3 /* set the page table pointer.. */
|
x86-32: Start out cr0 clean, disable paging before modifying cr3/4
Patch
5a5a51db78e x86-32: Start out eflags and cr4 clean
... made x86-32 match x86-64 in that we initialize %eflags and %cr4
from scratch. This broke OLPC XO-1.5, because the XO enters the
kernel with paging enabled, which the kernel doesn't expect.
Since we no longer support 386 (the source of most of the variability
in %cr0 configuration), we can simply match further x86-64 and
initialize %cr0 to a fixed value -- the one variable part remaining in
%cr0 is for FPU control, but all that is handled later on in
initialization; in particular, configuring %cr0 as if the FPU is
present until proven otherwise is correct and necessary for the probe
to work.
To deal with the XO case sanely, explicitly disable paging in %cr0
before we muck with %cr3, %cr4 or EFER -- those operations are
inherently unsafe with paging enabled.
NOTE: There is still a lot of 386-related junk in head_32.S which we
can and should get rid of, however, this is intended as a minimal fix
whereas the cleanup can be deferred to the next merge window.
Reported-by: Andres Salomon <dilinger@queued.net>
Tested-by: Daniel Drake <dsd@laptop.org>
Link: http://lkml.kernel.org/r/50FA0661.2060400@linux.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-19 18:29:37 +00:00
|
|
|
movl $CR0_STATE,%eax
|
2005-04-16 22:20:36 +00:00
|
|
|
movl %eax,%cr0 /* ..and set paging (PG) bit */
|
|
|
|
ljmp $__BOOT_CS,$1f /* Clear prefetch and normalize %eip */
|
|
|
|
1:
|
2011-02-05 00:14:11 +00:00
|
|
|
/* Shift the stack pointer to a virtual address */
|
|
|
|
addl $__PAGE_OFFSET, %esp
|
2005-04-16 22:20:36 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* start system 32-bit setup. We need to re-do some of the things done
|
|
|
|
* in 16-bit mode for the "real" operations.
|
|
|
|
*/
|
2012-04-19 00:16:50 +00:00
|
|
|
movl setup_once_ref,%eax
|
|
|
|
andl %eax,%eax
|
|
|
|
jz 1f # Did we do this already?
|
|
|
|
call *%eax
|
|
|
|
1:
|
2013-02-11 14:22:15 +00:00
|
|
|
|
2005-04-16 22:20:36 +00:00
|
|
|
/*
|
2013-02-11 14:22:15 +00:00
|
|
|
* Check if it is 486
|
2005-04-16 22:20:36 +00:00
|
|
|
*/
|
2013-06-28 14:45:16 +00:00
|
|
|
movb $4,X86 # at least 486
|
2013-02-11 14:22:17 +00:00
|
|
|
cmpl $-1,X86_CPUID
|
2016-09-21 21:03:59 +00:00
|
|
|
je .Lis486
|
2005-04-16 22:20:36 +00:00
|
|
|
|
|
|
|
/* get vendor info */
|
|
|
|
xorl %eax,%eax # call CPUID with 0 -> return vendor ID
|
|
|
|
cpuid
|
|
|
|
movl %eax,X86_CPUID # save CPUID level
|
|
|
|
movl %ebx,X86_VENDOR_ID # lo 4 chars
|
|
|
|
movl %edx,X86_VENDOR_ID+4 # next 4 chars
|
|
|
|
movl %ecx,X86_VENDOR_ID+8 # last 4 chars
|
|
|
|
|
|
|
|
orl %eax,%eax # do we have processor info as well?
|
2016-09-21 21:03:59 +00:00
|
|
|
je .Lis486
|
2005-04-16 22:20:36 +00:00
|
|
|
|
|
|
|
movl $1,%eax # Use the CPUID instruction to get CPU type
|
|
|
|
cpuid
|
|
|
|
movb %al,%cl # save reg for future use
|
|
|
|
andb $0x0f,%ah # mask processor family
|
|
|
|
movb %ah,X86
|
|
|
|
andb $0xf0,%al # mask model
|
|
|
|
shrb $4,%al
|
|
|
|
movb %al,X86_MODEL
|
|
|
|
andb $0x0f,%cl # mask mask revision
|
2018-01-01 01:52:10 +00:00
|
|
|
movb %cl,X86_STEPPING
|
2005-04-16 22:20:36 +00:00
|
|
|
movl %edx,X86_CAPABILITY
|
|
|
|
|
2016-09-21 21:03:59 +00:00
|
|
|
.Lis486:
|
2013-02-11 14:22:17 +00:00
|
|
|
movl $0x50022,%ecx # set AM, WP, NE and MP
|
2013-02-11 14:22:15 +00:00
|
|
|
movl %cr0,%eax
|
2005-04-16 22:20:36 +00:00
|
|
|
andl $0x80000011,%eax # Save PG,PE,ET
|
|
|
|
orl %ecx,%eax
|
|
|
|
movl %eax,%cr0
|
|
|
|
|
2007-02-13 12:26:26 +00:00
|
|
|
lgdt early_gdt_descr
|
2005-04-16 22:20:36 +00:00
|
|
|
ljmp $(__KERNEL_CS),$1f
|
|
|
|
1: movl $(__KERNEL_DS),%eax # reload all the segment registers
|
|
|
|
movl %eax,%ss # after changing gdt.
|
|
|
|
|
|
|
|
movl $(__USER_DS),%eax # DS/ES contains default USER segment
|
|
|
|
movl %eax,%ds
|
|
|
|
movl %eax,%es
|
|
|
|
|
2009-01-21 08:26:05 +00:00
|
|
|
movl $(__KERNEL_PERCPU), %eax
|
|
|
|
movl %eax,%fs # set this cpu's percpu
|
|
|
|
|
x86/stackprotector/32: Make the canary into a regular percpu variable
On 32-bit kernels, the stackprotector canary is quite nasty -- it is
stored at %gs:(20), which is nasty because 32-bit kernels use %fs for
percpu storage. It's even nastier because it means that whether %gs
contains userspace state or kernel state while running kernel code
depends on whether stackprotector is enabled (this is
CONFIG_X86_32_LAZY_GS), and this setting radically changes the way
that segment selectors work. Supporting both variants is a
maintenance and testing mess.
Merely rearranging so that percpu and the stack canary
share the same segment would be messy as the 32-bit percpu address
layout isn't currently compatible with putting a variable at a fixed
offset.
Fortunately, GCC 8.1 added options that allow the stack canary to be
accessed as %fs:__stack_chk_guard, effectively turning it into an ordinary
percpu variable. This lets us get rid of all of the code to manage the
stack canary GDT descriptor and the CONFIG_X86_32_LAZY_GS mess.
(That name is special. We could use any symbol we want for the
%fs-relative mode, but for CONFIG_SMP=n, gcc refuses to let us use any
name other than __stack_chk_guard.)
Forcibly disable stackprotector on older compilers that don't support
the new options and turn the stack canary into a percpu variable. The
"lazy GS" approach is now used for all 32-bit configurations.
Also makes load_gs_index() work on 32-bit kernels. On 64-bit kernels,
it loads the GS selector and updates the user GSBASE accordingly. (This
is unchanged.) On 32-bit kernels, it loads the GS selector and updates
GSBASE, which is now always the user base. This means that the overall
effect is the same on 32-bit and 64-bit, which avoids some ifdeffery.
[ bp: Massage commit message. ]
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/c0ff7dba14041c7e5d1cae5d4df052f03759bef3.1613243844.git.luto@kernel.org
2021-02-13 19:19:44 +00:00
|
|
|
xorl %eax,%eax
|
|
|
|
movl %eax,%gs # clear possible garbage in %gs
|
2009-02-09 13:17:40 +00:00
|
|
|
|
|
|
|
xorl %eax,%eax # Clear LDT
|
2005-04-16 22:20:36 +00:00
|
|
|
lldt %ax
|
[PATCH] i386: Use %gs as the PDA base-segment in the kernel
This patch is the meat of the PDA change. This patch makes several related
changes:
1: Most significantly, %gs is now used in the kernel. This means that on
entry, the old value of %gs is saved away, and it is reloaded with
__KERNEL_PDA.
2: entry.S constructs the stack in the shape of struct pt_regs, and this
is passed around the kernel so that the process's saved register
state can be accessed.
Unfortunately struct pt_regs doesn't currently have space for %gs
(or %fs). This patch extends pt_regs to add space for gs (no space
is allocated for %fs, since it won't be used, and it would just
complicate the code in entry.S to work around the space).
3: Because %gs is now saved on the stack like %ds, %es and the integer
registers, there are a number of places where it no longer needs to
be handled specially; namely context switch, and saving/restoring the
register state in a signal context.
4: And since kernel threads run in kernel space and call normal kernel
code, they need to be created with their %gs == __KERNEL_PDA.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Chuck Ebbert <76306.1226@compuserve.com>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Jan Beulich <jbeulich@novell.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 01:14:02 +00:00
|
|
|
|
2016-09-21 21:04:02 +00:00
|
|
|
call *(initial_code)
|
|
|
|
1: jmp 1b
|
2019-10-11 11:51:07 +00:00
|
|
|
SYM_FUNC_END(startup_32_smp)
|
2005-04-16 22:20:36 +00:00
|
|
|
|
2012-04-19 00:16:50 +00:00
|
|
|
#include "verify_cpu.S"
|
|
|
|
|
2005-04-16 22:20:36 +00:00
|
|
|
/*
|
2012-04-19 00:16:50 +00:00
|
|
|
* setup_once
|
2005-04-16 22:20:36 +00:00
|
|
|
*
|
2012-04-19 00:16:50 +00:00
|
|
|
* The setup work we only want to run on the BSP.
|
2005-04-16 22:20:36 +00:00
|
|
|
*
|
|
|
|
* Warning: %esi is live across this function.
|
|
|
|
*/
|
2012-04-19 00:16:50 +00:00
|
|
|
__INIT
|
|
|
|
setup_once:
|
|
|
|
andl $0,setup_once_ref /* Once is enough, thanks */
|
2005-04-16 22:20:36 +00:00
|
|
|
ret
|
|
|
|
|
2019-10-11 11:51:07 +00:00
|
|
|
SYM_FUNC_START(early_idt_handler_array)
|
2012-04-19 00:16:50 +00:00
|
|
|
# 36(%esp) %eflags
|
|
|
|
# 32(%esp) %cs
|
|
|
|
# 28(%esp) %eip
|
|
|
|
# 24(%rsp) error code
|
|
|
|
i = 0
|
|
|
|
.rept NUM_EXCEPTION_VECTORS
|
2017-10-20 16:21:35 +00:00
|
|
|
.if ((EXCEPTION_ERRCODE_MASK >> i) & 1) == 0
|
2012-04-19 00:16:50 +00:00
|
|
|
pushl $0 # Dummy error code, to make stack frame uniform
|
|
|
|
.endif
|
|
|
|
pushl $i # 20(%esp) Vector number
|
2015-05-22 23:15:47 +00:00
|
|
|
jmp early_idt_handler_common
|
2012-04-19 00:16:50 +00:00
|
|
|
i = i + 1
|
2015-05-22 23:15:47 +00:00
|
|
|
.fill early_idt_handler_array + i*EARLY_IDT_HANDLER_SIZE - ., 1, 0xcc
|
2012-04-19 00:16:50 +00:00
|
|
|
.endr
|
2019-10-11 11:51:07 +00:00
|
|
|
SYM_FUNC_END(early_idt_handler_array)
|
2012-04-19 00:16:50 +00:00
|
|
|
|
2019-10-11 11:50:45 +00:00
|
|
|
SYM_CODE_START_LOCAL(early_idt_handler_common)
|
2015-05-22 23:15:47 +00:00
|
|
|
/*
|
|
|
|
* The stack is the hardware frame, an error code or zero, and the
|
|
|
|
* vector number.
|
|
|
|
*/
|
2012-04-19 00:16:50 +00:00
|
|
|
cld
|
2014-03-07 23:05:20 +00:00
|
|
|
|
2012-04-19 00:16:50 +00:00
|
|
|
incl %ss:early_recursion_flag
|
2006-09-26 08:52:39 +00:00
|
|
|
|
2016-04-02 14:01:32 +00:00
|
|
|
/* The vector number is in pt_regs->gs */
|
2006-09-26 08:52:39 +00:00
|
|
|
|
2016-04-02 14:01:32 +00:00
|
|
|
cld
|
2017-07-28 13:00:31 +00:00
|
|
|
pushl %fs /* pt_regs->fs (__fsh varies by model) */
|
|
|
|
pushl %es /* pt_regs->es (__esh varies by model) */
|
|
|
|
pushl %ds /* pt_regs->ds (__dsh varies by model) */
|
2016-04-02 14:01:32 +00:00
|
|
|
pushl %eax /* pt_regs->ax */
|
|
|
|
pushl %ebp /* pt_regs->bp */
|
|
|
|
pushl %edi /* pt_regs->di */
|
|
|
|
pushl %esi /* pt_regs->si */
|
|
|
|
pushl %edx /* pt_regs->dx */
|
|
|
|
pushl %ecx /* pt_regs->cx */
|
|
|
|
pushl %ebx /* pt_regs->bx */
|
|
|
|
|
|
|
|
/* Fix up DS and ES */
|
|
|
|
movl $(__KERNEL_DS), %ecx
|
|
|
|
movl %ecx, %ds
|
|
|
|
movl %ecx, %es
|
|
|
|
|
|
|
|
/* Load the vector number into EDX */
|
|
|
|
movl PT_GS(%esp), %edx
|
|
|
|
|
2017-07-28 13:00:31 +00:00
|
|
|
/* Load GS into pt_regs->gs (and maybe clobber __gsh) */
|
2016-04-02 14:01:32 +00:00
|
|
|
movw %gs, PT_GS(%esp)
|
|
|
|
|
|
|
|
movl %esp, %eax /* args are pt_regs (EAX), trapnr (EDX) */
|
|
|
|
call early_fixup_exception
|
|
|
|
|
|
|
|
popl %ebx /* pt_regs->bx */
|
|
|
|
popl %ecx /* pt_regs->cx */
|
|
|
|
popl %edx /* pt_regs->dx */
|
|
|
|
popl %esi /* pt_regs->si */
|
|
|
|
popl %edi /* pt_regs->di */
|
|
|
|
popl %ebp /* pt_regs->bp */
|
|
|
|
popl %eax /* pt_regs->ax */
|
2017-07-28 13:00:31 +00:00
|
|
|
popl %ds /* pt_regs->ds (always ignores __dsh) */
|
|
|
|
popl %es /* pt_regs->es (always ignores __esh) */
|
|
|
|
popl %fs /* pt_regs->fs (always ignores __fsh) */
|
|
|
|
popl %gs /* pt_regs->gs (always ignores __gsh) */
|
2016-04-02 14:01:32 +00:00
|
|
|
decl %ss:early_recursion_flag
|
|
|
|
addl $4, %esp /* pop pt_regs->orig_ax */
|
|
|
|
iret
|
2019-10-11 11:50:45 +00:00
|
|
|
SYM_CODE_END(early_idt_handler_common)
|
2012-04-19 00:16:50 +00:00
|
|
|
|
2005-04-16 22:20:36 +00:00
|
|
|
/* This is the default interrupt "handler" :-) */
|
2019-10-11 11:51:07 +00:00
|
|
|
SYM_FUNC_START(early_ignore_irq)
|
2005-04-16 22:20:36 +00:00
|
|
|
cld
|
2005-05-01 15:59:02 +00:00
|
|
|
#ifdef CONFIG_PRINTK
|
2005-04-16 22:20:36 +00:00
|
|
|
pushl %eax
|
|
|
|
pushl %ecx
|
|
|
|
pushl %edx
|
|
|
|
pushl %es
|
|
|
|
pushl %ds
|
|
|
|
movl $(__KERNEL_DS),%eax
|
|
|
|
movl %eax,%ds
|
|
|
|
movl %eax,%es
|
2006-09-26 08:52:39 +00:00
|
|
|
cmpl $2,early_recursion_flag
|
|
|
|
je hlt_loop
|
|
|
|
incl early_recursion_flag
|
2005-04-16 22:20:36 +00:00
|
|
|
pushl 16(%esp)
|
|
|
|
pushl 24(%esp)
|
|
|
|
pushl 32(%esp)
|
|
|
|
pushl 40(%esp)
|
|
|
|
pushl $int_msg
|
|
|
|
call printk
|
2009-01-26 05:09:00 +00:00
|
|
|
|
|
|
|
call dump_stack
|
|
|
|
|
2005-04-16 22:20:36 +00:00
|
|
|
addl $(5*4),%esp
|
|
|
|
popl %ds
|
|
|
|
popl %es
|
|
|
|
popl %edx
|
|
|
|
popl %ecx
|
|
|
|
popl %eax
|
2005-05-01 15:59:02 +00:00
|
|
|
#endif
|
2005-04-16 22:20:36 +00:00
|
|
|
iret
|
2016-04-02 14:01:34 +00:00
|
|
|
|
|
|
|
hlt_loop:
|
|
|
|
hlt
|
|
|
|
jmp hlt_loop
|
2019-10-11 11:51:07 +00:00
|
|
|
SYM_FUNC_END(early_ignore_irq)
|
2017-08-31 12:16:53 +00:00
|
|
|
|
2012-04-19 00:16:50 +00:00
|
|
|
__INITDATA
|
|
|
|
.align 4
|
2019-10-11 11:50:51 +00:00
|
|
|
SYM_DATA(early_recursion_flag, .long 0)
|
2005-04-16 22:20:36 +00:00
|
|
|
|
2012-04-19 00:16:50 +00:00
|
|
|
__REFDATA
|
|
|
|
.align 4
|
2019-10-11 11:50:51 +00:00
|
|
|
SYM_DATA(initial_code, .long i386_start_kernel)
|
|
|
|
SYM_DATA(setup_once_ref, .long setup_once)
|
2008-07-27 19:43:11 +00:00
|
|
|
|
2018-07-18 09:40:54 +00:00
|
|
|
#ifdef CONFIG_PAGE_TABLE_ISOLATION
|
|
|
|
#define PGD_ALIGN (2 * PAGE_SIZE)
|
|
|
|
#define PTI_USER_PGD_FILL 1024
|
|
|
|
#else
|
|
|
|
#define PGD_ALIGN (PAGE_SIZE)
|
|
|
|
#define PTI_USER_PGD_FILL 0
|
|
|
|
#endif
|
2005-04-16 22:20:36 +00:00
|
|
|
/*
|
|
|
|
* BSS section
|
|
|
|
*/
|
2009-09-20 22:14:14 +00:00
|
|
|
__PAGE_ALIGNED_BSS
|
2018-07-18 09:40:54 +00:00
|
|
|
.align PGD_ALIGN
|
2008-02-09 22:24:09 +00:00
|
|
|
#ifdef CONFIG_X86_PAE
|
2016-12-08 16:44:31 +00:00
|
|
|
.globl initial_pg_pmd
|
2011-01-04 06:50:54 +00:00
|
|
|
initial_pg_pmd:
|
2008-02-09 22:24:09 +00:00
|
|
|
.fill 1024*KPMDS,4,0
|
|
|
|
#else
|
2016-11-16 14:17:09 +00:00
|
|
|
.globl initial_page_table
|
|
|
|
initial_page_table:
|
2005-04-16 22:20:36 +00:00
|
|
|
.fill 1024,4,0
|
2008-02-09 22:24:09 +00:00
|
|
|
#endif
|
2018-07-18 09:40:54 +00:00
|
|
|
.align PGD_ALIGN
|
2011-01-04 06:50:54 +00:00
|
|
|
initial_pg_fixmap:
|
2007-07-16 06:37:28 +00:00
|
|
|
.fill 1024,4,0
|
2016-11-16 14:17:09 +00:00
|
|
|
.globl swapper_pg_dir
|
2018-07-18 09:40:54 +00:00
|
|
|
.align PGD_ALIGN
|
2016-11-16 14:17:09 +00:00
|
|
|
swapper_pg_dir:
|
2010-08-28 13:58:33 +00:00
|
|
|
.fill 1024,4,0
|
2018-07-18 09:40:54 +00:00
|
|
|
.fill PTI_USER_PGD_FILL,4,0
|
|
|
|
.globl empty_zero_page
|
|
|
|
empty_zero_page:
|
|
|
|
.fill 4096,1,0
|
2016-01-11 16:04:34 +00:00
|
|
|
EXPORT_SYMBOL(empty_zero_page)
|
2009-03-09 08:15:57 +00:00
|
|
|
|
2005-04-16 22:20:36 +00:00
|
|
|
/*
|
|
|
|
* This starts the data section.
|
|
|
|
*/
|
2008-02-09 22:24:09 +00:00
|
|
|
#ifdef CONFIG_X86_PAE
|
2009-09-20 22:14:15 +00:00
|
|
|
__PAGE_ALIGNED_DATA
|
2008-02-09 22:24:09 +00:00
|
|
|
/* Page-aligned for the benefit of paravirt? */
|
2018-07-18 09:40:54 +00:00
|
|
|
.align PGD_ALIGN
|
2019-10-11 11:50:51 +00:00
|
|
|
SYM_DATA_START(initial_page_table)
|
2010-08-28 13:58:33 +00:00
|
|
|
.long pa(initial_pg_pmd+PGD_IDENT_ATTR),0 /* low identity map */
|
2008-02-09 22:24:09 +00:00
|
|
|
# if KPMDS == 3
|
2010-08-28 13:58:33 +00:00
|
|
|
.long pa(initial_pg_pmd+PGD_IDENT_ATTR),0
|
|
|
|
.long pa(initial_pg_pmd+PGD_IDENT_ATTR+0x1000),0
|
|
|
|
.long pa(initial_pg_pmd+PGD_IDENT_ATTR+0x2000),0
|
2008-02-09 22:24:09 +00:00
|
|
|
# elif KPMDS == 2
|
|
|
|
.long 0,0
|
2010-08-28 13:58:33 +00:00
|
|
|
.long pa(initial_pg_pmd+PGD_IDENT_ATTR),0
|
|
|
|
.long pa(initial_pg_pmd+PGD_IDENT_ATTR+0x1000),0
|
2008-02-09 22:24:09 +00:00
|
|
|
# elif KPMDS == 1
|
|
|
|
.long 0,0
|
|
|
|
.long 0,0
|
2010-08-28 13:58:33 +00:00
|
|
|
.long pa(initial_pg_pmd+PGD_IDENT_ATTR),0
|
2008-02-09 22:24:09 +00:00
|
|
|
# else
|
|
|
|
# error "Kernel PMDs should be 1, 2 or 3"
|
|
|
|
# endif
|
2011-02-25 20:46:13 +00:00
|
|
|
.align PAGE_SIZE /* needs to be page-sized too */
|
2019-11-20 23:40:23 +00:00
|
|
|
|
|
|
|
#ifdef CONFIG_PAGE_TABLE_ISOLATION
|
|
|
|
/*
|
|
|
|
* PTI needs another page so sync_initial_pagetable() works correctly
|
|
|
|
* and does not scribble over the data which is placed behind the
|
|
|
|
* actual initial_page_table. See clone_pgd_range().
|
|
|
|
*/
|
|
|
|
.fill 1024, 4, 0
|
|
|
|
#endif
|
|
|
|
|
2019-10-11 11:50:51 +00:00
|
|
|
SYM_DATA_END(initial_page_table)
|
2008-02-09 22:24:09 +00:00
|
|
|
#endif
|
|
|
|
|
2005-04-16 22:20:36 +00:00
|
|
|
.data
|
2011-02-05 00:14:11 +00:00
|
|
|
.balign 4
|
2019-10-11 11:50:51 +00:00
|
|
|
/*
|
|
|
|
* The SIZEOF_PTREGS gap is a convention which helps the in-kernel unwinder
|
|
|
|
* reliably detect the end of the stack.
|
|
|
|
*/
|
|
|
|
SYM_DATA(initial_stack,
|
|
|
|
.long init_thread_union + THREAD_SIZE -
|
|
|
|
SIZEOF_PTREGS - TOP_OF_KERNEL_STACK_PADDING)
|
2005-04-16 22:20:36 +00:00
|
|
|
|
2012-04-19 00:16:50 +00:00
|
|
|
__INITRODATA
|
2005-04-16 22:20:36 +00:00
|
|
|
int_msg:
|
2009-01-26 05:09:00 +00:00
|
|
|
.asciz "Unknown interrupt or fault at: %p %p %p\n"
|
2005-04-16 22:20:36 +00:00
|
|
|
|
2007-10-11 09:16:51 +00:00
|
|
|
#include "../../x86/xen/xen-head.S"
|
xen: Core Xen implementation
This patch is a rollup of all the core pieces of the Xen
implementation, including:
- booting and setup
- pagetable setup
- privileged instructions
- segmentation
- interrupt flags
- upcalls
- multicall batching
BOOTING AND SETUP
The vmlinux image is decorated with ELF notes which tell the Xen
domain builder what the kernel's requirements are; the domain builder
then constructs the address space accordingly and starts the kernel.
Xen has its own entrypoint for the kernel (contained in an ELF note).
The ELF notes are set up by xen-head.S, which is included into head.S.
In principle it could be linked separately, but it seems to provoke
lots of binutils bugs.
Because the domain builder starts the kernel in a fairly sane state
(32-bit protected mode, paging enabled, flat segments set up), there's
not a lot of setup needed before starting the kernel proper. The main
steps are:
1. Install the Xen paravirt_ops, which is simply a matter of a
structure assignment.
2. Set init_mm to use the Xen-supplied pagetables (analogous to the
head.S generated pagetables in a native boot).
3. Reserve address space for Xen, since it takes a chunk at the top
of the address space for its own use.
4. Call start_kernel()
PAGETABLE SETUP
Once we hit the main kernel boot sequence, it will end up calling back
via paravirt_ops to set up various pieces of Xen specific state. One
of the critical things which requires a bit of extra care is the
construction of the initial init_mm pagetable. Because Xen places
tight constraints on pagetables (an active pagetable must always be
valid, and must always be mapped read-only to the guest domain), we
need to be careful when constructing the new pagetable to keep these
constraints in mind. It turns out that the easiest way to do this is
use the initial Xen-provided pagetable as a template, and then just
insert new mappings for memory where a mapping doesn't already exist.
This means that during pagetable setup, it uses a special version of
xen_set_pte which ignores any attempt to remap a read-only page as
read-write (since Xen will map its own initial pagetable as RO), but
lets other changes to the ptes happen, so that things like NX are set
properly.
PRIVILEGED INSTRUCTIONS AND SEGMENTATION
When the kernel runs under Xen, it runs in ring 1 rather than ring 0.
This means that it is more privileged than user-mode in ring 3, but it
still can't run privileged instructions directly. Non-performance
critical instructions are dealt with by taking a privilege exception
and trapping into the hypervisor and emulating the instruction, but
more performance-critical instructions have their own specific
paravirt_ops. In many cases we can avoid having to do any hypercalls
for these instructions, or the Xen implementation is quite different
from the normal native version.
The privileged instructions fall into the broad classes of:
Segmentation: setting up the GDT and the GDT entries, LDT,
TLS and so on. Xen doesn't allow the GDT to be directly
modified; all GDT updates are done via hypercalls where the new
entries can be validated. This is important because Xen uses
segment limits to prevent the guest kernel from damaging the
hypervisor itself.
Traps and exceptions: Xen uses a special format for trap entrypoints,
so when the kernel wants to set an IDT entry, it needs to be
converted to the form Xen expects. Xen sets int 0x80 up specially
so that the trap goes straight from userspace into the guest kernel
without going via the hypervisor. sysenter isn't supported.
Kernel stack: The esp0 entry is extracted from the tss and provided to
Xen.
TLB operations: the various TLB calls are mapped into corresponding
Xen hypercalls.
Control registers: all the control registers are privileged. The most
important is cr3, which points to the base of the current pagetable,
and we handle it specially.
Another instruction we treat specially is CPUID, even though its not
privileged. We want to control what CPU features are visible to the
rest of the kernel, and so CPUID ends up going into a paravirt_op.
Xen implements this mainly to disable the ACPI and APIC subsystems.
INTERRUPT FLAGS
Xen maintains its own separate flag for masking events, which is
contained within the per-cpu vcpu_info structure. Because the guest
kernel runs in ring 1 and not 0, the IF flag in EFLAGS is completely
ignored (and must be, because even if a guest domain disables
interrupts for itself, it can't disable them overall).
(A note on terminology: "events" and interrupts are effectively
synonymous. However, rather than using an "enable flag", Xen uses a
"mask flag", which blocks event delivery when it is non-zero.)
There are paravirt_ops for each of cli/sti/save_fl/restore_fl, which
are implemented to manage the Xen event mask state. The only thing
worth noting is that when events are unmasked, we need to explicitly
see if there's a pending event and call into the hypervisor to make
sure it gets delivered.
UPCALLS
Xen needs a couple of upcall (or callback) functions to be implemented
by each guest. One is the event upcalls, which is how events
(interrupts, effectively) are delivered to the guests. The other is
the failsafe callback, which is used to report errors in either
reloading a segment register, or caused by iret. These are
implemented in i386/kernel/entry.S so they can jump into the normal
iret_exc path when necessary.
MULTICALL BATCHING
Xen provides a multicall mechanism, which allows multiple hypercalls
to be issued at once in order to mitigate the cost of trapping into
the hypervisor. This is particularly useful for context switches,
since the 4-5 hypercalls they would normally need (reload cr3, update
TLS, maybe update LDT) can be reduced to one. This patch implements a
generic batching mechanism for hypercalls, which gets used in many
places in the Xen code.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Ian Pratt <ian.pratt@xensource.com>
Cc: Christian Limpach <Christian.Limpach@cl.cam.ac.uk>
Cc: Adrian Bunk <bunk@stusta.de>
2007-07-18 01:37:04 +00:00
|
|
|
|
2005-04-16 22:20:36 +00:00
|
|
|
/*
|
|
|
|
* The IDT and GDT 'descriptors' are a strange 48-bit object
|
|
|
|
* only used by the lidt and lgdt instructions. They are not
|
|
|
|
* like usual segment descriptors - they consist of a 16-bit
|
|
|
|
* segment size, and 32-bit linear address value:
|
|
|
|
*/
|
|
|
|
|
2012-04-19 00:16:50 +00:00
|
|
|
.data
|
2005-04-16 22:20:36 +00:00
|
|
|
ALIGN
|
|
|
|
# early boot GDT descriptor (must use 1:1 address mapping)
|
|
|
|
.word 0 # 32 bit align gdt_desc.address
|
2019-10-11 11:50:51 +00:00
|
|
|
SYM_DATA_START_LOCAL(boot_gdt_descr)
|
2005-04-16 22:20:36 +00:00
|
|
|
.word __BOOT_DS+7
|
2007-05-02 17:27:10 +00:00
|
|
|
.long boot_gdt - __PAGE_OFFSET
|
2019-10-11 11:50:51 +00:00
|
|
|
SYM_DATA_END(boot_gdt_descr)
|
2005-04-16 22:20:36 +00:00
|
|
|
|
|
|
|
# boot GDT descriptor (later on used by CPU#0):
|
|
|
|
.word 0 # 32 bit align gdt_desc.address
|
2019-10-11 11:50:51 +00:00
|
|
|
SYM_DATA_START(early_gdt_descr)
|
2005-04-16 22:20:36 +00:00
|
|
|
.word GDT_ENTRIES*8-1
|
2009-10-29 13:34:15 +00:00
|
|
|
.long gdt_page /* Overwritten for secondary CPUs */
|
2019-10-11 11:50:51 +00:00
|
|
|
SYM_DATA_END(early_gdt_descr)
|
2005-04-16 22:20:36 +00:00
|
|
|
|
|
|
|
/*
|
2007-05-02 17:27:10 +00:00
|
|
|
* The boot_gdt must mirror the equivalent in setup.S and is
|
2005-04-16 22:20:36 +00:00
|
|
|
* used only for booting.
|
|
|
|
*/
|
|
|
|
.align L1_CACHE_BYTES
|
2019-10-11 11:50:51 +00:00
|
|
|
SYM_DATA_START(boot_gdt)
|
2005-04-16 22:20:36 +00:00
|
|
|
.fill GDT_ENTRY_BOOT_CS,8,0
|
|
|
|
.quad 0x00cf9a000000ffff /* kernel 4GB code at 0x00000000 */
|
|
|
|
.quad 0x00cf92000000ffff /* kernel 4GB data at 0x00000000 */
|
2019-10-11 11:50:51 +00:00
|
|
|
SYM_DATA_END(boot_gdt)
|