linux-stable/mm/kasan/init.c
Andrey Konovalov 11f094e312 kasan: drop unnecessary GPL text from comment headers
Patch series "kasan: add hardware tag-based mode for arm64", v11.

This patchset adds a new hardware tag-based mode to KASAN [1].  The new
mode is similar to the existing software tag-based KASAN, but relies on
arm64 Memory Tagging Extension (MTE) [2] to perform memory and pointer
tagging (instead of shadow memory and compiler instrumentation).

This patchset is co-developed and tested by
Vincenzo Frascino <vincenzo.frascino@arm.com>.

This patchset is available here:

https://github.com/xairy/linux/tree/up-kasan-mte-v11

For testing in QEMU hardware tag-based KASAN requires:

1. QEMU built from master [4] (use "-machine virt,mte=on -cpu max" arguments
   to run).
2. GCC version 10.

[1] https://www.kernel.org/doc/html/latest/dev-tools/kasan.html
[2] https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/enhancing-memory-safety
[3] git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux for-next/mte
[4] https://github.com/qemu/qemu

====== Overview

The underlying ideas of the approach used by hardware tag-based KASAN are:

1. By relying on the Top Byte Ignore (TBI) arm64 CPU feature, pointer tags
   are stored in the top byte of each kernel pointer.

2. With the Memory Tagging Extension (MTE) arm64 CPU feature, memory tags
   for kernel memory allocations are stored in a dedicated memory not
   accessible via normal instuctions.

3. On each memory allocation, a random tag is generated, embedded it into
   the returned pointer, and the corresponding memory is tagged with the
   same tag value.

4. With MTE the CPU performs a check on each memory access to make sure
   that the pointer tag matches the memory tag.

5. On a tag mismatch the CPU generates a tag fault, and a KASAN report is
   printed.

Same as other KASAN modes, hardware tag-based KASAN is intended as a
debugging feature at this point.

====== Rationale

There are two main reasons for this new hardware tag-based mode:

1. Previously implemented software tag-based KASAN is being successfully
   used on dogfood testing devices due to its low memory overhead (as
   initially planned). The new hardware mode keeps the same low memory
   overhead, and is expected to have significantly lower performance
   impact, due to the tag checks being performed by the hardware.
   Therefore the new mode can be used as a better alternative in dogfood
   testing for hardware that supports MTE.

2. The new mode lays the groundwork for the planned in-kernel MTE-based
   memory corruption mitigation to be used in production.

====== Technical details

Considering the implementation perspective, hardware tag-based KASAN is
almost identical to the software mode.  The key difference is using MTE
for assigning and checking tags.

Compared to the software mode, the hardware mode uses 4 bits per tag, as
dictated by MTE.  Pointer tags are stored in bits [56:60), the top 4 bits
have the normal value 0xF.  Having less distict tags increases the
probablity of false negatives (from ~1/256 to ~1/16) in certain cases.

Only synchronous exceptions are set up and used by hardware tag-based KASAN.

====== Benchmarks

Note: all measurements have been performed with software emulation of Memory
Tagging Extension, performance numbers for hardware tag-based KASAN on the
actual hardware are expected to be better.

Boot time [1]:
* 2.8 sec for clean kernel
* 5.7 sec for hardware tag-based KASAN
* 11.8 sec for software tag-based KASAN
* 11.6 sec for generic KASAN

Slab memory usage after boot [2]:
* 7.0 kb for clean kernel
* 9.7 kb for hardware tag-based KASAN
* 9.7 kb for software tag-based KASAN
* 41.3 kb for generic KASAN

Measurements have been performed with:
* defconfig-based configs
* Manually built QEMU master
* QEMU arguments: -machine virt,mte=on -cpu max
* CONFIG_KASAN_STACK_ENABLE disabled
* CONFIG_KASAN_INLINE enabled
* clang-10 as the compiler and gcc-10 as the assembler

[1] Time before the ext4 driver is initialized.
[2] Measured as `cat /proc/meminfo | grep Slab`.

====== Notes

The cover letter for software tag-based KASAN patchset can be found here:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0116523cfffa62aeb5aa3b85ce7419f3dae0c1b8

===== Tags

Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com>

This patch (of 41):

Don't mention "GNU General Public License version 2" text explicitly, as
it's already covered by the SPDX-License-Identifier.

Link: https://lkml.kernel.org/r/cover.1606161801.git.andreyknvl@google.com
Link: https://lkml.kernel.org/r/6ea9f5f4aa9dbbffa0d0c0a780b37699a4531034.1606161801.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Marco Elver <elver@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Branislav Rankov <Branislav.Rankov@arm.com>
Cc: Kevin Brodsky <kevin.brodsky@arm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-22 12:55:06 -08:00

489 lines
11 KiB
C

// SPDX-License-Identifier: GPL-2.0
/*
* This file contains some kasan initialization code.
*
* Copyright (c) 2015 Samsung Electronics Co., Ltd.
* Author: Andrey Ryabinin <ryabinin.a.a@gmail.com>
*/
#include <linux/memblock.h>
#include <linux/init.h>
#include <linux/kasan.h>
#include <linux/kernel.h>
#include <linux/mm.h>
#include <linux/pfn.h>
#include <linux/slab.h>
#include <asm/page.h>
#include <asm/pgalloc.h>
#include "kasan.h"
/*
* This page serves two purposes:
* - It used as early shadow memory. The entire shadow region populated
* with this page, before we will be able to setup normal shadow memory.
* - Latter it reused it as zero shadow to cover large ranges of memory
* that allowed to access, but not handled by kasan (vmalloc/vmemmap ...).
*/
unsigned char kasan_early_shadow_page[PAGE_SIZE] __page_aligned_bss;
#if CONFIG_PGTABLE_LEVELS > 4
p4d_t kasan_early_shadow_p4d[MAX_PTRS_PER_P4D] __page_aligned_bss;
static inline bool kasan_p4d_table(pgd_t pgd)
{
return pgd_page(pgd) == virt_to_page(lm_alias(kasan_early_shadow_p4d));
}
#else
static inline bool kasan_p4d_table(pgd_t pgd)
{
return false;
}
#endif
#if CONFIG_PGTABLE_LEVELS > 3
pud_t kasan_early_shadow_pud[PTRS_PER_PUD] __page_aligned_bss;
static inline bool kasan_pud_table(p4d_t p4d)
{
return p4d_page(p4d) == virt_to_page(lm_alias(kasan_early_shadow_pud));
}
#else
static inline bool kasan_pud_table(p4d_t p4d)
{
return false;
}
#endif
#if CONFIG_PGTABLE_LEVELS > 2
pmd_t kasan_early_shadow_pmd[PTRS_PER_PMD] __page_aligned_bss;
static inline bool kasan_pmd_table(pud_t pud)
{
return pud_page(pud) == virt_to_page(lm_alias(kasan_early_shadow_pmd));
}
#else
static inline bool kasan_pmd_table(pud_t pud)
{
return false;
}
#endif
pte_t kasan_early_shadow_pte[PTRS_PER_PTE] __page_aligned_bss;
static inline bool kasan_pte_table(pmd_t pmd)
{
return pmd_page(pmd) == virt_to_page(lm_alias(kasan_early_shadow_pte));
}
static inline bool kasan_early_shadow_page_entry(pte_t pte)
{
return pte_page(pte) == virt_to_page(lm_alias(kasan_early_shadow_page));
}
static __init void *early_alloc(size_t size, int node)
{
void *ptr = memblock_alloc_try_nid(size, size, __pa(MAX_DMA_ADDRESS),
MEMBLOCK_ALLOC_ACCESSIBLE, node);
if (!ptr)
panic("%s: Failed to allocate %zu bytes align=%zx nid=%d from=%llx\n",
__func__, size, size, node, (u64)__pa(MAX_DMA_ADDRESS));
return ptr;
}
static void __ref zero_pte_populate(pmd_t *pmd, unsigned long addr,
unsigned long end)
{
pte_t *pte = pte_offset_kernel(pmd, addr);
pte_t zero_pte;
zero_pte = pfn_pte(PFN_DOWN(__pa_symbol(kasan_early_shadow_page)),
PAGE_KERNEL);
zero_pte = pte_wrprotect(zero_pte);
while (addr + PAGE_SIZE <= end) {
set_pte_at(&init_mm, addr, pte, zero_pte);
addr += PAGE_SIZE;
pte = pte_offset_kernel(pmd, addr);
}
}
static int __ref zero_pmd_populate(pud_t *pud, unsigned long addr,
unsigned long end)
{
pmd_t *pmd = pmd_offset(pud, addr);
unsigned long next;
do {
next = pmd_addr_end(addr, end);
if (IS_ALIGNED(addr, PMD_SIZE) && end - addr >= PMD_SIZE) {
pmd_populate_kernel(&init_mm, pmd,
lm_alias(kasan_early_shadow_pte));
continue;
}
if (pmd_none(*pmd)) {
pte_t *p;
if (slab_is_available())
p = pte_alloc_one_kernel(&init_mm);
else
p = early_alloc(PAGE_SIZE, NUMA_NO_NODE);
if (!p)
return -ENOMEM;
pmd_populate_kernel(&init_mm, pmd, p);
}
zero_pte_populate(pmd, addr, next);
} while (pmd++, addr = next, addr != end);
return 0;
}
static int __ref zero_pud_populate(p4d_t *p4d, unsigned long addr,
unsigned long end)
{
pud_t *pud = pud_offset(p4d, addr);
unsigned long next;
do {
next = pud_addr_end(addr, end);
if (IS_ALIGNED(addr, PUD_SIZE) && end - addr >= PUD_SIZE) {
pmd_t *pmd;
pud_populate(&init_mm, pud,
lm_alias(kasan_early_shadow_pmd));
pmd = pmd_offset(pud, addr);
pmd_populate_kernel(&init_mm, pmd,
lm_alias(kasan_early_shadow_pte));
continue;
}
if (pud_none(*pud)) {
pmd_t *p;
if (slab_is_available()) {
p = pmd_alloc(&init_mm, pud, addr);
if (!p)
return -ENOMEM;
} else {
pud_populate(&init_mm, pud,
early_alloc(PAGE_SIZE, NUMA_NO_NODE));
}
}
zero_pmd_populate(pud, addr, next);
} while (pud++, addr = next, addr != end);
return 0;
}
static int __ref zero_p4d_populate(pgd_t *pgd, unsigned long addr,
unsigned long end)
{
p4d_t *p4d = p4d_offset(pgd, addr);
unsigned long next;
do {
next = p4d_addr_end(addr, end);
if (IS_ALIGNED(addr, P4D_SIZE) && end - addr >= P4D_SIZE) {
pud_t *pud;
pmd_t *pmd;
p4d_populate(&init_mm, p4d,
lm_alias(kasan_early_shadow_pud));
pud = pud_offset(p4d, addr);
pud_populate(&init_mm, pud,
lm_alias(kasan_early_shadow_pmd));
pmd = pmd_offset(pud, addr);
pmd_populate_kernel(&init_mm, pmd,
lm_alias(kasan_early_shadow_pte));
continue;
}
if (p4d_none(*p4d)) {
pud_t *p;
if (slab_is_available()) {
p = pud_alloc(&init_mm, p4d, addr);
if (!p)
return -ENOMEM;
} else {
p4d_populate(&init_mm, p4d,
early_alloc(PAGE_SIZE, NUMA_NO_NODE));
}
}
zero_pud_populate(p4d, addr, next);
} while (p4d++, addr = next, addr != end);
return 0;
}
/**
* kasan_populate_early_shadow - populate shadow memory region with
* kasan_early_shadow_page
* @shadow_start - start of the memory range to populate
* @shadow_end - end of the memory range to populate
*/
int __ref kasan_populate_early_shadow(const void *shadow_start,
const void *shadow_end)
{
unsigned long addr = (unsigned long)shadow_start;
unsigned long end = (unsigned long)shadow_end;
pgd_t *pgd = pgd_offset_k(addr);
unsigned long next;
do {
next = pgd_addr_end(addr, end);
if (IS_ALIGNED(addr, PGDIR_SIZE) && end - addr >= PGDIR_SIZE) {
p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
/*
* kasan_early_shadow_pud should be populated with pmds
* at this moment.
* [pud,pmd]_populate*() below needed only for
* 3,2 - level page tables where we don't have
* puds,pmds, so pgd_populate(), pud_populate()
* is noops.
*/
pgd_populate(&init_mm, pgd,
lm_alias(kasan_early_shadow_p4d));
p4d = p4d_offset(pgd, addr);
p4d_populate(&init_mm, p4d,
lm_alias(kasan_early_shadow_pud));
pud = pud_offset(p4d, addr);
pud_populate(&init_mm, pud,
lm_alias(kasan_early_shadow_pmd));
pmd = pmd_offset(pud, addr);
pmd_populate_kernel(&init_mm, pmd,
lm_alias(kasan_early_shadow_pte));
continue;
}
if (pgd_none(*pgd)) {
p4d_t *p;
if (slab_is_available()) {
p = p4d_alloc(&init_mm, pgd, addr);
if (!p)
return -ENOMEM;
} else {
pgd_populate(&init_mm, pgd,
early_alloc(PAGE_SIZE, NUMA_NO_NODE));
}
}
zero_p4d_populate(pgd, addr, next);
} while (pgd++, addr = next, addr != end);
return 0;
}
static void kasan_free_pte(pte_t *pte_start, pmd_t *pmd)
{
pte_t *pte;
int i;
for (i = 0; i < PTRS_PER_PTE; i++) {
pte = pte_start + i;
if (!pte_none(*pte))
return;
}
pte_free_kernel(&init_mm, (pte_t *)page_to_virt(pmd_page(*pmd)));
pmd_clear(pmd);
}
static void kasan_free_pmd(pmd_t *pmd_start, pud_t *pud)
{
pmd_t *pmd;
int i;
for (i = 0; i < PTRS_PER_PMD; i++) {
pmd = pmd_start + i;
if (!pmd_none(*pmd))
return;
}
pmd_free(&init_mm, (pmd_t *)page_to_virt(pud_page(*pud)));
pud_clear(pud);
}
static void kasan_free_pud(pud_t *pud_start, p4d_t *p4d)
{
pud_t *pud;
int i;
for (i = 0; i < PTRS_PER_PUD; i++) {
pud = pud_start + i;
if (!pud_none(*pud))
return;
}
pud_free(&init_mm, (pud_t *)page_to_virt(p4d_page(*p4d)));
p4d_clear(p4d);
}
static void kasan_free_p4d(p4d_t *p4d_start, pgd_t *pgd)
{
p4d_t *p4d;
int i;
for (i = 0; i < PTRS_PER_P4D; i++) {
p4d = p4d_start + i;
if (!p4d_none(*p4d))
return;
}
p4d_free(&init_mm, (p4d_t *)page_to_virt(pgd_page(*pgd)));
pgd_clear(pgd);
}
static void kasan_remove_pte_table(pte_t *pte, unsigned long addr,
unsigned long end)
{
unsigned long next;
for (; addr < end; addr = next, pte++) {
next = (addr + PAGE_SIZE) & PAGE_MASK;
if (next > end)
next = end;
if (!pte_present(*pte))
continue;
if (WARN_ON(!kasan_early_shadow_page_entry(*pte)))
continue;
pte_clear(&init_mm, addr, pte);
}
}
static void kasan_remove_pmd_table(pmd_t *pmd, unsigned long addr,
unsigned long end)
{
unsigned long next;
for (; addr < end; addr = next, pmd++) {
pte_t *pte;
next = pmd_addr_end(addr, end);
if (!pmd_present(*pmd))
continue;
if (kasan_pte_table(*pmd)) {
if (IS_ALIGNED(addr, PMD_SIZE) &&
IS_ALIGNED(next, PMD_SIZE))
pmd_clear(pmd);
continue;
}
pte = pte_offset_kernel(pmd, addr);
kasan_remove_pte_table(pte, addr, next);
kasan_free_pte(pte_offset_kernel(pmd, 0), pmd);
}
}
static void kasan_remove_pud_table(pud_t *pud, unsigned long addr,
unsigned long end)
{
unsigned long next;
for (; addr < end; addr = next, pud++) {
pmd_t *pmd, *pmd_base;
next = pud_addr_end(addr, end);
if (!pud_present(*pud))
continue;
if (kasan_pmd_table(*pud)) {
if (IS_ALIGNED(addr, PUD_SIZE) &&
IS_ALIGNED(next, PUD_SIZE))
pud_clear(pud);
continue;
}
pmd = pmd_offset(pud, addr);
pmd_base = pmd_offset(pud, 0);
kasan_remove_pmd_table(pmd, addr, next);
kasan_free_pmd(pmd_base, pud);
}
}
static void kasan_remove_p4d_table(p4d_t *p4d, unsigned long addr,
unsigned long end)
{
unsigned long next;
for (; addr < end; addr = next, p4d++) {
pud_t *pud;
next = p4d_addr_end(addr, end);
if (!p4d_present(*p4d))
continue;
if (kasan_pud_table(*p4d)) {
if (IS_ALIGNED(addr, P4D_SIZE) &&
IS_ALIGNED(next, P4D_SIZE))
p4d_clear(p4d);
continue;
}
pud = pud_offset(p4d, addr);
kasan_remove_pud_table(pud, addr, next);
kasan_free_pud(pud_offset(p4d, 0), p4d);
}
}
void kasan_remove_zero_shadow(void *start, unsigned long size)
{
unsigned long addr, end, next;
pgd_t *pgd;
addr = (unsigned long)kasan_mem_to_shadow(start);
end = addr + (size >> KASAN_SHADOW_SCALE_SHIFT);
if (WARN_ON((unsigned long)start %
(KASAN_SHADOW_SCALE_SIZE * PAGE_SIZE)) ||
WARN_ON(size % (KASAN_SHADOW_SCALE_SIZE * PAGE_SIZE)))
return;
for (; addr < end; addr = next) {
p4d_t *p4d;
next = pgd_addr_end(addr, end);
pgd = pgd_offset_k(addr);
if (!pgd_present(*pgd))
continue;
if (kasan_p4d_table(*pgd)) {
if (IS_ALIGNED(addr, PGDIR_SIZE) &&
IS_ALIGNED(next, PGDIR_SIZE))
pgd_clear(pgd);
continue;
}
p4d = p4d_offset(pgd, addr);
kasan_remove_p4d_table(p4d, addr, next);
kasan_free_p4d(p4d_offset(pgd, 0), pgd);
}
}
int kasan_add_zero_shadow(void *start, unsigned long size)
{
int ret;
void *shadow_start, *shadow_end;
shadow_start = kasan_mem_to_shadow(start);
shadow_end = shadow_start + (size >> KASAN_SHADOW_SCALE_SHIFT);
if (WARN_ON((unsigned long)start %
(KASAN_SHADOW_SCALE_SIZE * PAGE_SIZE)) ||
WARN_ON(size % (KASAN_SHADOW_SCALE_SIZE * PAGE_SIZE)))
return -EINVAL;
ret = kasan_populate_early_shadow(shadow_start, shadow_end);
if (ret)
kasan_remove_zero_shadow(shadow_start,
size >> KASAN_SHADOW_SCALE_SHIFT);
return ret;
}