linux-stable/include/linux/sysctl.h
Xiaoming Ni 3ddd9a808c sysctl: add a new register_sysctl_init() interface
Patch series "sysctl: first set of kernel/sysctl cleanups", v2.

Finally had time to respin the series of the work we had started last
year on cleaning up the kernel/sysct.c kitchen sink.  People keeps
stuffing their sysctls in that file and this creates a maintenance
burden.  So this effort is aimed at placing sysctls where they actually
belong.

I'm going to split patches up into series as there is quite a bit of
work.

This first set adds register_sysctl_init() for uses of registerting a
sysctl on the init path, adds const where missing to a few places,
generalizes common values so to be more easy to share, and starts the
move of a few kernel/sysctl.c out where they belong.

The majority of rework on v2 in this first patch set is 0-day fixes.
Eric Biederman's feedback is later addressed in subsequent patch sets.

I'll only post the first two patch sets for now.  We can address the
rest once the first two patch sets get completely reviewed / Acked.

This patch (of 9):

The kernel/sysctl.c is a kitchen sink where everyone leaves their dirty
dishes, this makes it very difficult to maintain.

To help with this maintenance let's start by moving sysctls to places
where they actually belong.  The proc sysctl maintainers do not want to
know what sysctl knobs you wish to add for your own piece of code, we
just care about the core logic.

Today though folks heavily rely on tables on kernel/sysctl.c so they can
easily just extend this table with their needed sysctls.  In order to
help users move their sysctls out we need to provide a helper which can
be used during code initialization.

We special-case the initialization use of register_sysctl() since it
*is* safe to fail, given all that sysctls do is provide a dynamic
interface to query or modify at runtime an existing variable.  So the
use case of register_sysctl() on init should *not* stop if the sysctls
don't end up getting registered.  It would be counter productive to stop
boot if a simple sysctl registration failed.

Provide a helper for init then, and document the recommended init levels
to use for callers of this routine.  We will later use this in
subsequent patches to start slimming down kernel/sysctl.c tables and
moving sysctl registration to the code which actually needs these
sysctls.

[mcgrof@kernel.org: major commit log and documentation rephrasing also moved to fs/proc/proc_sysctl.c                  ]

Link: https://lkml.kernel.org/r/20211123202347.818157-1-mcgrof@kernel.org
Link: https://lkml.kernel.org/r/20211123202347.818157-2-mcgrof@kernel.org
Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Paul Turner <pjt@google.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Sebastian Reichel <sre@kernel.org>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Qing Wang <wangqing@vivo.com>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jan Kara <jack@suse.cz>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Stephen Kitt <steve@sk2.org>
Cc: Antti Palosaari <crope@iki.fi>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Clemens Ladisch <clemens@ladisch.de>
Cc: David Airlie <airlied@linux.ie>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Julia Lawall <julia.lawall@inria.fr>
Cc: Lukas Middendorf <kernel@tuxforce.de>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Phillip Potter <phil@philpotter.co.uk>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Cc: James E.J. Bottomley <jejb@linux.ibm.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-22 08:33:34 +02:00

252 lines
7.9 KiB
C

/* SPDX-License-Identifier: GPL-2.0 */
/*
* sysctl.h: General linux system control interface
*
* Begun 24 March 1995, Stephen Tweedie
*
****************************************************************
****************************************************************
**
** WARNING:
** The values in this file are exported to user space via
** the sysctl() binary interface. Do *NOT* change the
** numbering of any existing values here, and do not change
** any numbers within any one set of values. If you have to
** redefine an existing interface, use a new number for it.
** The kernel will then return -ENOTDIR to any application using
** the old binary interface.
**
****************************************************************
****************************************************************
*/
#ifndef _LINUX_SYSCTL_H
#define _LINUX_SYSCTL_H
#include <linux/list.h>
#include <linux/rcupdate.h>
#include <linux/wait.h>
#include <linux/rbtree.h>
#include <linux/uidgid.h>
#include <uapi/linux/sysctl.h>
/* For the /proc/sys support */
struct completion;
struct ctl_table;
struct nsproxy;
struct ctl_table_root;
struct ctl_table_header;
struct ctl_dir;
/* Keep the same order as in fs/proc/proc_sysctl.c */
#define SYSCTL_ZERO ((void *)&sysctl_vals[0])
#define SYSCTL_ONE ((void *)&sysctl_vals[1])
#define SYSCTL_INT_MAX ((void *)&sysctl_vals[2])
extern const int sysctl_vals[];
typedef int proc_handler(struct ctl_table *ctl, int write, void *buffer,
size_t *lenp, loff_t *ppos);
int proc_dostring(struct ctl_table *, int, void *, size_t *, loff_t *);
int proc_dobool(struct ctl_table *table, int write, void *buffer,
size_t *lenp, loff_t *ppos);
int proc_dointvec(struct ctl_table *, int, void *, size_t *, loff_t *);
int proc_douintvec(struct ctl_table *, int, void *, size_t *, loff_t *);
int proc_dointvec_minmax(struct ctl_table *, int, void *, size_t *, loff_t *);
int proc_douintvec_minmax(struct ctl_table *table, int write, void *buffer,
size_t *lenp, loff_t *ppos);
int proc_dou8vec_minmax(struct ctl_table *table, int write, void *buffer,
size_t *lenp, loff_t *ppos);
int proc_dointvec_jiffies(struct ctl_table *, int, void *, size_t *, loff_t *);
int proc_dointvec_userhz_jiffies(struct ctl_table *, int, void *, size_t *,
loff_t *);
int proc_dointvec_ms_jiffies(struct ctl_table *, int, void *, size_t *,
loff_t *);
int proc_doulongvec_minmax(struct ctl_table *, int, void *, size_t *, loff_t *);
int proc_doulongvec_ms_jiffies_minmax(struct ctl_table *table, int, void *,
size_t *, loff_t *);
int proc_do_large_bitmap(struct ctl_table *, int, void *, size_t *, loff_t *);
int proc_do_static_key(struct ctl_table *table, int write, void *buffer,
size_t *lenp, loff_t *ppos);
/*
* Register a set of sysctl names by calling register_sysctl_table
* with an initialised array of struct ctl_table's. An entry with
* NULL procname terminates the table. table->de will be
* set up by the registration and need not be initialised in advance.
*
* sysctl names can be mirrored automatically under /proc/sys. The
* procname supplied controls /proc naming.
*
* The table's mode will be honoured for proc-fs access.
*
* Leaf nodes in the sysctl tree will be represented by a single file
* under /proc; non-leaf nodes will be represented by directories. A
* null procname disables /proc mirroring at this node.
*
* The data and maxlen fields of the ctl_table
* struct enable minimal validation of the values being written to be
* performed, and the mode field allows minimal authentication.
*
* There must be a proc_handler routine for any terminal nodes
* mirrored under /proc/sys (non-terminals are handled by a built-in
* directory handler). Several default handlers are available to
* cover common cases.
*/
/* Support for userspace poll() to watch for changes */
struct ctl_table_poll {
atomic_t event;
wait_queue_head_t wait;
};
static inline void *proc_sys_poll_event(struct ctl_table_poll *poll)
{
return (void *)(unsigned long)atomic_read(&poll->event);
}
#define __CTL_TABLE_POLL_INITIALIZER(name) { \
.event = ATOMIC_INIT(0), \
.wait = __WAIT_QUEUE_HEAD_INITIALIZER(name.wait) }
#define DEFINE_CTL_TABLE_POLL(name) \
struct ctl_table_poll name = __CTL_TABLE_POLL_INITIALIZER(name)
/* A sysctl table is an array of struct ctl_table: */
struct ctl_table {
const char *procname; /* Text ID for /proc/sys, or zero */
void *data;
int maxlen;
umode_t mode;
struct ctl_table *child; /* Deprecated */
proc_handler *proc_handler; /* Callback for text formatting */
struct ctl_table_poll *poll;
void *extra1;
void *extra2;
} __randomize_layout;
struct ctl_node {
struct rb_node node;
struct ctl_table_header *header;
};
/* struct ctl_table_header is used to maintain dynamic lists of
struct ctl_table trees. */
struct ctl_table_header {
union {
struct {
struct ctl_table *ctl_table;
int used;
int count;
int nreg;
};
struct rcu_head rcu;
};
struct completion *unregistering;
struct ctl_table *ctl_table_arg;
struct ctl_table_root *root;
struct ctl_table_set *set;
struct ctl_dir *parent;
struct ctl_node *node;
struct hlist_head inodes; /* head for proc_inode->sysctl_inodes */
};
struct ctl_dir {
/* Header must be at the start of ctl_dir */
struct ctl_table_header header;
struct rb_root root;
};
struct ctl_table_set {
int (*is_seen)(struct ctl_table_set *);
struct ctl_dir dir;
};
struct ctl_table_root {
struct ctl_table_set default_set;
struct ctl_table_set *(*lookup)(struct ctl_table_root *root);
void (*set_ownership)(struct ctl_table_header *head,
struct ctl_table *table,
kuid_t *uid, kgid_t *gid);
int (*permissions)(struct ctl_table_header *head, struct ctl_table *table);
};
/* struct ctl_path describes where in the hierarchy a table is added */
struct ctl_path {
const char *procname;
};
#ifdef CONFIG_SYSCTL
void proc_sys_poll_notify(struct ctl_table_poll *poll);
extern void setup_sysctl_set(struct ctl_table_set *p,
struct ctl_table_root *root,
int (*is_seen)(struct ctl_table_set *));
extern void retire_sysctl_set(struct ctl_table_set *set);
struct ctl_table_header *__register_sysctl_table(
struct ctl_table_set *set,
const char *path, struct ctl_table *table);
struct ctl_table_header *__register_sysctl_paths(
struct ctl_table_set *set,
const struct ctl_path *path, struct ctl_table *table);
struct ctl_table_header *register_sysctl(const char *path, struct ctl_table *table);
struct ctl_table_header *register_sysctl_table(struct ctl_table * table);
struct ctl_table_header *register_sysctl_paths(const struct ctl_path *path,
struct ctl_table *table);
void unregister_sysctl_table(struct ctl_table_header * table);
extern int sysctl_init(void);
extern void __register_sysctl_init(const char *path, struct ctl_table *table,
const char *table_name);
#define register_sysctl_init(path, table) __register_sysctl_init(path, table, #table)
void do_sysctl_args(void);
extern int pwrsw_enabled;
extern int unaligned_enabled;
extern int unaligned_dump_stack;
extern int no_unaligned_warning;
extern struct ctl_table sysctl_mount_point[];
extern struct ctl_table random_table[];
extern struct ctl_table firmware_config_table[];
extern struct ctl_table epoll_table[];
#else /* CONFIG_SYSCTL */
static inline struct ctl_table_header *register_sysctl_table(struct ctl_table * table)
{
return NULL;
}
static inline struct ctl_table_header *register_sysctl_paths(
const struct ctl_path *path, struct ctl_table *table)
{
return NULL;
}
static inline struct ctl_table_header *register_sysctl(const char *path, struct ctl_table *table)
{
return NULL;
}
static inline void unregister_sysctl_table(struct ctl_table_header * table)
{
}
static inline void setup_sysctl_set(struct ctl_table_set *p,
struct ctl_table_root *root,
int (*is_seen)(struct ctl_table_set *))
{
}
static inline void do_sysctl_args(void)
{
}
#endif /* CONFIG_SYSCTL */
int sysctl_max_threads(struct ctl_table *table, int write, void *buffer,
size_t *lenp, loff_t *ppos);
#endif /* _LINUX_SYSCTL_H */