tracing updates for 6.8:

- Allow kernel trace instance creation to specify what events are created
   Inside the kernel, a subsystem may create a tracing instance that it can
   use to send events to user space. This sub-system may not care about the
   thousands of events that exist in eventfs. Allow the sub-system to specify
   what sub-systems of events it cares about, and only those events are exposed
   to this instance.
 
 - Allow the ring buffer to be broken up into bigger sub-buffers than just the
   architecture page size. A new tracefs file called "buffer_subbuf_size_kb"
   is created. The user can now specify a minimum size the sub-buffer may be
   in kilobytes. Note, that the implementation currently make the sub-buffer
   size a power of 2 pages (1, 2, 4, 8, 16, ...) but the user only writes in
   kilobyte size, and the sub-buffer will be updated to the next size that
   it will can accommodate it. If the user writes in 10, it will change the
   size to be 4 pages on x86 (16K), as that is the next available size that
   can hold 10K pages.
 
 - Update the debug output when a corrupt time is detected in the ring buffer.
   If the ring buffer detects inconsistent timestamps, there's a debug config
   options that will dump the contents of the meta data of the sub-buffer that
   is used for debugging. Add some more information to this dump that helps
   with debugging.
 
 - Add more timestamp debugging checks (only triggers when the config is enabled)
 
 - Increase the trace_seq iterator to 2 page sizes.
 
 - Allow strings written into tracefs_marker to be larger. Up to just under
   2 page sizes (based on what trace_seq can hold).
 
 - Increase the trace_maker_raw write to be as big as a sub-buffer can hold.
 
 - Remove 32 bit time stamp logic, now that the rb_time_cmpxchg() has been
   removed.
 
 - More selftests were added.
 
 - Some code clean ups as well.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZZ8p3BQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6ql2GAQDZg/zlFEiJHyTfWbCIE8pA3T5xbzKo
 26TNxIZAxJJZpQEAvGFU5Smy14pG6soEoVMp8B6ZOANbqU8VVamhOL+r+Qw=
 =0OYG
 -----END PGP SIGNATURE-----

Merge tag 'trace-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing updates from Steven Rostedt:

 - Allow kernel trace instance creation to specify what events are
   created

   Inside the kernel, a subsystem may create a tracing instance that it
   can use to send events to user space. This sub-system may not care
   about the thousands of events that exist in eventfs. Allow the
   sub-system to specify what sub-systems of events it cares about, and
   only those events are exposed to this instance.

 - Allow the ring buffer to be broken up into bigger sub-buffers than
   just the architecture page size.

   A new tracefs file called "buffer_subbuf_size_kb" is created. The
   user can now specify a minimum size the sub-buffer may be in
   kilobytes. Note, that the implementation currently make the
   sub-buffer size a power of 2 pages (1, 2, 4, 8, 16, ...) but the user
   only writes in kilobyte size, and the sub-buffer will be updated to
   the next size that it will can accommodate it. If the user writes in
   10, it will change the size to be 4 pages on x86 (16K), as that is
   the next available size that can hold 10K pages.

 - Update the debug output when a corrupt time is detected in the ring
   buffer. If the ring buffer detects inconsistent timestamps, there's a
   debug config options that will dump the contents of the meta data of
   the sub-buffer that is used for debugging. Add some more information
   to this dump that helps with debugging.

 - Add more timestamp debugging checks (only triggers when the config is
   enabled)

 - Increase the trace_seq iterator to 2 page sizes.

 - Allow strings written into tracefs_marker to be larger. Up to just
   under 2 page sizes (based on what trace_seq can hold).

 - Increase the trace_maker_raw write to be as big as a sub-buffer can
   hold.

 - Remove 32 bit time stamp logic, now that the rb_time_cmpxchg() has
   been removed.

 - More selftests were added.

 - Some code clean ups as well.

* tag 'trace-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (29 commits)
  ring-buffer: Remove stale comment from ring_buffer_size()
  tracing histograms: Simplify parse_actions() function
  tracing/selftests: Remove exec permissions from trace_marker.tc test
  ring-buffer: Use subbuf_order for buffer page masking
  tracing: Update subbuffer with kilobytes not page order
  ringbuffer/selftest: Add basic selftest to test changing subbuf order
  ring-buffer: Add documentation on the buffer_subbuf_order file
  ring-buffer: Just update the subbuffers when changing their allocation order
  ring-buffer: Keep the same size when updating the order
  tracing: Stop the tracing while changing the ring buffer subbuf size
  tracing: Update snapshot order along with main buffer order
  ring-buffer: Make sure the spare sub buffer used for reads has same size
  ring-buffer: Do no swap cpu buffers if order is different
  ring-buffer: Clear pages on error in ring_buffer_subbuf_order_set() failure
  ring-buffer: Read and write to ring buffers with custom sub buffer size
  ring-buffer: Set new size of the ring buffer sub page
  ring-buffer: Add interface for configuring trace sub buffer size
  ring-buffer: Page size per ring buffer
  ring-buffer: Have ring_buffer_print_page_header() be able to access ring_buffer_iter
  ring-buffer: Check if absolute timestamp goes backwards
  ...
This commit is contained in:
Linus Torvalds 2024-01-18 14:35:29 -08:00
commit a2ded784cd
16 changed files with 1004 additions and 379 deletions

View File

@ -218,6 +218,27 @@ of ftrace. Here is a list of some of the key files:
This displays the total combined size of all the trace buffers.
buffer_subbuf_size_kb:
This sets or displays the sub buffer size. The ring buffer is broken up
into several same size "sub buffers". An event can not be bigger than
the size of the sub buffer. Normally, the sub buffer is the size of the
architecture's page (4K on x86). The sub buffer also contains meta data
at the start which also limits the size of an event. That means when
the sub buffer is a page size, no event can be larger than the page
size minus the sub buffer meta data.
Note, the buffer_subbuf_size_kb is a way for the user to specify the
minimum size of the subbuffer. The kernel may make it bigger due to the
implementation details, or simply fail the operation if the kernel can
not handle the request.
Changing the sub buffer size allows for events to be larger than the
page size.
Note: When changing the sub-buffer size, tracing is stopped and any
data in the ring buffer and the snapshot buffer will be discarded.
free_buffer:
If a process is performing tracing, and the ring buffer should be

View File

@ -2889,7 +2889,7 @@ static void qla2x00_iocb_work_fn(struct work_struct *work)
static void
qla_trace_init(void)
{
qla_trc_array = trace_array_get_by_name("qla2xxx");
qla_trc_array = trace_array_get_by_name("qla2xxx", NULL);
if (!qla_trc_array) {
ql_log(ql_log_fatal, NULL, 0x0001,
"Unable to create qla2xxx trace instance, instance logging will be disabled.\n");

View File

@ -141,6 +141,7 @@ int ring_buffer_iter_empty(struct ring_buffer_iter *iter);
bool ring_buffer_iter_dropped(struct ring_buffer_iter *iter);
unsigned long ring_buffer_size(struct trace_buffer *buffer, int cpu);
unsigned long ring_buffer_max_event_size(struct trace_buffer *buffer);
void ring_buffer_reset_cpu(struct trace_buffer *buffer, int cpu);
void ring_buffer_reset_online_cpus(struct trace_buffer *buffer);
@ -191,15 +192,24 @@ bool ring_buffer_time_stamp_abs(struct trace_buffer *buffer);
size_t ring_buffer_nr_pages(struct trace_buffer *buffer, int cpu);
size_t ring_buffer_nr_dirty_pages(struct trace_buffer *buffer, int cpu);
void *ring_buffer_alloc_read_page(struct trace_buffer *buffer, int cpu);
void ring_buffer_free_read_page(struct trace_buffer *buffer, int cpu, void *data);
int ring_buffer_read_page(struct trace_buffer *buffer, void **data_page,
struct buffer_data_read_page;
struct buffer_data_read_page *
ring_buffer_alloc_read_page(struct trace_buffer *buffer, int cpu);
void ring_buffer_free_read_page(struct trace_buffer *buffer, int cpu,
struct buffer_data_read_page *page);
int ring_buffer_read_page(struct trace_buffer *buffer,
struct buffer_data_read_page *data_page,
size_t len, int cpu, int full);
void *ring_buffer_read_page_data(struct buffer_data_read_page *page);
struct trace_seq;
int ring_buffer_print_entry_header(struct trace_seq *s);
int ring_buffer_print_page_header(struct trace_seq *s);
int ring_buffer_print_page_header(struct trace_buffer *buffer, struct trace_seq *s);
int ring_buffer_subbuf_order_get(struct trace_buffer *buffer);
int ring_buffer_subbuf_order_set(struct trace_buffer *buffer, int order);
int ring_buffer_subbuf_size_get(struct trace_buffer *buffer);
enum ring_buffer_flags {
RB_FL_OVERWRITE = 1 << 0,

View File

@ -51,7 +51,7 @@ int trace_array_printk(struct trace_array *tr, unsigned long ip,
const char *fmt, ...);
int trace_array_init_printk(struct trace_array *tr);
void trace_array_put(struct trace_array *tr);
struct trace_array *trace_array_get_by_name(const char *name);
struct trace_array *trace_array_get_by_name(const char *name, const char *systems);
int trace_array_destroy(struct trace_array *tr);
/* For osnoise tracer */
@ -84,7 +84,7 @@ static inline int trace_array_init_printk(struct trace_array *tr)
static inline void trace_array_put(struct trace_array *tr)
{
}
static inline struct trace_array *trace_array_get_by_name(const char *name)
static inline struct trace_array *trace_array_get_by_name(const char *name, const char *systems)
{
return NULL;
}

View File

@ -8,11 +8,14 @@
/*
* Trace sequences are used to allow a function to call several other functions
* to create a string of data to use (up to a max of PAGE_SIZE).
* to create a string of data to use.
*/
#define TRACE_SEQ_BUFFER_SIZE (PAGE_SIZE * 2 - \
(sizeof(struct seq_buf) + sizeof(size_t) + sizeof(int)))
struct trace_seq {
char buffer[PAGE_SIZE];
char buffer[TRACE_SEQ_BUFFER_SIZE];
struct seq_buf seq;
size_t readpos;
int full;
@ -21,7 +24,7 @@ struct trace_seq {
static inline void
trace_seq_init(struct trace_seq *s)
{
seq_buf_init(&s->seq, s->buffer, PAGE_SIZE);
seq_buf_init(&s->seq, s->buffer, TRACE_SEQ_BUFFER_SIZE);
s->full = 0;
s->readpos = 0;
}

File diff suppressed because it is too large Load Diff

View File

@ -104,10 +104,11 @@ static enum event_status read_event(int cpu)
static enum event_status read_page(int cpu)
{
struct buffer_data_read_page *bpage;
struct ring_buffer_event *event;
struct rb_page *rpage;
unsigned long commit;
void *bpage;
int page_size;
int *entry;
int ret;
int inc;
@ -117,14 +118,15 @@ static enum event_status read_page(int cpu)
if (IS_ERR(bpage))
return EVENT_DROPPED;
ret = ring_buffer_read_page(buffer, &bpage, PAGE_SIZE, cpu, 1);
page_size = ring_buffer_subbuf_size_get(buffer);
ret = ring_buffer_read_page(buffer, bpage, page_size, cpu, 1);
if (ret >= 0) {
rpage = bpage;
rpage = ring_buffer_read_page_data(bpage);
/* The commit may have missed event flags set, clear them */
commit = local_read(&rpage->commit) & 0xfffff;
for (i = 0; i < commit && !test_error ; i += inc) {
if (i >= (PAGE_SIZE - offsetof(struct rb_page, data))) {
if (i >= (page_size - offsetof(struct rb_page, data))) {
TEST_ERROR();
break;
}

View File

@ -1263,10 +1263,17 @@ static void set_buffer_entries(struct array_buffer *buf, unsigned long val);
int tracing_alloc_snapshot_instance(struct trace_array *tr)
{
int order;
int ret;
if (!tr->allocated_snapshot) {
/* Make the snapshot buffer have the same order as main buffer */
order = ring_buffer_subbuf_order_get(tr->array_buffer.buffer);
ret = ring_buffer_subbuf_order_set(tr->max_buffer.buffer, order);
if (ret < 0)
return ret;
/* allocate spare buffer */
ret = resize_buffer_duplicate_size(&tr->max_buffer,
&tr->array_buffer, RING_BUFFER_ALL_CPUS);
@ -1286,6 +1293,7 @@ static void free_snapshot(struct trace_array *tr)
* The max_tr ring buffer has some state (e.g. ring->clock) and
* we want preserve it.
*/
ring_buffer_subbuf_order_set(tr->max_buffer.buffer, 0);
ring_buffer_resize(tr->max_buffer.buffer, 1, RING_BUFFER_ALL_CPUS);
set_buffer_entries(&tr->max_buffer, 1);
tracing_reset_online_cpus(&tr->max_buffer);
@ -3767,7 +3775,7 @@ static bool trace_safe_str(struct trace_iterator *iter, const char *str,
/* OK if part of the temp seq buffer */
if ((addr >= (unsigned long)iter->tmp_seq.buffer) &&
(addr < (unsigned long)iter->tmp_seq.buffer + PAGE_SIZE))
(addr < (unsigned long)iter->tmp_seq.buffer + TRACE_SEQ_BUFFER_SIZE))
return true;
/* Core rodata can not be freed */
@ -5032,7 +5040,7 @@ static int tracing_release(struct inode *inode, struct file *file)
return 0;
}
static int tracing_release_generic_tr(struct inode *inode, struct file *file)
int tracing_release_generic_tr(struct inode *inode, struct file *file)
{
struct trace_array *tr = inode->i_private;
@ -6946,8 +6954,8 @@ waitagain:
goto out;
}
if (cnt >= PAGE_SIZE)
cnt = PAGE_SIZE - 1;
if (cnt >= TRACE_SEQ_BUFFER_SIZE)
cnt = TRACE_SEQ_BUFFER_SIZE - 1;
/* reset all but tr, trace, and overruns */
trace_iterator_reset(iter);
@ -7292,8 +7300,9 @@ tracing_mark_write(struct file *filp, const char __user *ubuf,
enum event_trigger_type tt = ETT_NONE;
struct trace_buffer *buffer;
struct print_entry *entry;
int meta_size;
ssize_t written;
int size;
size_t size;
int len;
/* Used in tracing_mark_raw_write() as well */
@ -7306,23 +7315,44 @@ tracing_mark_write(struct file *filp, const char __user *ubuf,
if (!(tr->trace_flags & TRACE_ITER_MARKERS))
return -EINVAL;
if (cnt > TRACE_BUF_SIZE)
cnt = TRACE_BUF_SIZE;
if ((ssize_t)cnt < 0)
return -EINVAL;
BUILD_BUG_ON(TRACE_BUF_SIZE >= PAGE_SIZE);
size = sizeof(*entry) + cnt + 2; /* add '\0' and possible '\n' */
meta_size = sizeof(*entry) + 2; /* add '\0' and possible '\n' */
again:
size = cnt + meta_size;
/* If less than "<faulted>", then make sure we can still add that */
if (cnt < FAULTED_SIZE)
size += FAULTED_SIZE - cnt;
if (size > TRACE_SEQ_BUFFER_SIZE) {
cnt -= size - TRACE_SEQ_BUFFER_SIZE;
goto again;
}
buffer = tr->array_buffer.buffer;
event = __trace_buffer_lock_reserve(buffer, TRACE_PRINT, size,
tracing_gen_ctx());
if (unlikely(!event))
if (unlikely(!event)) {
/*
* If the size was greater than what was allowed, then
* make it smaller and try again.
*/
if (size > ring_buffer_max_event_size(buffer)) {
/* cnt < FAULTED size should never be bigger than max */
if (WARN_ON_ONCE(cnt < FAULTED_SIZE))
return -EBADF;
cnt = ring_buffer_max_event_size(buffer) - meta_size;
/* The above should only happen once */
if (WARN_ON_ONCE(cnt + meta_size == size))
return -EBADF;
goto again;
}
/* Ring buffer disabled, return as if not open for write */
return -EBADF;
}
entry = ring_buffer_event_data(event);
entry->ip = _THIS_IP_;
@ -7357,9 +7387,6 @@ tracing_mark_write(struct file *filp, const char __user *ubuf,
return written;
}
/* Limit it for now to 3K (including tag) */
#define RAW_DATA_MAX_SIZE (1024*3)
static ssize_t
tracing_mark_raw_write(struct file *filp, const char __user *ubuf,
size_t cnt, loff_t *fpos)
@ -7381,19 +7408,18 @@ tracing_mark_raw_write(struct file *filp, const char __user *ubuf,
return -EINVAL;
/* The marker must at least have a tag id */
if (cnt < sizeof(unsigned int) || cnt > RAW_DATA_MAX_SIZE)
if (cnt < sizeof(unsigned int))
return -EINVAL;
if (cnt > TRACE_BUF_SIZE)
cnt = TRACE_BUF_SIZE;
BUILD_BUG_ON(TRACE_BUF_SIZE >= PAGE_SIZE);
size = sizeof(*entry) + cnt;
if (cnt < FAULT_SIZE_ID)
size += FAULT_SIZE_ID - cnt;
buffer = tr->array_buffer.buffer;
if (size > ring_buffer_max_event_size(buffer))
return -EINVAL;
event = __trace_buffer_lock_reserve(buffer, TRACE_RAW_DATA, size,
tracing_gen_ctx());
if (!event)
@ -7578,6 +7604,7 @@ struct ftrace_buffer_info {
struct trace_iterator iter;
void *spare;
unsigned int spare_cpu;
unsigned int spare_size;
unsigned int read;
};
@ -8282,6 +8309,8 @@ tracing_buffers_read(struct file *filp, char __user *ubuf,
{
struct ftrace_buffer_info *info = filp->private_data;
struct trace_iterator *iter = &info->iter;
void *trace_data;
int page_size;
ssize_t ret = 0;
ssize_t size;
@ -8293,6 +8322,17 @@ tracing_buffers_read(struct file *filp, char __user *ubuf,
return -EBUSY;
#endif
page_size = ring_buffer_subbuf_size_get(iter->array_buffer->buffer);
/* Make sure the spare matches the current sub buffer size */
if (info->spare) {
if (page_size != info->spare_size) {
ring_buffer_free_read_page(iter->array_buffer->buffer,
info->spare_cpu, info->spare);
info->spare = NULL;
}
}
if (!info->spare) {
info->spare = ring_buffer_alloc_read_page(iter->array_buffer->buffer,
iter->cpu_file);
@ -8301,19 +8341,20 @@ tracing_buffers_read(struct file *filp, char __user *ubuf,
info->spare = NULL;
} else {
info->spare_cpu = iter->cpu_file;
info->spare_size = page_size;
}
}
if (!info->spare)
return ret;
/* Do we have previous read data to read? */
if (info->read < PAGE_SIZE)
if (info->read < page_size)
goto read;
again:
trace_access_lock(iter->cpu_file);
ret = ring_buffer_read_page(iter->array_buffer->buffer,
&info->spare,
info->spare,
count,
iter->cpu_file, 0);
trace_access_unlock(iter->cpu_file);
@ -8334,11 +8375,11 @@ tracing_buffers_read(struct file *filp, char __user *ubuf,
info->read = 0;
read:
size = PAGE_SIZE - info->read;
size = page_size - info->read;
if (size > count)
size = count;
ret = copy_to_user(ubuf, info->spare + info->read, size);
trace_data = ring_buffer_read_page_data(info->spare);
ret = copy_to_user(ubuf, trace_data + info->read, size);
if (ret == size)
return -EFAULT;
@ -8449,6 +8490,7 @@ tracing_buffers_splice_read(struct file *file, loff_t *ppos,
.spd_release = buffer_spd_release,
};
struct buffer_ref *ref;
int page_size;
int entries, i;
ssize_t ret = 0;
@ -8457,13 +8499,14 @@ tracing_buffers_splice_read(struct file *file, loff_t *ppos,
return -EBUSY;
#endif
if (*ppos & (PAGE_SIZE - 1))
page_size = ring_buffer_subbuf_size_get(iter->array_buffer->buffer);
if (*ppos & (page_size - 1))
return -EINVAL;
if (len & (PAGE_SIZE - 1)) {
if (len < PAGE_SIZE)
if (len & (page_size - 1)) {
if (len < page_size)
return -EINVAL;
len &= PAGE_MASK;
len &= (~(page_size - 1));
}
if (splice_grow_spd(pipe, &spd))
@ -8473,7 +8516,7 @@ tracing_buffers_splice_read(struct file *file, loff_t *ppos,
trace_access_lock(iter->cpu_file);
entries = ring_buffer_entries_cpu(iter->array_buffer->buffer, iter->cpu_file);
for (i = 0; i < spd.nr_pages_max && len && entries; i++, len -= PAGE_SIZE) {
for (i = 0; i < spd.nr_pages_max && len && entries; i++, len -= page_size) {
struct page *page;
int r;
@ -8494,7 +8537,7 @@ tracing_buffers_splice_read(struct file *file, loff_t *ppos,
}
ref->cpu = iter->cpu_file;
r = ring_buffer_read_page(ref->buffer, &ref->page,
r = ring_buffer_read_page(ref->buffer, ref->page,
len, iter->cpu_file, 1);
if (r < 0) {
ring_buffer_free_read_page(ref->buffer, ref->cpu,
@ -8503,14 +8546,14 @@ tracing_buffers_splice_read(struct file *file, loff_t *ppos,
break;
}
page = virt_to_page(ref->page);
page = virt_to_page(ring_buffer_read_page_data(ref->page));
spd.pages[i] = page;
spd.partial[i].len = PAGE_SIZE;
spd.partial[i].len = page_size;
spd.partial[i].offset = 0;
spd.partial[i].private = (unsigned long)ref;
spd.nr_pages++;
*ppos += PAGE_SIZE;
*ppos += page_size;
entries = ring_buffer_entries_cpu(iter->array_buffer->buffer, iter->cpu_file);
}
@ -9354,6 +9397,103 @@ static const struct file_operations buffer_percent_fops = {
.llseek = default_llseek,
};
static ssize_t
buffer_subbuf_size_read(struct file *filp, char __user *ubuf, size_t cnt, loff_t *ppos)
{
struct trace_array *tr = filp->private_data;
size_t size;
char buf[64];
int order;
int r;
order = ring_buffer_subbuf_order_get(tr->array_buffer.buffer);
size = (PAGE_SIZE << order) / 1024;
r = sprintf(buf, "%zd\n", size);
return simple_read_from_buffer(ubuf, cnt, ppos, buf, r);
}
static ssize_t
buffer_subbuf_size_write(struct file *filp, const char __user *ubuf,
size_t cnt, loff_t *ppos)
{
struct trace_array *tr = filp->private_data;
unsigned long val;
int old_order;
int order;
int pages;
int ret;
ret = kstrtoul_from_user(ubuf, cnt, 10, &val);
if (ret)
return ret;
val *= 1024; /* value passed in is in KB */
pages = DIV_ROUND_UP(val, PAGE_SIZE);
order = fls(pages - 1);
/* limit between 1 and 128 system pages */
if (order < 0 || order > 7)
return -EINVAL;
/* Do not allow tracing while changing the order of the ring buffer */
tracing_stop_tr(tr);
old_order = ring_buffer_subbuf_order_get(tr->array_buffer.buffer);
if (old_order == order)
goto out;
ret = ring_buffer_subbuf_order_set(tr->array_buffer.buffer, order);
if (ret)
goto out;
#ifdef CONFIG_TRACER_MAX_TRACE
if (!tr->allocated_snapshot)
goto out_max;
ret = ring_buffer_subbuf_order_set(tr->max_buffer.buffer, order);
if (ret) {
/* Put back the old order */
cnt = ring_buffer_subbuf_order_set(tr->array_buffer.buffer, old_order);
if (WARN_ON_ONCE(cnt)) {
/*
* AARGH! We are left with different orders!
* The max buffer is our "snapshot" buffer.
* When a tracer needs a snapshot (one of the
* latency tracers), it swaps the max buffer
* with the saved snap shot. We succeeded to
* update the order of the main buffer, but failed to
* update the order of the max buffer. But when we tried
* to reset the main buffer to the original size, we
* failed there too. This is very unlikely to
* happen, but if it does, warn and kill all
* tracing.
*/
tracing_disabled = 1;
}
goto out;
}
out_max:
#endif
(*ppos)++;
out:
if (ret)
cnt = ret;
tracing_start_tr(tr);
return cnt;
}
static const struct file_operations buffer_subbuf_size_fops = {
.open = tracing_open_generic_tr,
.read = buffer_subbuf_size_read,
.write = buffer_subbuf_size_write,
.release = tracing_release_generic_tr,
.llseek = default_llseek,
};
static struct dentry *trace_instance_dir;
static void
@ -9504,7 +9644,8 @@ static int trace_array_create_dir(struct trace_array *tr)
return ret;
}
static struct trace_array *trace_array_create(const char *name)
static struct trace_array *
trace_array_create_systems(const char *name, const char *systems)
{
struct trace_array *tr;
int ret;
@ -9524,6 +9665,12 @@ static struct trace_array *trace_array_create(const char *name)
if (!zalloc_cpumask_var(&tr->pipe_cpumask, GFP_KERNEL))
goto out_free_tr;
if (systems) {
tr->system_names = kstrdup_const(systems, GFP_KERNEL);
if (!tr->system_names)
goto out_free_tr;
}
tr->trace_flags = global_trace.trace_flags & ~ZEROED_TRACE_FLAGS;
cpumask_copy(tr->tracing_cpumask, cpu_all_mask);
@ -9570,12 +9717,18 @@ static struct trace_array *trace_array_create(const char *name)
free_trace_buffers(tr);
free_cpumask_var(tr->pipe_cpumask);
free_cpumask_var(tr->tracing_cpumask);
kfree_const(tr->system_names);
kfree(tr->name);
kfree(tr);
return ERR_PTR(ret);
}
static struct trace_array *trace_array_create(const char *name)
{
return trace_array_create_systems(name, NULL);
}
static int instance_mkdir(const char *name)
{
struct trace_array *tr;
@ -9601,6 +9754,7 @@ out_unlock:
/**
* trace_array_get_by_name - Create/Lookup a trace array, given its name.
* @name: The name of the trace array to be looked up/created.
* @systems: A list of systems to create event directories for (NULL for all)
*
* Returns pointer to trace array with given name.
* NULL, if it cannot be created.
@ -9614,7 +9768,7 @@ out_unlock:
* trace_array_put() is called, user space can not delete it.
*
*/
struct trace_array *trace_array_get_by_name(const char *name)
struct trace_array *trace_array_get_by_name(const char *name, const char *systems)
{
struct trace_array *tr;
@ -9626,7 +9780,7 @@ struct trace_array *trace_array_get_by_name(const char *name)
goto out_unlock;
}
tr = trace_array_create(name);
tr = trace_array_create_systems(name, systems);
if (IS_ERR(tr))
tr = NULL;
@ -9673,6 +9827,7 @@ static int __remove_instance(struct trace_array *tr)
free_cpumask_var(tr->pipe_cpumask);
free_cpumask_var(tr->tracing_cpumask);
kfree_const(tr->system_names);
kfree(tr->name);
kfree(tr);
@ -9805,6 +9960,9 @@ init_tracer_tracefs(struct trace_array *tr, struct dentry *d_tracer)
trace_create_file("buffer_percent", TRACE_MODE_WRITE, d_tracer,
tr, &buffer_percent_fops);
trace_create_file("buffer_subbuf_size_kb", TRACE_MODE_WRITE, d_tracer,
tr, &buffer_subbuf_size_fops);
create_trace_options_dir(tr);
#ifdef CONFIG_TRACER_MAX_TRACE
@ -10391,7 +10549,7 @@ __init static void enable_instances(void)
if (IS_ENABLED(CONFIG_TRACER_MAX_TRACE))
do_allocate_snapshot(tok);
tr = trace_array_get_by_name(tok);
tr = trace_array_get_by_name(tok, NULL);
if (!tr) {
pr_warn("Failed to create instance buffer %s\n", curr_str);
continue;

View File

@ -377,6 +377,7 @@ struct trace_array {
unsigned char trace_flags_index[TRACE_FLAGS_MAX_SIZE];
unsigned int flags;
raw_spinlock_t start_lock;
const char *system_names;
struct list_head err_log;
struct dentry *dir;
struct dentry *options;
@ -615,6 +616,7 @@ void tracing_reset_all_online_cpus(void);
void tracing_reset_all_online_cpus_unlocked(void);
int tracing_open_generic(struct inode *inode, struct file *filp);
int tracing_open_generic_tr(struct inode *inode, struct file *filp);
int tracing_release_generic_tr(struct inode *inode, struct file *file);
int tracing_open_file_tr(struct inode *inode, struct file *filp);
int tracing_release_file_tr(struct inode *inode, struct file *filp);
int tracing_single_release_file_tr(struct inode *inode, struct file *filp);

View File

@ -633,7 +633,7 @@ trace_boot_init_instances(struct xbc_node *node)
if (!p || *p == '\0')
continue;
tr = trace_array_get_by_name(p);
tr = trace_array_get_by_name(p, NULL);
if (!tr) {
pr_err("Failed to get trace instance %s\n", p);
continue;

View File

@ -1893,9 +1893,9 @@ subsystem_filter_write(struct file *filp, const char __user *ubuf, size_t cnt,
}
static ssize_t
show_header(struct file *filp, char __user *ubuf, size_t cnt, loff_t *ppos)
show_header_page_file(struct file *filp, char __user *ubuf, size_t cnt, loff_t *ppos)
{
int (*func)(struct trace_seq *s) = filp->private_data;
struct trace_array *tr = filp->private_data;
struct trace_seq *s;
int r;
@ -1908,7 +1908,31 @@ show_header(struct file *filp, char __user *ubuf, size_t cnt, loff_t *ppos)
trace_seq_init(s);
func(s);
ring_buffer_print_page_header(tr->array_buffer.buffer, s);
r = simple_read_from_buffer(ubuf, cnt, ppos,
s->buffer, trace_seq_used(s));
kfree(s);
return r;
}
static ssize_t
show_header_event_file(struct file *filp, char __user *ubuf, size_t cnt, loff_t *ppos)
{
struct trace_seq *s;
int r;
if (*ppos)
return 0;
s = kmalloc(sizeof(*s), GFP_KERNEL);
if (!s)
return -ENOMEM;
trace_seq_init(s);
ring_buffer_print_entry_header(s);
r = simple_read_from_buffer(ubuf, cnt, ppos,
s->buffer, trace_seq_used(s));
@ -2165,10 +2189,18 @@ static const struct file_operations ftrace_tr_enable_fops = {
.release = subsystem_release,
};
static const struct file_operations ftrace_show_header_fops = {
.open = tracing_open_generic,
.read = show_header,
static const struct file_operations ftrace_show_header_page_fops = {
.open = tracing_open_generic_tr,
.read = show_header_page_file,
.llseek = default_llseek,
.release = tracing_release_generic_tr,
};
static const struct file_operations ftrace_show_header_event_fops = {
.open = tracing_open_generic_tr,
.read = show_header_event_file,
.llseek = default_llseek,
.release = tracing_release_generic_tr,
};
static int
@ -2896,6 +2928,27 @@ void trace_event_eval_update(struct trace_eval_map **map, int len)
up_write(&trace_event_sem);
}
static bool event_in_systems(struct trace_event_call *call,
const char *systems)
{
const char *system;
const char *p;
if (!systems)
return true;
system = call->class->system;
p = strstr(systems, system);
if (!p)
return false;
if (p != systems && !isspace(*(p - 1)) && *(p - 1) != ',')
return false;
p += strlen(system);
return !*p || isspace(*p) || *p == ',';
}
static struct trace_event_file *
trace_create_new_event(struct trace_event_call *call,
struct trace_array *tr)
@ -2905,9 +2958,12 @@ trace_create_new_event(struct trace_event_call *call,
struct trace_event_file *file;
unsigned int first;
if (!event_in_systems(call, tr->system_names))
return NULL;
file = kmem_cache_alloc(file_cachep, GFP_TRACE);
if (!file)
return NULL;
return ERR_PTR(-ENOMEM);
pid_list = rcu_dereference_protected(tr->filtered_pids,
lockdep_is_held(&event_mutex));
@ -2972,8 +3028,17 @@ __trace_add_new_event(struct trace_event_call *call, struct trace_array *tr)
struct trace_event_file *file;
file = trace_create_new_event(call, tr);
/*
* trace_create_new_event() returns ERR_PTR(-ENOMEM) if failed
* allocation, or NULL if the event is not part of the tr->system_names.
* When the event is not part of the tr->system_names, return zero, not
* an error.
*/
if (!file)
return -ENOMEM;
return 0;
if (IS_ERR(file))
return PTR_ERR(file);
if (eventdir_initialized)
return event_create_dir(tr->event_dir, file);
@ -3012,8 +3077,17 @@ __trace_early_add_new_event(struct trace_event_call *call,
int ret;
file = trace_create_new_event(call, tr);
/*
* trace_create_new_event() returns ERR_PTR(-ENOMEM) if failed
* allocation, or NULL if the event is not part of the tr->system_names.
* When the event is not part of the tr->system_names, return zero, not
* an error.
*/
if (!file)
return -ENOMEM;
return 0;
if (IS_ERR(file))
return PTR_ERR(file);
ret = event_define_fields(call);
if (ret)
@ -3752,17 +3826,16 @@ static int events_callback(const char *name, umode_t *mode, void **data,
return 1;
}
if (strcmp(name, "header_page") == 0)
*data = ring_buffer_print_page_header;
if (strcmp(name, "header_page") == 0) {
*mode = TRACE_MODE_READ;
*fops = &ftrace_show_header_page_fops;
else if (strcmp(name, "header_event") == 0)
*data = ring_buffer_print_entry_header;
else
} else if (strcmp(name, "header_event") == 0) {
*mode = TRACE_MODE_READ;
*fops = &ftrace_show_header_event_fops;
} else
return 0;
*mode = TRACE_MODE_READ;
*fops = &ftrace_show_header_fops;
return 1;
}

View File

@ -4805,36 +4805,35 @@ static int parse_actions(struct hist_trigger_data *hist_data)
int len;
for (i = 0; i < hist_data->attrs->n_actions; i++) {
enum handler_id hid = 0;
char *action_str;
str = hist_data->attrs->action_str[i];
if ((len = str_has_prefix(str, "onmatch("))) {
char *action_str = str + len;
if ((len = str_has_prefix(str, "onmatch(")))
hid = HANDLER_ONMATCH;
else if ((len = str_has_prefix(str, "onmax(")))
hid = HANDLER_ONMAX;
else if ((len = str_has_prefix(str, "onchange(")))
hid = HANDLER_ONCHANGE;
action_str = str + len;
switch (hid) {
case HANDLER_ONMATCH:
data = onmatch_parse(tr, action_str);
if (IS_ERR(data)) {
ret = PTR_ERR(data);
break;
}
} else if ((len = str_has_prefix(str, "onmax("))) {
char *action_str = str + len;
break;
case HANDLER_ONMAX:
case HANDLER_ONCHANGE:
data = track_data_parse(hist_data, action_str, hid);
break;
default:
data = ERR_PTR(-EINVAL);
break;
}
data = track_data_parse(hist_data, action_str,
HANDLER_ONMAX);
if (IS_ERR(data)) {
ret = PTR_ERR(data);
break;
}
} else if ((len = str_has_prefix(str, "onchange("))) {
char *action_str = str + len;
data = track_data_parse(hist_data, action_str,
HANDLER_ONCHANGE);
if (IS_ERR(data)) {
ret = PTR_ERR(data);
break;
}
} else {
ret = -EINVAL;
if (IS_ERR(data)) {
ret = PTR_ERR(data);
break;
}

View File

@ -13,9 +13,6 @@
* trace_seq_init() more than once to reset the trace_seq to start
* from scratch.
*
* The buffer size is currently PAGE_SIZE, although it may become dynamic
* in the future.
*
* A write to the buffer will either succeed or fail. That is, unlike
* sprintf() there will not be a partial write (well it may write into
* the buffer but it wont update the pointers). This allows users to

View File

@ -105,7 +105,7 @@ static int __init sample_trace_array_init(void)
* NOTE: This function increments the reference counter
* associated with the trace array - "tr".
*/
tr = trace_array_get_by_name("sample-instance");
tr = trace_array_get_by_name("sample-instance", "sched,timer,kprobes");
if (!tr)
return -1;

View File

@ -0,0 +1,95 @@
#!/bin/sh
# SPDX-License-Identifier: GPL-2.0
# description: Change the ringbuffer sub-buffer size
# requires: buffer_subbuf_size_kb
# flags: instance
get_buffer_data_size() {
sed -ne 's/^.*data.*size:\([0-9][0-9]*\).*/\1/p' events/header_page
}
get_buffer_data_offset() {
sed -ne 's/^.*data.*offset:\([0-9][0-9]*\).*/\1/p' events/header_page
}
get_event_header_size() {
type_len=`sed -ne 's/^.*type_len.*:[^0-9]*\([0-9][0-9]*\).*/\1/p' events/header_event`
time_len=`sed -ne 's/^.*time_delta.*:[^0-9]*\([0-9][0-9]*\).*/\1/p' events/header_event`
array_len=`sed -ne 's/^.*array.*:[^0-9]*\([0-9][0-9]*\).*/\1/p' events/header_event`
total_bits=$((type_len+time_len+array_len))
total_bits=$((total_bits+7))
echo $((total_bits/8))
}
get_print_event_buf_offset() {
sed -ne 's/^.*buf.*offset:\([0-9][0-9]*\).*/\1/p' events/ftrace/print/format
}
event_header_size=`get_event_header_size`
print_header_size=`get_print_event_buf_offset`
data_offset=`get_buffer_data_offset`
marker_meta=$((event_header_size+print_header_size))
make_str() {
cnt=$1
printf -- 'X%.0s' $(seq $cnt)
}
write_buffer() {
size=$1
str=`make_str $size`
# clear the buffer
echo > trace
# write the string into the marker
echo $str > trace_marker
echo $str
}
test_buffer() {
size_kb=$1
page_size=$((size_kb*1024))
size=`get_buffer_data_size`
# the size must be greater than or equal to page_size - data_offset
page_size=$((page_size-data_offset))
if [ $size -lt $page_size ]; then
exit fail
fi
# Now add a little more the meta data overhead will overflow
str=`write_buffer $size`
# Make sure the line was broken
new_str=`awk ' /tracing_mark_write:/ { sub(/^.*tracing_mark_write: /,"");printf "%s", $0; exit}' trace`
if [ "$new_str" = "$str" ]; then
exit fail;
fi
# Make sure the entire line can be found
new_str=`awk ' /tracing_mark_write:/ { sub(/^.*tracing_mark_write: /,"");printf "%s", $0; }' trace`
if [ "$new_str" != "$str" ]; then
exit fail;
fi
}
ORIG=`cat buffer_subbuf_size_kb`
# Could test bigger sizes than 32K, but then creating the string
# to write into the ring buffer takes too long
for a in 4 8 16 32 ; do
echo $a > buffer_subbuf_size_kb
test_buffer $a
done
echo $ORIG > buffer_subbuf_size_kb

View File

@ -0,0 +1,82 @@
#!/bin/sh
# SPDX-License-Identifier: GPL-2.0
# description: Basic tests on writing to trace_marker
# requires: trace_marker
# flags: instance
get_buffer_data_size() {
sed -ne 's/^.*data.*size:\([0-9][0-9]*\).*/\1/p' events/header_page
}
get_buffer_data_offset() {
sed -ne 's/^.*data.*offset:\([0-9][0-9]*\).*/\1/p' events/header_page
}
get_event_header_size() {
type_len=`sed -ne 's/^.*type_len.*:[^0-9]*\([0-9][0-9]*\).*/\1/p' events/header_event`
time_len=`sed -ne 's/^.*time_delta.*:[^0-9]*\([0-9][0-9]*\).*/\1/p' events/header_event`
array_len=`sed -ne 's/^.*array.*:[^0-9]*\([0-9][0-9]*\).*/\1/p' events/header_event`
total_bits=$((type_len+time_len+array_len))
total_bits=$((total_bits+7))
echo $((total_bits/8))
}
get_print_event_buf_offset() {
sed -ne 's/^.*buf.*offset:\([0-9][0-9]*\).*/\1/p' events/ftrace/print/format
}
event_header_size=`get_event_header_size`
print_header_size=`get_print_event_buf_offset`
data_offset=`get_buffer_data_offset`
marker_meta=$((event_header_size+print_header_size))
make_str() {
cnt=$1
# subtract two for \n\0 as marker adds these
cnt=$((cnt-2))
printf -- 'X%.0s' $(seq $cnt)
}
write_buffer() {
size=$1
str=`make_str $size`
# clear the buffer
echo > trace
# write the string into the marker
echo -n $str > trace_marker
echo $str
}
test_buffer() {
size=`get_buffer_data_size`
oneline_size=$((size-marker_meta))
echo size = $size
echo meta size = $marker_meta
# Now add a little more the meta data overhead will overflow
str=`write_buffer $size`
# Make sure the line was broken
new_str=`awk ' /tracing_mark_write:/ { sub(/^.*tracing_mark_write: /,"");printf "%s", $0; exit}' trace`
if [ "$new_str" = "$str" ]; then
exit fail;
fi
# Make sure the entire line can be found
new_str=`awk ' /tracing_mark_write:/ { sub(/^.*tracing_mark_write: /,"");printf "%s", $0; }' trace`
if [ "$new_str" != "$str" ]; then
exit fail;
fi
}
test_buffer