2022-08-07 22:37:47

by Alan Maguire

[permalink] [raw]
Subject: [RFC tracing 0/4] tracing: support > 8 byte filter predicates

For cases like IPv6 addresses, having a means to supply tracing
predicates for fields with more than 8 bytes would be convenient.
This series provides a simple way to support this by allowing
simple ==, != memory comparison with the predicate supplied when
the size of the field exceeds 8 bytes. For example, to trace
::1, the predicate

"dst == 0x00000000000000000000000000000001"

..could be used.

When investigating this initially, I stumbled upon a kernel
crash when specifying a predicate for a non-string field that is
not 1, 2, 4, or 8 bytes in size. Patch 1 fixes it. Patch 2
provides the support for > 8 byte fields via a memcmp()-style
predicate. Patch 3 adds tests for filter predicates, and patch 4
documents the fact that for > 8 bytes. only == and != are
supported.

Alan Maguire (2):
tracing: predicate matching trigger crashes for > 8-byte arrays
tracing: support > 8 byte array filter predicates

Oracle Public Cloud User (2):
selftests/ftrace: add test coverage for filter predicates
tracing: document > 8 byte numeric filtering support

Documentation/trace/events.rst | 9 +++
kernel/trace/trace_events_filter.c | 59 +++++++++++++++++-
.../selftests/ftrace/test.d/event/filter.tc | 62 +++++++++++++++++++
3 files changed, 129 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/ftrace/test.d/event/filter.tc

--
2.31.1


2022-08-07 22:41:48

by Alan Maguire

[permalink] [raw]
Subject: [RFC tracing 2/4] tracing: support > 8 byte array filter predicates

For > 8 byte values, allow simple binary '==', '!=' predicates
where the user passes in a hex ASCII representation of the
desired value. This representation must match the field size
exactly, and a simple memory comparison between predicate and
actual values is carried out. This will allow predicates with
for example IPv6 addresses to be supported, such as filtering
on ::1

cd /sys/kernel/debug/tracing/events/tcp/tcp_receive_reset
echo "saddr_v6 == 0x00000000000000000000000000000001" > filter

Signed-off-by: Alan Maguire <[email protected]>
---
kernel/trace/trace_events_filter.c | 54 +++++++++++++++++++++++++++++-
1 file changed, 53 insertions(+), 1 deletion(-)

diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index 65e01c8d48d9..31c900b6a83c 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -147,6 +147,8 @@ enum {
PROCESS_OR = 4,
};

+static int filter_pred_memcmp(struct filter_pred *pred, void *event);
+
/*
* Without going into a formal proof, this explains the method that is used in
* parsing the logical expressions.
@@ -583,8 +585,11 @@ predicate_parse(const char *str, int nr_parens, int nr_preds,
kfree(op_stack);
kfree(inverts);
if (prog_stack) {
- for (i = 0; prog_stack[i].pred; i++)
+ for (i = 0; prog_stack[i].pred; i++) {
+ if (prog_stack[i].pred->fn == filter_pred_memcmp)
+ kfree((u8 *)prog_stack[i].pred->val);
kfree(prog_stack[i].pred);
+ }
kfree(prog_stack);
}
return ERR_PTR(ret);
@@ -841,6 +846,14 @@ static int filter_pred_none(struct filter_pred *pred, void *event)
return 0;
}

+static int filter_pred_memcmp(struct filter_pred *pred, void *event)
+{
+ u8 *mem = (u8 *)(event + pred->offset);
+ u8 *cmp = (u8 *)(pred->val);
+
+ return (memcmp(mem, cmp, pred->field->size) == 0) ^ pred->not;
+}
+
/*
* regex_match_foo - Basic regex callbacks
*
@@ -1443,6 +1456,45 @@ static int parse_pred(const char *str, void *data,
/* go past the last quote */
i++;

+ } else if (str[i] == '0' && tolower(str[i + 1]) == 'x' &&
+ field->size > 8) {
+ u8 *pred_val;
+
+ /* For sizes > 8 bytes, we store a binary representation
+ * for comparison; only '==' and '!=' are supported.
+ * To keep things simple, the predicate value must specify
+ * a value that matches the field size exactly, with leading
+ * 0s if necessary.
+ */
+ if (pred->op != OP_EQ && pred->op != OP_NE) {
+ parse_error(pe, FILT_ERR_ILLEGAL_FIELD_OP, pos + i);
+ goto err_free;
+ }
+
+ /* skip required 0x */
+ s += 2;
+ i += 2;
+
+ while (isalnum(str[i]))
+ i++;
+
+ len = i - s;
+ if (len != (field->size * 2)) {
+ parse_error(pe, FILT_ERR_ILLEGAL_FIELD_OP, pos + s);
+ goto err_free;
+ }
+
+ pred_val = kzalloc(field->size, GFP_KERNEL);
+ if (hex2bin(pred_val, str + s, field->size)) {
+ parse_error(pe, FILT_ERR_ILLEGAL_INTVAL, pos + s);
+ kfree(pred_val);
+ goto err_free;
+ }
+ pred->val = (u64)pred_val;
+ pred->fn = filter_pred_memcmp;
+ if (pred->op == OP_NE)
+ pred->not = 1;
+
} else if (isdigit(str[i]) || str[i] == '-') {

/* Make sure the field is not a string */
--
2.31.1

2022-08-07 22:42:29

by Alan Maguire

[permalink] [raw]
Subject: [RFC tracing 3/4] selftests/ftrace: add test coverage for filter predicates

add tests verifying filter predicates work for 1/2/4/8/16 byte values
and strings; use predicates at event and subsystem level.

Signed-off-by: Alan Maguire <[email protected]>
---
.../selftests/ftrace/test.d/event/filter.tc | 62 +++++++++++++++++++
1 file changed, 62 insertions(+)
create mode 100644 tools/testing/selftests/ftrace/test.d/event/filter.tc

diff --git a/tools/testing/selftests/ftrace/test.d/event/filter.tc b/tools/testing/selftests/ftrace/test.d/event/filter.tc
new file mode 100644
index 000000000000..396383519f84
--- /dev/null
+++ b/tools/testing/selftests/ftrace/test.d/event/filter.tc
@@ -0,0 +1,62 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+# description: event tracing - enable filter predicates
+# requires: set_event events/sched
+# flags:
+
+do_reset() {
+ echo 0 > ${event}/enable
+ echo 0 > ${event}/filter
+ clear_trace
+}
+
+fail() { #msg
+ echo $1
+ exit_fail
+}
+
+# verify filter predicates at trace event/subsys level for
+# - string (prev_comm)
+# - 1-byte value (common_flags)
+# - 2-byte value (common_type)
+# - 4-byte value (next_pid)
+# - 8-byte value (prev_state)
+
+for event in events/sched/sched_switch events/sched
+do
+ for filter in "prev_comm == 'ping'" \
+ "common_flags != 0" \
+ "common_type >= 0" \
+ "next_pid > 0" \
+ "prev_state != 0"
+ do
+ echo "$filter" > ${event}/filter
+ echo 1 > ${event}/enable
+ yield
+ count=`grep sched_switch trace|wc -l`
+ if [ $count -lt 1 ]; then
+ fail "at least one $event should be recorded for '$filter'"
+ fi
+ do_reset
+ done
+done
+
+# verify '==', '!=' filter predicates for 16-byte array at event/subsys
+# level
+
+LOCALHOST="-6 ::1"
+for event in events/fib6/fib6_table_lookup events/fib6 ; do
+ for filter in "dst == 0x00000000000000000000000000000001" \
+ "src != 0x00000000000000000000000000000001"
+ do
+ echo "$filter" > ${event}/filter
+ echo 1 > ${event}/enable
+ yield
+ count=`grep fib6_table_lookup trace|wc -l`
+ if [ $count -lt 1 ]; then
+ fail "at least one $event should be recorded for '$filter'"
+ fi
+ do_reset
+ done
+done
+exit 0
--
2.31.1

2022-08-07 22:43:33

by Alan Maguire

[permalink] [raw]
Subject: [RFC tracing 1/4] tracing: predicate matching trigger crashes for > 8-byte arrays

The following (wrong) use of tracepoint filtering was enough to trigger
a null-pointer dereference crash:

cd /sys/kernel/debug/tracing
echo "saddr_v6 == 0x0100007f" > tcp/tcp_receive_reset/filter
echo 1 > tcp/tcp_receive_reset/enable
wget https://localhost

This works fine if saddr - a 4-byte array representing the source address -
is used instead.

Fix is to handle case where we encounter an unexpected size.

Signed-off-by: Alan Maguire <[email protected]>
---
kernel/trace/trace_events_filter.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index 4b1057ab9d96..65e01c8d48d9 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -1490,6 +1490,11 @@ static int parse_pred(const char *str, void *data,
else {
pred->fn = select_comparison_fn(pred->op, field->size,
field->is_signed);
+ if (!pred->fn) {
+ parse_error(pe, FILT_ERR_ILLEGAL_FIELD_OP,
+ pos + i);
+ goto err_free;
+ }
if (pred->op == OP_NE)
pred->not = 1;
}
--
2.31.1

2022-08-07 22:44:24

by Alan Maguire

[permalink] [raw]
Subject: [RFC tracing 4/4] tracing: document > 8 byte numeric filtering support

For values > 8 bytes in size, only == and != filter predicates are
supported; document this.

Signed-off-by: Alan Maguire <[email protected]>
---
Documentation/trace/events.rst | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/Documentation/trace/events.rst b/Documentation/trace/events.rst
index c47f381d0c00..318dba2fe3ee 100644
--- a/Documentation/trace/events.rst
+++ b/Documentation/trace/events.rst
@@ -186,6 +186,15 @@ The operators available for numeric fields are:

==, !=, <, <=, >, >=, &

+For numeric fields larger than 8 bytes, only
+
+==, !=
+
+...are allowed, and values for comparison must match field size exactly.
+For example, to match the "::1" IPv6 address:
+
+"dst == 0x00000000000000000000000000000001"
+
And for string fields they are:

==, !=, ~
--
2.31.1

2022-08-07 23:02:03

by Steven Rostedt

[permalink] [raw]
Subject: Re: [RFC tracing 1/4] tracing: predicate matching trigger crashes for > 8-byte arrays

On Sun, 7 Aug 2022 23:21:20 +0100
Alan Maguire <[email protected]> wrote:

> The following (wrong) use of tracepoint filtering was enough to trigger
> a null-pointer dereference crash:
>
> cd /sys/kernel/debug/tracing
> echo "saddr_v6 == 0x0100007f" > tcp/tcp_receive_reset/filter
> echo 1 > tcp/tcp_receive_reset/enable
> wget https://localhost
>
> This works fine if saddr - a 4-byte array representing the source address -
> is used instead.
>

The patch series is a new feature so it would need to go into the next
merge window. But this patch looks to be a bug fix, so I'll pull this
one in separately, and tag it for stable.

Thanks,

-- Steve


> Fix is to handle case where we encounter an unexpected size.
>
> Signed-off-by: Alan Maguire <[email protected]>
> ---
> kernel/trace/trace_events_filter.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
> index 4b1057ab9d96..65e01c8d48d9 100644
> --- a/kernel/trace/trace_events_filter.c
> +++ b/kernel/trace/trace_events_filter.c
> @@ -1490,6 +1490,11 @@ static int parse_pred(const char *str, void *data,
> else {
> pred->fn = select_comparison_fn(pred->op, field->size,
> field->is_signed);
> + if (!pred->fn) {
> + parse_error(pe, FILT_ERR_ILLEGAL_FIELD_OP,
> + pos + i);
> + goto err_free;
> + }
> if (pred->op == OP_NE)
> pred->not = 1;
> }