2022-01-07 04:50:07

by Pingfan Liu

[permalink] [raw]
Subject: [PATCH 0/3] tracing/filter: make filter_pred_pchar() survive the access to user space

When
echo "filename ~ \"cpu\"" > events/syscalls/sys_enter_openat/filter
echo 1 > events/syscalls/sys_enter_at/enable

The kernel will run into a #PF (see [3/3].
This series resorts to copy_from_user() to cope with the access to user
space.

Cc: Steven Rostedt <[email protected]>
Cc: Ingo Molnar <[email protected]>
To: [email protected]


Pingfan Liu (3):
tracing/filter: degrade addr in filter_pred_string() from double
pointer to pointer
tracing/filter: harden the prototype of predicate_parse()
tracing/filter: make filter_pred_pchar() survive the access to user
space

kernel/trace/trace.h | 1 +
kernel/trace/trace_events_filter.c | 38 ++++++++++++++++++++++++------
2 files changed, 32 insertions(+), 7 deletions(-)

--
2.31.1



2022-01-07 04:50:14

by Pingfan Liu

[permalink] [raw]
Subject: [PATCH 1/3] tracing/filter: degrade addr in filter_pred_string() from double pointer to pointer

Since FILTER_PTR_STRING has the type of "char *", it is meaningless to
convert it to "char **". Hence degrading addr from double pointer to
single.

Signed-off-by: Pingfan Liu <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ingo Molnar <[email protected]>
To: [email protected]
---
kernel/trace/trace_events_filter.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index c9124038b140..264456e1698f 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -670,11 +670,11 @@ static int filter_pred_string(struct filter_pred *pred, void *event)
/* Filter predicate for char * pointers */
static int filter_pred_pchar(struct filter_pred *pred, void *event)
{
- char **addr = (char **)(event + pred->offset);
+ char *addr = (char *)(event + pred->offset);
int cmp, match;
- int len = strlen(*addr) + 1; /* including tailing '\0' */
+ int len = strlen(addr) + 1; /* including tailing '\0' */

- cmp = pred->regex.match(*addr, &pred->regex, len);
+ cmp = pred->regex.match(addr, &pred->regex, len);

match = cmp ^ pred->not;

--
2.31.1


2022-01-07 04:50:14

by Pingfan Liu

[permalink] [raw]
Subject: [PATCH 2/3] tracing/filter: harden the prototype of predicate_parse()

Since the next patch badly relies on the struct 'trace_event_call' to
pass in 'event_call_class info', making the involved functions' prototype
stricter.

There is no functional change in this patch.

Signed-off-by: Pingfan Liu <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ingo Molnar <[email protected]>
To: [email protected]
---
kernel/trace/trace_events_filter.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index 264456e1698f..2a05315127f9 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -136,7 +136,8 @@ static void parse_error(struct filter_parse_error *pe, int err, int pos)
pe->lasterr_pos = pos;
}

-typedef int (*parse_pred_fn)(const char *str, void *data, int pos,
+typedef int (*parse_pred_fn)(const char *str, struct trace_event_call *data,
+ int pos,
struct filter_parse_error *pe,
struct filter_pred **pred);

@@ -408,7 +409,7 @@ enum {
*/
static struct prog_entry *
predicate_parse(const char *str, int nr_parens, int nr_preds,
- parse_pred_fn parse_pred, void *data,
+ parse_pred_fn parse_pred, struct trace_event_call *data,
struct filter_parse_error *pe)
{
struct prog_entry *prog_stack;
@@ -1149,7 +1150,7 @@ static filter_pred_fn_t select_comparison_fn(enum filter_op_ids op,
}

/* Called when a predicate is encountered by predicate_parse() */
-static int parse_pred(const char *str, void *data,
+static int parse_pred(const char *str, struct trace_event_call *data,
int pos, struct filter_parse_error *pe,
struct filter_pred **pred_ptr)
{
--
2.31.1


2022-01-07 04:50:18

by Pingfan Liu

[permalink] [raw]
Subject: [PATCH 3/3] tracing/filter: make filter_pred_pchar() survive the access to user space

When
echo "filename ~ \"cpu\"" > events/syscalls/sys_enter_openat/filter
echo 1 > events/syscalls/sys_enter_at/enable

Then the following #PF is observed:
kvm-03-guest16 login: [72198.026181] BUG: unable to handle page fault for address: 00007fffaae8ef60
[72198.027625] #PF: supervisor read access in kernel mode
[72198.028627] #PF: error_code(0x0001) - permissions violation
[72198.029708] PGD 80000001008b7067 P4D 80000001008b7067 PUD 2393f1067 PMD 2393ec067 PTE 8000000108f47867
[72198.031588] Oops: 0001 [#1] PREEMPT SMP PTI
[72198.032410] CPU: 1 PID: 1 Comm: systemd Kdump: loaded Not tainted 5.14.0-32.el9.x86_64 #1
[72198.034021] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[72198.035190] RIP: 0010:strlen+0x0/0x20
[72198.035914] Code: 48 89 f9 74 09 48 83 c1 01 80 39 00 75 f7 31 d2 44 0f b6 04 16 44 88 04 11 48 83 c2 01 45 84 c0 75 ee c3 0f 1f 80 00 00 00 00 <80> 3f 00 74 10 48 89 f8 48 83 c0 01 80 38 00 75 f7 48 29 f8 c3 31
[72198.039576] RSP: 0018:ffffb5b900013e48 EFLAGS: 00010246
[72198.040593] RAX: 0000000000000018 RBX: ffff8fc1c49ede00 RCX: 0000000000000000
[72198.041991] RDX: 0000000000000020 RSI: ffff8fc1c02d601c RDI: 00007fffaae8ef60
[72198.043419] RBP: 00007fffaae8ef60 R08: 0005034f4ddb8ea4 R09: 0000000000000000
[72198.044800] R10: ffff8fc1c02d601c R11: 0000000000000000 R12: ffff8fc1c8a6e380
[72198.046185] R13: 0000000000000000 R14: ffff8fc1c02d6010 R15: ffff8fc1c00453c0
[72198.047610] FS: 00007fa86123db40(0000) GS:ffff8fc2ffd00000(0000) knlGS:0000000000000000
[72198.049206] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[72198.050332] CR2: 00007fffaae8ef60 CR3: 0000000102880001 CR4: 00000000007706e0
[72198.051760] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[72198.053168] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[72198.054550] PKRU: 55555554
[72198.055114] Call Trace:
[72198.055616] filter_pred_pchar+0x18/0x40
[72198.056421] filter_match_preds+0x31/0x70
[72198.057210] ftrace_syscall_enter+0x27a/0x2c0
[72198.058088] syscall_trace_enter.constprop.0+0x1aa/0x1d0
[72198.059163] do_syscall_64+0x16/0x90
[72198.059898] entry_SYSCALL_64_after_hwframe+0x44/0xae
[72198.060904] RIP: 0033:0x7fa861d88664

Apparently, it is caused by supervisor read access in kernel mode.

To tackle this issue caused by event_class_syscall_enter, using the pair
of user_access_{begin/end}() may be an efficient method, but it means to
stir up _ASM_EXTABLE. Hence this patch picks up the road of
copy_from_user(). This is achieved by introducing a field 'uaccess' in
ftrace_event_field, and run regex.match on the copied buffer.

Signed-off-by: Pingfan Liu <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ingo Molnar <[email protected]>
To: [email protected]
---
kernel/trace/trace.h | 1 +
kernel/trace/trace_events_filter.c | 29 ++++++++++++++++++++++++++---
2 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 38715aa6cfdf..81a263a060e8 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -1281,6 +1281,7 @@ struct ftrace_event_field {
int offset;
int size;
int is_signed;
+ int uaccess;
};

struct prog_entry;
diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index 2a05315127f9..9af268b98c61 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -10,6 +10,7 @@
#include <linux/mutex.h>
#include <linux/perf_event.h>
#include <linux/slab.h>
+#include <linux/syscalls.h>

#include "trace.h"
#include "trace_output.h"
@@ -672,12 +673,30 @@ static int filter_pred_string(struct filter_pred *pred, void *event)
static int filter_pred_pchar(struct filter_pred *pred, void *event)
{
char *addr = (char *)(event + pred->offset);
+ char *udata, *cmp_buff;
int cmp, match;
- int len = strlen(addr) + 1; /* including tailing '\0' */
+ int len, poffset;
+
+ if (unlikely(pred->field->uaccess)) {
+ udata = kmalloc(PAGE_SIZE, GFP_KERNEL);
+ if (!udata)
+ return -ENOMEM;
+ poffset = (ulong)addr & (PAGE_SIZE - 1);
+ cmp_buff = udata + poffset;
+ if (copy_from_user(cmp_buff, addr, PAGE_SIZE - poffset)) {
+ kfree(udata);
+ return -EFAULT;
+ }
+ } else {
+ cmp_buff = addr;
+ }
+ len = strlen(cmp_buff) + 1; /* including tailing '\0' */

- cmp = pred->regex.match(addr, &pred->regex, len);
+ cmp = pred->regex.match(cmp_buff, &pred->regex, len);

match = cmp ^ pred->not;
+ if (unlikely(pred->field->uaccess))
+ kfree(udata);

return match;
}
@@ -1220,6 +1239,7 @@ static int parse_pred(const char *str, struct trace_event_call *data,
return -ENOMEM;

pred->field = field;
+ field->uaccess = 0;
pred->offset = field->offset;
pred->op = op;

@@ -1321,8 +1341,11 @@ static int parse_pred(const char *str, struct trace_event_call *data,

} else if (field->filter_type == FILTER_DYN_STRING)
pred->fn = filter_pred_strloc;
- else
+ else {
pred->fn = filter_pred_pchar;
+ if (data->class == &event_class_syscall_enter)
+ pred->field->uaccess = 1;
+ }
/* go past the last quote */
i++;

--
2.31.1


2022-01-07 17:18:47

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 1/3] tracing/filter: degrade addr in filter_pred_string() from double pointer to pointer

On Fri, 7 Jan 2022 12:49:49 +0800
Pingfan Liu <[email protected]> wrote:

> Since FILTER_PTR_STRING has the type of "char *", it is meaningless to
> convert it to "char **". Hence degrading addr from double pointer to
> single.
>
> Signed-off-by: Pingfan Liu <[email protected]>
> Cc: Steven Rostedt <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> To: [email protected]
> ---
> kernel/trace/trace_events_filter.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
> index c9124038b140..264456e1698f 100644
> --- a/kernel/trace/trace_events_filter.c
> +++ b/kernel/trace/trace_events_filter.c
> @@ -670,11 +670,11 @@ static int filter_pred_string(struct filter_pred *pred, void *event)
> /* Filter predicate for char * pointers */
> static int filter_pred_pchar(struct filter_pred *pred, void *event)
> {
> - char **addr = (char **)(event + pred->offset);
> + char *addr = (char *)(event + pred->offset);

This doesn't look right. The address of the pointer should be in the event.
"event" is an address to the content of the event in the kernel ring buffer.

event + pred->offset

Is then the address of position of the event.

Let's say we have an event record at 0xffff8000, and the pred->offset is at
0x10. And the pointer to the string (in user space) is at 0x70008000.

0xffff8000: <heade>
0xffff8010: 0x70008000

0x70008000: "user space sting"

event + pred->offset gives us 0xffff8010

If we now have that as char *addr, then addr is 0xffff8010


> int cmp, match;
> - int len = strlen(*addr) + 1; /* including tailing '\0' */
> + int len = strlen(addr) + 1; /* including tailing '\0' */

This would give us the addr = 0xffff8010, which is not where the string
exists.

How would this work?

-- Steve


>
> - cmp = pred->regex.match(*addr, &pred->regex, len);
> + cmp = pred->regex.match(addr, &pred->regex, len);
>
> match = cmp ^ pred->not;
>


2022-01-07 22:25:37

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 3/3] tracing/filter: make filter_pred_pchar() survive the access to user space

On Fri, 7 Jan 2022 12:49:51 +0800
Pingfan Liu <[email protected]> wrote:

> When
> echo "filename ~ \"cpu\"" > events/syscalls/sys_enter_openat/filter
> echo 1 > events/syscalls/sys_enter_at/enable

So this is definitely a bug. Thanks for reporting this.

>
> Then the following #PF is observed:
> kvm-03-guest16 login: [72198.026181] BUG: unable to handle page fault for address: 00007fffaae8ef60
> [72198.027625] #PF: supervisor read access in kernel mode
> [72198.028627] #PF: error_code(0x0001) - permissions violation
> [72198.029708] PGD 80000001008b7067 P4D 80000001008b7067 PUD 2393f1067 PMD 2393ec067 PTE 8000000108f47867
> [72198.031588] Oops: 0001 [#1] PREEMPT SMP PTI
> [72198.032410] CPU: 1 PID: 1 Comm: systemd Kdump: loaded Not tainted 5.14.0-32.el9.x86_64 #1
> [72198.034021] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
> [72198.035190] RIP: 0010:strlen+0x0/0x20
> [72198.035914] Code: 48 89 f9 74 09 48 83 c1 01 80 39 00 75 f7 31 d2 44 0f b6 04 16 44 88 04 11 48 83 c2 01 45 84 c0 75 ee c3 0f 1f 80 00 00 00 00 <80> 3f 00 74 10 48 89 f8 48 83 c0 01 80 38 00 75 f7 48 29 f8 c3 31
> [72198.039576] RSP: 0018:ffffb5b900013e48 EFLAGS: 00010246
> [72198.040593] RAX: 0000000000000018 RBX: ffff8fc1c49ede00 RCX: 0000000000000000
> [72198.041991] RDX: 0000000000000020 RSI: ffff8fc1c02d601c RDI: 00007fffaae8ef60
> [72198.043419] RBP: 00007fffaae8ef60 R08: 0005034f4ddb8ea4 R09: 0000000000000000
> [72198.044800] R10: ffff8fc1c02d601c R11: 0000000000000000 R12: ffff8fc1c8a6e380
> [72198.046185] R13: 0000000000000000 R14: ffff8fc1c02d6010 R15: ffff8fc1c00453c0
> [72198.047610] FS: 00007fa86123db40(0000) GS:ffff8fc2ffd00000(0000) knlGS:0000000000000000
> [72198.049206] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [72198.050332] CR2: 00007fffaae8ef60 CR3: 0000000102880001 CR4: 00000000007706e0
> [72198.051760] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [72198.053168] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [72198.054550] PKRU: 55555554
> [72198.055114] Call Trace:
> [72198.055616] filter_pred_pchar+0x18/0x40
> [72198.056421] filter_match_preds+0x31/0x70
> [72198.057210] ftrace_syscall_enter+0x27a/0x2c0
> [72198.058088] syscall_trace_enter.constprop.0+0x1aa/0x1d0
> [72198.059163] do_syscall_64+0x16/0x90
> [72198.059898] entry_SYSCALL_64_after_hwframe+0x44/0xae
> [72198.060904] RIP: 0033:0x7fa861d88664
>
> Apparently, it is caused by supervisor read access in kernel mode.
>
> To tackle this issue caused by event_class_syscall_enter, using the pair
> of user_access_{begin/end}() may be an efficient method, but it means to
> stir up _ASM_EXTABLE. Hence this patch picks up the road of
> copy_from_user(). This is achieved by introducing a field 'uaccess' in
> ftrace_event_field, and run regex.match on the copied buffer.
>
> Signed-off-by: Pingfan Liu <[email protected]>
> Cc: Steven Rostedt <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> To: [email protected]
> ---
> kernel/trace/trace.h | 1 +
> kernel/trace/trace_events_filter.c | 29 ++++++++++++++++++++++++++---
> 2 files changed, 27 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
> index 38715aa6cfdf..81a263a060e8 100644
> --- a/kernel/trace/trace.h
> +++ b/kernel/trace/trace.h
> @@ -1281,6 +1281,7 @@ struct ftrace_event_field {
> int offset;
> int size;
> int is_signed;
> + int uaccess;
> };
>
> struct prog_entry;
> diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
> index 2a05315127f9..9af268b98c61 100644
> --- a/kernel/trace/trace_events_filter.c
> +++ b/kernel/trace/trace_events_filter.c
> @@ -10,6 +10,7 @@
> #include <linux/mutex.h>
> #include <linux/perf_event.h>
> #include <linux/slab.h>
> +#include <linux/syscalls.h>
>
> #include "trace.h"
> #include "trace_output.h"
> @@ -672,12 +673,30 @@ static int filter_pred_string(struct filter_pred *pred, void *event)
> static int filter_pred_pchar(struct filter_pred *pred, void *event)
> {
> char *addr = (char *)(event + pred->offset);
> + char *udata, *cmp_buff;
> int cmp, match;
> - int len = strlen(addr) + 1; /* including tailing '\0' */
> + int len, poffset;
> +
> + if (unlikely(pred->field->uaccess)) {
> + udata = kmalloc(PAGE_SIZE, GFP_KERNEL);

This is called within a tracepoint, thus you can not call kmalloc.

> + if (!udata)
> + return -ENOMEM;
> + poffset = (ulong)addr & (PAGE_SIZE - 1);
> + cmp_buff = udata + poffset;
> + if (copy_from_user(cmp_buff, addr, PAGE_SIZE - poffset)) {
> + kfree(udata);
> + return -EFAULT;
> + }
> + } else {
> + cmp_buff = addr;
> + }
> + len = strlen(cmp_buff) + 1; /* including tailing '\0' */
>
> - cmp = pred->regex.match(addr, &pred->regex, len);
> + cmp = pred->regex.match(cmp_buff, &pred->regex, len);
>
> match = cmp ^ pred->not;
> + if (unlikely(pred->field->uaccess))
> + kfree(udata);
>
> return match;
> }
> @@ -1220,6 +1239,7 @@ static int parse_pred(const char *str, struct trace_event_call *data,
> return -ENOMEM;
>
> pred->field = field;
> + field->uaccess = 0;
> pred->offset = field->offset;
> pred->op = op;
>
> @@ -1321,8 +1341,11 @@ static int parse_pred(const char *str, struct trace_event_call *data,
>
> } else if (field->filter_type == FILTER_DYN_STRING)
> pred->fn = filter_pred_strloc;
> - else
> + else {
> pred->fn = filter_pred_pchar;
> + if (data->class == &event_class_syscall_enter)
> + pred->field->uaccess = 1;
> + }
> /* go past the last quote */
> i++;
>

This is not the right way to fix it. Simply check if the string is user
space and copy it into a temp per cpu buffer. In fact, kernel space strings
should be tested as well.

I wrote this up. Can you make sure it works for you?

Thanks,

-- Steve

From: Steven Rostedt <[email protected]>
Subject: [PATCH] tracing: Add test for user space strings when filtering on
string pointers

Pingfan reported that the following causes a fault:

echo "filename ~ \"cpu\"" > events/syscalls/sys_enter_openat/filter
echo 1 > events/syscalls/sys_enter_at/enable

The reason is that trace event filter treats the user space pointer
defined by "filename" as a normal pointer to compare against the "cpu"
string. If the string is not loaded into memory yet, it will trigger a
fault in kernel space:

kvm-03-guest16 login: [72198.026181] BUG: unable to handle page fault for address: 00007fffaae8ef60
#PF: supervisor read access in kernel mode
#PF: error_code(0x0001) - permissions violation
PGD 80000001008b7067 P4D 80000001008b7067 PUD 2393f1067 PMD 2393ec067 PTE 8000000108f47867
Oops: 0001 [#1] PREEMPT SMP PTI
CPU: 1 PID: 1 Comm: systemd Kdump: loaded Not tainted 5.14.0-32.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:strlen+0x0/0x20
Code: 48 89 f9 74 09 48 83 c1 01 80 39 00 75 f7 31 d2 44 0f b6 04 16 44 88 04 11
48 83 c2 01 45 84 c0 75 ee c3 0f 1f 80 00 00 00 00 <80> 3f 00 74 10 48 89 f8
48 83 c0 01 80 38 00 75 f7 48 29 f8 c3 31
RSP: 0018:ffffb5b900013e48 EFLAGS: 00010246
RAX: 0000000000000018 RBX: ffff8fc1c49ede00 RCX: 0000000000000000
RDX: 0000000000000020 RSI: ffff8fc1c02d601c RDI: 00007fffaae8ef60
RBP: 00007fffaae8ef60 R08: 0005034f4ddb8ea4 R09: 0000000000000000
R10: ffff8fc1c02d601c R11: 0000000000000000 R12: ffff8fc1c8a6e380
R13: 0000000000000000 R14: ffff8fc1c02d6010 R15: ffff8fc1c00453c0
FS: 00007fa86123db40(0000) GS:ffff8fc2ffd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fffaae8ef60 CR3: 0000000102880001 CR4: 00000000007706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
filter_pred_pchar+0x18/0x40
filter_match_preds+0x31/0x70
ftrace_syscall_enter+0x27a/0x2c0
syscall_trace_enter.constprop.0+0x1aa/0x1d0
do_syscall_64+0x16/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fa861d88664

To be even more robust, test both kernel and user space strings. If the
string fails to read, then simply have the filter fail.

Link: https://lore.kernel.org/all/[email protected]/

Cc: [email protected]
Reported-by: Pingfan Liu <[email protected]>
Fixes: 87a342f5db69d ("tracing/filters: Support filtering for char * strings")
Signed-off-by: Steven Rostedt <[email protected]>
---
kernel/trace/trace_events_filter.c | 79 +++++++++++++++++++++++++++++-
1 file changed, 77 insertions(+), 2 deletions(-)

diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index 996920ed1812..cf0fa9a785c7 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -5,6 +5,7 @@
* Copyright (C) 2009 Tom Zanussi <[email protected]>
*/

+#include <linux/uaccess.h>
#include <linux/module.h>
#include <linux/ctype.h>
#include <linux/mutex.h>
@@ -654,12 +655,50 @@ DEFINE_EQUALITY_PRED(32);
DEFINE_EQUALITY_PRED(16);
DEFINE_EQUALITY_PRED(8);

+/* user space strings temp buffer */
+#define USTRING_BUF_SIZE 512
+
+struct ustring_buffer {
+ char buffer[USTRING_BUF_SIZE];
+};
+
+static __percpu struct ustring_buffer *ustring_per_cpu;
+
+static __always_inline char *test_string(char *str)
+{
+ struct ustring_buffer *ubuf;
+ char __user *ustr;
+ char *kstr;
+
+ if (!ustring_per_cpu)
+ return NULL;
+
+ ubuf = this_cpu_ptr(ustring_per_cpu);
+ kstr = ubuf->buffer;
+
+ if (likely((unsigned long)str >= TASK_SIZE)) {
+ /* For safety, do not trust the string pointer */
+ if (!strncpy_from_kernel_nofault(kstr, str, USTRING_BUF_SIZE))
+ return NULL;
+ } else {
+ /* user space address? */
+ ustr = str;
+ if (!strncpy_from_user_nofault(kstr, ustr, USTRING_BUF_SIZE))
+ return NULL;
+ }
+ return kstr;
+}
+
/* Filter predicate for fixed sized arrays of characters */
static int filter_pred_string(struct filter_pred *pred, void *event)
{
char *addr = (char *)(event + pred->offset);
int cmp, match;

+ addr = test_string(addr);
+ if (!addr)
+ return 0;
+
cmp = pred->regex.match(addr, &pred->regex, pred->regex.field_len);

match = cmp ^ pred->not;
@@ -671,10 +710,16 @@ static int filter_pred_string(struct filter_pred *pred, void *event)
static int filter_pred_pchar(struct filter_pred *pred, void *event)
{
char **addr = (char **)(event + pred->offset);
+ char *str;
int cmp, match;
- int len = strlen(*addr) + 1; /* including tailing '\0' */
+ int len;
+
+ str = test_string(*addr);
+ if (!str)
+ return 0;

- cmp = pred->regex.match(*addr, &pred->regex, len);
+ len = strlen(str) + 1; /* including tailing '\0' */
+ cmp = pred->regex.match(str, &pred->regex, len);

match = cmp ^ pred->not;

@@ -784,6 +829,10 @@ static int filter_pred_none(struct filter_pred *pred, void *event)

static int regex_match_full(char *str, struct regex *r, int len)
{
+ str = test_string(str);
+ if (!str)
+ return 0;
+
/* len of zero means str is dynamic and ends with '\0' */
if (!len)
return strcmp(str, r->pattern) == 0;
@@ -793,6 +842,10 @@ static int regex_match_full(char *str, struct regex *r, int len)

static int regex_match_front(char *str, struct regex *r, int len)
{
+ str = test_string(str);
+ if (!str)
+ return 0;
+
if (len && len < r->len)
return 0;

@@ -801,6 +854,10 @@ static int regex_match_front(char *str, struct regex *r, int len)

static int regex_match_middle(char *str, struct regex *r, int len)
{
+ str = test_string(str);
+ if (!str)
+ return 0;
+
if (!len)
return strstr(str, r->pattern) != NULL;

@@ -811,6 +868,10 @@ static int regex_match_end(char *str, struct regex *r, int len)
{
int strlen = len - 1;

+ str = test_string(str);
+ if (!str)
+ return 0;
+
if (strlen >= r->len &&
memcmp(str + strlen - r->len, r->pattern, r->len) == 0)
return 1;
@@ -819,6 +880,10 @@ static int regex_match_end(char *str, struct regex *r, int len)

static int regex_match_glob(char *str, struct regex *r, int len __maybe_unused)
{
+ str = test_string(str);
+ if (!str)
+ return 0;
+
if (glob_match(r->pattern, str))
return 1;
return 0;
@@ -1335,6 +1400,13 @@ static int parse_pred(const char *str, void *data,
strncpy(pred->regex.pattern, str + s, len);
pred->regex.pattern[len] = 0;

+ if (!ustring_per_cpu) {
+ /* Once allocated, keep it around for good */
+ ustring_per_cpu = alloc_percpu(struct ustring_buffer);
+ if (!ustring_per_cpu)
+ goto err_mem;
+ }
+
filter_build_regex(pred);

if (field->filter_type == FILTER_COMM) {
@@ -1415,6 +1487,9 @@ static int parse_pred(const char *str, void *data,
err_free:
kfree(pred);
return -EINVAL;
+err_mem:
+ kfree(pred);
+ return -ENOMEM;
}

enum {
--
2.33.0




2022-01-08 06:49:51

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 3/3] tracing/filter: make filter_pred_pchar() survive the access to user space

Hi Pingfan,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linux/master]
[also build test ERROR on linus/master v5.16-rc8]
[cannot apply to rostedt-trace/for-next next-20220107]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/0day-ci/linux/commits/Pingfan-Liu/tracing-filter-make-filter_pred_pchar-survive-the-access-to-user-space/20220107-125117
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 136057256686de39cc3a07c2e39ef6bc43003ff6
config: arc-randconfig-r016-20220107 (https://download.01.org/0day-ci/archive/20220108/[email protected]/config)
compiler: arceb-elf-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/0day-ci/linux/commit/61177cfa9fa8f64bc8036d7b18c540b7b78306b0
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Pingfan-Liu/tracing-filter-make-filter_pred_pchar-survive-the-access-to-user-space/20220107-125117
git checkout 61177cfa9fa8f64bc8036d7b18c540b7b78306b0
# save the config file to linux build tree
mkdir build_dir
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross O=build_dir ARCH=arc SHELL=/bin/bash kernel/trace/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

kernel/trace/trace_events_filter.c: In function 'parse_pred':
>> kernel/trace/trace_events_filter.c:1346:45: error: 'event_class_syscall_enter' undeclared (first use in this function)
1346 | if (data->class == &event_class_syscall_enter)
| ^~~~~~~~~~~~~~~~~~~~~~~~~
kernel/trace/trace_events_filter.c:1346:45: note: each undeclared identifier is reported only once for each function it appears in
In file included from include/linux/perf_event.h:25,
from kernel/trace/trace_events_filter.c:11:
At top level:
arch/arc/include/asm/perf_event.h:126:27: warning: 'arc_pmu_cache_map' defined but not used [-Wunused-const-variable=]
126 | static const unsigned int arc_pmu_cache_map[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
| ^~~~~~~~~~~~~~~~~
arch/arc/include/asm/perf_event.h:91:27: warning: 'arc_pmu_ev_hw_map' defined but not used [-Wunused-const-variable=]
91 | static const char * const arc_pmu_ev_hw_map[] = {
| ^~~~~~~~~~~~~~~~~


vim +/event_class_syscall_enter +1346 kernel/trace/trace_events_filter.c

1170
1171 /* Called when a predicate is encountered by predicate_parse() */
1172 static int parse_pred(const char *str, struct trace_event_call *data,
1173 int pos, struct filter_parse_error *pe,
1174 struct filter_pred **pred_ptr)
1175 {
1176 struct trace_event_call *call = data;
1177 struct ftrace_event_field *field;
1178 struct filter_pred *pred = NULL;
1179 char num_buf[24]; /* Big enough to hold an address */
1180 char *field_name;
1181 char q;
1182 u64 val;
1183 int len;
1184 int ret;
1185 int op;
1186 int s;
1187 int i = 0;
1188
1189 /* First find the field to associate to */
1190 while (isspace(str[i]))
1191 i++;
1192 s = i;
1193
1194 while (isalnum(str[i]) || str[i] == '_')
1195 i++;
1196
1197 len = i - s;
1198
1199 if (!len)
1200 return -1;
1201
1202 field_name = kmemdup_nul(str + s, len, GFP_KERNEL);
1203 if (!field_name)
1204 return -ENOMEM;
1205
1206 /* Make sure that the field exists */
1207
1208 field = trace_find_event_field(call, field_name);
1209 kfree(field_name);
1210 if (!field) {
1211 parse_error(pe, FILT_ERR_FIELD_NOT_FOUND, pos + i);
1212 return -EINVAL;
1213 }
1214
1215 while (isspace(str[i]))
1216 i++;
1217
1218 /* Make sure this op is supported */
1219 for (op = 0; ops[op]; op++) {
1220 /* This is why '<=' must come before '<' in ops[] */
1221 if (strncmp(str + i, ops[op], strlen(ops[op])) == 0)
1222 break;
1223 }
1224
1225 if (!ops[op]) {
1226 parse_error(pe, FILT_ERR_INVALID_OP, pos + i);
1227 goto err_free;
1228 }
1229
1230 i += strlen(ops[op]);
1231
1232 while (isspace(str[i]))
1233 i++;
1234
1235 s = i;
1236
1237 pred = kzalloc(sizeof(*pred), GFP_KERNEL);
1238 if (!pred)
1239 return -ENOMEM;
1240
1241 pred->field = field;
1242 field->uaccess = 0;
1243 pred->offset = field->offset;
1244 pred->op = op;
1245
1246 if (ftrace_event_is_function(call)) {
1247 /*
1248 * Perf does things different with function events.
1249 * It only allows an "ip" field, and expects a string.
1250 * But the string does not need to be surrounded by quotes.
1251 * If it is a string, the assigned function as a nop,
1252 * (perf doesn't use it) and grab everything.
1253 */
1254 if (strcmp(field->name, "ip") != 0) {
1255 parse_error(pe, FILT_ERR_IP_FIELD_ONLY, pos + i);
1256 goto err_free;
1257 }
1258 pred->fn = filter_pred_none;
1259
1260 /*
1261 * Quotes are not required, but if they exist then we need
1262 * to read them till we hit a matching one.
1263 */
1264 if (str[i] == '\'' || str[i] == '"')
1265 q = str[i];
1266 else
1267 q = 0;
1268
1269 for (i++; str[i]; i++) {
1270 if (q && str[i] == q)
1271 break;
1272 if (!q && (str[i] == ')' || str[i] == '&' ||
1273 str[i] == '|'))
1274 break;
1275 }
1276 /* Skip quotes */
1277 if (q)
1278 s++;
1279 len = i - s;
1280 if (len >= MAX_FILTER_STR_VAL) {
1281 parse_error(pe, FILT_ERR_OPERAND_TOO_LONG, pos + i);
1282 goto err_free;
1283 }
1284
1285 pred->regex.len = len;
1286 strncpy(pred->regex.pattern, str + s, len);
1287 pred->regex.pattern[len] = 0;
1288
1289 /* This is either a string, or an integer */
1290 } else if (str[i] == '\'' || str[i] == '"') {
1291 char q = str[i];
1292
1293 /* Make sure the op is OK for strings */
1294 switch (op) {
1295 case OP_NE:
1296 pred->not = 1;
1297 fallthrough;
1298 case OP_GLOB:
1299 case OP_EQ:
1300 break;
1301 default:
1302 parse_error(pe, FILT_ERR_ILLEGAL_FIELD_OP, pos + i);
1303 goto err_free;
1304 }
1305
1306 /* Make sure the field is OK for strings */
1307 if (!is_string_field(field)) {
1308 parse_error(pe, FILT_ERR_EXPECT_DIGIT, pos + i);
1309 goto err_free;
1310 }
1311
1312 for (i++; str[i]; i++) {
1313 if (str[i] == q)
1314 break;
1315 }
1316 if (!str[i]) {
1317 parse_error(pe, FILT_ERR_MISSING_QUOTE, pos + i);
1318 goto err_free;
1319 }
1320
1321 /* Skip quotes */
1322 s++;
1323 len = i - s;
1324 if (len >= MAX_FILTER_STR_VAL) {
1325 parse_error(pe, FILT_ERR_OPERAND_TOO_LONG, pos + i);
1326 goto err_free;
1327 }
1328
1329 pred->regex.len = len;
1330 strncpy(pred->regex.pattern, str + s, len);
1331 pred->regex.pattern[len] = 0;
1332
1333 filter_build_regex(pred);
1334
1335 if (field->filter_type == FILTER_COMM) {
1336 pred->fn = filter_pred_comm;
1337
1338 } else if (field->filter_type == FILTER_STATIC_STRING) {
1339 pred->fn = filter_pred_string;
1340 pred->regex.field_len = field->size;
1341
1342 } else if (field->filter_type == FILTER_DYN_STRING)
1343 pred->fn = filter_pred_strloc;
1344 else {
1345 pred->fn = filter_pred_pchar;
> 1346 if (data->class == &event_class_syscall_enter)
1347 pred->field->uaccess = 1;
1348 }
1349 /* go past the last quote */
1350 i++;
1351
1352 } else if (isdigit(str[i]) || str[i] == '-') {
1353
1354 /* Make sure the field is not a string */
1355 if (is_string_field(field)) {
1356 parse_error(pe, FILT_ERR_EXPECT_STRING, pos + i);
1357 goto err_free;
1358 }
1359
1360 if (op == OP_GLOB) {
1361 parse_error(pe, FILT_ERR_ILLEGAL_FIELD_OP, pos + i);
1362 goto err_free;
1363 }
1364
1365 if (str[i] == '-')
1366 i++;
1367
1368 /* We allow 0xDEADBEEF */
1369 while (isalnum(str[i]))
1370 i++;
1371
1372 len = i - s;
1373 /* 0xfeedfacedeadbeef is 18 chars max */
1374 if (len >= sizeof(num_buf)) {
1375 parse_error(pe, FILT_ERR_OPERAND_TOO_LONG, pos + i);
1376 goto err_free;
1377 }
1378
1379 strncpy(num_buf, str + s, len);
1380 num_buf[len] = 0;
1381
1382 /* Make sure it is a value */
1383 if (field->is_signed)
1384 ret = kstrtoll(num_buf, 0, &val);
1385 else
1386 ret = kstrtoull(num_buf, 0, &val);
1387 if (ret) {
1388 parse_error(pe, FILT_ERR_ILLEGAL_INTVAL, pos + s);
1389 goto err_free;
1390 }
1391
1392 pred->val = val;
1393
1394 if (field->filter_type == FILTER_CPU)
1395 pred->fn = filter_pred_cpu;
1396 else {
1397 pred->fn = select_comparison_fn(pred->op, field->size,
1398 field->is_signed);
1399 if (pred->op == OP_NE)
1400 pred->not = 1;
1401 }
1402
1403 } else {
1404 parse_error(pe, FILT_ERR_INVALID_VALUE, pos + i);
1405 goto err_free;
1406 }
1407
1408 *pred_ptr = pred;
1409 return i;
1410
1411 err_free:
1412 kfree(pred);
1413 return -EINVAL;
1414 }
1415

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]

2022-01-08 08:20:59

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 3/3] tracing/filter: make filter_pred_pchar() survive the access to user space

Hi Pingfan,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linux/master]
[also build test ERROR on linus/master v5.16-rc8]
[cannot apply to rostedt-trace/for-next next-20220107]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/0day-ci/linux/commits/Pingfan-Liu/tracing-filter-make-filter_pred_pchar-survive-the-access-to-user-space/20220107-125117
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 136057256686de39cc3a07c2e39ef6bc43003ff6
config: i386-randconfig-r014-20220107 (https://download.01.org/0day-ci/archive/20220108/[email protected]/config)
compiler: clang version 14.0.0 (https://github.com/llvm/llvm-project 32167bfe64a4c5dd4eb3f7a58e24f4cba76f5ac2)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/0day-ci/linux/commit/61177cfa9fa8f64bc8036d7b18c540b7b78306b0
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Pingfan-Liu/tracing-filter-make-filter_pred_pchar-survive-the-access-to-user-space/20220107-125117
git checkout 61177cfa9fa8f64bc8036d7b18c540b7b78306b0
# save the config file to linux build tree
mkdir build_dir
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=i386 SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

>> kernel/trace/trace_events_filter.c:1346:24: error: use of undeclared identifier 'event_class_syscall_enter'
if (data->class == &event_class_syscall_enter)
^
1 error generated.


vim +/event_class_syscall_enter +1346 kernel/trace/trace_events_filter.c

1170
1171 /* Called when a predicate is encountered by predicate_parse() */
1172 static int parse_pred(const char *str, struct trace_event_call *data,
1173 int pos, struct filter_parse_error *pe,
1174 struct filter_pred **pred_ptr)
1175 {
1176 struct trace_event_call *call = data;
1177 struct ftrace_event_field *field;
1178 struct filter_pred *pred = NULL;
1179 char num_buf[24]; /* Big enough to hold an address */
1180 char *field_name;
1181 char q;
1182 u64 val;
1183 int len;
1184 int ret;
1185 int op;
1186 int s;
1187 int i = 0;
1188
1189 /* First find the field to associate to */
1190 while (isspace(str[i]))
1191 i++;
1192 s = i;
1193
1194 while (isalnum(str[i]) || str[i] == '_')
1195 i++;
1196
1197 len = i - s;
1198
1199 if (!len)
1200 return -1;
1201
1202 field_name = kmemdup_nul(str + s, len, GFP_KERNEL);
1203 if (!field_name)
1204 return -ENOMEM;
1205
1206 /* Make sure that the field exists */
1207
1208 field = trace_find_event_field(call, field_name);
1209 kfree(field_name);
1210 if (!field) {
1211 parse_error(pe, FILT_ERR_FIELD_NOT_FOUND, pos + i);
1212 return -EINVAL;
1213 }
1214
1215 while (isspace(str[i]))
1216 i++;
1217
1218 /* Make sure this op is supported */
1219 for (op = 0; ops[op]; op++) {
1220 /* This is why '<=' must come before '<' in ops[] */
1221 if (strncmp(str + i, ops[op], strlen(ops[op])) == 0)
1222 break;
1223 }
1224
1225 if (!ops[op]) {
1226 parse_error(pe, FILT_ERR_INVALID_OP, pos + i);
1227 goto err_free;
1228 }
1229
1230 i += strlen(ops[op]);
1231
1232 while (isspace(str[i]))
1233 i++;
1234
1235 s = i;
1236
1237 pred = kzalloc(sizeof(*pred), GFP_KERNEL);
1238 if (!pred)
1239 return -ENOMEM;
1240
1241 pred->field = field;
1242 field->uaccess = 0;
1243 pred->offset = field->offset;
1244 pred->op = op;
1245
1246 if (ftrace_event_is_function(call)) {
1247 /*
1248 * Perf does things different with function events.
1249 * It only allows an "ip" field, and expects a string.
1250 * But the string does not need to be surrounded by quotes.
1251 * If it is a string, the assigned function as a nop,
1252 * (perf doesn't use it) and grab everything.
1253 */
1254 if (strcmp(field->name, "ip") != 0) {
1255 parse_error(pe, FILT_ERR_IP_FIELD_ONLY, pos + i);
1256 goto err_free;
1257 }
1258 pred->fn = filter_pred_none;
1259
1260 /*
1261 * Quotes are not required, but if they exist then we need
1262 * to read them till we hit a matching one.
1263 */
1264 if (str[i] == '\'' || str[i] == '"')
1265 q = str[i];
1266 else
1267 q = 0;
1268
1269 for (i++; str[i]; i++) {
1270 if (q && str[i] == q)
1271 break;
1272 if (!q && (str[i] == ')' || str[i] == '&' ||
1273 str[i] == '|'))
1274 break;
1275 }
1276 /* Skip quotes */
1277 if (q)
1278 s++;
1279 len = i - s;
1280 if (len >= MAX_FILTER_STR_VAL) {
1281 parse_error(pe, FILT_ERR_OPERAND_TOO_LONG, pos + i);
1282 goto err_free;
1283 }
1284
1285 pred->regex.len = len;
1286 strncpy(pred->regex.pattern, str + s, len);
1287 pred->regex.pattern[len] = 0;
1288
1289 /* This is either a string, or an integer */
1290 } else if (str[i] == '\'' || str[i] == '"') {
1291 char q = str[i];
1292
1293 /* Make sure the op is OK for strings */
1294 switch (op) {
1295 case OP_NE:
1296 pred->not = 1;
1297 fallthrough;
1298 case OP_GLOB:
1299 case OP_EQ:
1300 break;
1301 default:
1302 parse_error(pe, FILT_ERR_ILLEGAL_FIELD_OP, pos + i);
1303 goto err_free;
1304 }
1305
1306 /* Make sure the field is OK for strings */
1307 if (!is_string_field(field)) {
1308 parse_error(pe, FILT_ERR_EXPECT_DIGIT, pos + i);
1309 goto err_free;
1310 }
1311
1312 for (i++; str[i]; i++) {
1313 if (str[i] == q)
1314 break;
1315 }
1316 if (!str[i]) {
1317 parse_error(pe, FILT_ERR_MISSING_QUOTE, pos + i);
1318 goto err_free;
1319 }
1320
1321 /* Skip quotes */
1322 s++;
1323 len = i - s;
1324 if (len >= MAX_FILTER_STR_VAL) {
1325 parse_error(pe, FILT_ERR_OPERAND_TOO_LONG, pos + i);
1326 goto err_free;
1327 }
1328
1329 pred->regex.len = len;
1330 strncpy(pred->regex.pattern, str + s, len);
1331 pred->regex.pattern[len] = 0;
1332
1333 filter_build_regex(pred);
1334
1335 if (field->filter_type == FILTER_COMM) {
1336 pred->fn = filter_pred_comm;
1337
1338 } else if (field->filter_type == FILTER_STATIC_STRING) {
1339 pred->fn = filter_pred_string;
1340 pred->regex.field_len = field->size;
1341
1342 } else if (field->filter_type == FILTER_DYN_STRING)
1343 pred->fn = filter_pred_strloc;
1344 else {
1345 pred->fn = filter_pred_pchar;
> 1346 if (data->class == &event_class_syscall_enter)
1347 pred->field->uaccess = 1;
1348 }
1349 /* go past the last quote */
1350 i++;
1351
1352 } else if (isdigit(str[i]) || str[i] == '-') {
1353
1354 /* Make sure the field is not a string */
1355 if (is_string_field(field)) {
1356 parse_error(pe, FILT_ERR_EXPECT_STRING, pos + i);
1357 goto err_free;
1358 }
1359
1360 if (op == OP_GLOB) {
1361 parse_error(pe, FILT_ERR_ILLEGAL_FIELD_OP, pos + i);
1362 goto err_free;
1363 }
1364
1365 if (str[i] == '-')
1366 i++;
1367
1368 /* We allow 0xDEADBEEF */
1369 while (isalnum(str[i]))
1370 i++;
1371
1372 len = i - s;
1373 /* 0xfeedfacedeadbeef is 18 chars max */
1374 if (len >= sizeof(num_buf)) {
1375 parse_error(pe, FILT_ERR_OPERAND_TOO_LONG, pos + i);
1376 goto err_free;
1377 }
1378
1379 strncpy(num_buf, str + s, len);
1380 num_buf[len] = 0;
1381
1382 /* Make sure it is a value */
1383 if (field->is_signed)
1384 ret = kstrtoll(num_buf, 0, &val);
1385 else
1386 ret = kstrtoull(num_buf, 0, &val);
1387 if (ret) {
1388 parse_error(pe, FILT_ERR_ILLEGAL_INTVAL, pos + s);
1389 goto err_free;
1390 }
1391
1392 pred->val = val;
1393
1394 if (field->filter_type == FILTER_CPU)
1395 pred->fn = filter_pred_cpu;
1396 else {
1397 pred->fn = select_comparison_fn(pred->op, field->size,
1398 field->is_signed);
1399 if (pred->op == OP_NE)
1400 pred->not = 1;
1401 }
1402
1403 } else {
1404 parse_error(pe, FILT_ERR_INVALID_VALUE, pos + i);
1405 goto err_free;
1406 }
1407
1408 *pred_ptr = pred;
1409 return i;
1410
1411 err_free:
1412 kfree(pred);
1413 return -EINVAL;
1414 }
1415

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]

2022-01-10 02:58:21

by Pingfan Liu

[permalink] [raw]
Subject: Re: [PATCH 1/3] tracing/filter: degrade addr in filter_pred_string() from double pointer to pointer

On Fri, Jan 07, 2022 at 12:18:42PM -0500, Steven Rostedt wrote:
> On Fri, 7 Jan 2022 12:49:49 +0800
> Pingfan Liu <[email protected]> wrote:
>
> > Since FILTER_PTR_STRING has the type of "char *", it is meaningless to
> > convert it to "char **". Hence degrading addr from double pointer to
> > single.
> >
> > Signed-off-by: Pingfan Liu <[email protected]>
> > Cc: Steven Rostedt <[email protected]>
> > Cc: Ingo Molnar <[email protected]>
> > To: [email protected]
> > ---
> > kernel/trace/trace_events_filter.c | 6 +++---
> > 1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
> > index c9124038b140..264456e1698f 100644
> > --- a/kernel/trace/trace_events_filter.c
> > +++ b/kernel/trace/trace_events_filter.c
> > @@ -670,11 +670,11 @@ static int filter_pred_string(struct filter_pred *pred, void *event)
> > /* Filter predicate for char * pointers */
> > static int filter_pred_pchar(struct filter_pred *pred, void *event)
> > {
> > - char **addr = (char **)(event + pred->offset);
> > + char *addr = (char *)(event + pred->offset);
>
> This doesn't look right. The address of the pointer should be in the event.
> "event" is an address to the content of the event in the kernel ring buffer.
>
> event + pred->offset
>
> Is then the address of position of the event.
>
> Let's say we have an event record at 0xffff8000, and the pred->offset is at
> 0x10. And the pointer to the string (in user space) is at 0x70008000.
>
> 0xffff8000: <heade>
> 0xffff8010: 0x70008000
>
> 0x70008000: "user space sting"
>
> event + pred->offset gives us 0xffff8010
>
> If we now have that as char *addr, then addr is 0xffff8010
>
>
> > int cmp, match;
> > - int len = strlen(*addr) + 1; /* including tailing '\0' */
> > + int len = strlen(addr) + 1; /* including tailing '\0' */
>
> This would give us the addr = 0xffff8010, which is not where the string
> exists.
>
> How would this work?
>
No, it can not work.


Thanks,

Pingfan