2019-01-14 12:49:42

by Andreas Ziegler

[permalink] [raw]
Subject: uprobes: bug in comm/string output?

Hi,

I've been playing around with uprobes today and found the following weird
behaviour/output when using more than one string argument (or using the $comm
argument). This was run on a v4.20 mainline build on Ubuntu 18.04.

root@ubuntu1810:~# uname -a
Linux ubuntu1810 4.20.0-042000-generic #201812232030 SMP Mon Dec 24 01:32:58 UTC
2018 x86_64 x86_64 x86_64 GNU/Linux

I'm trying to track calls to dlopen so I'm looking up the correct offset in
libdl.so:

root@ubuntu1810:/sys/kernel/debug/tracing# readelf -s
/lib/x86_64-linux-gnu/libdl-2.28.so | grep dlopen
34: 00000000000012a0 133 FUNC GLOBAL DEFAULT 14 dlopen@@GLIBC_2.2.5

Then I'm creating a uprobe with two prints of $comm and two prints of the first
argument to dlopen, and enable that probe. The '/root/test' program only does a
dlopen("libc.so.6", RTLD_LAZY) in main().

root@ubuntu1810:/sys/kernel/debug/tracing# echo 'p:dlopen
/lib/x86_64-linux-gnu/libdl-2.28.so:0x12a0 $comm $comm +0(%di):string
+0(%di):string' > uprobe_events
root@ubuntu1810:/sys/kernel/debug/tracing# echo 1 > events/uprobes/dlopen/enable
root@ubuntu1810:/sys/kernel/debug/tracing# /root/test

The trace output looks like this:

root@ubuntu1810:/sys/kernel/debug/tracing# cat trace
# tracer: nop
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
test-1617 [000] d... 1237.959168: dlopen: (0x7fbd5272e2a0)
arg1=(fault) arg2=(fault) arg3="libc.so.6libc.so.6" arg4="libc.so.6"

That's very weird for two reasons:
- fetching $comm seems to fail with an invalid pointer
- arg3 contains the text twice (if I add another print of the argument, arg3
will contain the wanted string three times, arg4 two times and the last argument
will contain it once).

On the "standard" kernel shipped with Ubuntu 18.04 (4.18) printing $comm works
but also "accumulates" (in lack of a better word) the later string arguments in
the earlier output arguments, so for the probe above arg1 would be
"testtestlibc.so.6libc.so.6", arg2 would be "testlibc.so.6libc.so.6" and
arg3/arg4 will be the same as in the output above.

Is there anything in the documentation why multiple string outputs might not
work with uprobes? Am I installing the uprobe wrong? Or is this a bug?

I found that the kprobe selftest at
tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_string.tc contains a
similar situation for kprobes (it prints the parameter two times) which works
fine on the same system.

Regards,

Andreas


Attachments:
smime.p7s (5.32 kB)
S/MIME Cryptographic Signature

2019-01-15 15:41:52

by Andreas Ziegler

[permalink] [raw]
Subject: Re: uprobes: bug in comm/string output?

Hi again,

On 1/14/19 1:38 PM, Andreas Ziegler wrote:
> Hi,
>
> I've been playing around with uprobes today and found the following weird behaviour/output when using more than one string argument (or using the $comm argument). This was run on a v4.20 mainline build on Ubuntu 18.04.
>
> root@ubuntu1810:~# uname -a
> Linux ubuntu1810 4.20.0-042000-generic #201812232030 SMP Mon Dec 24 01:32:58 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
>
> I'm trying to track calls to dlopen so I'm looking up the correct offset in libdl.so:
>
> root@ubuntu1810:/sys/kernel/debug/tracing# readelf -s /lib/x86_64-linux-gnu/libdl-2.28.so | grep dlopen
> 34: 00000000000012a0 133 FUNC GLOBAL DEFAULT 14 dlopen@@GLIBC_2.2.5
>
> Then I'm creating a uprobe with two prints of $comm and two prints of the first argument to dlopen, and enable that probe. The '/root/test' program only does a dlopen("libc.so.6", RTLD_LAZY) in main().
>
> root@ubuntu1810:/sys/kernel/debug/tracing# echo 'p:dlopen /lib/x86_64-linux-gnu/libdl-2.28.so:0x12a0 $comm $comm +0(%di):string +0(%di):string' > uprobe_events
> root@ubuntu1810:/sys/kernel/debug/tracing# echo 1 > events/uprobes/dlopen/enable
> root@ubuntu1810:/sys/kernel/debug/tracing# /root/test
>
> The trace output looks like this:
>
> root@ubuntu1810:/sys/kernel/debug/tracing# cat trace
> # tracer: nop
> #
> # _-----=> irqs-off
> # / _----=> need-resched
> # | / _---=> hardirq/softirq
> # || / _--=> preempt-depth
> # ||| / delay
> # TASK-PID CPU# |||| TIMESTAMP FUNCTION
> # | | | |||| | |
> test-1617 [000] d... 1237.959168: dlopen: (0x7fbd5272e2a0) arg1=(fault) arg2=(fault) arg3="libc.so.6libc.so.6" arg4="libc.so.6"
>
> That's very weird for two reasons:
> - fetching $comm seems to fail with an invalid pointer
> - arg3 contains the text twice (if I add another print of the argument, arg3 will contain the wanted string three times, arg4 two times and the last argument will contain it once).

at least for the second problem I think I found the answer, and for the
first problem I have a suspicion (see last paragraph for that).

For this, I installed a uprobe for libdl.so/dlopen once again, the
command would be

'p:dlopen /lib/x86_64-linux-gnu/libdl-2.28.so:0x12a0 $comm $comm'

so it should print the process name twice (using a kernel v4.18 on
Ubuntu 18.10).

The code which prints the collected data (print_type_string, defined by
PRINT_TYPE_FUNC_NAME(string) in kernel/trace/trace_probe.c) is the
following, it simply passes the respective data to trace_seq_printf with
a "%s" format string:

int PRINT_TYPE_FUNC_NAME(string)(struct trace_seq *s, void *data, void *ent)
{
int len = *(u32 *)data >> 16;

if (!len)
trace_seq_puts(s, "(fault)");
else
trace_seq_printf(s, "\"%s\"",
(const char *)get_loc_data(data, ent));
return !trace_seq_has_overflowed(s);
}

I dug into that function with KGDB and found the following: 'data' holds
the size and offset for the member in question, which is 0xf and 0x18,
respectively. 'ent' holds the base address for event. When we print the
string at ent + 0x18, we can see that the output for 'arg1' will be
"update-notifierupdate-notifier"

Thread 511 hit Breakpoint 6, print_type_string (s=0xffff880078fd1090,
name=0xffff880078fe4458 "arg1", data=0xffff88007d01f05c,
ent=0xffff88007d01f04c) at
/build/linux-EsXT4r/linux-4.18.0/kernel/trace/trace_probe.c:67
67 in /build/linux-EsXT4r/linux-4.18.0/kernel/trace/trace_probe.c
gdb$ p *(uint32_t *) data
$46 = 0xf0018
gdb$ p ent
$47 = (void *) 0xffff88007d01f04c
gdb$ p ((char *)ent + 0x18)
$48 = 0xffff88007d01f064 "update-notifierupdate-notifier"

Moving on printing 'arg2' (e.g., the other $comm string). Here we see
that the string in question is at offset 0x27, and if we print that, we
see the correct "update-notifier".

Thread 511 hit Breakpoint 6, print_type_string (s=0xffff880078fd1090,
name=0xffff880078fe4d70 "arg2", data=0xffff88007d01f060,
ent=0xffff88007d01f04c) at
/build/linux-EsXT4r/linux-4.18.0/kernel/trace/trace_probe.c:67
67 in /build/linux-EsXT4r/linux-4.18.0/kernel/trace/trace_probe.c
gdb$ p *(uint32_t *) data
$49 = 0xf0027
gdb$ p ((char *)ent + 0x27)
$50 = 0xffff88007d01f073 "update-notifier"

Looking at the bytes in memory and the offsets it becomes clear that
there is no \0 byte at the end of the first entry (which would need to
be at address 0xffff88007d01f064 + 0xf = 0xffff88007d01f073 but instead
that's the start address of the second entry which simply gets consumed
by the (first) "%s" as well.

gdb$ x/32x ent
0xffff88007d01f04c: 0x00010592 0x00002143 0xe83522a0 0x00007f7f
0xffff88007d01f05c: 0x000f0018 0x000f0027 0x61647075 0x6e2d6574
0xffff88007d01f06c: 0x6669746f 0x75726569 0x74616470 0x6f6e2d65
0xffff88007d01f07c: 0x69666974 0x00007265 0x0045feee 0x00010592
0xffff88007d01f08c: 0x00002143 0xe83522a0 0x00007f7f 0x000f0018
0xffff88007d01f09c: 0x000f0027 0x61647075 0x6e2d6574 0x6669746f
0xffff88007d01f0ac: 0x75726569 0x74616470 0x6f6e2d65 0x69666974
0xffff88007d01f0bc: 0x00007265 0x0038806e 0x00010592 0x00002143

Should we simply also store the terminating \0 at the end of the string?
The $comm string is saved by fetch_comm_string (in v4.18) which uses
'strlcpy' and its return value as the size of the respective data...
that value however does NOT include the terminating \0 byte (as it's
simply the return value of a call to strlen). The same holds for
"regular" string arguments where the code uses 'strncpy_from_user' which
has the same return value semantics. Or should we use the information in
'len' to only print that many characters?

As fetch_store_string has changed between v4.18 and v4.20, I could try
to reproduce this with a v4.20 kernel but from looking at the code I
suspect this should be the problem in v4.20 as well.

As for $comm only printing "(fault)" I suspect this has to do with
commit 533059281ee5 ("tracing: probeevent: Introduce new argument
fetching code") as we lost the 'fetch_comm_string' function in that
commit and (I think, didn't have the newer kernel running yet) go
through 'fetch_store_string' now. This calls 'strncpy_from_user' instead
of accessing current->comm directly (and thus does not succeed in
fetching it). I'm adding Masami to Cc: as the author of said patch.

>
> On the "standard" kernel shipped with Ubuntu 18.04 (4.18) printing $comm works but also "accumulates" (in lack of a better word) the later string arguments in the earlier output arguments, so for the probe above arg1 would be "testtestlibc.so.6libc.so.6", arg2 would be "testlibc.so.6libc.so.6" and arg3/arg4 will be the same as in the output above.
>
> Is there anything in the documentation why multiple string outputs might not work with uprobes? Am I installing the uprobe wrong? Or is this a bug?
>
> I found that the kprobe selftest at tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_string.tc contains a similar situation for kprobes (it prints the parameter two times) which works fine on the same system.
>
> Regards,
>
> Andreas


Regards,

Andreas

2019-01-16 18:43:40

by Andreas Ziegler

[permalink] [raw]
Subject: [PATCH] tracing/uprobes: Fix output for multiple string arguments

When printing multiple uprobe arguments as strings the output for the
earlier arguments would also include all later string arguments.

This is best explained in an example:

Consider adding a uprobe to a function receiving two strings as
parameters which is at offset 0xa0 in strlib.so and we want to print
both parameters when the uprobe is hit (on x86_64):

$ echo 'p:func /lib/strlib.so:0xa0 +0(%di):string +0(%si):string' > \
/sys/kernel/debug/tracing/uprobe_events

When the function is called as func("foo", "bar") and we hit the probe,
the trace file shows a line like the following:

[...] func: (0x7f7e683706a0) arg1="foobar" arg2="bar"

Note the extra "bar" printed as part of arg1. This behaviour stacks up
for additional string arguments.

The strings are stored in a dynamically growing part of the uprobe
buffer by fetch_store_string() after copying them from userspace via
strncpy_from_user(). The return value of strncpy_from_user() is then
directly used as the required size for the string. However, this does
not take the terminating null byte into account as the documentation
for strncpy_from_user() cleary states that it "[...] returns the
length of the string (not including the trailing NUL)" even though the
null byte will be copied to the destination.

Therefore, subsequent calls to fetch_store_string() will overwrite
the terminating null byte of the most recently fetched string with
the first character of the current string, leading to the
"accumulation" of strings in earlier arguments in the output.

Fix this by incrementing the return value of strncpy_from_user() by
one if we did not hit the maximum buffer size.

Signed-off-by: Andreas Ziegler <[email protected]>
---
kernel/trace/trace_uprobe.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index e335576b9411..dfb9bbc7fd82 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -160,6 +160,13 @@ fetch_store_string(unsigned long addr, void *dest, void *base)
if (ret >= 0) {
if (ret == maxlen)
dst[ret - 1] = '\0';
+ else if (ret > 0)
+ /*
+ * Include the terminating null byte. In this case it
+ * was copied by strncpy_from_user but not accounted
+ * for in ret.
+ */
+ ret++;
*(u32 *)dest = make_data_loc(ret, (void *)dst - base);
}

--
2.17.1


2019-01-16 18:57:32

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: uprobes: bug in comm/string output?

On Tue, 15 Jan 2019 14:36:48 +0100
Andreas Ziegler <[email protected]> wrote:

> Hi again,
>
> On 1/14/19 1:38 PM, Andreas Ziegler wrote:
> > Hi,
> >
> > I've been playing around with uprobes today and found the following weird behaviour/output when using more than one string argument (or using the $comm argument). This was run on a v4.20 mainline build on Ubuntu 18.04.
> >
> > root@ubuntu1810:~# uname -a
> > Linux ubuntu1810 4.20.0-042000-generic #201812232030 SMP Mon Dec 24 01:32:58 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
> >
> > I'm trying to track calls to dlopen so I'm looking up the correct offset in libdl.so:
> >
> > root@ubuntu1810:/sys/kernel/debug/tracing# readelf -s /lib/x86_64-linux-gnu/libdl-2.28.so | grep dlopen
> > 34: 00000000000012a0 133 FUNC GLOBAL DEFAULT 14 dlopen@@GLIBC_2.2.5
> >
> > Then I'm creating a uprobe with two prints of $comm and two prints of the first argument to dlopen, and enable that probe. The '/root/test' program only does a dlopen("libc.so.6", RTLD_LAZY) in main().
> >
> > root@ubuntu1810:/sys/kernel/debug/tracing# echo 'p:dlopen /lib/x86_64-linux-gnu/libdl-2.28.so:0x12a0 $comm $comm +0(%di):string +0(%di):string' > uprobe_events
> > root@ubuntu1810:/sys/kernel/debug/tracing# echo 1 > events/uprobes/dlopen/enable
> > root@ubuntu1810:/sys/kernel/debug/tracing# /root/test
> >
> > The trace output looks like this:
> >
> > root@ubuntu1810:/sys/kernel/debug/tracing# cat trace
> > # tracer: nop
> > #
> > # _-----=> irqs-off
> > # / _----=> need-resched
> > # | / _---=> hardirq/softirq
> > # || / _--=> preempt-depth
> > # ||| / delay
> > # TASK-PID CPU# |||| TIMESTAMP FUNCTION
> > # | | | |||| | |
> > test-1617 [000] d... 1237.959168: dlopen: (0x7fbd5272e2a0) arg1=(fault) arg2=(fault) arg3="libc.so.6libc.so.6" arg4="libc.so.6"
> >
> > That's very weird for two reasons:
> > - fetching $comm seems to fail with an invalid pointer
> > - arg3 contains the text twice (if I add another print of the argument, arg3 will contain the wanted string three times, arg4 two times and the last argument will contain it once).
>
> at least for the second problem I think I found the answer, and for the
> first problem I have a suspicion (see last paragraph for that).

OK, this looks broken. Thank you very much for reporting it!

BTW, I tried to reproduce it with kprobe event, but it seems working well. e.g.

# echo 'p ksys_chdir $comm $comm +0(%di):string +0(%di):string' > kprobe_events
# echo 1 > events/kprobes/enable
# cd /sys/kernel/debug/tracing
# cat trace
sh-812 [003] ...1 229.344360: p_ksys_chdir_0: (ksys_chdir+0x0/0xc0) arg1="sh" arg2="sh" arg3="/sys/kernel/debug/tracing" arg4="/sys/kernel/debug/tracing"

So, it might be an issue on uprobe_event.

>
> For this, I installed a uprobe for libdl.so/dlopen once again, the
> command would be
>
> 'p:dlopen /lib/x86_64-linux-gnu/libdl-2.28.so:0x12a0 $comm $comm'
>
> so it should print the process name twice (using a kernel v4.18 on
> Ubuntu 18.10).
>
> The code which prints the collected data (print_type_string, defined by
> PRINT_TYPE_FUNC_NAME(string) in kernel/trace/trace_probe.c) is the
> following, it simply passes the respective data to trace_seq_printf with
> a "%s" format string:
>
> int PRINT_TYPE_FUNC_NAME(string)(struct trace_seq *s, void *data, void *ent)
> {
> int len = *(u32 *)data >> 16;
>
> if (!len)
> trace_seq_puts(s, "(fault)");
> else
> trace_seq_printf(s, "\"%s\"",
> (const char *)get_loc_data(data, ent));
> return !trace_seq_has_overflowed(s);
> }
>
> I dug into that function with KGDB and found the following: 'data' holds
> the size and offset for the member in question, which is 0xf and 0x18,
> respectively. 'ent' holds the base address for event. When we print the
> string at ent + 0x18, we can see that the output for 'arg1' will be
> "update-notifierupdate-notifier"
>
> Thread 511 hit Breakpoint 6, print_type_string (s=0xffff880078fd1090,
> name=0xffff880078fe4458 "arg1", data=0xffff88007d01f05c,
> ent=0xffff88007d01f04c) at
> /build/linux-EsXT4r/linux-4.18.0/kernel/trace/trace_probe.c:67
> 67 in /build/linux-EsXT4r/linux-4.18.0/kernel/trace/trace_probe.c
> gdb$ p *(uint32_t *) data
> $46 = 0xf0018
> gdb$ p ent
> $47 = (void *) 0xffff88007d01f04c
> gdb$ p ((char *)ent + 0x18)
> $48 = 0xffff88007d01f064 "update-notifierupdate-notifier"
>
> Moving on printing 'arg2' (e.g., the other $comm string). Here we see
> that the string in question is at offset 0x27, and if we print that, we
> see the correct "update-notifier".
>
> Thread 511 hit Breakpoint 6, print_type_string (s=0xffff880078fd1090,
> name=0xffff880078fe4d70 "arg2", data=0xffff88007d01f060,
> ent=0xffff88007d01f04c) at
> /build/linux-EsXT4r/linux-4.18.0/kernel/trace/trace_probe.c:67
> 67 in /build/linux-EsXT4r/linux-4.18.0/kernel/trace/trace_probe.c
> gdb$ p *(uint32_t *) data
> $49 = 0xf0027
> gdb$ p ((char *)ent + 0x27)
> $50 = 0xffff88007d01f073 "update-notifier"
>
> Looking at the bytes in memory and the offsets it becomes clear that
> there is no \0 byte at the end of the first entry (which would need to
> be at address 0xffff88007d01f064 + 0xf = 0xffff88007d01f073 but instead
> that's the start address of the second entry which simply gets consumed
> by the (first) "%s" as well.
>
> gdb$ x/32x ent
> 0xffff88007d01f04c: 0x00010592 0x00002143 0xe83522a0 0x00007f7f
> 0xffff88007d01f05c: 0x000f0018 0x000f0027 0x61647075 0x6e2d6574
> 0xffff88007d01f06c: 0x6669746f 0x75726569 0x74616470 0x6f6e2d65
> 0xffff88007d01f07c: 0x69666974 0x00007265 0x0045feee 0x00010592
> 0xffff88007d01f08c: 0x00002143 0xe83522a0 0x00007f7f 0x000f0018
> 0xffff88007d01f09c: 0x000f0027 0x61647075 0x6e2d6574 0x6669746f
> 0xffff88007d01f0ac: 0x75726569 0x74616470 0x6f6e2d65 0x69666974
> 0xffff88007d01f0bc: 0x00007265 0x0038806e 0x00010592 0x00002143
>
> Should we simply also store the terminating \0 at the end of the string?
> The $comm string is saved by fetch_comm_string (in v4.18) which uses
> 'strlcpy' and its return value as the size of the respective data...
> that value however does NOT include the terminating \0 byte (as it's
> simply the return value of a call to strlen). The same holds for
> "regular" string arguments where the code uses 'strncpy_from_user' which
> has the same return value semantics. Or should we use the information in
> 'len' to only print that many characters?
>
> As fetch_store_string has changed between v4.18 and v4.20, I could try
> to reproduce this with a v4.20 kernel but from looking at the code I
> suspect this should be the problem in v4.20 as well.
>
> As for $comm only printing "(fault)" I suspect this has to do with
> commit 533059281ee5 ("tracing: probeevent: Introduce new argument
> fetching code") as we lost the 'fetch_comm_string' function in that
> commit and (I think, didn't have the newer kernel running yet) go
> through 'fetch_store_string' now. This calls 'strncpy_from_user' instead
> of accessing current->comm directly (and thus does not succeed in
> fetching it). I'm adding Masami to Cc: as the author of said patch.

Ah, OK. I have to check fetch_store_string() implementation differences
between trace_kprobe.c and trace_uprobe.c. Well, in the uprobes, we may
need more careful steps.

Anyway, that is my fault. I will fix the issue. Hmm, and I may need to
consider to add some test program for uprobes, which including a target
application to be probed.

Thank you!

--
Masami Hiramatsu <[email protected]>

2019-01-16 19:01:15

by Andreas Ziegler

[permalink] [raw]
Subject: Re: uprobes: bug in comm/string output?

Hi,

thanks for your reply!

On 1/16/19 11:00 AM, Masami Hiramatsu wrote:
> On Tue, 15 Jan 2019 14:36:48 +0100
> Andreas Ziegler <[email protected]> wrote:
>
>> Hi again,
>>
>> On 1/14/19 1:38 PM, Andreas Ziegler wrote:
>>> Hi,
>>>
>>> I've been playing around with uprobes today and found the following weird behaviour/output when using more than one string argument (or using the $comm argument). This was run on a v4.20 mainline build on Ubuntu 18.04.
>>>
>>> root@ubuntu1810:~# uname -a
>>> Linux ubuntu1810 4.20.0-042000-generic #201812232030 SMP Mon Dec 24 01:32:58 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
>>>
>>> I'm trying to track calls to dlopen so I'm looking up the correct offset in libdl.so:
>>>
>>> root@ubuntu1810:/sys/kernel/debug/tracing# readelf -s /lib/x86_64-linux-gnu/libdl-2.28.so | grep dlopen
>>> 34: 00000000000012a0 133 FUNC GLOBAL DEFAULT 14 dlopen@@GLIBC_2.2.5
>>>
>>> Then I'm creating a uprobe with two prints of $comm and two prints of the first argument to dlopen, and enable that probe. The '/root/test' program only does a dlopen("libc.so.6", RTLD_LAZY) in main().
>>>
>>> root@ubuntu1810:/sys/kernel/debug/tracing# echo 'p:dlopen /lib/x86_64-linux-gnu/libdl-2.28.so:0x12a0 $comm $comm +0(%di):string +0(%di):string' > uprobe_events
>>> root@ubuntu1810:/sys/kernel/debug/tracing# echo 1 > events/uprobes/dlopen/enable
>>> root@ubuntu1810:/sys/kernel/debug/tracing# /root/test
>>>
>>> The trace output looks like this:
>>>
>>> root@ubuntu1810:/sys/kernel/debug/tracing# cat trace
>>> # tracer: nop
>>> #
>>> # _-----=> irqs-off
>>> # / _----=> need-resched
>>> # | / _---=> hardirq/softirq
>>> # || / _--=> preempt-depth
>>> # ||| / delay
>>> # TASK-PID CPU# |||| TIMESTAMP FUNCTION
>>> # | | | |||| | |
>>> test-1617 [000] d... 1237.959168: dlopen: (0x7fbd5272e2a0) arg1=(fault) arg2=(fault) arg3="libc.so.6libc.so.6" arg4="libc.so.6"
>>>
>>> That's very weird for two reasons:
>>> - fetching $comm seems to fail with an invalid pointer
>>> - arg3 contains the text twice (if I add another print of the argument, arg3 will contain the wanted string three times, arg4 two times and the last argument will contain it once).
>>
>> at least for the second problem I think I found the answer, and for the
>> first problem I have a suspicion (see last paragraph for that).
>
> OK, this looks broken. Thank you very much for reporting it!
>
> BTW, I tried to reproduce it with kprobe event, but it seems working well. e.g.
>
> # echo 'p ksys_chdir $comm $comm +0(%di):string +0(%di):string' > kprobe_events
> # echo 1 > events/kprobes/enable
> # cd /sys/kernel/debug/tracing
> # cat trace
> sh-812 [003] ...1 229.344360: p_ksys_chdir_0: (ksys_chdir+0x0/0xc0) arg1="sh" arg2="sh" arg3="/sys/kernel/debug/tracing" arg4="/sys/kernel/debug/tracing"
>
> So, it might be an issue on uprobe_event.
>

yes, kprobes work because they use strncpy_from_unsafe which *includes*
the null byte in its return value... the fact that both are called
strncpy_* but behave differently is really annoying...

I just sent a patch for this case half an hour ago which simply adds 1
to the returned value for uprobes if it didn't hit the maximum allowed
length.

>>
>> For this, I installed a uprobe for libdl.so/dlopen once again, the
>> command would be
>>
>> 'p:dlopen /lib/x86_64-linux-gnu/libdl-2.28.so:0x12a0 $comm $comm'
>>
>> so it should print the process name twice (using a kernel v4.18 on
>> Ubuntu 18.10).
>>
>> The code which prints the collected data (print_type_string, defined by
>> PRINT_TYPE_FUNC_NAME(string) in kernel/trace/trace_probe.c) is the
>> following, it simply passes the respective data to trace_seq_printf with
>> a "%s" format string:
>>
>> int PRINT_TYPE_FUNC_NAME(string)(struct trace_seq *s, void *data, void *ent)
>> {
>> int len = *(u32 *)data >> 16;
>>
>> if (!len)
>> trace_seq_puts(s, "(fault)");
>> else
>> trace_seq_printf(s, "\"%s\"",
>> (const char *)get_loc_data(data, ent));
>> return !trace_seq_has_overflowed(s);
>> }
>>
>> I dug into that function with KGDB and found the following: 'data' holds
>> the size and offset for the member in question, which is 0xf and 0x18,
>> respectively. 'ent' holds the base address for event. When we print the
>> string at ent + 0x18, we can see that the output for 'arg1' will be
>> "update-notifierupdate-notifier"
>>
>> Thread 511 hit Breakpoint 6, print_type_string (s=0xffff880078fd1090,
>> name=0xffff880078fe4458 "arg1", data=0xffff88007d01f05c,
>> ent=0xffff88007d01f04c) at
>> /build/linux-EsXT4r/linux-4.18.0/kernel/trace/trace_probe.c:67
>> 67 in /build/linux-EsXT4r/linux-4.18.0/kernel/trace/trace_probe.c
>> gdb$ p *(uint32_t *) data
>> $46 = 0xf0018
>> gdb$ p ent
>> $47 = (void *) 0xffff88007d01f04c
>> gdb$ p ((char *)ent + 0x18)
>> $48 = 0xffff88007d01f064 "update-notifierupdate-notifier"
>>
>> Moving on printing 'arg2' (e.g., the other $comm string). Here we see
>> that the string in question is at offset 0x27, and if we print that, we
>> see the correct "update-notifier".
>>
>> Thread 511 hit Breakpoint 6, print_type_string (s=0xffff880078fd1090,
>> name=0xffff880078fe4d70 "arg2", data=0xffff88007d01f060,
>> ent=0xffff88007d01f04c) at
>> /build/linux-EsXT4r/linux-4.18.0/kernel/trace/trace_probe.c:67
>> 67 in /build/linux-EsXT4r/linux-4.18.0/kernel/trace/trace_probe.c
>> gdb$ p *(uint32_t *) data
>> $49 = 0xf0027
>> gdb$ p ((char *)ent + 0x27)
>> $50 = 0xffff88007d01f073 "update-notifier"
>>
>> Looking at the bytes in memory and the offsets it becomes clear that
>> there is no \0 byte at the end of the first entry (which would need to
>> be at address 0xffff88007d01f064 + 0xf = 0xffff88007d01f073 but instead
>> that's the start address of the second entry which simply gets consumed
>> by the (first) "%s" as well.
>>
>> gdb$ x/32x ent
>> 0xffff88007d01f04c: 0x00010592 0x00002143 0xe83522a0 0x00007f7f
>> 0xffff88007d01f05c: 0x000f0018 0x000f0027 0x61647075 0x6e2d6574
>> 0xffff88007d01f06c: 0x6669746f 0x75726569 0x74616470 0x6f6e2d65
>> 0xffff88007d01f07c: 0x69666974 0x00007265 0x0045feee 0x00010592
>> 0xffff88007d01f08c: 0x00002143 0xe83522a0 0x00007f7f 0x000f0018
>> 0xffff88007d01f09c: 0x000f0027 0x61647075 0x6e2d6574 0x6669746f
>> 0xffff88007d01f0ac: 0x75726569 0x74616470 0x6f6e2d65 0x69666974
>> 0xffff88007d01f0bc: 0x00007265 0x0038806e 0x00010592 0x00002143
>>
>> Should we simply also store the terminating \0 at the end of the string?
>> The $comm string is saved by fetch_comm_string (in v4.18) which uses
>> 'strlcpy' and its return value as the size of the respective data...
>> that value however does NOT include the terminating \0 byte (as it's
>> simply the return value of a call to strlen). The same holds for
>> "regular" string arguments where the code uses 'strncpy_from_user' which
>> has the same return value semantics. Or should we use the information in
>> 'len' to only print that many characters?
>>
>> As fetch_store_string has changed between v4.18 and v4.20, I could try
>> to reproduce this with a v4.20 kernel but from looking at the code I
>> suspect this should be the problem in v4.20 as well.
>>
>> As for $comm only printing "(fault)" I suspect this has to do with
>> commit 533059281ee5 ("tracing: probeevent: Introduce new argument
>> fetching code") as we lost the 'fetch_comm_string' function in that
>> commit and (I think, didn't have the newer kernel running yet) go
>> through 'fetch_store_string' now. This calls 'strncpy_from_user' instead
>> of accessing current->comm directly (and thus does not succeed in
>> fetching it). I'm adding Masami to Cc: as the author of said patch.
>
> Ah, OK. I have to check fetch_store_string() implementation differences
> between trace_kprobe.c and trace_uprobe.c. Well, in the uprobes, we may
> need more careful steps.
>

I went into this a bit deeper today, and right now it is simply failing
to parse the code because there is no FETCH_OP_COMM case in
process_fetch_insn() for uprobes so that will return -EILSEQ, leading to
a make_data_loc(0, ...) in store_trace_args(). If we just add
FETCH_OP_COMM and let val point to current->comm (that's what
trace_kprobe.c does), we get an -EFAULT return value from
fetch_store_string because strncpy_from_user() checks if the argument is
in user space.

So I think we might need a special case for that, something like
FETCH_OP_ST_COMM_STRING which is only used for FETCH_OP_COMM and copies
current->comm over to the dynamic area. The implementation could be
similar to the old fetch_comm_string implementation before your rewrite.

> Anyway, that is my fault. I will fix the issue. Hmm, and I may need to
> consider to add some test program for uprobes, which including a target
> application to be probed.
>
> Thank you!
>

Thanks, let me know if I can help you.

Andreas


Attachments:
smime.p7s (5.32 kB)
S/MIME Cryptographic Signature

2019-01-16 22:06:24

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] tracing/uprobes: Fix output for multiple string arguments

On Wed, 16 Jan 2019 10:41:12 +0100
Andreas Ziegler <[email protected]> wrote:

> When printing multiple uprobe arguments as strings the output for the
> earlier arguments would also include all later string arguments.
>
> This is best explained in an example:
>
> Consider adding a uprobe to a function receiving two strings as
> parameters which is at offset 0xa0 in strlib.so and we want to print
> both parameters when the uprobe is hit (on x86_64):
>
> $ echo 'p:func /lib/strlib.so:0xa0 +0(%di):string +0(%si):string' > \
> /sys/kernel/debug/tracing/uprobe_events
>
> When the function is called as func("foo", "bar") and we hit the probe,
> the trace file shows a line like the following:
>
> [...] func: (0x7f7e683706a0) arg1="foobar" arg2="bar"
>
> Note the extra "bar" printed as part of arg1. This behaviour stacks up
> for additional string arguments.
>
> The strings are stored in a dynamically growing part of the uprobe
> buffer by fetch_store_string() after copying them from userspace via
> strncpy_from_user(). The return value of strncpy_from_user() is then
> directly used as the required size for the string. However, this does
> not take the terminating null byte into account as the documentation
> for strncpy_from_user() cleary states that it "[...] returns the
> length of the string (not including the trailing NUL)" even though the
> null byte will be copied to the destination.
>
> Therefore, subsequent calls to fetch_store_string() will overwrite
> the terminating null byte of the most recently fetched string with
> the first character of the current string, leading to the
> "accumulation" of strings in earlier arguments in the output.
>
> Fix this by incrementing the return value of strncpy_from_user() by
> one if we did not hit the maximum buffer size.
>
> Signed-off-by: Andreas Ziegler <[email protected]>
> ---
> kernel/trace/trace_uprobe.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
> index e335576b9411..dfb9bbc7fd82 100644
> --- a/kernel/trace/trace_uprobe.c
> +++ b/kernel/trace/trace_uprobe.c
> @@ -160,6 +160,13 @@ fetch_store_string(unsigned long addr, void *dest, void *base)
> if (ret >= 0) {
> if (ret == maxlen)
> dst[ret - 1] = '\0';
> + else if (ret > 0)

Do we need the ret > 0 check? What if the value is ""?

Doesn't that cause the same issue?

-- Steve

> + /*
> + * Include the terminating null byte. In this case it
> + * was copied by strncpy_from_user but not accounted
> + * for in ret.
> + */
> + ret++;
> *(u32 *)dest = make_data_loc(ret, (void *)dst - base);
> }
>


2019-01-16 22:12:20

by Andreas Ziegler

[permalink] [raw]
Subject: [PATCH v2] tracing/uprobes: Fix output for multiple string arguments

When printing multiple uprobe arguments as strings the output for the
earlier arguments would also include all later string arguments.

This is best explained in an example:

Consider adding a uprobe to a function receiving two strings as
parameters which is at offset 0xa0 in strlib.so and we want to print
both parameters when the uprobe is hit (on x86_64):

$ echo 'p:func /lib/strlib.so:0xa0 +0(%di):string +0(%si):string' > \
/sys/kernel/debug/tracing/uprobe_events

When the function is called as func("foo", "bar") and we hit the probe,
the trace file shows a line like the following:

[...] func: (0x7f7e683706a0) arg1="foobar" arg2="bar"

Note the extra "bar" printed as part of arg1. This behaviour stacks up
for additional string arguments.

The strings are stored in a dynamically growing part of the uprobe
buffer by fetch_store_string() after copying them from userspace via
strncpy_from_user(). The return value of strncpy_from_user() is then
directly used as the required size for the string. However, this does
not take the terminating null byte into account as the documentation
for strncpy_from_user() cleary states that it "[...] returns the
length of the string (not including the trailing NUL)" even though the
null byte will be copied to the destination.

Therefore, subsequent calls to fetch_store_string() will overwrite
the terminating null byte of the most recently fetched string with
the first character of the current string, leading to the
"accumulation" of strings in earlier arguments in the output.

Fix this by incrementing the return value of strncpy_from_user() by
one if we did not hit the maximum buffer size.

Signed-off-by: Andreas Ziegler <[email protected]>
---
v2: removed a wrong check for (ret > 0)

kernel/trace/trace_uprobe.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index e335576b9411..3a1d5ab6b4ba 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -160,6 +160,13 @@ fetch_store_string(unsigned long addr, void *dest, void *base)
if (ret >= 0) {
if (ret == maxlen)
dst[ret - 1] = '\0';
+ else
+ /*
+ * Include the terminating null byte. In this case it
+ * was copied by strncpy_from_user but not accounted
+ * for in ret.
+ */
+ ret++;
*(u32 *)dest = make_data_loc(ret, (void *)dst - base);
}

--
2.17.1


2019-01-16 22:12:47

by Andreas Ziegler

[permalink] [raw]
Subject: Re: [PATCH] tracing/uprobes: Fix output for multiple string arguments

On 1/16/19 2:40 PM, Steven Rostedt wrote:
> On Wed, 16 Jan 2019 10:41:12 +0100
> Andreas Ziegler <[email protected]> wrote:
>
>> When printing multiple uprobe arguments as strings the output for the
>> earlier arguments would also include all later string arguments.
>>
>> This is best explained in an example:
>>
>> Consider adding a uprobe to a function receiving two strings as
>> parameters which is at offset 0xa0 in strlib.so and we want to print
>> both parameters when the uprobe is hit (on x86_64):
>>
>> $ echo 'p:func /lib/strlib.so:0xa0 +0(%di):string +0(%si):string' > \
>> /sys/kernel/debug/tracing/uprobe_events
>>
>> When the function is called as func("foo", "bar") and we hit the probe,
>> the trace file shows a line like the following:
>>
>> [...] func: (0x7f7e683706a0) arg1="foobar" arg2="bar"
>>
>> Note the extra "bar" printed as part of arg1. This behaviour stacks up
>> for additional string arguments.
>>
>> The strings are stored in a dynamically growing part of the uprobe
>> buffer by fetch_store_string() after copying them from userspace via
>> strncpy_from_user(). The return value of strncpy_from_user() is then
>> directly used as the required size for the string. However, this does
>> not take the terminating null byte into account as the documentation
>> for strncpy_from_user() cleary states that it "[...] returns the
>> length of the string (not including the trailing NUL)" even though the
>> null byte will be copied to the destination.
>>
>> Therefore, subsequent calls to fetch_store_string() will overwrite
>> the terminating null byte of the most recently fetched string with
>> the first character of the current string, leading to the
>> "accumulation" of strings in earlier arguments in the output.
>>
>> Fix this by incrementing the return value of strncpy_from_user() by
>> one if we did not hit the maximum buffer size.
>>
>> Signed-off-by: Andreas Ziegler <[email protected]>
>> ---
>> kernel/trace/trace_uprobe.c | 7 +++++++
>> 1 file changed, 7 insertions(+)
>>
>> diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
>> index e335576b9411..dfb9bbc7fd82 100644
>> --- a/kernel/trace/trace_uprobe.c
>> +++ b/kernel/trace/trace_uprobe.c
>> @@ -160,6 +160,13 @@ fetch_store_string(unsigned long addr, void *dest, void *base)
>> if (ret >= 0) {
>> if (ret == maxlen)
>> dst[ret - 1] = '\0';
>> + else if (ret > 0)
>
> Do we need the ret > 0 check? What if the value is ""?
>
> Doesn't that cause the same issue?
>
> -- Steve
>

yes, it does. With this patch an empty string will also print "(fault)",
I missed that, sorry. I'll send a v2.

Thanks,

Andreas

>> + /*
>> + * Include the terminating null byte. In this case it
>> + * was copied by strncpy_from_user but not accounted
>> + * for in ret.
>> + */
>> + ret++;
>> *(u32 *)dest = make_data_loc(ret, (void *)dst - base);
>> }
>>
>


Attachments:
smime.p7s (5.32 kB)
S/MIME Cryptographic Signature

2019-01-16 22:18:55

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH v2] tracing/uprobes: Fix output for multiple string arguments

On Wed, 16 Jan 2019 15:16:29 +0100
Andreas Ziegler <[email protected]> wrote:

> When printing multiple uprobe arguments as strings the output for the
> earlier arguments would also include all later string arguments.
>
> This is best explained in an example:
>
> Consider adding a uprobe to a function receiving two strings as
> parameters which is at offset 0xa0 in strlib.so and we want to print
> both parameters when the uprobe is hit (on x86_64):
>
> $ echo 'p:func /lib/strlib.so:0xa0 +0(%di):string +0(%si):string' > \
> /sys/kernel/debug/tracing/uprobe_events
>
> When the function is called as func("foo", "bar") and we hit the probe,
> the trace file shows a line like the following:
>
> [...] func: (0x7f7e683706a0) arg1="foobar" arg2="bar"
>
> Note the extra "bar" printed as part of arg1. This behaviour stacks up
> for additional string arguments.
>
> The strings are stored in a dynamically growing part of the uprobe
> buffer by fetch_store_string() after copying them from userspace via
> strncpy_from_user(). The return value of strncpy_from_user() is then
> directly used as the required size for the string. However, this does
> not take the terminating null byte into account as the documentation
> for strncpy_from_user() cleary states that it "[...] returns the
> length of the string (not including the trailing NUL)" even though the
> null byte will be copied to the destination.
>
> Therefore, subsequent calls to fetch_store_string() will overwrite
> the terminating null byte of the most recently fetched string with
> the first character of the current string, leading to the
> "accumulation" of strings in earlier arguments in the output.
>
> Fix this by incrementing the return value of strncpy_from_user() by
> one if we did not hit the maximum buffer size.
>
> Signed-off-by: Andreas Ziegler <[email protected]>

Masami,

Care to give a Reviewed-by on this patch?

-- Steve

> ---
> v2: removed a wrong check for (ret > 0)
>
> kernel/trace/trace_uprobe.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
> index e335576b9411..3a1d5ab6b4ba 100644
> --- a/kernel/trace/trace_uprobe.c
> +++ b/kernel/trace/trace_uprobe.c
> @@ -160,6 +160,13 @@ fetch_store_string(unsigned long addr, void *dest, void *base)
> if (ret >= 0) {
> if (ret == maxlen)
> dst[ret - 1] = '\0';
> + else
> + /*
> + * Include the terminating null byte. In this case it
> + * was copied by strncpy_from_user but not accounted
> + * for in ret.
> + */
> + ret++;
> *(u32 *)dest = make_data_loc(ret, (void *)dst - base);
> }
>
> --
> 2.17.1


2019-01-17 09:08:44

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [PATCH v2] tracing/uprobes: Fix output for multiple string arguments

On Wed, 16 Jan 2019 15:16:29 +0100
Andreas Ziegler <[email protected]> wrote:

> When printing multiple uprobe arguments as strings the output for the
> earlier arguments would also include all later string arguments.
>
> This is best explained in an example:
>
> Consider adding a uprobe to a function receiving two strings as
> parameters which is at offset 0xa0 in strlib.so and we want to print
> both parameters when the uprobe is hit (on x86_64):
>
> $ echo 'p:func /lib/strlib.so:0xa0 +0(%di):string +0(%si):string' > \
> /sys/kernel/debug/tracing/uprobe_events
>
> When the function is called as func("foo", "bar") and we hit the probe,
> the trace file shows a line like the following:
>
> [...] func: (0x7f7e683706a0) arg1="foobar" arg2="bar"
>
> Note the extra "bar" printed as part of arg1. This behaviour stacks up
> for additional string arguments.
>
> The strings are stored in a dynamically growing part of the uprobe
> buffer by fetch_store_string() after copying them from userspace via
> strncpy_from_user(). The return value of strncpy_from_user() is then
> directly used as the required size for the string. However, this does
> not take the terminating null byte into account as the documentation
> for strncpy_from_user() cleary states that it "[...] returns the
> length of the string (not including the trailing NUL)" even though the
> null byte will be copied to the destination.
>
> Therefore, subsequent calls to fetch_store_string() will overwrite
> the terminating null byte of the most recently fetched string with
> the first character of the current string, leading to the
> "accumulation" of strings in earlier arguments in the output.
>
> Fix this by incrementing the return value of strncpy_from_user() by
> one if we did not hit the maximum buffer size.
>

Yeah, I had eventually same conclusion. However, you also have to increse
the return value of fetch_store_strlen() too. (And I found another issue)

Could you fix fetch_store_strlen in the same patch?

Thank you,

> Signed-off-by: Andreas Ziegler <[email protected]>
> ---
> v2: removed a wrong check for (ret > 0)
>
> kernel/trace/trace_uprobe.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
> index e335576b9411..3a1d5ab6b4ba 100644
> --- a/kernel/trace/trace_uprobe.c
> +++ b/kernel/trace/trace_uprobe.c
> @@ -160,6 +160,13 @@ fetch_store_string(unsigned long addr, void *dest, void *base)
> if (ret >= 0) {
> if (ret == maxlen)
> dst[ret - 1] = '\0';
> + else
> + /*
> + * Include the terminating null byte. In this case it
> + * was copied by strncpy_from_user but not accounted
> + * for in ret.
> + */
> + ret++;
> *(u32 *)dest = make_data_loc(ret, (void *)dst - base);
> }
>
> --
> 2.17.1
>


--
Masami Hiramatsu <[email protected]>

2019-01-17 09:11:15

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: uprobes: bug in comm/string output?

On Wed, 16 Jan 2019 11:16:07 +0100
Andreas Ziegler <[email protected]> wrote:

>
> I went into this a bit deeper today, and right now it is simply failing
> to parse the code because there is no FETCH_OP_COMM case in
> process_fetch_insn() for uprobes so that will return -EILSEQ, leading to
> a make_data_loc(0, ...) in store_trace_args(). If we just add
> FETCH_OP_COMM and let val point to current->comm (that's what
> trace_kprobe.c does), we get an -EFAULT return value from
> fetch_store_string because strncpy_from_user() checks if the argument is
> in user space.

Correct. I missed to add OP_COMM support. And uprobe's fetch_store_string
is only for user space strings.

> So I think we might need a special case for that, something like
> FETCH_OP_ST_COMM_STRING which is only used for FETCH_OP_COMM and copies
> current->comm over to the dynamic area. The implementation could be
> similar to the old fetch_comm_string implementation before your rewrite.

Hmm, instead, I would like to add current->comm checker and only allows
to copy that. That would be simpler and enough.

Could you test below patch?


tracing: uprobes: Re-enable $comm support for uprobe events

From: Masami Hiramatsu <[email protected]>

Since commit 533059281ee5 ("tracing: probeevent: Introduce new
argument fetching code") dropped the $comm support from uprobe
events, this re-enable it.

For $comm support, use strncpy() instead of strncpy_from_user()
to copy current task's comm. Because it is in the kernel space,
strncpy_from_user() always fails to copy the comm.
This also use strlen() instead of strlen_user() to measure the
length of the comm.

Signed-off-by: Masami Hiramatsu <[email protected]>
Reported-by: Andreas Ziegler <[email protected]>
---
kernel/trace/trace_uprobe.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index e335576b9411..97d134e83e0f 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -156,7 +156,10 @@ fetch_store_string(unsigned long addr, void *dest, void *base)
if (unlikely(!maxlen))
return -ENOMEM;

- ret = strncpy_from_user(dst, src, maxlen);
+ if (addr == (unsigned long)current->comm)
+ ret = strlcpy(dst, current->comm, maxlen);
+ else
+ ret = strncpy_from_user(dst, src, maxlen);
if (ret >= 0) {
if (ret == maxlen)
dst[ret - 1] = '\0';
@@ -173,7 +176,10 @@ fetch_store_strlen(unsigned long addr)
int len;
void __user *vaddr = (void __force __user *) addr;

- len = strnlen_user(vaddr, MAX_STRING_SIZE);
+ if (addr == (unsigned long)current->comm)
+ len = strlen(current->comm);
+ else
+ len = strnlen_user(vaddr, MAX_STRING_SIZE);

return (len > MAX_STRING_SIZE) ? 0 : len;
}
@@ -213,6 +219,9 @@ process_fetch_insn(struct fetch_insn *code, struct pt_regs *regs, void *dest,
case FETCH_OP_IMM:
val = code->immediate;
break;
+ case FETCH_OP_COMM:
+ val = (unsigned long)current->comm;
+ break;
case FETCH_OP_FOFFS:
val = translate_user_vaddr(code->immediate);
break;

2019-01-17 09:34:18

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [PATCH v2] tracing/uprobes: Fix output for multiple string arguments

On Thu, 17 Jan 2019 08:40:05 +0100
Andreas Ziegler <[email protected]> wrote:

> On 17.01.19 07:01, Masami Hiramatsu wrote:
> > On Wed, 16 Jan 2019 15:16:29 +0100
> > Andreas Ziegler <[email protected]> wrote:
> >
> >> When printing multiple uprobe arguments as strings the output for the
> >> earlier arguments would also include all later string arguments.
> >>
> >> This is best explained in an example:
> >>
> >> Consider adding a uprobe to a function receiving two strings as
> >> parameters which is at offset 0xa0 in strlib.so and we want to print
> >> both parameters when the uprobe is hit (on x86_64):
> >>
> >> $ echo 'p:func /lib/strlib.so:0xa0 +0(%di):string +0(%si):string' > \
> >> /sys/kernel/debug/tracing/uprobe_events
> >>
> >> When the function is called as func("foo", "bar") and we hit the probe,
> >> the trace file shows a line like the following:
> >>
> >> [...] func: (0x7f7e683706a0) arg1="foobar" arg2="bar"
> >>
> >> Note the extra "bar" printed as part of arg1. This behaviour stacks up
> >> for additional string arguments.
> >>
> >> The strings are stored in a dynamically growing part of the uprobe
> >> buffer by fetch_store_string() after copying them from userspace via
> >> strncpy_from_user(). The return value of strncpy_from_user() is then
> >> directly used as the required size for the string. However, this does
> >> not take the terminating null byte into account as the documentation
> >> for strncpy_from_user() cleary states that it "[...] returns the
> >> length of the string (not including the trailing NUL)" even though the
> >> null byte will be copied to the destination.
> >>
> >> Therefore, subsequent calls to fetch_store_string() will overwrite
> >> the terminating null byte of the most recently fetched string with
> >> the first character of the current string, leading to the
> >> "accumulation" of strings in earlier arguments in the output.
> >>
> >> Fix this by incrementing the return value of strncpy_from_user() by
> >> one if we did not hit the maximum buffer size.
> >>
> >
> > Yeah, I had eventually same conclusion. However, you also have to increse
> > the return value of fetch_store_strlen() too. (And I found another issue)
> >
>
> I don't think we need to increase that since the documentation for
> strnlen_user() says that it "[r]eturns the size of the string
> INCLUDING the terminating NUL." so its return value will already be
> one more than that of strncpy_from_user().

Ah, I got it.
Hmm, in that case, I have to update my patch in the previous mail.
Anyway,

Acked-by: Masami Hiramatsu <[email protected]>

Thank you!!

>
> Thanks,
>
> Andreas
>
> > Could you fix fetch_store_strlen in the same patch?
> >
> > Thank you,
> >
> >> Signed-off-by: Andreas Ziegler <[email protected]>
> >> ---
> >> v2: removed a wrong check for (ret > 0)
> >>
> >> kernel/trace/trace_uprobe.c | 7 +++++++
> >> 1 file changed, 7 insertions(+)
> >>
> >> diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
> >> index e335576b9411..3a1d5ab6b4ba 100644
> >> --- a/kernel/trace/trace_uprobe.c
> >> +++ b/kernel/trace/trace_uprobe.c
> >> @@ -160,6 +160,13 @@ fetch_store_string(unsigned long addr, void *dest, void *base)
> >> if (ret >= 0) {
> >> if (ret == maxlen)
> >> dst[ret - 1] = '\0';
> >> + else
> >> + /*
> >> + * Include the terminating null byte. In this case it
> >> + * was copied by strncpy_from_user but not accounted
> >> + * for in ret.
> >> + */
> >> + ret++;
> >> *(u32 *)dest = make_data_loc(ret, (void *)dst - base);
> >> }
> >>
> >> --
> >> 2.17.1
> >>
> >
> >
>


--
Masami Hiramatsu <[email protected]>

2019-01-17 09:35:18

by Andreas Ziegler

[permalink] [raw]
Subject: Re: uprobes: bug in comm/string output?

On 17.01.19 09:00, Masami Hiramatsu wrote:
> On Thu, 17 Jan 2019 15:13:09 +0900
> Masami Hiramatsu <[email protected]> wrote:
>
>> On Wed, 16 Jan 2019 11:16:07 +0100
>> Andreas Ziegler <[email protected]> wrote:
>>
>>>
>>> I went into this a bit deeper today, and right now it is simply failing
>>> to parse the code because there is no FETCH_OP_COMM case in
>>> process_fetch_insn() for uprobes so that will return -EILSEQ, leading to
>>> a make_data_loc(0, ...) in store_trace_args(). If we just add
>>> FETCH_OP_COMM and let val point to current->comm (that's what
>>> trace_kprobe.c does), we get an -EFAULT return value from
>>> fetch_store_string because strncpy_from_user() checks if the argument is
>>> in user space.
>>
>> Correct. I missed to add OP_COMM support. And uprobe's fetch_store_string
>> is only for user space strings.
>>
>>> So I think we might need a special case for that, something like
>>> FETCH_OP_ST_COMM_STRING which is only used for FETCH_OP_COMM and copies
>>> current->comm over to the dynamic area. The implementation could be
>>> similar to the old fetch_comm_string implementation before your rewrite.
>>
>> Hmm, instead, I would like to add current->comm checker and only allows
>> to copy that. That would be simpler and enough.
>>
>> Could you test below patch?
>>
>>
>> tracing: uprobes: Re-enable $comm support for uprobe events
>>
>> From: Masami Hiramatsu <[email protected]>
>>
>> Since commit 533059281ee5 ("tracing: probeevent: Introduce new
>> argument fetching code") dropped the $comm support from uprobe
>> events, this re-enable it.

this should read 're-enables'.

>>
>> For $comm support, use strncpy() instead of strncpy_from_user()
^
we're using strlcpy(), not strncpy().

>> to copy current task's comm. Because it is in the kernel space,
>> strncpy_from_user() always fails to copy the comm.
>> This also use strlen() instead of strlen_user() to measure the
^ ^
'uses', and the function should be 'strnlen_user()'.

>> length of the comm.
>>
>> Signed-off-by: Masami Hiramatsu <[email protected]>
>> Reported-by: Andreas Ziegler <[email protected]>
>> ---
>> kernel/trace/trace_uprobe.c | 13 +++++++++++--
>> 1 file changed, 11 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
>> index e335576b9411..97d134e83e0f 100644
>> --- a/kernel/trace/trace_uprobe.c
>> +++ b/kernel/trace/trace_uprobe.c
>> @@ -156,7 +156,10 @@ fetch_store_string(unsigned long addr, void *dest, void *base)
>> if (unlikely(!maxlen))
>> return -ENOMEM;
>>
>> - ret = strncpy_from_user(dst, src, maxlen);
>> + if (addr == (unsigned long)current->comm)
>> + ret = strlcpy(dst, current->comm, maxlen);
>> + else
>> + ret = strncpy_from_user(dst, src, maxlen);
>> if (ret >= 0) {
>> if (ret == maxlen)
>> dst[ret - 1] = '\0';
>> @@ -173,7 +176,10 @@ fetch_store_strlen(unsigned long addr)
>> int len;
>> void __user *vaddr = (void __force __user *) addr;
>>
>> - len = strnlen_user(vaddr, MAX_STRING_SIZE);
>> + if (addr == (unsigned long)current->comm)
>> + len = strlen(current->comm);
>
> To balance with the strnlen_user, we must increse the len in this block.
> (strlen doesn't count the final '\0', but strnlen_user counts it)
>

yes, we need to add a '+ 1' here.

With the typos and this one fixed, this is

Acked-by: Andreas Ziegler <[email protected]>

> Thank you,
>
>> + else
>> + len = strnlen_user(vaddr, MAX_STRING_SIZE);
>>
>> return (len > MAX_STRING_SIZE) ? 0 : len;
>> }
>> @@ -213,6 +219,9 @@ process_fetch_insn(struct fetch_insn *code, struct pt_regs *regs, void *dest,
>> case FETCH_OP_IMM:
>> val = code->immediate;
>> break;
>> + case FETCH_OP_COMM:
>> + val = (unsigned long)current->comm;
>> + break;
>> case FETCH_OP_FOFFS:
>> val = translate_user_vaddr(code->immediate);
>> break;
>
>


2019-01-17 11:43:19

by Andreas Ziegler

[permalink] [raw]
Subject: Re: [PATCH v2] tracing/uprobes: Fix output for multiple string arguments

On 17.01.19 07:01, Masami Hiramatsu wrote:
> On Wed, 16 Jan 2019 15:16:29 +0100
> Andreas Ziegler <[email protected]> wrote:
>
>> When printing multiple uprobe arguments as strings the output for the
>> earlier arguments would also include all later string arguments.
>>
>> This is best explained in an example:
>>
>> Consider adding a uprobe to a function receiving two strings as
>> parameters which is at offset 0xa0 in strlib.so and we want to print
>> both parameters when the uprobe is hit (on x86_64):
>>
>> $ echo 'p:func /lib/strlib.so:0xa0 +0(%di):string +0(%si):string' > \
>> /sys/kernel/debug/tracing/uprobe_events
>>
>> When the function is called as func("foo", "bar") and we hit the probe,
>> the trace file shows a line like the following:
>>
>> [...] func: (0x7f7e683706a0) arg1="foobar" arg2="bar"
>>
>> Note the extra "bar" printed as part of arg1. This behaviour stacks up
>> for additional string arguments.
>>
>> The strings are stored in a dynamically growing part of the uprobe
>> buffer by fetch_store_string() after copying them from userspace via
>> strncpy_from_user(). The return value of strncpy_from_user() is then
>> directly used as the required size for the string. However, this does
>> not take the terminating null byte into account as the documentation
>> for strncpy_from_user() cleary states that it "[...] returns the
>> length of the string (not including the trailing NUL)" even though the
>> null byte will be copied to the destination.
>>
>> Therefore, subsequent calls to fetch_store_string() will overwrite
>> the terminating null byte of the most recently fetched string with
>> the first character of the current string, leading to the
>> "accumulation" of strings in earlier arguments in the output.
>>
>> Fix this by incrementing the return value of strncpy_from_user() by
>> one if we did not hit the maximum buffer size.
>>
>
> Yeah, I had eventually same conclusion. However, you also have to increse
> the return value of fetch_store_strlen() too. (And I found another issue)
>

I don't think we need to increase that since the documentation for
strnlen_user() says that it "[r]eturns the size of the string
INCLUDING the terminating NUL." so its return value will already be
one more than that of strncpy_from_user().

Thanks,

Andreas

> Could you fix fetch_store_strlen in the same patch?
>
> Thank you,
>
>> Signed-off-by: Andreas Ziegler <[email protected]>
>> ---
>> v2: removed a wrong check for (ret > 0)
>>
>> kernel/trace/trace_uprobe.c | 7 +++++++
>> 1 file changed, 7 insertions(+)
>>
>> diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
>> index e335576b9411..3a1d5ab6b4ba 100644
>> --- a/kernel/trace/trace_uprobe.c
>> +++ b/kernel/trace/trace_uprobe.c
>> @@ -160,6 +160,13 @@ fetch_store_string(unsigned long addr, void *dest, void *base)
>> if (ret >= 0) {
>> if (ret == maxlen)
>> dst[ret - 1] = '\0';
>> + else
>> + /*
>> + * Include the terminating null byte. In this case it
>> + * was copied by strncpy_from_user but not accounted
>> + * for in ret.
>> + */
>> + ret++;
>> *(u32 *)dest = make_data_loc(ret, (void *)dst - base);
>> }
>>
>> --
>> 2.17.1
>>
>
>


2019-01-17 11:44:01

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: uprobes: bug in comm/string output?

On Thu, 17 Jan 2019 15:13:09 +0900
Masami Hiramatsu <[email protected]> wrote:

> On Wed, 16 Jan 2019 11:16:07 +0100
> Andreas Ziegler <[email protected]> wrote:
>
> >
> > I went into this a bit deeper today, and right now it is simply failing
> > to parse the code because there is no FETCH_OP_COMM case in
> > process_fetch_insn() for uprobes so that will return -EILSEQ, leading to
> > a make_data_loc(0, ...) in store_trace_args(). If we just add
> > FETCH_OP_COMM and let val point to current->comm (that's what
> > trace_kprobe.c does), we get an -EFAULT return value from
> > fetch_store_string because strncpy_from_user() checks if the argument is
> > in user space.
>
> Correct. I missed to add OP_COMM support. And uprobe's fetch_store_string
> is only for user space strings.
>
> > So I think we might need a special case for that, something like
> > FETCH_OP_ST_COMM_STRING which is only used for FETCH_OP_COMM and copies
> > current->comm over to the dynamic area. The implementation could be
> > similar to the old fetch_comm_string implementation before your rewrite.
>
> Hmm, instead, I would like to add current->comm checker and only allows
> to copy that. That would be simpler and enough.
>
> Could you test below patch?
>
>
> tracing: uprobes: Re-enable $comm support for uprobe events
>
> From: Masami Hiramatsu <[email protected]>
>
> Since commit 533059281ee5 ("tracing: probeevent: Introduce new
> argument fetching code") dropped the $comm support from uprobe
> events, this re-enable it.
>
> For $comm support, use strncpy() instead of strncpy_from_user()
> to copy current task's comm. Because it is in the kernel space,
> strncpy_from_user() always fails to copy the comm.
> This also use strlen() instead of strlen_user() to measure the
> length of the comm.
>
> Signed-off-by: Masami Hiramatsu <[email protected]>
> Reported-by: Andreas Ziegler <[email protected]>
> ---
> kernel/trace/trace_uprobe.c | 13 +++++++++++--
> 1 file changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
> index e335576b9411..97d134e83e0f 100644
> --- a/kernel/trace/trace_uprobe.c
> +++ b/kernel/trace/trace_uprobe.c
> @@ -156,7 +156,10 @@ fetch_store_string(unsigned long addr, void *dest, void *base)
> if (unlikely(!maxlen))
> return -ENOMEM;
>
> - ret = strncpy_from_user(dst, src, maxlen);
> + if (addr == (unsigned long)current->comm)
> + ret = strlcpy(dst, current->comm, maxlen);
> + else
> + ret = strncpy_from_user(dst, src, maxlen);
> if (ret >= 0) {
> if (ret == maxlen)
> dst[ret - 1] = '\0';
> @@ -173,7 +176,10 @@ fetch_store_strlen(unsigned long addr)
> int len;
> void __user *vaddr = (void __force __user *) addr;
>
> - len = strnlen_user(vaddr, MAX_STRING_SIZE);
> + if (addr == (unsigned long)current->comm)
> + len = strlen(current->comm);

To balance with the strnlen_user, we must increse the len in this block.
(strlen doesn't count the final '\0', but strnlen_user counts it)

Thank you,

> + else
> + len = strnlen_user(vaddr, MAX_STRING_SIZE);
>
> return (len > MAX_STRING_SIZE) ? 0 : len;
> }
> @@ -213,6 +219,9 @@ process_fetch_insn(struct fetch_insn *code, struct pt_regs *regs, void *dest,
> case FETCH_OP_IMM:
> val = code->immediate;
> break;
> + case FETCH_OP_COMM:
> + val = (unsigned long)current->comm;
> + break;
> case FETCH_OP_FOFFS:
> val = translate_user_vaddr(code->immediate);
> break;


--
Masami Hiramatsu <[email protected]>

2019-01-17 11:46:05

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: uprobes: bug in comm/string output?

On Thu, 17 Jan 2019 09:08:41 +0100
Andreas Ziegler <[email protected]> wrote:

> On 17.01.19 09:00, Masami Hiramatsu wrote:
> > On Thu, 17 Jan 2019 15:13:09 +0900
> > Masami Hiramatsu <[email protected]> wrote:
> >
> >> On Wed, 16 Jan 2019 11:16:07 +0100
> >> Andreas Ziegler <[email protected]> wrote:
> >>
> >>>
> >>> I went into this a bit deeper today, and right now it is simply failing
> >>> to parse the code because there is no FETCH_OP_COMM case in
> >>> process_fetch_insn() for uprobes so that will return -EILSEQ, leading to
> >>> a make_data_loc(0, ...) in store_trace_args(). If we just add
> >>> FETCH_OP_COMM and let val point to current->comm (that's what
> >>> trace_kprobe.c does), we get an -EFAULT return value from
> >>> fetch_store_string because strncpy_from_user() checks if the argument is
> >>> in user space.
> >>
> >> Correct. I missed to add OP_COMM support. And uprobe's fetch_store_string
> >> is only for user space strings.
> >>
> >>> So I think we might need a special case for that, something like
> >>> FETCH_OP_ST_COMM_STRING which is only used for FETCH_OP_COMM and copies
> >>> current->comm over to the dynamic area. The implementation could be
> >>> similar to the old fetch_comm_string implementation before your rewrite.
> >>
> >> Hmm, instead, I would like to add current->comm checker and only allows
> >> to copy that. That would be simpler and enough.
> >>
> >> Could you test below patch?
> >>
> >>
> >> tracing: uprobes: Re-enable $comm support for uprobe events
> >>
> >> From: Masami Hiramatsu <[email protected]>
> >>
> >> Since commit 533059281ee5 ("tracing: probeevent: Introduce new
> >> argument fetching code") dropped the $comm support from uprobe
> >> events, this re-enable it.
>
> this should read 're-enables'.
>
> >>
> >> For $comm support, use strncpy() instead of strncpy_from_user()
> ^
> we're using strlcpy(), not strncpy().
>
> >> to copy current task's comm. Because it is in the kernel space,
> >> strncpy_from_user() always fails to copy the comm.
> >> This also use strlen() instead of strlen_user() to measure the
> ^ ^
> 'uses', and the function should be 'strnlen_user()'.
>
> >> length of the comm.
> >>
> >> Signed-off-by: Masami Hiramatsu <[email protected]>
> >> Reported-by: Andreas Ziegler <[email protected]>
> >> ---
> >> kernel/trace/trace_uprobe.c | 13 +++++++++++--
> >> 1 file changed, 11 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
> >> index e335576b9411..97d134e83e0f 100644
> >> --- a/kernel/trace/trace_uprobe.c
> >> +++ b/kernel/trace/trace_uprobe.c
> >> @@ -156,7 +156,10 @@ fetch_store_string(unsigned long addr, void *dest, void *base)
> >> if (unlikely(!maxlen))
> >> return -ENOMEM;
> >>
> >> - ret = strncpy_from_user(dst, src, maxlen);
> >> + if (addr == (unsigned long)current->comm)
> >> + ret = strlcpy(dst, current->comm, maxlen);
> >> + else
> >> + ret = strncpy_from_user(dst, src, maxlen);
> >> if (ret >= 0) {
> >> if (ret == maxlen)
> >> dst[ret - 1] = '\0';
> >> @@ -173,7 +176,10 @@ fetch_store_strlen(unsigned long addr)
> >> int len;
> >> void __user *vaddr = (void __force __user *) addr;
> >>
> >> - len = strnlen_user(vaddr, MAX_STRING_SIZE);
> >> + if (addr == (unsigned long)current->comm)
> >> + len = strlen(current->comm);
> >
> > To balance with the strnlen_user, we must increse the len in this block.
> > (strlen doesn't count the final '\0', but strnlen_user counts it)
> >
>
> yes, we need to add a '+ 1' here.
>
> With the typos and this one fixed, this is
>
> Acked-by: Andreas Ziegler <[email protected]>

Thank you for fixing typo and Ack :)

Thanks you,

>
> > Thank you,
> >
> >> + else
> >> + len = strnlen_user(vaddr, MAX_STRING_SIZE);
> >>
> >> return (len > MAX_STRING_SIZE) ? 0 : len;
> >> }
> >> @@ -213,6 +219,9 @@ process_fetch_insn(struct fetch_insn *code, struct pt_regs *regs, void *dest,
> >> case FETCH_OP_IMM:
> >> val = code->immediate;
> >> break;
> >> + case FETCH_OP_COMM:
> >> + val = (unsigned long)current->comm;
> >> + break;
> >> case FETCH_OP_FOFFS:
> >> val = translate_user_vaddr(code->immediate);
> >> break;
> >
> >
>


--
Masami Hiramatsu <[email protected]>

2019-01-17 13:46:15

by Andreas Ziegler

[permalink] [raw]
Subject: Re: uprobes: bug in comm/string output?

Hi,

On 1/17/19 10:47 AM, Masami Hiramatsu wrote:
> On Thu, 17 Jan 2019 09:08:41 +0100
> Andreas Ziegler <[email protected]> wrote:
>
>> On 17.01.19 09:00, Masami Hiramatsu wrote:
>>> On Thu, 17 Jan 2019 15:13:09 +0900
>>> Masami Hiramatsu <[email protected]> wrote:
>>>
>>>> On Wed, 16 Jan 2019 11:16:07 +0100
>>>> Andreas Ziegler <[email protected]> wrote:
>>>>
>>>>>
>>>>> I went into this a bit deeper today, and right now it is simply failing
>>>>> to parse the code because there is no FETCH_OP_COMM case in
>>>>> process_fetch_insn() for uprobes so that will return -EILSEQ, leading to
>>>>> a make_data_loc(0, ...) in store_trace_args(). If we just add
>>>>> FETCH_OP_COMM and let val point to current->comm (that's what
>>>>> trace_kprobe.c does), we get an -EFAULT return value from
>>>>> fetch_store_string because strncpy_from_user() checks if the argument is
>>>>> in user space.
>>>>
>>>> Correct. I missed to add OP_COMM support. And uprobe's fetch_store_string
>>>> is only for user space strings.
>>>>
>>>>> So I think we might need a special case for that, something like
>>>>> FETCH_OP_ST_COMM_STRING which is only used for FETCH_OP_COMM and copies
>>>>> current->comm over to the dynamic area. The implementation could be
>>>>> similar to the old fetch_comm_string implementation before your rewrite.
>>>>
>>>> Hmm, instead, I would like to add current->comm checker and only allows
>>>> to copy that. That would be simpler and enough.
>>>>
>>>> Could you test below patch?
>>>>
>>>>
>>>> tracing: uprobes: Re-enable $comm support for uprobe events
>>>>
>>>> From: Masami Hiramatsu <[email protected]>
>>>>
>>>> Since commit 533059281ee5 ("tracing: probeevent: Introduce new
>>>> argument fetching code") dropped the $comm support from uprobe
>>>> events, this re-enable it.
>>
>> this should read 're-enables'.
>>
>>>>
>>>> For $comm support, use strncpy() instead of strncpy_from_user()
>> ^
>> we're using strlcpy(), not strncpy().
>>
>>>> to copy current task's comm. Because it is in the kernel space,
>>>> strncpy_from_user() always fails to copy the comm.
>>>> This also use strlen() instead of strlen_user() to measure the
>> ^ ^
>> 'uses', and the function should be 'strnlen_user()'.
>>
>>>> length of the comm.
>>>>
>>>> Signed-off-by: Masami Hiramatsu <[email protected]>
>>>> Reported-by: Andreas Ziegler <[email protected]>
>>>> ---
>>>> kernel/trace/trace_uprobe.c | 13 +++++++++++--
>>>> 1 file changed, 11 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
>>>> index e335576b9411..97d134e83e0f 100644
>>>> --- a/kernel/trace/trace_uprobe.c
>>>> +++ b/kernel/trace/trace_uprobe.c
>>>> @@ -156,7 +156,10 @@ fetch_store_string(unsigned long addr, void *dest, void *base)
>>>> if (unlikely(!maxlen))
>>>> return -ENOMEM;
>>>>
>>>> - ret = strncpy_from_user(dst, src, maxlen);
>>>> + if (addr == (unsigned long)current->comm)
>>>> + ret = strlcpy(dst, current->comm, maxlen);
>>>> + else
>>>> + ret = strncpy_from_user(dst, src, maxlen);
>>>> if (ret >= 0) {
>>>> if (ret == maxlen)
>>>> dst[ret - 1] = '\0';
>>>> @@ -173,7 +176,10 @@ fetch_store_strlen(unsigned long addr)
>>>> int len;
>>>> void __user *vaddr = (void __force __user *) addr;
>>>>
>>>> - len = strnlen_user(vaddr, MAX_STRING_SIZE);
>>>> + if (addr == (unsigned long)current->comm)
>>>> + len = strlen(current->comm);
>>>
>>> To balance with the strnlen_user, we must increse the len in this block.
>>> (strlen doesn't count the final '\0', but strnlen_user counts it)
>>>
>>
>> yes, we need to add a '+ 1' here.
>>
>> With the typos and this one fixed, this is
>>
>> Acked-by: Andreas Ziegler <[email protected]>
>
> Thank you for fixing typo and Ack :)
>
> Thanks you,
>
>>
>>> Thank you,
>>>
>>>> + else
>>>> + len = strnlen_user(vaddr, MAX_STRING_SIZE);
>>>>
>>>> return (len > MAX_STRING_SIZE) ? 0 : len;
>>>> }
>>>> @@ -213,6 +219,9 @@ process_fetch_insn(struct fetch_insn *code, struct pt_regs *regs, void *dest,
>>>> case FETCH_OP_IMM:
>>>> val = code->immediate;
>>>> break;
>>>> + case FETCH_OP_COMM:
>>>> + val = (unsigned long)current->comm;
>>>> + break;
>>>> case FETCH_OP_FOFFS:
>>>> val = translate_user_vaddr(code->immediate);
>>>> break;
>>>
>>>
>>
>
>
as the original commit breaking $comm support was merged for v4.20
(which is a stable kernel) and the wrong behaviour with multiple strings
exists in all longterm/stable releases (tested back to v4.4), do you
think this should be going into a stable release once it's merged?

I added Greg as he might know the answer to that.

Thanks,

Andreas


Attachments:
smime.p7s (5.32 kB)
S/MIME Cryptographic Signature

2019-01-17 14:53:48

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH v2] tracing/uprobes: Fix output for multiple string arguments

On Thu, 17 Jan 2019 15:29:09 +0100
Andreas Ziegler <[email protected]> wrote:

> From my side it's basically good to go, but I just realized that there
> is no "Fixes: " tag in it but the problem it fixes goes back to
> 5baaa59ef09e ("tracing/probes: Implement 'memory' fetch method for
> uprobes") from 2013 (and is present in all current longterm and stable
> kernels)... should I add a tag and resend a v3?
>

I was just about to ask you to provide the fixes tag. No need to
resend, I'll just add it manually. I asked Masami to resend because it
was a patch series. If it was just a single patch, I would have just
asked what the fixes was for that particular patch. I can handle one
patch, but not more than one ;-)

-- Steve

2019-01-17 15:16:12

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: uprobes: bug in comm/string output?

On Thu, Jan 17, 2019 at 02:44:41PM +0100, Andreas Ziegler wrote:
> Hi,
>
> On 1/17/19 10:47 AM, Masami Hiramatsu wrote:
> > On Thu, 17 Jan 2019 09:08:41 +0100
> > Andreas Ziegler <[email protected]> wrote:
> >
> > > On 17.01.19 09:00, Masami Hiramatsu wrote:
> > > > On Thu, 17 Jan 2019 15:13:09 +0900
> > > > Masami Hiramatsu <[email protected]> wrote:
> > > >
> > > > > On Wed, 16 Jan 2019 11:16:07 +0100
> > > > > Andreas Ziegler <[email protected]> wrote:
> > > > >
> > > > > >
> > > > > > I went into this a bit deeper today, and right now it is simply failing
> > > > > > to parse the code because there is no FETCH_OP_COMM case in
> > > > > > process_fetch_insn() for uprobes so that will return -EILSEQ, leading to
> > > > > > a make_data_loc(0, ...) in store_trace_args(). If we just add
> > > > > > FETCH_OP_COMM and let val point to current->comm (that's what
> > > > > > trace_kprobe.c does), we get an -EFAULT return value from
> > > > > > fetch_store_string because strncpy_from_user() checks if the argument is
> > > > > > in user space.
> > > > >
> > > > > Correct. I missed to add OP_COMM support. And uprobe's fetch_store_string
> > > > > is only for user space strings.
> > > > >
> > > > > > So I think we might need a special case for that, something like
> > > > > > FETCH_OP_ST_COMM_STRING which is only used for FETCH_OP_COMM and copies
> > > > > > current->comm over to the dynamic area. The implementation could be
> > > > > > similar to the old fetch_comm_string implementation before your rewrite.
> > > > >
> > > > > Hmm, instead, I would like to add current->comm checker and only allows
> > > > > to copy that. That would be simpler and enough.
> > > > >
> > > > > Could you test below patch?
> > > > >
> > > > >
> > > > > tracing: uprobes: Re-enable $comm support for uprobe events
> > > > >
> > > > > From: Masami Hiramatsu <[email protected]>
> > > > >
> > > > > Since commit 533059281ee5 ("tracing: probeevent: Introduce new
> > > > > argument fetching code") dropped the $comm support from uprobe
> > > > > events, this re-enable it.
> > >
> > > this should read 're-enables'.
> > >
> > > > >
> > > > > For $comm support, use strncpy() instead of strncpy_from_user()
> > > ^
> > > we're using strlcpy(), not strncpy().
> > >
> > > > > to copy current task's comm. Because it is in the kernel space,
> > > > > strncpy_from_user() always fails to copy the comm.
> > > > > This also use strlen() instead of strlen_user() to measure the
> > > ^ ^
> > > 'uses', and the function should be 'strnlen_user()'.
> > >
> > > > > length of the comm.
> > > > >
> > > > > Signed-off-by: Masami Hiramatsu <[email protected]>
> > > > > Reported-by: Andreas Ziegler <[email protected]>
> > > > > ---
> > > > > kernel/trace/trace_uprobe.c | 13 +++++++++++--
> > > > > 1 file changed, 11 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
> > > > > index e335576b9411..97d134e83e0f 100644
> > > > > --- a/kernel/trace/trace_uprobe.c
> > > > > +++ b/kernel/trace/trace_uprobe.c
> > > > > @@ -156,7 +156,10 @@ fetch_store_string(unsigned long addr, void *dest, void *base)
> > > > > if (unlikely(!maxlen))
> > > > > return -ENOMEM;
> > > > > - ret = strncpy_from_user(dst, src, maxlen);
> > > > > + if (addr == (unsigned long)current->comm)
> > > > > + ret = strlcpy(dst, current->comm, maxlen);
> > > > > + else
> > > > > + ret = strncpy_from_user(dst, src, maxlen);
> > > > > if (ret >= 0) {
> > > > > if (ret == maxlen)
> > > > > dst[ret - 1] = '\0';
> > > > > @@ -173,7 +176,10 @@ fetch_store_strlen(unsigned long addr)
> > > > > int len;
> > > > > void __user *vaddr = (void __force __user *) addr;
> > > > > - len = strnlen_user(vaddr, MAX_STRING_SIZE);
> > > > > + if (addr == (unsigned long)current->comm)
> > > > > + len = strlen(current->comm);
> > > >
> > > > To balance with the strnlen_user, we must increse the len in this block.
> > > > (strlen doesn't count the final '\0', but strnlen_user counts it)
> > > >
> > >
> > > yes, we need to add a '+ 1' here.
> > >
> > > With the typos and this one fixed, this is
> > >
> > > Acked-by: Andreas Ziegler <[email protected]>
> >
> > Thank you for fixing typo and Ack :)
> >
> > Thanks you,
> >
> > >
> > > > Thank you,
> > > >
> > > > > + else
> > > > > + len = strnlen_user(vaddr, MAX_STRING_SIZE);
> > > > > return (len > MAX_STRING_SIZE) ? 0 : len;
> > > > > }
> > > > > @@ -213,6 +219,9 @@ process_fetch_insn(struct fetch_insn *code, struct pt_regs *regs, void *dest,
> > > > > case FETCH_OP_IMM:
> > > > > val = code->immediate;
> > > > > break;
> > > > > + case FETCH_OP_COMM:
> > > > > + val = (unsigned long)current->comm;
> > > > > + break;
> > > > > case FETCH_OP_FOFFS:
> > > > > val = translate_user_vaddr(code->immediate);
> > > > > break;
> > > >
> > > >
> > >
> >
> >
> as the original commit breaking $comm support was merged for v4.20 (which is
> a stable kernel) and the wrong behaviour with multiple strings exists in all
> longterm/stable releases (tested back to v4.4), do you think this should be
> going into a stable release once it's merged?
>
> I added Greg as he might know the answer to that.

Sounds like it would be a patch to be backported to the stable kernels
to me.

greg k-h

2019-01-17 15:17:04

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH v2] tracing/uprobes: Fix output for multiple string arguments

On Thu, 17 Jan 2019 16:58:07 +0900
Masami Hiramatsu <[email protected]> wrote:

> Ah, I got it.
> Hmm, in that case, I have to update my patch in the previous mail.
> Anyway,
>
> Acked-by: Masami Hiramatsu <[email protected]>

So this patch is good to go? If so, I'll pull it in and start testing
it.

I'm currently traveling (as you probably received my OoO emails), but I
can still do a bit of work remotely.

-- Steve

2019-01-17 17:10:48

by Andreas Ziegler

[permalink] [raw]
Subject: Re: [PATCH v2] tracing/uprobes: Fix output for multiple string arguments

On 1/17/19 3:20 PM, Steven Rostedt wrote:
> On Thu, 17 Jan 2019 16:58:07 +0900
> Masami Hiramatsu <[email protected]> wrote:
>
>> Ah, I got it.
>> Hmm, in that case, I have to update my patch in the previous mail.
>> Anyway,
>>
>> Acked-by: Masami Hiramatsu <[email protected]>
>
> So this patch is good to go? If so, I'll pull it in and start testing
> it.
>
> I'm currently traveling (as you probably received my OoO emails), but I
> can still do a bit of work remotely.
>
> -- Steve
>
From my side it's basically good to go, but I just realized that there
is no "Fixes: " tag in it but the problem it fixes goes back to
5baaa59ef09e ("tracing/probes: Implement 'memory' fetch method for
uprobes") from 2013 (and is present in all current longterm and stable
kernels)... should I add a tag and resend a v3?

Thanks,

Andreas

2019-01-17 17:18:13

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH v2] tracing/uprobes: Fix output for multiple string arguments

On Thu, 17 Jan 2019 16:14:44 +0100
Andreas Ziegler <[email protected]> wrote:

> One more thing maybe, the code in question was rewritten
> by Masami in 9178412ddf5a ("tracing: probeevent: Return consumed
> bytes of dynamic area”) which is in v4.20 but the problem traces back
> to the original commit mentioned above, so it will need manual so we
> will need a slightly modified version for earlier kernels.

That's fine. When Greg (or other stable maintainer) process the commit,
it will produce a "FAILED" message and Cc all those in the tags
(obviously, you'll be one of them as the author), and then you can
reply to that email with the fixed up version and Greg (or whoever)
will take that patch in.

Thanks!

-- Steve

2019-01-17 18:43:16

by Andreas Ziegler

[permalink] [raw]
Subject: Re: [PATCH v2] tracing/uprobes: Fix output for multiple string arguments

On 17. Jan 2019, at 15:51, Steven Rostedt <[email protected]> wrote:
>
> On Thu, 17 Jan 2019 15:29:09 +0100
> Andreas Ziegler <[email protected]> wrote:
>
>> From my side it's basically good to go, but I just realized that there
>> is no "Fixes: " tag in it but the problem it fixes goes back to
>> 5baaa59ef09e ("tracing/probes: Implement 'memory' fetch method for
>> uprobes") from 2013 (and is present in all current longterm and stable
>> kernels)... should I add a tag and resend a v3?
>>
>
> I was just about to ask you to provide the fixes tag. No need to
> resend, I'll just add it manually. I asked Masami to resend because it
> was a patch series. If it was just a single patch, I would have just
> asked what the fixes was for that particular patch. I can handle one
> patch, but not more than one ;-)
>
> -- Steve

One more thing maybe, the code in question was rewritten
by Masami in 9178412ddf5a ("tracing: probeevent: Return consumed
bytes of dynamic area”) which is in v4.20 but the problem traces back
to the original commit mentioned above, so it will need manual so we
will need a slightly modified version for earlier kernels.

Regards,

Andreas

2019-01-17 23:52:10

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: uprobes: bug in comm/string output?

On Thu, 17 Jan 2019 14:44:41 +0100
Andreas Ziegler <[email protected]> wrote:

> Hi,
>
> On 1/17/19 10:47 AM, Masami Hiramatsu wrote:
> > On Thu, 17 Jan 2019 09:08:41 +0100
> > Andreas Ziegler <[email protected]> wrote:
> >
> >> On 17.01.19 09:00, Masami Hiramatsu wrote:
> >>> On Thu, 17 Jan 2019 15:13:09 +0900
> >>> Masami Hiramatsu <[email protected]> wrote:
> >>>
> >>>> On Wed, 16 Jan 2019 11:16:07 +0100
> >>>> Andreas Ziegler <[email protected]> wrote:
> >>>>
> >>>>>
> >>>>> I went into this a bit deeper today, and right now it is simply failing
> >>>>> to parse the code because there is no FETCH_OP_COMM case in
> >>>>> process_fetch_insn() for uprobes so that will return -EILSEQ, leading to
> >>>>> a make_data_loc(0, ...) in store_trace_args(). If we just add
> >>>>> FETCH_OP_COMM and let val point to current->comm (that's what
> >>>>> trace_kprobe.c does), we get an -EFAULT return value from
> >>>>> fetch_store_string because strncpy_from_user() checks if the argument is
> >>>>> in user space.
> >>>>
> >>>> Correct. I missed to add OP_COMM support. And uprobe's fetch_store_string
> >>>> is only for user space strings.
> >>>>
> >>>>> So I think we might need a special case for that, something like
> >>>>> FETCH_OP_ST_COMM_STRING which is only used for FETCH_OP_COMM and copies
> >>>>> current->comm over to the dynamic area. The implementation could be
> >>>>> similar to the old fetch_comm_string implementation before your rewrite.
> >>>>
> >>>> Hmm, instead, I would like to add current->comm checker and only allows
> >>>> to copy that. That would be simpler and enough.
> >>>>
> >>>> Could you test below patch?
> >>>>
> >>>>
> >>>> tracing: uprobes: Re-enable $comm support for uprobe events
> >>>>
> >>>> From: Masami Hiramatsu <[email protected]>
> >>>>
> >>>> Since commit 533059281ee5 ("tracing: probeevent: Introduce new
> >>>> argument fetching code") dropped the $comm support from uprobe
> >>>> events, this re-enable it.
> >>
> >> this should read 're-enables'.
> >>
> >>>>
> >>>> For $comm support, use strncpy() instead of strncpy_from_user()
> >> ^
> >> we're using strlcpy(), not strncpy().
> >>
> >>>> to copy current task's comm. Because it is in the kernel space,
> >>>> strncpy_from_user() always fails to copy the comm.
> >>>> This also use strlen() instead of strlen_user() to measure the
> >> ^ ^
> >> 'uses', and the function should be 'strnlen_user()'.
> >>
> >>>> length of the comm.
> >>>>
> >>>> Signed-off-by: Masami Hiramatsu <[email protected]>
> >>>> Reported-by: Andreas Ziegler <[email protected]>
> >>>> ---
> >>>> kernel/trace/trace_uprobe.c | 13 +++++++++++--
> >>>> 1 file changed, 11 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
> >>>> index e335576b9411..97d134e83e0f 100644
> >>>> --- a/kernel/trace/trace_uprobe.c
> >>>> +++ b/kernel/trace/trace_uprobe.c
> >>>> @@ -156,7 +156,10 @@ fetch_store_string(unsigned long addr, void *dest, void *base)
> >>>> if (unlikely(!maxlen))
> >>>> return -ENOMEM;
> >>>>
> >>>> - ret = strncpy_from_user(dst, src, maxlen);
> >>>> + if (addr == (unsigned long)current->comm)
> >>>> + ret = strlcpy(dst, current->comm, maxlen);
> >>>> + else
> >>>> + ret = strncpy_from_user(dst, src, maxlen);
> >>>> if (ret >= 0) {
> >>>> if (ret == maxlen)
> >>>> dst[ret - 1] = '\0';
> >>>> @@ -173,7 +176,10 @@ fetch_store_strlen(unsigned long addr)
> >>>> int len;
> >>>> void __user *vaddr = (void __force __user *) addr;
> >>>>
> >>>> - len = strnlen_user(vaddr, MAX_STRING_SIZE);
> >>>> + if (addr == (unsigned long)current->comm)
> >>>> + len = strlen(current->comm);
> >>>
> >>> To balance with the strnlen_user, we must increse the len in this block.
> >>> (strlen doesn't count the final '\0', but strnlen_user counts it)
> >>>
> >>
> >> yes, we need to add a '+ 1' here.
> >>
> >> With the typos and this one fixed, this is
> >>
> >> Acked-by: Andreas Ziegler <[email protected]>
> >
> > Thank you for fixing typo and Ack :)
> >
> > Thanks you,
> >
> >>
> >>> Thank you,
> >>>
> >>>> + else
> >>>> + len = strnlen_user(vaddr, MAX_STRING_SIZE);
> >>>>
> >>>> return (len > MAX_STRING_SIZE) ? 0 : len;
> >>>> }
> >>>> @@ -213,6 +219,9 @@ process_fetch_insn(struct fetch_insn *code, struct pt_regs *regs, void *dest,
> >>>> case FETCH_OP_IMM:
> >>>> val = code->immediate;
> >>>> break;
> >>>> + case FETCH_OP_COMM:
> >>>> + val = (unsigned long)current->comm;
> >>>> + break;
> >>>> case FETCH_OP_FOFFS:
> >>>> val = translate_user_vaddr(code->immediate);
> >>>> break;
> >>>
> >>>
> >>
> >
> >
> as the original commit breaking $comm support was merged for v4.20
> (which is a stable kernel) and the wrong behaviour with multiple strings
> exists in all longterm/stable releases (tested back to v4.4), do you
> think this should be going into a stable release once it's merged?

Yes, clearly your patch should go to stable to fix the multiple string
support. Mine is only for v4.20.

Thank you!

>
> I added Greg as he might know the answer to that.
>
> Thanks,
>
> Andreas
>


--
Masami Hiramatsu <[email protected]>