Commit ad67b74d2469 ("printk: hash addresses printed with %p")
changed %p to hash pointers in order to avoid leaking
kernel addresses.
This breaks the tool perf probe.
To set a uprobe on a function named inet_pton in libc library,
obtain the address of the symbol inet_pton using command nm and
then use the following command to set the uprobe:
# echo "p:probe_libc/inet_pton /usr/lib64/libc-2.26.so:0x142060"
> /sys/kernel/debug/tracing/uprobe_events
However the output shows a randomized address:
# cat /sys/kernel/debug/tracing/uprobe_events
p:probe_libc/inet_pton /usr/lib64/libc-2.26.so:0x000000002d0f8952
The displayed address 0x000000002d0f8952 is incorrect and breaks
tools post processing it:
# linux/tools/perf/perf probe -l
Failed to find debug information for address 2d0f8952
probe_libc:inet_pton (on 0x2d0f8952 in /usr/lib64/libc-2.26.so)
Using the %px printk format string fixes this issue for root
and shows the correct address allowing the perf probe tool
to resolve the address to the symbol:
# echo "p:probe_libc/inet_pton /usr/lib64/libc-2.26.so:0x142060"
> /sys/kernel/debug/tracing/uprobe_events
# cat /sys/kernel/debug/tracing/uprobe_events
p:probe_libc/inet_pton /usr/lib64/libc-2.26.so:0x0000000000142060
# linux/tools/perf/perf probe -l
probe_libc:inet_pton (on __inet_pton@resolv/inet_pton.c
in /usr/lib64/libc-2.26.so)
Signed-off-by: Thomas Richter <[email protected]>
Suggested-by: Peter Zijlstra <[email protected]>
---
kernel/trace/trace_uprobe.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index 40592e7b3568..268029ae1be6 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -608,7 +608,7 @@ static int probes_seq_show(struct seq_file *m, void *v)
/* Don't print "0x (null)" when offset is 0 */
if (tu->offset) {
- seq_printf(m, "0x%p", (void *)tu->offset);
+ seq_printf(m, "0x%px", (void *)tu->offset);
} else {
switch (sizeof(void *)) {
case 4:
--
2.13.4
On Mon, Dec 18, 2017 at 04:32:25PM +0100, Thomas Richter wrote:
> Commit ad67b74d2469 ("printk: hash addresses printed with %p")
> changed %p to hash pointers in order to avoid leaking
> kernel addresses.
>
> This breaks the tool perf probe.
I also said this likely ends up in a world readable file and probably
isn't right.
> To set a uprobe on a function named inet_pton in libc library,
> obtain the address of the symbol inet_pton using command nm and
> then use the following command to set the uprobe:
>
> # echo "p:probe_libc/inet_pton /usr/lib64/libc-2.26.so:0x142060"
> > /sys/kernel/debug/tracing/uprobe_events
>
> However the output shows a randomized address:
> # cat /sys/kernel/debug/tracing/uprobe_events
> p:probe_libc/inet_pton /usr/lib64/libc-2.26.so:0x000000002d0f8952
>
> The displayed address 0x000000002d0f8952 is incorrect and breaks
> tools post processing it:
>
> # linux/tools/perf/perf probe -l
> Failed to find debug information for address 2d0f8952
> probe_libc:inet_pton (on 0x2d0f8952 in /usr/lib64/libc-2.26.so)
>
> Using the %px printk format string fixes this issue for root
> and shows the correct address allowing the perf probe tool
> to resolve the address to the symbol:
>
> # echo "p:probe_libc/inet_pton /usr/lib64/libc-2.26.so:0x142060"
> > /sys/kernel/debug/tracing/uprobe_events
> # cat /sys/kernel/debug/tracing/uprobe_events
> p:probe_libc/inet_pton /usr/lib64/libc-2.26.so:0x0000000000142060
> # linux/tools/perf/perf probe -l
> probe_libc:inet_pton (on __inet_pton@resolv/inet_pton.c
> in /usr/lib64/libc-2.26.so)
>
> Signed-off-by: Thomas Richter <[email protected]>
> Suggested-by: Peter Zijlstra <[email protected]>
> ---
> kernel/trace/trace_uprobe.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
> index 40592e7b3568..268029ae1be6 100644
> --- a/kernel/trace/trace_uprobe.c
> +++ b/kernel/trace/trace_uprobe.c
> @@ -608,7 +608,7 @@ static int probes_seq_show(struct seq_file *m, void *v)
>
> /* Don't print "0x (null)" when offset is 0 */
> if (tu->offset) {
> - seq_printf(m, "0x%p", (void *)tu->offset);
> + seq_printf(m, "0x%px", (void *)tu->offset);
> } else {
> switch (sizeof(void *)) {
> case 4:
> --
> 2.13.4
>
On Mon, 18 Dec 2017 16:32:25 +0100
Thomas Richter <[email protected]> wrote:
> Commit ad67b74d2469 ("printk: hash addresses printed with %p")
> changed %p to hash pointers in order to avoid leaking
> kernel addresses.
>
> This breaks the tool perf probe.
>
> To set a uprobe on a function named inet_pton in libc library,
> obtain the address of the symbol inet_pton using command nm and
> then use the following command to set the uprobe:
>
> # echo "p:probe_libc/inet_pton /usr/lib64/libc-2.26.so:0x142060"
> > /sys/kernel/debug/tracing/uprobe_events
>
> However the output shows a randomized address:
> # cat /sys/kernel/debug/tracing/uprobe_events
> p:probe_libc/inet_pton /usr/lib64/libc-2.26.so:0x000000002d0f8952
>
> The displayed address 0x000000002d0f8952 is incorrect and breaks
> tools post processing it:
Hmm, since the scrambling of %p is to prevent kernel addresses from
leaking, I wonder if it would be OK to make it only scramble the address
if the address is a kernel address. It should be fine to print user
space addresses unscrambled. Shouldn't it?
-- Steve
>
> # linux/tools/perf/perf probe -l
> Failed to find debug information for address 2d0f8952
> probe_libc:inet_pton (on 0x2d0f8952 in /usr/lib64/libc-2.26.so)
>
> Using the %px printk format string fixes this issue for root
> and shows the correct address allowing the perf probe tool
> to resolve the address to the symbol:
>
> # echo "p:probe_libc/inet_pton /usr/lib64/libc-2.26.so:0x142060"
> > /sys/kernel/debug/tracing/uprobe_events
> # cat /sys/kernel/debug/tracing/uprobe_events
> p:probe_libc/inet_pton /usr/lib64/libc-2.26.so:0x0000000000142060
> # linux/tools/perf/perf probe -l
> probe_libc:inet_pton (on __inet_pton@resolv/inet_pton.c
> in /usr/lib64/libc-2.26.so)
>
> Signed-off-by: Thomas Richter <[email protected]>
> Suggested-by: Peter Zijlstra <[email protected]>
> ---
> kernel/trace/trace_uprobe.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
> index 40592e7b3568..268029ae1be6 100644
> --- a/kernel/trace/trace_uprobe.c
> +++ b/kernel/trace/trace_uprobe.c
> @@ -608,7 +608,7 @@ static int probes_seq_show(struct seq_file *m, void *v)
>
> /* Don't print "0x (null)" when offset is 0 */
> if (tu->offset) {
> - seq_printf(m, "0x%p", (void *)tu->offset);
> + seq_printf(m, "0x%px", (void *)tu->offset);
> } else {
> switch (sizeof(void *)) {
> case 4:
> --
> 2.13.4
On Mon, Dec 18, 2017 at 9:59 AM, Steven Rostedt <[email protected]> wrote:
>
> Hmm, since the scrambling of %p is to prevent kernel addresses from
> leaking, I wonder if it would be OK to make it only scramble the address
> if the address is a kernel address. It should be fine to print user
> space addresses unscrambled. Shouldn't it?
So %p itself shouldn't have logic like that, because some of those
addresses can be sensitive even if they aren't strictly kernel
addresses.
For example, anything that prints out sensitive physical addresses
wouldn't look like a kernel virtual address, but it could still expose
very sensitive data.
So that check would have to be done by the user of %p, not by %p itself.
So generally, the fix for "oops %p hashing now breaks xyz" should
generally be to make sure that the protections are correct, and then
it can be turned into %px (when it can't just be removed).
In this particular case, we already have the magical case of "don't
print (null)", and I think _that_ the case that could be just extended
to say "don't print sensitive data" and then the %p can be changed
into a %px.
So one approach would be to simply checking (at _open_ time, not at IO
time! That was one of the things that I absolutely detested about %pK
- getting that fundamentally wrong) whether the opener could write a
kernel address to the file, and if the opener has those permissions,
then it obviously can read the values too.
But in this case I would suggest just making "uprobe_events" be 0600
rather than 0644. Why the hell should a normal user be able to read
whatever events that root has set?
Same goes for "uprobe_profile". Is it really sensible that anybody
can read that file? It sounds insane to me. Why should a random
regular user be able to see what probes are installed?
Honestly, very few of the "let's just change %p to %px" has been
correct. There is pretty much _always_ some permission problem or
other that says "hell no, that data should not have been so widely
available before".
The only time a pure %p -> %px is warranted is likely for oops reports
etc. We did have a couple of those.
Linus
On 12/18/2017 07:16 PM, Linus Torvalds wrote:
> On Mon, Dec 18, 2017 at 9:59 AM, Steven Rostedt <[email protected]> wrote:
>>
>> Hmm, since the scrambling of %p is to prevent kernel addresses from
>> leaking, I wonder if it would be OK to make it only scramble the address
>> if the address is a kernel address. It should be fine to print user
>> space addresses unscrambled. Shouldn't it?
>
> So %p itself shouldn't have logic like that, because some of those
> addresses can be sensitive even if they aren't strictly kernel
> addresses.
>
> For example, anything that prints out sensitive physical addresses
> wouldn't look like a kernel virtual address, but it could still expose
> very sensitive data.
>
> So that check would have to be done by the user of %p, not by %p itself.
>
> So generally, the fix for "oops %p hashing now breaks xyz" should
> generally be to make sure that the protections are correct, and then
> it can be turned into %px (when it can't just be removed).
>
> In this particular case, we already have the magical case of "don't
> print (null)", and I think _that_ the case that could be just extended
> to say "don't print sensitive data" and then the %p can be changed
> into a %px.
>
> So one approach would be to simply checking (at _open_ time, not at IO
> time! That was one of the things that I absolutely detested about %pK
> - getting that fundamentally wrong) whether the opener could write a
> kernel address to the file, and if the opener has those permissions,
> then it obviously can read the values too.
>
> But in this case I would suggest just making "uprobe_events" be 0600
> rather than 0644. Why the hell should a normal user be able to read
> whatever events that root has set?
>
> Same goes for "uprobe_profile". Is it really sensible that anybody
> can read that file? It sounds insane to me. Why should a random
> regular user be able to see what probes are installed?
>
> Honestly, very few of the "let's just change %p to %px" has been
> correct. There is pretty much _always_ some permission problem or
> other that says "hell no, that data should not have been so widely
> available before".
>
> The only time a pure %p -> %px is warranted is likely for oops reports
> etc. We did have a couple of those.
>
> Linus
>
As Steven Rostedt already pointed out, the 2 file systems debugfs and
tracefs are accessible for root only:
[root@s35lp76 ~]# mount | egrep 'debug|trace'
debugfs on /sys/kernel/debug type debugfs (rw,relatime)
tracefs on /sys/kernel/debug/tracing type tracefs (rw,relatime)
[root@s35lp76 ~]# ls -ld /sys/kernel/debug/ /sys/kernel/debug/tracing/
drwx------ 14 root root 0 Dec 18 11:34 /sys/kernel/debug/
drwx------ 7 root root 0 Dec 18 11:34 /sys/kernel/debug/tracing/
[root@s35lp76 ~]#
Surely the file permissions for uprobe_events and uprobe_profile
in directory /sys/kernel/debug/tracing
can be changed to be accessible by owner (root) only, but does this really
help?
I have looked at all the other files in this directory and almost
all of them have read permissions for group and others.
--
Thomas Richter, Dept 3303, IBM LTC Boeblingen Germany
--
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294