2022-06-16 15:27:02

by Chuck Lever

[permalink] [raw]
Subject: possible trace_printk() bug in v5.19-rc1

Hi Steve-

Since v5.19-rc1, trace_printk() no longer displays a function
name, but rather a kernel address. I don't have time at the
moment to track this down to a specific commit, just parking
it here in case someone can get to it sooner than me.

<idle>-0 [003] 269.397568: bprint: 0xffffffffc0cccab7: nf=0xffff888172dbfba0 still on the LRU or a dispose list
<idle>-0 [003] 269.397576: bprint: 0xffffffffc0cccafa: nf=0xffff888172dbfba0
<idle>-0 [003] 269.397583: bprint: 0xffffffffc0cccab7: nf=0xffff8881726cd4d8 still on the LRU or a dispose list
<idle>-0 [003] 269.397584: bprint: 0xffffffffc0cccafa: nf=0xffff8881726cd4d8



--
Chuck Lever




2022-06-16 15:42:35

by Chuck Lever

[permalink] [raw]
Subject: Re: possible trace_printk() bug in v5.19-rc1



> On Jun 16, 2022, at 11:34 AM, Steven Rostedt <[email protected]> wrote:
>
> On Thu, 16 Jun 2022 15:22:26 +0000
> Chuck Lever III <[email protected]> wrote:
>
>> Hi Steve-
>>
>> Since v5.19-rc1, trace_printk() no longer displays a function
>> name, but rather a kernel address. I don't have time at the
>> moment to track this down to a specific commit, just parking
>> it here in case someone can get to it sooner than me.
>>
>> <idle>-0 [003] 269.397568: bprint: 0xffffffffc0cccab7: nf=0xffff888172dbfba0 still on the LRU or a dispose list
>> <idle>-0 [003] 269.397576: bprint: 0xffffffffc0cccafa: nf=0xffff888172dbfba0
>> <idle>-0 [003] 269.397583: bprint: 0xffffffffc0cccab7: nf=0xffff8881726cd4d8 still on the LRU or a dispose list
>> <idle>-0 [003] 269.397584: bprint: 0xffffffffc0cccafa: nf=0xffff8881726cd4d8
>>
>
> It uses the normal vsprintf() in the kernel (that is, it prints the
> function address just like printk() does with %pS). So if trace_printk() is
> broken, then so is printk().
>
> But one reason for this is if you have a trace_printk() in a module, do the
> record, and then remove the module. The function name will also be removed
> from kallsyms, leaving you with only the function address.
>
> Did you remove any modules before displaying the trace?

I haven't explicitly removed nfsd.ko, and lsmod says it's still there.
And, trace_printk was working as expected on v5.18.


--
Chuck Lever



2022-06-16 15:59:17

by Steven Rostedt

[permalink] [raw]
Subject: Re: possible trace_printk() bug in v5.19-rc1

On Thu, 16 Jun 2022 15:36:43 +0000
Chuck Lever III <[email protected]> wrote:

> I haven't explicitly removed nfsd.ko, and lsmod says it's still there.
> And, trace_printk was working as expected on v5.18.

If you were to add a printk() with the same format in the same location,
does it work?

That is:

printk("%pS still here\n", __func__);

-- Steve

2022-06-16 16:12:48

by Steven Rostedt

[permalink] [raw]
Subject: Re: possible trace_printk() bug in v5.19-rc1

On Thu, 16 Jun 2022 15:22:26 +0000
Chuck Lever III <[email protected]> wrote:

> Hi Steve-
>
> Since v5.19-rc1, trace_printk() no longer displays a function
> name, but rather a kernel address. I don't have time at the
> moment to track this down to a specific commit, just parking
> it here in case someone can get to it sooner than me.
>
> <idle>-0 [003] 269.397568: bprint: 0xffffffffc0cccab7: nf=0xffff888172dbfba0 still on the LRU or a dispose list
> <idle>-0 [003] 269.397576: bprint: 0xffffffffc0cccafa: nf=0xffff888172dbfba0
> <idle>-0 [003] 269.397583: bprint: 0xffffffffc0cccab7: nf=0xffff8881726cd4d8 still on the LRU or a dispose list
> <idle>-0 [003] 269.397584: bprint: 0xffffffffc0cccafa: nf=0xffff8881726cd4d8
>

It uses the normal vsprintf() in the kernel (that is, it prints the
function address just like printk() does with %pS). So if trace_printk() is
broken, then so is printk().

But one reason for this is if you have a trace_printk() in a module, do the
record, and then remove the module. The function name will also be removed
from kallsyms, leaving you with only the function address.

Did you remove any modules before displaying the trace?

-- Steve

2022-06-16 18:24:18

by Chuck Lever

[permalink] [raw]
Subject: Re: possible trace_printk() bug in v5.19-rc1



> On Jun 16, 2022, at 11:43 AM, Steven Rostedt <[email protected]> wrote:
>
> On Thu, 16 Jun 2022 15:36:43 +0000
> Chuck Lever III <[email protected]> wrote:
>
>> I haven't explicitly removed nfsd.ko, and lsmod says it's still there.
>> And, trace_printk was working as expected on v5.18.
>
> If you were to add a printk() with the same format in the same location,
> does it work?
>
> That is:
>
> printk("%pS still here\n", __func__);

Sorry for the delay; lunch was mandatory-to-implement.

This code:

pr_info("%pS\n", (void *)_THIS_IP_);

generates:

Jun 16 14:05:53 klimt.1015granger.net kernel: nfsd_file_slab_free+0xb6/0x10a [nfsd]

--
Chuck Lever



2022-06-17 20:02:10

by Steven Rostedt

[permalink] [raw]
Subject: Re: possible trace_printk() bug in v5.19-rc1

On Thu, 16 Jun 2022 15:36:43 +0000
Chuck Lever III <[email protected]> wrote:

> > Did you remove any modules before displaying the trace?
>
> I haven't explicitly removed nfsd.ko, and lsmod says it's still there.
> And, trace_printk was working as expected on v5.18.

Are you sure?

I just checkout v5.19-rc2 and added the below patch. Then I did the
following:

# rmmod bridge

# trace-cmd start -p function -l :mod:bridge

# insmod bridge

# trace-cmd show
# tracer: function
#
# entries-in-buffer/entries-written: 23/23 #P:8
#
# _-----=> irqs-off/BH-disabled
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / _-=> migrate-disable
# |||| / delay
# TASK-PID CPU# ||||| TIMESTAMP FUNCTION
# | | | ||||| | |
modprobe-2155 [005] ...1. 4732.051406: br_init <-do_one_initcall
modprobe-2155 [005] ..... 4732.051408: br_init: here in bridge
modprobe-2155 [005] ...1. 4732.051408: br_fdb_init <-br_init
modprobe-2155 [005] ...1. 4732.051418: br_device_event <-call_netdevice_register_net_notifiers
modprobe-2155 [005] ..... 4732.051418: br_device_event: this is an event
modprobe-2155 [005] ...1. 4732.051418: br_device_event <-call_netdevice_register_net_notifiers
modprobe-2155 [005] ..... 4732.051418: br_device_event: this is an event
modprobe-2155 [005] ...1. 4732.051418: br_device_event <-call_netdevice_register_net_notifiers
modprobe-2155 [005] ..... 4732.051419: br_device_event: this is an event
modprobe-2155 [005] ...1. 4732.051419: br_device_event <-call_netdevice_register_net_notifiers
modprobe-2155 [005] ..... 4732.051419: br_device_event: this is an event
modprobe-2155 [005] ...1. 4732.051419: br_device_event <-call_netdevice_register_net_notifiers
modprobe-2155 [005] ..... 4732.051419: br_device_event: this is an event
modprobe-2155 [005] ...1. 4732.051419: br_device_event <-call_netdevice_register_net_notifiers
modprobe-2155 [005] ..... 4732.051420: br_device_event: this is an event
modprobe-2155 [005] ...1. 4732.051420: br_device_event <-call_netdevice_register_net_notifiers
modprobe-2155 [005] ..... 4732.051420: br_device_event: this is an event
modprobe-2155 [005] ...1. 4732.051420: br_device_event <-call_netdevice_register_net_notifiers
modprobe-2155 [005] ..... 4732.051420: br_device_event: this is an event
modprobe-2155 [005] ...1. 4732.051420: br_device_event <-call_netdevice_register_net_notifiers
modprobe-2155 [005] ..... 4732.051420: br_device_event: this is an event
modprobe-2155 [005] ...1. 4732.051421: br_netlink_init <-br_init
modprobe-2155 [005] ...1. 4732.051421: br_mdb_init <-br_netlink_init

# rmmod bridge

# trace-cmd show
# tracer: function
#
# entries-in-buffer/entries-written: 56/56 #P:8
#
# _-----=> irqs-off/BH-disabled
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / _-=> migrate-disable
# |||| / delay
# TASK-PID CPU# ||||| TIMESTAMP FUNCTION
# | | | ||||| | |
modprobe-2155 [005] ...1. 4732.051406: 0xffffffffc0cc7000 <-do_one_initcall
modprobe-2155 [005] ..... 4732.051408: 0xffffffffc0cc702b: here in bridge
modprobe-2155 [005] ...1. 4732.051408: 0xffffffffc0cc7101 <-0xffffffffc0cc7043
modprobe-2155 [005] ...1. 4732.051418: 0xffffffffc0d81340 <-call_netdevice_register_net_notifiers
modprobe-2155 [005] ..... 4732.051418: 0xffffffffc0d81345: this is an event
modprobe-2155 [005] ...1. 4732.051418: 0xffffffffc0d81340 <-call_netdevice_register_net_notifiers
modprobe-2155 [005] ..... 4732.051418: 0xffffffffc0d81345: this is an event
modprobe-2155 [005] ...1. 4732.051418: 0xffffffffc0d81340 <-call_netdevice_register_net_notifiers
modprobe-2155 [005] ..... 4732.051419: 0xffffffffc0d81345: this is an event
modprobe-2155 [005] ...1. 4732.051419: 0xffffffffc0d81340 <-call_netdevice_register_net_notifiers
modprobe-2155 [005] ..... 4732.051419: 0xffffffffc0d81345: this is an event
modprobe-2155 [005] ...1. 4732.051419: 0xffffffffc0d81340 <-call_netdevice_register_net_notifiers
modprobe-2155 [005] ..... 4732.051419: 0xffffffffc0d81345: this is an event
modprobe-2155 [005] ...1. 4732.051419: 0xffffffffc0d81340 <-call_netdevice_register_net_notifiers
modprobe-2155 [005] ..... 4732.051420: 0xffffffffc0d81345: this is an event
modprobe-2155 [005] ...1. 4732.051420: 0xffffffffc0d81340 <-call_netdevice_register_net_notifiers
modprobe-2155 [005] ..... 4732.051420: 0xffffffffc0d81345: this is an event
modprobe-2155 [005] ...1. 4732.051420: 0xffffffffc0d81340 <-call_netdevice_register_net_notifiers
modprobe-2155 [005] ..... 4732.051420: 0xffffffffc0d81345: this is an event
modprobe-2155 [005] ...1. 4732.051420: 0xffffffffc0d81340 <-call_netdevice_register_net_notifiers
modprobe-2155 [005] ..... 4732.051420: 0xffffffffc0d81345: this is an event
modprobe-2155 [005] ...1. 4732.051421: 0xffffffffc0cc7132 <-0xffffffffc0cc70a3
modprobe-2155 [005] ...1. 4732.051421: 0xffffffffc0da1500 <-0xffffffffc0cc713e
kworker/u16:2-2023 [004] ...1. 4737.946576: 0xffffffffc0d81340 <-raw_notifier_call_chain
kworker/u16:2-2023 [004] ..... 4737.946578: 0xffffffffc0d81345: this is an event
kworker/u16:2-2023 [004] ...1. 4737.946634: 0xffffffffc0d8e1a0 <-if_nlmsg_size
kworker/u16:2-2023 [004] ...1. 4737.946735: 0xffffffffc0d81340 <-raw_notifier_call_chain
kworker/u16:2-2023 [004] ..... 4737.946735: 0xffffffffc0d81345: this is an event
kworker/u16:2-2023 [004] ...1. 4737.946833: 0xffffffffc0d81340 <-raw_notifier_call_chain
kworker/u16:2-2023 [004] ..... 4737.946833: 0xffffffffc0d81345: this is an event
kworker/u16:2-2023 [004] ...1. 4737.946834: 0xffffffffc0d8e1a0 <-if_nlmsg_size
kworker/u16:2-2023 [004] ...1. 4737.957648: 0xffffffffc0d81190 <-cleanup_net
rmmod-2206 [000] ...1. 5739.459376: 0xffffffffc0d8f880 <-0xffffffffc0da3d1d
rmmod-2206 [000] ...1. 5739.459378: 0xffffffffc0da1570 <-0xffffffffc0d8f88a
rmmod-2206 [000] ...1. 5739.471367: 0xffffffffc0d81340 <-call_netdevice_unregister_notifiers
rmmod-2206 [000] ..... 5739.471368: 0xffffffffc0d81345: this is an event
rmmod-2206 [000] ...1. 5739.471369: 0xffffffffc0d81340 <-call_netdevice_unregister_notifiers
rmmod-2206 [000] ..... 5739.471369: 0xffffffffc0d81345: this is an event
rmmod-2206 [000] ...1. 5739.471369: 0xffffffffc0d81340 <-call_netdevice_unregister_notifiers
rmmod-2206 [000] ..... 5739.471369: 0xffffffffc0d81345: this is an event
rmmod-2206 [000] ...1. 5739.471370: 0xffffffffc0d81340 <-call_netdevice_unregister_notifiers
rmmod-2206 [000] ..... 5739.471370: 0xffffffffc0d81345: this is an event
rmmod-2206 [000] ...1. 5739.471370: 0xffffffffc0d81340 <-call_netdevice_unregister_notifiers
rmmod-2206 [000] ..... 5739.471370: 0xffffffffc0d81345: this is an event
rmmod-2206 [000] ...1. 5739.471370: 0xffffffffc0d81340 <-call_netdevice_unregister_notifiers
rmmod-2206 [000] ..... 5739.471370: 0xffffffffc0d81345: this is an event
rmmod-2206 [000] ...1. 5739.471371: 0xffffffffc0d81340 <-call_netdevice_unregister_notifiers
rmmod-2206 [000] ..... 5739.471371: 0xffffffffc0d81345: this is an event
rmmod-2206 [000] ...1. 5739.471371: 0xffffffffc0d81340 <-call_netdevice_unregister_notifiers
rmmod-2206 [000] ..... 5739.471371: 0xffffffffc0d81345: this is an event
rmmod-2206 [000] ...1. 5739.471371: 0xffffffffc0d81340 <-call_netdevice_unregister_notifiers
rmmod-2206 [000] ..... 5739.471372: 0xffffffffc0d81345: this is an event
rmmod-2206 [000] ...1. 5739.471372: 0xffffffffc0d81340 <-call_netdevice_unregister_notifiers
rmmod-2206 [000] ..... 5739.471372: 0xffffffffc0d81345: this is an event
rmmod-2206 [000] ...1. 5739.477366: 0xffffffffc0d81190 <-unregister_pernet_operations
rmmod-2206 [000] ...1. 5739.477375: 0xffffffffc0d84910 <-__do_sys_delete_module.constprop.0

Before removing the module, the names were all present. Then after I
removed the module, they did not exist.

Other than that, I cannot reproduce the issue.

-- Steve


diff --git a/net/bridge/br.c b/net/bridge/br.c
index 96e91d69a9a8..4067760d3763 100644
--- a/net/bridge/br.c
+++ b/net/bridge/br.c
@@ -36,6 +36,7 @@ static int br_device_event(struct notifier_block *unused, unsigned long event, v
bool changed_addr;
int err;

+ trace_printk("this is an event\n");
if (netif_is_bridge_master(dev)) {
err = br_vlan_bridge_event(dev, event, ptr);
if (err)
@@ -385,6 +386,7 @@ static int __init br_init(void)
return err;
}

+ trace_printk("here in bridge\n");
err = br_fdb_init();
if (err)
goto err_out;

2022-06-18 01:36:43

by Chuck Lever

[permalink] [raw]
Subject: Re: possible trace_printk() bug in v5.19-rc1



> On Jun 17, 2022, at 3:50 PM, Steven Rostedt <[email protected]> wrote:
>
> On Thu, 16 Jun 2022 15:36:43 +0000
> Chuck Lever III <[email protected]> wrote:
>
>>> Did you remove any modules before displaying the trace?
>>
>> I haven't explicitly removed nfsd.ko, and lsmod says it's still there.
>> And, trace_printk was working as expected on v5.18.
>
> Are you sure?

Yes, I'm sure. I just checked out v5.18 and built it with the
same CONFIG. trace_printk() on that kernel generates function
names as expected.


> I just checkout v5.19-rc2 and added the below patch. Then I did the
> following:
>
> # rmmod bridge
>
> # trace-cmd start -p function -l :mod:bridge
>
> # insmod bridge
>
> # trace-cmd show
> # tracer: function
> #
> # entries-in-buffer/entries-written: 23/23 #P:8
> #
> # _-----=> irqs-off/BH-disabled
> # / _----=> need-resched
> # | / _---=> hardirq/softirq
> # || / _--=> preempt-depth
> # ||| / _-=> migrate-disable
> # |||| / delay
> # TASK-PID CPU# ||||| TIMESTAMP FUNCTION
> # | | | ||||| | |
> modprobe-2155 [005] ...1. 4732.051406: br_init <-do_one_initcall
> modprobe-2155 [005] ..... 4732.051408: br_init: here in bridge
> modprobe-2155 [005] ...1. 4732.051408: br_fdb_init <-br_init
> modprobe-2155 [005] ...1. 4732.051418: br_device_event <-call_netdevice_register_net_notifiers
> modprobe-2155 [005] ..... 4732.051418: br_device_event: this is an event
> modprobe-2155 [005] ...1. 4732.051418: br_device_event <-call_netdevice_register_net_notifiers
> modprobe-2155 [005] ..... 4732.051418: br_device_event: this is an event
> modprobe-2155 [005] ...1. 4732.051418: br_device_event <-call_netdevice_register_net_notifiers
> modprobe-2155 [005] ..... 4732.051419: br_device_event: this is an event
> modprobe-2155 [005] ...1. 4732.051419: br_device_event <-call_netdevice_register_net_notifiers
> modprobe-2155 [005] ..... 4732.051419: br_device_event: this is an event
> modprobe-2155 [005] ...1. 4732.051419: br_device_event <-call_netdevice_register_net_notifiers
> modprobe-2155 [005] ..... 4732.051419: br_device_event: this is an event
> modprobe-2155 [005] ...1. 4732.051419: br_device_event <-call_netdevice_register_net_notifiers
> modprobe-2155 [005] ..... 4732.051420: br_device_event: this is an event
> modprobe-2155 [005] ...1. 4732.051420: br_device_event <-call_netdevice_register_net_notifiers
> modprobe-2155 [005] ..... 4732.051420: br_device_event: this is an event
> modprobe-2155 [005] ...1. 4732.051420: br_device_event <-call_netdevice_register_net_notifiers
> modprobe-2155 [005] ..... 4732.051420: br_device_event: this is an event
> modprobe-2155 [005] ...1. 4732.051420: br_device_event <-call_netdevice_register_net_notifiers
> modprobe-2155 [005] ..... 4732.051420: br_device_event: this is an event
> modprobe-2155 [005] ...1. 4732.051421: br_netlink_init <-br_init
> modprobe-2155 [005] ...1. 4732.051421: br_mdb_init <-br_netlink_init
>
> # rmmod bridge
>
> # trace-cmd show
> # tracer: function
> #
> # entries-in-buffer/entries-written: 56/56 #P:8
> #
> # _-----=> irqs-off/BH-disabled
> # / _----=> need-resched
> # | / _---=> hardirq/softirq
> # || / _--=> preempt-depth
> # ||| / _-=> migrate-disable
> # |||| / delay
> # TASK-PID CPU# ||||| TIMESTAMP FUNCTION
> # | | | ||||| | |
> modprobe-2155 [005] ...1. 4732.051406: 0xffffffffc0cc7000 <-do_one_initcall
> modprobe-2155 [005] ..... 4732.051408: 0xffffffffc0cc702b: here in bridge
> modprobe-2155 [005] ...1. 4732.051408: 0xffffffffc0cc7101 <-0xffffffffc0cc7043
> modprobe-2155 [005] ...1. 4732.051418: 0xffffffffc0d81340 <-call_netdevice_register_net_notifiers
> modprobe-2155 [005] ..... 4732.051418: 0xffffffffc0d81345: this is an event
> modprobe-2155 [005] ...1. 4732.051418: 0xffffffffc0d81340 <-call_netdevice_register_net_notifiers
> modprobe-2155 [005] ..... 4732.051418: 0xffffffffc0d81345: this is an event
> modprobe-2155 [005] ...1. 4732.051418: 0xffffffffc0d81340 <-call_netdevice_register_net_notifiers
> modprobe-2155 [005] ..... 4732.051419: 0xffffffffc0d81345: this is an event
> modprobe-2155 [005] ...1. 4732.051419: 0xffffffffc0d81340 <-call_netdevice_register_net_notifiers
> modprobe-2155 [005] ..... 4732.051419: 0xffffffffc0d81345: this is an event
> modprobe-2155 [005] ...1. 4732.051419: 0xffffffffc0d81340 <-call_netdevice_register_net_notifiers
> modprobe-2155 [005] ..... 4732.051419: 0xffffffffc0d81345: this is an event
> modprobe-2155 [005] ...1. 4732.051419: 0xffffffffc0d81340 <-call_netdevice_register_net_notifiers
> modprobe-2155 [005] ..... 4732.051420: 0xffffffffc0d81345: this is an event
> modprobe-2155 [005] ...1. 4732.051420: 0xffffffffc0d81340 <-call_netdevice_register_net_notifiers
> modprobe-2155 [005] ..... 4732.051420: 0xffffffffc0d81345: this is an event
> modprobe-2155 [005] ...1. 4732.051420: 0xffffffffc0d81340 <-call_netdevice_register_net_notifiers
> modprobe-2155 [005] ..... 4732.051420: 0xffffffffc0d81345: this is an event
> modprobe-2155 [005] ...1. 4732.051420: 0xffffffffc0d81340 <-call_netdevice_register_net_notifiers
> modprobe-2155 [005] ..... 4732.051420: 0xffffffffc0d81345: this is an event
> modprobe-2155 [005] ...1. 4732.051421: 0xffffffffc0cc7132 <-0xffffffffc0cc70a3
> modprobe-2155 [005] ...1. 4732.051421: 0xffffffffc0da1500 <-0xffffffffc0cc713e
> kworker/u16:2-2023 [004] ...1. 4737.946576: 0xffffffffc0d81340 <-raw_notifier_call_chain
> kworker/u16:2-2023 [004] ..... 4737.946578: 0xffffffffc0d81345: this is an event
> kworker/u16:2-2023 [004] ...1. 4737.946634: 0xffffffffc0d8e1a0 <-if_nlmsg_size
> kworker/u16:2-2023 [004] ...1. 4737.946735: 0xffffffffc0d81340 <-raw_notifier_call_chain
> kworker/u16:2-2023 [004] ..... 4737.946735: 0xffffffffc0d81345: this is an event
> kworker/u16:2-2023 [004] ...1. 4737.946833: 0xffffffffc0d81340 <-raw_notifier_call_chain
> kworker/u16:2-2023 [004] ..... 4737.946833: 0xffffffffc0d81345: this is an event
> kworker/u16:2-2023 [004] ...1. 4737.946834: 0xffffffffc0d8e1a0 <-if_nlmsg_size
> kworker/u16:2-2023 [004] ...1. 4737.957648: 0xffffffffc0d81190 <-cleanup_net
> rmmod-2206 [000] ...1. 5739.459376: 0xffffffffc0d8f880 <-0xffffffffc0da3d1d
> rmmod-2206 [000] ...1. 5739.459378: 0xffffffffc0da1570 <-0xffffffffc0d8f88a
> rmmod-2206 [000] ...1. 5739.471367: 0xffffffffc0d81340 <-call_netdevice_unregister_notifiers
> rmmod-2206 [000] ..... 5739.471368: 0xffffffffc0d81345: this is an event
> rmmod-2206 [000] ...1. 5739.471369: 0xffffffffc0d81340 <-call_netdevice_unregister_notifiers
> rmmod-2206 [000] ..... 5739.471369: 0xffffffffc0d81345: this is an event
> rmmod-2206 [000] ...1. 5739.471369: 0xffffffffc0d81340 <-call_netdevice_unregister_notifiers
> rmmod-2206 [000] ..... 5739.471369: 0xffffffffc0d81345: this is an event
> rmmod-2206 [000] ...1. 5739.471370: 0xffffffffc0d81340 <-call_netdevice_unregister_notifiers
> rmmod-2206 [000] ..... 5739.471370: 0xffffffffc0d81345: this is an event
> rmmod-2206 [000] ...1. 5739.471370: 0xffffffffc0d81340 <-call_netdevice_unregister_notifiers
> rmmod-2206 [000] ..... 5739.471370: 0xffffffffc0d81345: this is an event
> rmmod-2206 [000] ...1. 5739.471370: 0xffffffffc0d81340 <-call_netdevice_unregister_notifiers
> rmmod-2206 [000] ..... 5739.471370: 0xffffffffc0d81345: this is an event
> rmmod-2206 [000] ...1. 5739.471371: 0xffffffffc0d81340 <-call_netdevice_unregister_notifiers
> rmmod-2206 [000] ..... 5739.471371: 0xffffffffc0d81345: this is an event
> rmmod-2206 [000] ...1. 5739.471371: 0xffffffffc0d81340 <-call_netdevice_unregister_notifiers
> rmmod-2206 [000] ..... 5739.471371: 0xffffffffc0d81345: this is an event
> rmmod-2206 [000] ...1. 5739.471371: 0xffffffffc0d81340 <-call_netdevice_unregister_notifiers
> rmmod-2206 [000] ..... 5739.471372: 0xffffffffc0d81345: this is an event
> rmmod-2206 [000] ...1. 5739.471372: 0xffffffffc0d81340 <-call_netdevice_unregister_notifiers
> rmmod-2206 [000] ..... 5739.471372: 0xffffffffc0d81345: this is an event
> rmmod-2206 [000] ...1. 5739.477366: 0xffffffffc0d81190 <-unregister_pernet_operations
> rmmod-2206 [000] ...1. 5739.477375: 0xffffffffc0d84910 <-__do_sys_delete_module.constprop.0
>
> Before removing the module, the names were all present. Then after I
> removed the module, they did not exist.
>
> Other than that, I cannot reproduce the issue.
>
> -- Steve
>
>
> diff --git a/net/bridge/br.c b/net/bridge/br.c
> index 96e91d69a9a8..4067760d3763 100644
> --- a/net/bridge/br.c
> +++ b/net/bridge/br.c
> @@ -36,6 +36,7 @@ static int br_device_event(struct notifier_block *unused, unsigned long event, v
> bool changed_addr;
> int err;
>
> + trace_printk("this is an event\n");
> if (netif_is_bridge_master(dev)) {
> err = br_vlan_bridge_event(dev, event, ptr);
> if (err)
> @@ -385,6 +386,7 @@ static int __init br_init(void)
> return err;
> }
>
> + trace_printk("here in bridge\n");
> err = br_fdb_init();
> if (err)
> goto err_out;

--
Chuck Lever



2022-06-20 01:17:29

by Chuck Lever

[permalink] [raw]
Subject: Re: possible trace_printk() bug in v5.19-rc1

Hi Steve-

> On Jun 17, 2022, at 9:28 PM, Chuck Lever III <[email protected]> wrote:
>
>
>
>> On Jun 17, 2022, at 3:50 PM, Steven Rostedt <[email protected]> wrote:
>>
>> On Thu, 16 Jun 2022 15:36:43 +0000
>> Chuck Lever III <[email protected]> wrote:
>>
>>>> Did you remove any modules before displaying the trace?
>>>
>>> I haven't explicitly removed nfsd.ko, and lsmod says it's still there.
>>> And, trace_printk was working as expected on v5.18.
>>
>> Are you sure?
>
> Yes, I'm sure. I just checked out v5.18 and built it with the
> same CONFIG. trace_printk() on that kernel generates function
> names as expected.

I moved my development work to another system, and bisected. The
result:

91fb02f31505 ("module: Move kallsyms support into a separate file")


--
Chuck Lever



2022-06-24 18:30:14

by Chuck Lever

[permalink] [raw]
Subject: Re: possible trace_printk() bug in v5.19-rc1



> On Jun 19, 2022, at 8:57 PM, Chuck Lever III <[email protected]> wrote:
>
> Hi Steve-
>
>> On Jun 17, 2022, at 9:28 PM, Chuck Lever III <[email protected]> wrote:
>>
>>
>>
>>> On Jun 17, 2022, at 3:50 PM, Steven Rostedt <[email protected]> wrote:
>>>
>>> On Thu, 16 Jun 2022 15:36:43 +0000
>>> Chuck Lever III <[email protected]> wrote:
>>>
>>>>> Did you remove any modules before displaying the trace?
>>>>
>>>> I haven't explicitly removed nfsd.ko, and lsmod says it's still there.
>>>> And, trace_printk was working as expected on v5.18.
>>>
>>> Are you sure?
>>
>> Yes, I'm sure. I just checked out v5.18 and built it with the
>> same CONFIG. trace_printk() on that kernel generates function
>> names as expected.
>
> I moved my development work to another system, and bisected. The
> result:
>
> 91fb02f31505 ("module: Move kallsyms support into a separate file")

The function_graph plug-in is kinda whack-ola too (v5.19-rc3):

kworker/u25:2-1372 [000] 3465.543709: funcgraph_entry: | ffffffffc0e8c390() {
kworker/u25:2-1372 [000] 3465.543712: funcgraph_entry: | ktime_get_coarse_with_offset() {
kworker/u25:2-1372 [000] 3465.543714: funcgraph_entry: | rcu_read_lock_sched_held() {
kworker/u25:2-1372 [000] 3465.543716: funcgraph_entry: 1.615 us | rcu_lockdep_current_cpu_online();
kworker/u25:2-1372 [000] 3465.543719: funcgraph_exit: 4.857 us | }
kworker/u25:2-1372 [000] 3465.543721: funcgraph_entry: | rcu_read_lock_sched_held() {
kworker/u25:2-1372 [000] 3465.543722: funcgraph_entry: 1.263 us | rcu_lockdep_current_cpu_online();
kworker/u25:2-1372 [000] 3465.543725: funcgraph_exit: 3.882 us | }
kworker/u25:2-1372 [000] 3465.543726: funcgraph_exit: + 14.095 us | }


--
Chuck Lever



2022-06-24 23:10:23

by Steven Rostedt

[permalink] [raw]
Subject: Re: possible trace_printk() bug in v5.19-rc1

On Mon, 20 Jun 2022 00:57:23 +0000
Chuck Lever III <[email protected]> wrote:

> > Yes, I'm sure. I just checked out v5.18 and built it with the
> > same CONFIG. trace_printk() on that kernel generates function
> > names as expected.
>
> I moved my development work to another system, and bisected. The
> result:
>
> 91fb02f31505 ("module: Move kallsyms support into a separate file")

Hmm, right below that it says:

"No functional change."

I'm guessing it may not be as reliably reproducible.

And is this an issue with trace-cmd output or reading the trace file?

-- Steve

2022-06-25 17:46:42

by Chuck Lever

[permalink] [raw]
Subject: Re: possible trace_printk() bug in v5.19-rc1



> On Jun 24, 2022, at 7:08 PM, Steven Rostedt <[email protected]> wrote:
>
> On Mon, 20 Jun 2022 00:57:23 +0000
> Chuck Lever III <[email protected]> wrote:
>
>>> Yes, I'm sure. I just checked out v5.18 and built it with the
>>> same CONFIG. trace_printk() on that kernel generates function
>>> names as expected.
>>
>> I moved my development work to another system, and bisected. The
>> result:
>>
>> 91fb02f31505 ("module: Move kallsyms support into a separate file")
>
> Hmm, right below that it says:
>
> "No functional change."
>
> I'm guessing it may not be as reliably reproducible.

So far I've seen this misbehavior 100% of the time on two different
systems running 5.19-rc. But, I agree, it's not known how broadly
reproducible this is on the cohort of systems now running 5.19-rc.


> And is this an issue with trace-cmd output or reading the trace file?

The trace_printk() call site in nfsd_file_slab_free() shows up like
this via "trace-cmd show -p":

ksoftirqd/2-28 [002] ..s.. 61.849252: nfsd_file_slab_free: nf=0000000007b6d60b
ksoftirqd/2-28 [002] ..s.. 61.849257: nfsd_file_slab_free: nf=000000004bf47e99
<idle>-0 [005] ..s1. 61.853212: nfsd_file_slab_free: nf=0000000029187f87
<idle>-0 [005] ..s1. 61.853223: nfsd_file_slab_free: nf=000000003cd1db63

So, it looks like the misbehavior appears only with "trace-cmd report".

[root@manet ~]# cat /etc/redhat-release
Fedora release 35 (Thirty Five)
[root@manet ~]# trace-cmd version

trace-cmd version 2.9.2 (not-a-git-repo)

usage:
trace-cmd [COMMAND] ...

commands:
record - record a trace into a trace.dat file
set - set a ftrace configuration parameter

...

check-events - parse trace event formats
dump - read out the meta data from a trace file

[root@manet ~]#

--
Chuck Lever



2022-06-25 18:02:37

by Steven Rostedt

[permalink] [raw]
Subject: Re: possible trace_printk() bug in v5.19-rc1

On Sat, 25 Jun 2022 17:15:07 +0000
Chuck Lever III <[email protected]> wrote:

> [root@manet ~]# cat /etc/redhat-release
> Fedora release 35 (Thirty Five)
> [root@manet ~]# trace-cmd version
>
> trace-cmd version 2.9.2 (not-a-git-repo)

Ug, that's very old. Fedora should be shipping 3.1.1 soon.

Right John? ;-)

-- Steve


>
> usage:
> trace-cmd [COMMAND] ...
>
> commands:
> record - record a trace into a trace.dat file
> set - set a ftrace configuration parameter
>
> ...
>
> check-events - parse trace event formats
> dump - read out the meta data from a trace file
>
> [root@manet ~]#

2022-06-25 18:31:27

by John 'Warthog9' Hawley

[permalink] [raw]
Subject: Re: possible trace_printk() bug in v5.19-rc1


On 6/25/2022 10:45 AM, Steven Rostedt wrote:
> On Sat, 25 Jun 2022 17:15:07 +0000
> Chuck Lever III <[email protected]> wrote:
>
>> [root@manet ~]# cat /etc/redhat-release
>> Fedora release 35 (Thirty Five)
>> [root@manet ~]# trace-cmd version
>>
>> trace-cmd version 2.9.2 (not-a-git-repo)
>
> Ug, that's very old. Fedora should be shipping 3.1.1 soon.
>
> Right John? ;-)

I've got 3.0.2 in there right now (~3mo old) and I've started the builds
on the latest tags (REALLY need to automate this!), probably have latest
tags built/packaged by tonight.

https://copr.fedorainfracloud.org/coprs/warthog9/tracing/

- John "Warthog9" Hawley

2022-06-25 19:10:31

by Chuck Lever

[permalink] [raw]
Subject: Re: possible trace_printk() bug in v5.19-rc1



> On Jun 25, 2022, at 1:45 PM, Steven Rostedt <[email protected]> wrote:
>
> On Sat, 25 Jun 2022 17:15:07 +0000
> Chuck Lever III <[email protected]> wrote:
>
>> [root@manet ~]# cat /etc/redhat-release
>> Fedora release 35 (Thirty Five)
>> [root@manet ~]# trace-cmd version
>>
>> trace-cmd version 2.9.2 (not-a-git-repo)
>
> Ug, that's very old. Fedora should be shipping 3.1.1 soon.

Right -- this version doesn't recognize get_sockaddr either.


--
Chuck Lever



2022-06-25 23:37:27

by Steven Rostedt

[permalink] [raw]
Subject: Re: possible trace_printk() bug in v5.19-rc1

On Sat, 25 Jun 2022 18:50:30 +0000
Chuck Lever III <[email protected]> wrote:

> > On Jun 25, 2022, at 1:45 PM, Steven Rostedt <[email protected]> wrote:
> >
> > On Sat, 25 Jun 2022 17:15:07 +0000
> > Chuck Lever III <[email protected]> wrote:
> >
> >> [root@manet ~]# cat /etc/redhat-release
> >> Fedora release 35 (Thirty Five)
> >> [root@manet ~]# trace-cmd version
> >>
> >> trace-cmd version 2.9.2 (not-a-git-repo)
> >
> > Ug, that's very old. Fedora should be shipping 3.1.1 soon.
>
> Right -- this version doesn't recognize get_sockaddr either.
>

That would be libtraceevent that would do that. What version do you
have installed?

-- Steve

2022-06-26 03:13:27

by John 'Warthog9' Hawley

[permalink] [raw]
Subject: Re: possible trace_printk() bug in v5.19-rc1

On 6/25/22 11:28, John 'Warthog9' Hawley wrote:
>
> On 6/25/2022 10:45 AM, Steven Rostedt wrote:
>> On Sat, 25 Jun 2022 17:15:07 +0000
>> Chuck Lever III <[email protected]> wrote:
>>
>>> [root@manet ~]# cat /etc/redhat-release
>>> Fedora release 35 (Thirty Five)
>>> [root@manet ~]# trace-cmd version
>>>
>>> trace-cmd version 2.9.2 (not-a-git-repo)
>>
>> Ug, that's very old. Fedora should be shipping 3.1.1 soon.
>>
>> Right John? ;-)
>
> I've got 3.0.2 in there right now (~3mo old) and I've started the builds
> on the latest tags (REALLY need to automate this!), probably have latest
> tags built/packaged by tonight.
>
> https://copr.fedorainfracloud.org/coprs/warthog9/tracing/

As, vaguely, promised this morning trace-cmd, and all libraries needed
for everything but centos-9-stream x86_64 (it's having some fit about
not finding trace_seq_vprintf and trace_seq_printf, which is weird that
it finds it on every other build...)

https://copr.fedorainfracloud.org/coprs/warthog9/tracing/build/4564577/

so dnf copr enable warthog9/tracing and dnf update should snag 3.1.1

- John 'Warthog9' Hawley

2022-06-26 17:11:56

by Chuck Lever

[permalink] [raw]
Subject: Re: possible trace_printk() bug in v5.19-rc1



> On Jun 25, 2022, at 7:01 PM, Steven Rostedt <[email protected]> wrote:
>
> On Sat, 25 Jun 2022 18:50:30 +0000
> Chuck Lever III <[email protected]> wrote:
>
>>> On Jun 25, 2022, at 1:45 PM, Steven Rostedt <[email protected]> wrote:
>>>
>>> On Sat, 25 Jun 2022 17:15:07 +0000
>>> Chuck Lever III <[email protected]> wrote:
>>>
>>>> [root@manet ~]# cat /etc/redhat-release
>>>> Fedora release 35 (Thirty Five)
>>>> [root@manet ~]# trace-cmd version
>>>>
>>>> trace-cmd version 2.9.2 (not-a-git-repo)
>>>
>>> Ug, that's very old. Fedora should be shipping 3.1.1 soon.
>>
>> Right -- this version doesn't recognize get_sockaddr either.
>>
>
> That would be libtraceevent that would do that. What version do you
> have installed?

[cel@manet lib]$ cd /usr/lib64/
[cel@manet lib64]$ ls libtrace*
libtracecmd.so.1 libtracecmd.so.1.0.0 libtraceevent.so.1 libtraceevent.so.1.2.1 libtracefs.so.1 libtracefs.so.1.1.1
[cel@manet lib64]$


--
Chuck Lever



2022-06-27 15:55:10

by Steven Rostedt

[permalink] [raw]
Subject: Re: possible trace_printk() bug in v5.19-rc1

On Sun, 26 Jun 2022 17:09:31 +0000
Chuck Lever III <[email protected]> wrote:

> [cel@manet lib]$ cd /usr/lib64/
> [cel@manet lib64]$ ls libtrace*
> libtracecmd.so.1 libtracecmd.so.1.0.0 libtraceevent.so.1 libtraceevent.so.1.2.1 libtracefs.so.1 libtracefs.so.1.1.1
> [cel@manet lib64]$

According to https://trace-cmd.org

The latest libraries are:

libtraceevent 1.6.1
libtracefs 1.4.1

I'm guessing you may have a buggy version.

-- Steve

2022-06-27 16:06:35

by Chuck Lever

[permalink] [raw]
Subject: Re: possible trace_printk() bug in v5.19-rc1



> On Jun 27, 2022, at 11:43 AM, Steven Rostedt <[email protected]> wrote:
>
> On Sun, 26 Jun 2022 17:09:31 +0000
> Chuck Lever III <[email protected]> wrote:
>
>> [cel@manet lib]$ cd /usr/lib64/
>> [cel@manet lib64]$ ls libtrace*
>> libtracecmd.so.1 libtracecmd.so.1.0.0 libtraceevent.so.1 libtraceevent.so.1.2.1 libtracefs.so.1 libtracefs.so.1.1.1
>> [cel@manet lib64]$
>
> According to https://trace-cmd.org
>
> The latest libraries are:
>
> libtraceevent 1.6.1
> libtracefs 1.4.1
>
> I'm guessing you may have a buggy version.

I checked with dnf. I have the latest versions provided in
Fedora 35, which is still supported.

I'd like to avoid having to maintain the latest version of
the tool and its libraries by hand on all of my development
systems.

--
Chuck Lever



2022-06-27 16:50:11

by Steven Rostedt

[permalink] [raw]
Subject: Re: possible trace_printk() bug in v5.19-rc1

On Mon, 27 Jun 2022 15:51:04 +0000
Chuck Lever III <[email protected]> wrote:

> I'd like to avoid having to maintain the latest version of
> the tool and its libraries by hand on all of my development
> systems.

But can you just verify that the new ones fix the issue in one of your
builds?

I'm not a fedora maintainer, so I have no control over what they ship. But
if you can prove that the issues you have are resolved when upgrading, then
there's nothing more I can do. It's up to fedora to update the libraries.

If it's still broken in the latest versions, then I can do something. But
without verification, this is just a fedora bug, not an upstream one.

-- Steve

2022-06-27 17:22:28

by Steven Rostedt

[permalink] [raw]
Subject: Re: possible trace_printk() bug in v5.19-rc1

On Mon, 27 Jun 2022 17:08:03 +0000
Chuck Lever III <[email protected]> wrote:

> Updated from your copr repo. The problem persists.

The symbol lookup problem still exists. What about the get_sockaddr() not
processing?

Would you be able to send me a trace.dat file that has this issue?

Thanks,

-- Steve

>
> [root@manet ~]# trace-cmd version
>
> trace-cmd version 3.1.1 (not-a-git-repo)
>
> usage:
> trace-cmd [COMMAND] ...
>
> commands:
> record - record a trace into a trace.dat file
> set - set a ftrace configuration parameter
>
> ...
>
> dump - read out the meta data from a trace file
> convert - convert trace file to different version
>
> [root@manet ~]# ls -l /usr/lib64/libtrace*
> lrwxrwxrwx 1 root root 20 Jul 22 2021 /usr/lib64/libtracecmd.so.1 -> libtracecmd.so.1.0.0
> -rwxr-xr-x 1 root root 142032 Jul 22 2021 /usr/lib64/libtracecmd.so.1.0.0
> lrwxrwxrwx 1 root root 22 Jun 25 14:28 /usr/lib64/libtraceevent.so.1 -> libtraceevent.so.1.6.1
> -rwxr-xr-x 1 root root 153944 Jun 25 14:29 /usr/lib64/libtraceevent.so.1.6.1
> lrwxrwxrwx 1 root root 19 Jun 25 17:53 /usr/lib64/libtracefs.so.1 -> libtracefs.so.1.4.1
> -rwxr-xr-x 1 root root 133384 Jun 25 17:53 /usr/lib64/libtracefs.so.1.4.1
> [root@manet ~]#

2022-06-27 17:30:38

by Chuck Lever

[permalink] [raw]
Subject: Re: possible trace_printk() bug in v5.19-rc1



> On Jun 25, 2022, at 11:01 PM, John 'Warthog9' Hawley <[email protected]> wrote:
>
> On 6/25/22 11:28, John 'Warthog9' Hawley wrote:
>> On 6/25/2022 10:45 AM, Steven Rostedt wrote:
>>> On Sat, 25 Jun 2022 17:15:07 +0000
>>> Chuck Lever III <[email protected]> wrote:
>>>
>>>> [root@manet ~]# cat /etc/redhat-release
>>>> Fedora release 35 (Thirty Five)
>>>> [root@manet ~]# trace-cmd version
>>>>
>>>> trace-cmd version 2.9.2 (not-a-git-repo)
>>>
>>> Ug, that's very old. Fedora should be shipping 3.1.1 soon.
>>>
>>> Right John? ;-)
>> I've got 3.0.2 in there right now (~3mo old) and I've started the builds on the latest tags (REALLY need to automate this!), probably have latest tags built/packaged by tonight.
>> https://copr.fedorainfracloud.org/coprs/warthog9/tracing/
>
> As, vaguely, promised this morning trace-cmd, and all libraries needed for everything but centos-9-stream x86_64 (it's having some fit about not finding trace_seq_vprintf and trace_seq_printf, which is weird that it finds it on every other build...)
>
> https://copr.fedorainfracloud.org/coprs/warthog9/tracing/build/4564577/
>
> so dnf copr enable warthog9/tracing and dnf update should snag 3.1.1

Updated from your copr repo. The problem persists.

[root@manet ~]# trace-cmd version

trace-cmd version 3.1.1 (not-a-git-repo)

usage:
trace-cmd [COMMAND] ...

commands:
record - record a trace into a trace.dat file
set - set a ftrace configuration parameter

...

dump - read out the meta data from a trace file
convert - convert trace file to different version

[root@manet ~]# ls -l /usr/lib64/libtrace*
lrwxrwxrwx 1 root root 20 Jul 22 2021 /usr/lib64/libtracecmd.so.1 -> libtracecmd.so.1.0.0
-rwxr-xr-x 1 root root 142032 Jul 22 2021 /usr/lib64/libtracecmd.so.1.0.0
lrwxrwxrwx 1 root root 22 Jun 25 14:28 /usr/lib64/libtraceevent.so.1 -> libtraceevent.so.1.6.1
-rwxr-xr-x 1 root root 153944 Jun 25 14:29 /usr/lib64/libtraceevent.so.1.6.1
lrwxrwxrwx 1 root root 19 Jun 25 17:53 /usr/lib64/libtracefs.so.1 -> libtracefs.so.1.4.1
-rwxr-xr-x 1 root root 133384 Jun 25 17:53 /usr/lib64/libtracefs.so.1.4.1
[root@manet ~]#


--
Chuck Lever



2022-06-27 17:47:36

by Chuck Lever

[permalink] [raw]
Subject: Re: possible trace_printk() bug in v5.19-rc1


> On Jun 27, 2022, at 1:11 PM, Steven Rostedt <[email protected]> wrote:
>
> On Mon, 27 Jun 2022 17:08:03 +0000
> Chuck Lever III <[email protected]> wrote:
>
>> Updated from your copr repo. The problem persists.
>
> The symbol lookup problem still exists. What about the get_sockaddr() not
> processing?

"trace-cmd report" no longer produces the get_sockaddr warning messages,
but tracepoints that use __get_sockaddr() still FAIL TO PARSE on my
system:

nfsd-1167 [005] 117.853235: nfsd_cb_probe: [FAILED TO PARSE] state=0x1 cl_boot=1656349219 cl_id=3054917767 addr=ARRAY[02, 00, 00, 00, c0, a8, 02, 43, 00, 00, 00, 00, 00, 00, 00, 00]
kworker/u24:2-985 [003] 117.853368: nfsd_cb_setup: [FAILED TO PARSE] cl_boot=1656349219 cl_id=3054917767 authflavor=0x1 addr=ARRAY[02, 00, 00, 00, c0, a8, 02, 43, 00, 00, 00, 00, 00, 00, 00, 00] netid=rdma
kworker/u24:2-985 [003] 117.853370: nfsd_cb_state: [FAILED TO PARSE] state=0x0 cl_boot=1656349219 cl_id=3054917767 addr=ARRAY[02, 00, 00, 00, c0, a8, 02, 43, 00, 00, 00, 00, 00, 00, 00, 00]


> Would you be able to send me a trace.dat file that has this issue?

Sending under separate cover.


--
Chuck Lever


2022-06-27 19:37:10

by Steven Rostedt

[permalink] [raw]
Subject: Re: possible trace_printk() bug in v5.19-rc1

On Mon, 27 Jun 2022 17:19:18 +0000
Chuck Lever III <[email protected]> wrote:

> > Would you be able to send me a trace.dat file that has this issue?
>
> Sending under separate cover.

I found this is the kallsyms output:

ffffffffac820000 b __brk_early_pgt_alloc
ffffffffac830000 B _end
ffffffffac830000 B __brk_limit
ffffffffc0948205 0944410 t qrtr_alloc_ctrl_packet [qrtr]
ffffffffc0944470 t qrtr_node_enqueue [qrtr]
ffffffffc094d100 b __key.5 [qrtr]
ffffffffc09448b0 t qrtr_reset_ports [qrtr]
ffffffffc094c1c0 d qrtr_ports [qrtr]
ffffffffc094d100 b __key.3 [qrtr]
ffffffffc094d100 b __key.4 [qrtr]

The line with:

ffffffffc0948205 0944410 t qrtr_alloc_ctrl_packet [qrtr]

Causes the parsing to stop, because it is not of a proper format. It has
that extra "0944410" in it, which will break the parsing.

Now the question is, why does that exist? And yes, that's a kernel bug.

The kallsyms code that outputs this file is:

static int s_show(struct seq_file *m, void *p)
{
void *value;
struct kallsym_iter *iter = m->private;

/* Some debugging symbols have no name. Ignore them. */
if (!iter->name[0])
return 0;

value = iter->show_value ? (void *)iter->value : NULL;

if (iter->module_name[0]) {
char type;

/*
* Label it "global" if it is exported,
* "local" if not exported.
*/
type = iter->exported ? toupper(iter->type) :
tolower(iter->type);
seq_printf(m, "%px %c %s\t[%s]\n", value,
type, iter->name, iter->module_name);
} else
seq_printf(m, "%px %c %s\n", value,
iter->type, iter->name);
return 0;
}

So how did it get that strange output.

Hmm, I bet it is because trace-cmd reads it in BUFSIZ blocks, and that the
seq_file code got confused when parsing between two elements, I bet
something got dropped. I'll go see if there was any seq_file updates that
could have caused this.

-- Steve

2022-06-28 04:33:33

by Steven Rostedt

[permalink] [raw]
Subject: Re: possible trace_printk() bug in v5.19-rc1

On Fri, 24 Jun 2022 19:08:19 -0400
Steven Rostedt <[email protected]> wrote:

> On Mon, 20 Jun 2022 00:57:23 +0000
> Chuck Lever III <[email protected]> wrote:
>
> > > Yes, I'm sure. I just checked out v5.18 and built it with the
> > > same CONFIG. trace_printk() on that kernel generates function
> > > names as expected.
> >
> > I moved my development work to another system, and bisected. The
> > result:
> >
> > 91fb02f31505 ("module: Move kallsyms support into a separate file")
>
> Hmm, right below that it says:
>
> "No functional change."
>
> I'm guessing it may not be as reliably reproducible.

I take this back. I bisected it down to this commit as well.

I found another patch in that series that has a function change, so I do
not trust this. This most definitely has a functional change, as the output
of kallsyms changes with it. I'll report it against that patch submission.

-- Steve


>
> And is this an issue with trace-cmd output or reading the trace file?

2022-06-28 15:34:19

by Steven Rostedt

[permalink] [raw]
Subject: Re: possible trace_printk() bug in v5.19-rc1

On Mon, 27 Jun 2022 17:19:18 +0000
Chuck Lever III <[email protected]> wrote:

> > The symbol lookup problem still exists. What about the get_sockaddr() not
> > processing?
>
> "trace-cmd report" no longer produces the get_sockaddr warning messages,
> but tracepoints that use __get_sockaddr() still FAIL TO PARSE on my
> system:
>
> nfsd-1167 [005] 117.853235: nfsd_cb_probe: [FAILED TO PARSE] state=0x1 cl_boot=1656349219 cl_id=3054917767 addr=ARRAY[02, 00, 00, 00, c0, a8, 02, 43, 00, 00, 00, 00, 00, 00, 00, 00]
> kworker/u24:2-985 [003] 117.853368: nfsd_cb_setup: [FAILED TO PARSE] cl_boot=1656349219 cl_id=3054917767 authflavor=0x1 addr=ARRAY[02, 00, 00, 00, c0, a8, 02, 43, 00, 00, 00, 00, 00, 00, 00, 00] netid=rdma
> kworker/u24:2-985 [003] 117.853370: nfsd_cb_state: [FAILED TO PARSE] state=0x0 cl_boot=1656349219 cl_id=3054917767 addr=ARRAY[02, 00, 00, 00, c0, a8, 02, 43, 00, 00, 00, 00, 00, 00, 00, 00]
>

I remember writing a patch to fix this, but it looks to never have been
applied. And worse yet, I can't find it :-/

Oh well, I guess I just need to write it again.

-- Steve