2022-10-31 07:39:45

by Takashi Iwai

[permalink] [raw]
Subject: [REGRESSION 6.0.x / 6.1.x] NULL dereferencing at tracing

Hi Steven,

we've got a bug report indicating the NULL dereference at the recent
tracing changes, showing at the start of KDE. The details including
the dmesg are found at:
https://bugzilla.opensuse.org/show_bug.cgi?id=1204705

It was reported at first for 6.0.3, and confirmed that the problem
persists with 6.1-rc, too.

The culprit seems to be the commit
f3ddb74ad0790030c9592229fb14d8c451f4e9a8
tracing: Wake up ring buffer waiters on closing of the file
and reverting it seems fixing the problem.

Could you take a look?


thanks,

Takashi


2022-10-31 08:51:19

by Steven Noonan

[permalink] [raw]
Subject: Re: [REGRESSION 6.0.x / 6.1.x] NULL dereferencing at tracing

I hit this same NULL pointer dereference on one of my systems, trying
to boot with either 6.0.6 or 5.15.76. Unfortunately it happens fairly
early in boot during network initialization, which makes it
frustrating to recover from.


On Mon, Oct 31, 2022 at 12:24 AM Takashi Iwai <[email protected]> wrote:
>
> Hi Steven,
>
> we've got a bug report indicating the NULL dereference at the recent
> tracing changes, showing at the start of KDE. The details including
> the dmesg are found at:
> https://bugzilla.opensuse.org/show_bug.cgi?id=1204705
>
> It was reported at first for 6.0.3, and confirmed that the problem
> persists with 6.1-rc, too.
>
> The culprit seems to be the commit
> f3ddb74ad0790030c9592229fb14d8c451f4e9a8
> tracing: Wake up ring buffer waiters on closing of the file
> and reverting it seems fixing the problem.
>
> Could you take a look?
>
>
> thanks,
>
> Takashi

2022-10-31 10:23:30

by Takashi Iwai

[permalink] [raw]
Subject: Re: [REGRESSION 6.0.x / 6.1.x] NULL dereferencing at tracing

On Mon, 31 Oct 2022 10:48:37 +0100,
Steven Rostedt wrote:
>
> On Mon, 31 Oct 2022 08:11:28 +0100
> Takashi Iwai <[email protected]> wrote:
>
> > Hi Steven,
> >
> > we've got a bug report indicating the NULL dereference at the recent
> > tracing changes, showing at the start of KDE. The details including
> > the dmesg are found at:
> > https://bugzilla.opensuse.org/show_bug.cgi?id=1204705
> >
> > It was reported at first for 6.0.3, and confirmed that the problem
> > persists with 6.1-rc, too.
> >
> > The culprit seems to be the commit
> > f3ddb74ad0790030c9592229fb14d8c451f4e9a8
> > tracing: Wake up ring buffer waiters on closing of the file
> > and reverting it seems fixing the problem.
> >
> > Could you take a look?
> >
>
> Thanks for the report, can you send me your .config.

Here it is: for 6.0.x and 6.1-rc3.


thanks,

Takashi


Attachments:
config-6.0.5 (269.33 kB)
config-6.1-rc3 (270.98 kB)
Download all attachments

2022-10-31 10:25:57

by Steven Rostedt

[permalink] [raw]
Subject: Re: [REGRESSION 6.0.x / 6.1.x] NULL dereferencing at tracing

On Mon, 31 Oct 2022 08:11:28 +0100
Takashi Iwai <[email protected]> wrote:

> Hi Steven,
>
> we've got a bug report indicating the NULL dereference at the recent
> tracing changes, showing at the start of KDE. The details including
> the dmesg are found at:
> https://bugzilla.opensuse.org/show_bug.cgi?id=1204705
>
> It was reported at first for 6.0.3, and confirmed that the problem
> persists with 6.1-rc, too.
>
> The culprit seems to be the commit
> f3ddb74ad0790030c9592229fb14d8c451f4e9a8
> tracing: Wake up ring buffer waiters on closing of the file
> and reverting it seems fixing the problem.
>
> Could you take a look?
>

Thanks for the report, can you send me your .config.

Thanks!

-- Steve

2022-10-31 19:02:26

by Steven Rostedt

[permalink] [raw]
Subject: Re: [REGRESSION 6.0.x / 6.1.x] NULL dereferencing at tracing

On Mon, 31 Oct 2022 08:11:28 +0100
Takashi Iwai <[email protected]> wrote:

> Hi Steven,
>
> we've got a bug report indicating the NULL dereference at the recent
> tracing changes, showing at the start of KDE. The details including
> the dmesg are found at:
> https://bugzilla.opensuse.org/show_bug.cgi?id=1204705
>
> It was reported at first for 6.0.3, and confirmed that the problem
> persists with 6.1-rc, too.
>
> The culprit seems to be the commit
> f3ddb74ad0790030c9592229fb14d8c451f4e9a8
> tracing: Wake up ring buffer waiters on closing of the file
> and reverting it seems fixing the problem.
>
> Could you take a look?
>
>

Can you apply this to see if it fixes it?

I'm guessing there's a path to the release of the file descriptor where
the ring buffer isn't allocated (and this expected it to be).

I'll investigate further to see if I can find that path.

-- Steve

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 199759c73519..c1c7ce4c6ddb 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -937,6 +937,9 @@ void ring_buffer_wake_waiters(struct trace_buffer *buffer, int cpu)
struct ring_buffer_per_cpu *cpu_buffer;
struct rb_irq_work *rbwork;

+ if (!buffer)
+ return;
+
if (cpu == RING_BUFFER_ALL_CPUS) {

/* Wake up individual ones too. One level recursion */

2022-11-01 09:03:04

by Takashi Iwai

[permalink] [raw]
Subject: Re: [REGRESSION 6.0.x / 6.1.x] NULL dereferencing at tracing

On Mon, 31 Oct 2022 19:48:50 +0100,
Steven Rostedt wrote:
>
> On Mon, 31 Oct 2022 08:11:28 +0100
> Takashi Iwai <[email protected]> wrote:
>
> > Hi Steven,
> >
> > we've got a bug report indicating the NULL dereference at the recent
> > tracing changes, showing at the start of KDE. The details including
> > the dmesg are found at:
> > https://bugzilla.opensuse.org/show_bug.cgi?id=1204705
> >
> > It was reported at first for 6.0.3, and confirmed that the problem
> > persists with 6.1-rc, too.
> >
> > The culprit seems to be the commit
> > f3ddb74ad0790030c9592229fb14d8c451f4e9a8
> > tracing: Wake up ring buffer waiters on closing of the file
> > and reverting it seems fixing the problem.
> >
> > Could you take a look?
> >
> >
>
> Can you apply this to see if it fixes it?
>
> I'm guessing there's a path to the release of the file descriptor where
> the ring buffer isn't allocated (and this expected it to be).
>
> I'll investigate further to see if I can find that path.

For avoiding confusion: the follow up post in this thread
https://lore.kernel.org/[email protected]
is from Alex, who is the original bug reporter on openSUSE Bugzilla.

The test result looks negative, unfortunately.


Takashi

>
> -- Steve
>
> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> index 199759c73519..c1c7ce4c6ddb 100644
> --- a/kernel/trace/ring_buffer.c
> +++ b/kernel/trace/ring_buffer.c
> @@ -937,6 +937,9 @@ void ring_buffer_wake_waiters(struct trace_buffer *buffer, int cpu)
> struct ring_buffer_per_cpu *cpu_buffer;
> struct rb_irq_work *rbwork;
>
> + if (!buffer)
> + return;
> +
> if (cpu == RING_BUFFER_ALL_CPUS) {
>
> /* Wake up individual ones too. One level recursion */
>

2022-11-02 16:33:05

by Steven Rostedt

[permalink] [raw]
Subject: Re: [REGRESSION 6.0.x / 6.1.x] NULL dereferencing at tracing

On Wed, 2 Nov 2022 15:57:56 +0000
[email protected] wrote:

> Hello everyone,
>
> I have added lot's of debug printk's to see what's happening and I found
> that the "cpu" counter, which is used to access the buffer's array
> elements (cpu_buffer = buffer->buffers[cpu]) in the ring_buffer_wake_waiters
> function, exceeds the maximum number of total of total cores, namely in
> my case 24, which means, it should only run from 0..23. However, upon
> debugging, it runs up to 31, and thus causing a NULL pointer dereference
> (&cpu_buffer->irq_work).
>

Could you add this patch.

https://lore.kernel.org/all/[email protected]/

Thanks,

-- Steve


2022-11-02 17:12:10

by postix

[permalink] [raw]
Subject: Re: [REGRESSION 6.0.x / 6.1.x] NULL dereferencing at tracing

On 02.11.22 17:03, Steven Rostedt wrote:
> Could you add this patch.
>
> https://lore.kernel.org/all/[email protected]/


Thanks, this patch fixes the issue for me! Please see the final dmesg
output [1].

[1] https://paste.opensuse.org/e8d4fa46


All the best

--AD

2022-11-02 17:29:30

by Steven Rostedt

[permalink] [raw]
Subject: Re: [REGRESSION 6.0.x / 6.1.x] NULL dereferencing at tracing

On Wed, 2 Nov 2022 16:36:29 +0000
[email protected] wrote:

> On 02.11.22 17:03, Steven Rostedt wrote:
> > Could you add this patch.
> >
> > https://lore.kernel.org/all/[email protected]/
>
>
> Thanks, this patch fixes the issue for me! Please see the final dmesg
> output [1].
>
> [1] https://paste.opensuse.org/e8d4fa46
>

Yes that's known too. rasdaemon needs to be updated to use the
libtracefs library, which should fix all this.

-- Steve

2022-11-03 13:08:02

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: [REGRESSION 6.0.x / 6.1.x] NULL dereferencing at tracing #forregzbot

[Note: this mail is primarily send for documentation purposes and/or for
regzbot, my Linux kernel regression tracking bot. That's why I removed
most or all folks from the list of recipients, but left any that looked
like a mailing lists. These mails usually contain '#forregzbot' in the
subject, to make them easy to spot and filter out.]

On 31.10.22 08:11, Takashi Iwai wrote:
>
> we've got a bug report indicating the NULL dereference at the recent
> tracing changes, showing at the start of KDE. The details including
> the dmesg are found at:
> https://bugzilla.opensuse.org/show_bug.cgi?id=1204705
>
> It was reported at first for 6.0.3, and confirmed that the problem
> persists with 6.1-rc, too.
>
> The culprit seems to be the commit
> f3ddb74ad0790030c9592229fb14d8c451f4e9a8
> tracing: Wake up ring buffer waiters on closing of the file
> and reverting it seems fixing the problem.
>
> Could you take a look?

Just adding this to the tracking:

#regzbot introduced f3ddb74ad07 ^
https://bugzilla.opensuse.org/show_bug.cgi?id=1204705
#regzbot title NULL dereferencing at tracing
#regzbot ignore-activity
#regzbot monitor:
https://lore.kernel.org/all/[email protected]/

Ciao, Thorsten