2022-04-20 00:44:19

by Mukesh Ojha

[permalink] [raw]
Subject: [PATCH] devcoredump : Serialize devcd_del work

In following scenario(diagram), when one thread X running dev_coredumpm() adds devcd
device to the framework which sends uevent notification to userspace
and another thread Y reads this uevent and call to devcd_data_write()
which eventually try to delete the queued timer that is not initialized/queued yet.

So, debug object reports some warning and in the meantime, timer is initialized
and queued from X path. and from Y path, it gets reinitialized again and
timer->entry.pprev=NULL and try_to_grab_pending() stucks.

To fix this, introduce mutex to serialize the behaviour.

cpu0(X) cpu1(Y)

dev_coredump() uevent sent to userspace
device_add() =========================> userspace process Y reads the uevents
writes to devcd fd which
results into writes to

devcd_data_write()
mod_delayed_work()
try_to_grab_pending()
del_timer()
debug_assert_init()
INIT_DELAYED_WORK
schedule_delayed_work
debug_object_fixup()
timer_fixup_assert_init()
timer_setup()
do_init_timer()   ==> reinitialized the
timer to
timer->entry.pprev=NULL

timer_pending()
!hlist_unhashed_lockless(&timer->entry)
!h->pprev ==> del_timer checks
and finds it to be NULL
try_to_grab_pending() stucks.

Link: https://lore.kernel.org/lkml/[email protected]/
Signed-off-by: Mukesh Ojha <[email protected]>
---
drivers/base/devcoredump.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/base/devcoredump.c b/drivers/base/devcoredump.c
index f4d794d..316f566 100644
--- a/drivers/base/devcoredump.c
+++ b/drivers/base/devcoredump.c
@@ -25,6 +25,7 @@ struct devcd_entry {
struct device devcd_dev;
void *data;
size_t datalen;
+ struct mutex mutex;
struct module *owner;
ssize_t (*read)(char *buffer, loff_t offset, size_t count,
void *data, size_t datalen);
@@ -84,7 +85,9 @@ static ssize_t devcd_data_write(struct file *filp, struct kobject *kobj,
struct device *dev = kobj_to_dev(kobj);
struct devcd_entry *devcd = dev_to_devcd(dev);

+ mutex_lock(&devcd->mutex);
mod_delayed_work(system_wq, &devcd->del_wk, 0);
+ mutex_unlock(&devcd->mutex);

return count;
}
@@ -112,7 +115,9 @@ static int devcd_free(struct device *dev, void *data)
{
struct devcd_entry *devcd = dev_to_devcd(dev);

+ mutex_lock(&devcd->mutex);
flush_delayed_work(&devcd->del_wk);
+ mutex_unlock(&devcd->mutex);
return 0;
}

@@ -278,13 +283,14 @@ void dev_coredumpm(struct device *dev, struct module *owner,
devcd->read = read;
devcd->free = free;
devcd->failing_dev = get_device(dev);
-
+ mutex_init(&devcd->mutex);
device_initialize(&devcd->devcd_dev);

dev_set_name(&devcd->devcd_dev, "devcd%d",
atomic_inc_return(&devcd_count));
devcd->devcd_dev.class = &devcd_class;

+ mutex_lock(&devcd->mutex);
if (device_add(&devcd->devcd_dev))
goto put_device;

@@ -301,10 +307,11 @@ void dev_coredumpm(struct device *dev, struct module *owner,

INIT_DELAYED_WORK(&devcd->del_wk, devcd_del);
schedule_delayed_work(&devcd->del_wk, DEVCD_TIMEOUT);
-
+ mutex_unlock(&devcd->mutex);
return;
put_device:
put_device(&devcd->devcd_dev);
+ mutex_unlock(&devcd->mutex);
put_module:
module_put(owner);
free:
--
2.7.4


2022-04-22 17:26:22

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH] devcoredump : Serialize devcd_del work

On Tue, 2022-04-19 at 15:57 +0530, Mukesh Ojha wrote:
> In following scenario(diagram), when one thread X running dev_coredumpm() adds devcd
> device to the framework which sends uevent notification to userspace
> and another thread Y reads this uevent and call to devcd_data_write()
> which eventually try to delete the queued timer that is not initialized/queued yet.
>
> So, debug object reports some warning and in the meantime, timer is initialized
> and queued from X path. and from Y path, it gets reinitialized again and
> timer->entry.pprev=NULL and try_to_grab_pending() stucks.
>
> To fix this, introduce mutex to serialize the behaviour.
>
> cpu0(X) cpu1(Y)
>
> dev_coredump() uevent sent to userspace
> device_add() =========================> userspace process Y reads the uevents
> writes to devcd fd which
> results into writes to
>
> devcd_data_write()
> mod_delayed_work()
> try_to_grab_pending()
> del_timer()
> debug_assert_init()
> INIT_DELAYED_WORK
> schedule_delayed_work
>

Wouldn't it be easier to simply schedule this before adding it to sysfs
and sending the uevent?

johannes

2022-04-22 18:44:41

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] devcoredump : Serialize devcd_del work

On Fri, Apr 22 2022 at 15:53, Johannes Berg wrote:
> On Fri, 2022-04-22 at 15:41 +0200, Johannes Berg wrote:
>> On Tue, 2022-04-19 at 15:57 +0530, Mukesh Ojha wrote:
>> > INIT_DELAYED_WORK
>> > schedule_delayed_work
>> >
>>
>> Wouldn't it be easier to simply schedule this before adding it to sysfs
>> and sending the uevent?

Only if it's guaranteed that the timer will not expire before
add_device() succeeds. It's likely, but there is virt....

> Hm. I think that would solve this problem, but not all of the problems
> here ...
>
> Even with your change, I believe it's still racy wrt. disabled_store(),
> since that flushes the work but devcd_data_write() remains reachable
> (and might in fact be waiting for the mutex after your change), so I
> think we need an additional flag somewhere (in addition to the mutex) to
> serialize all of these things against each other.

Plus there is disabled_store() which iterates over the devices and
reschedules the work. How is that supposed to be protected against a
concurrent devcd_del()?

Why needs disabled_store() to do that at all?

Thanks,

tglx

2022-04-22 19:24:20

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] devcoredump : Serialize devcd_del work

On Fri, Apr 22, 2022 at 02:13:47PM +0200, Greg KH wrote:
>
> A: http://en.wikipedia.org/wiki/Top_post
> Q: Were do I find info about this thing called top-posting?
> A: Because it messes up the order in which people normally read text.
> Q: Why is top-posting such a bad thing?
> A: Top-posting.
> Q: What is the most annoying thing in e-mail?
>
> A: No.
> Q: Should I include quotations after my reply?
>
> http://daringfireball.net/2007/07/on_top
>
> On Fri, Apr 22, 2022 at 03:33:08PM +0530, Mukesh Ojha wrote:
> >
> > Hi All,
> >
> > Request you all the review comments on the fix of the described problem?
> >
> > -Mukesh
> >
> >
> > On 4/19/2022 3:57 PM, Mukesh Ojha wrote:
>
> You sent this 3 days ago, please be realistic.
>
> If you are concerned about the delay in patch reviews, please help us
> out and review patches sent to the list. Otherwise, don't start to
> worry until at least after 2 weeks.

You also didn't even cc: me, so how could I see this at all?

So, nothing to review in my inbox, nice!

{sigh}

2022-04-22 19:50:17

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] devcoredump : Serialize devcd_del work


A: http://en.wikipedia.org/wiki/Top_post
Q: Were do I find info about this thing called top-posting?
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?

A: No.
Q: Should I include quotations after my reply?

http://daringfireball.net/2007/07/on_top

On Fri, Apr 22, 2022 at 03:33:08PM +0530, Mukesh Ojha wrote:
>
> Hi All,
>
> Request you all the review comments on the fix of the described problem?
>
> -Mukesh
>
>
> On 4/19/2022 3:57 PM, Mukesh Ojha wrote:

You sent this 3 days ago, please be realistic.

If you are concerned about the delay in patch reviews, please help us
out and review patches sent to the list. Otherwise, don't start to
worry until at least after 2 weeks.

thanks,

greg k-h

2022-04-22 22:25:12

by Mukesh Ojha

[permalink] [raw]
Subject: Re: [PATCH] devcoredump : Serialize devcd_del work


Hi All,

Request you all the review comments on the fix of the described problem?

-Mukesh


On 4/19/2022 3:57 PM, Mukesh Ojha wrote:
> In following scenario(diagram), when one thread X running dev_coredumpm() adds devcd
> device to the framework which sends uevent notification to userspace
> and another thread Y reads this uevent and call to devcd_data_write()
> which eventually try to delete the queued timer that is not initialized/queued yet.
>
> So, debug object reports some warning and in the meantime, timer is initialized
> and queued from X path. and from Y path, it gets reinitialized again and
> timer->entry.pprev=NULL and try_to_grab_pending() stucks.
>
> To fix this, introduce mutex to serialize the behaviour.
>
> cpu0(X) cpu1(Y)
>
> dev_coredump() uevent sent to userspace
> device_add() =========================> userspace process Y reads the uevents
> writes to devcd fd which
> results into writes to
>
> devcd_data_write()
> mod_delayed_work()
> try_to_grab_pending()
> del_timer()
> debug_assert_init()
> INIT_DELAYED_WORK
> schedule_delayed_work
> debug_object_fixup()
> timer_fixup_assert_init()
> timer_setup()
> do_init_timer()   ==> reinitialized the
> timer to
> timer->entry.pprev=NULL
>
> timer_pending()
> !hlist_unhashed_lockless(&timer->entry)
> !h->pprev ==> del_timer checks
> and finds it to be NULL
> try_to_grab_pending() stucks.
>
> Link: https://lore.kernel.org/lkml/[email protected]/
> Signed-off-by: Mukesh Ojha <[email protected]>
> ---
> drivers/base/devcoredump.c | 11 +++++++++--
> 1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/base/devcoredump.c b/drivers/base/devcoredump.c
> index f4d794d..316f566 100644
> --- a/drivers/base/devcoredump.c
> +++ b/drivers/base/devcoredump.c
> @@ -25,6 +25,7 @@ struct devcd_entry {
> struct device devcd_dev;
> void *data;
> size_t datalen;
> + struct mutex mutex;
> struct module *owner;
> ssize_t (*read)(char *buffer, loff_t offset, size_t count,
> void *data, size_t datalen);
> @@ -84,7 +85,9 @@ static ssize_t devcd_data_write(struct file *filp, struct kobject *kobj,
> struct device *dev = kobj_to_dev(kobj);
> struct devcd_entry *devcd = dev_to_devcd(dev);
>
> + mutex_lock(&devcd->mutex);
> mod_delayed_work(system_wq, &devcd->del_wk, 0);
> + mutex_unlock(&devcd->mutex);
>
> return count;
> }
> @@ -112,7 +115,9 @@ static int devcd_free(struct device *dev, void *data)
> {
> struct devcd_entry *devcd = dev_to_devcd(dev);
>
> + mutex_lock(&devcd->mutex);
> flush_delayed_work(&devcd->del_wk);
> + mutex_unlock(&devcd->mutex);
> return 0;
> }
>
> @@ -278,13 +283,14 @@ void dev_coredumpm(struct device *dev, struct module *owner,
> devcd->read = read;
> devcd->free = free;
> devcd->failing_dev = get_device(dev);
> -
> + mutex_init(&devcd->mutex);
> device_initialize(&devcd->devcd_dev);
>
> dev_set_name(&devcd->devcd_dev, "devcd%d",
> atomic_inc_return(&devcd_count));
> devcd->devcd_dev.class = &devcd_class;
>
> + mutex_lock(&devcd->mutex);
> if (device_add(&devcd->devcd_dev))
> goto put_device;
>
> @@ -301,10 +307,11 @@ void dev_coredumpm(struct device *dev, struct module *owner,
>
> INIT_DELAYED_WORK(&devcd->del_wk, devcd_del);
> schedule_delayed_work(&devcd->del_wk, DEVCD_TIMEOUT);
> -
> + mutex_unlock(&devcd->mutex);
> return;
> put_device:
> put_device(&devcd->devcd_dev);
> + mutex_unlock(&devcd->mutex);
> put_module:
> module_put(owner);
> free:

2022-04-22 22:34:41

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH] devcoredump : Serialize devcd_del work

On Fri, 2022-04-22 at 15:41 +0200, Johannes Berg wrote:
> On Tue, 2022-04-19 at 15:57 +0530, Mukesh Ojha wrote:
> > In following scenario(diagram), when one thread X running dev_coredumpm() adds devcd
> > device to the framework which sends uevent notification to userspace
> > and another thread Y reads this uevent and call to devcd_data_write()
> > which eventually try to delete the queued timer that is not initialized/queued yet.
> >
> > So, debug object reports some warning and in the meantime, timer is initialized
> > and queued from X path. and from Y path, it gets reinitialized again and
> > timer->entry.pprev=NULL and try_to_grab_pending() stucks.
> >
> > To fix this, introduce mutex to serialize the behaviour.
> >
> > cpu0(X) cpu1(Y)
> >
> > dev_coredump() uevent sent to userspace
> > device_add() =========================> userspace process Y reads the uevents
> > writes to devcd fd which
> > results into writes to
> >
> > devcd_data_write()
> > mod_delayed_work()
> > try_to_grab_pending()
> > del_timer()
> > debug_assert_init()
> > INIT_DELAYED_WORK
> > schedule_delayed_work
> >
>
> Wouldn't it be easier to simply schedule this before adding it to sysfs
> and sending the uevent?
>

Hm. I think that would solve this problem, but not all of the problems
here ...

Even with your change, I believe it's still racy wrt. disabled_store(),
since that flushes the work but devcd_data_write() remains reachable
(and might in fact be waiting for the mutex after your change), so I
think we need an additional flag somewhere (in addition to the mutex) to
serialize all of these things against each other.

johannes

2022-04-26 03:12:57

by Mukesh Ojha

[permalink] [raw]
Subject: Re: [PATCH] devcoredump : Serialize devcd_del work

On Fri, Apr 22, 2022 at 03:53:35PM +0200, Johannes Berg wrote:
> On Fri, 2022-04-22 at 15:41 +0200, Johannes Berg wrote:
> > On Tue, 2022-04-19 at 15:57 +0530, Mukesh Ojha wrote:
> > > In following scenario(diagram), when one thread X running dev_coredumpm() adds devcd
> > > device to the framework which sends uevent notification to userspace
> > > and another thread Y reads this uevent and call to devcd_data_write()
> > > which eventually try to delete the queued timer that is not initialized/queued yet.
> > >
> > > So, debug object reports some warning and in the meantime, timer is initialized
> > > and queued from X path. and from Y path, it gets reinitialized again and
> > > timer->entry.pprev=NULL and try_to_grab_pending() stucks.
> > >
> > > To fix this, introduce mutex to serialize the behaviour.
> > >
> > > cpu0(X) cpu1(Y)
> > >
> > > dev_coredump() uevent sent to userspace
> > > device_add() =========================> userspace process Y reads the uevents
> > > writes to devcd fd which
> > > results into writes to
> > >
> > > devcd_data_write()
> > > mod_delayed_work()
> > > try_to_grab_pending()
> > > del_timer()
> > > debug_assert_init()
> > > INIT_DELAYED_WORK
> > > schedule_delayed_work
> > >
> >
> > Wouldn't it be easier to simply schedule this before adding it to sysfs
> > and sending the uevent?
> >
>
> Hm. I think that would solve this problem, but not all of the problems
> here ...
>
> Even with your change, I believe it's still racy wrt. disabled_store(),
> since that flushes the work but devcd_data_write() remains reachable
> (and might in fact be waiting for the mutex after your change), so I
> think we need an additional flag somewhere (in addition to the mutex) to
> serialize all of these things against each other.

Can we do something like this in v2

https://lore.kernel.org/lkml/[email protected]/

Thanks,
-Mukesh

>
> johannes