2009-01-09 18:36:19

by Stefan Richter

[permalink] [raw]
Subject: post 2.6.28 regression: device_initialize() now sleeps, and may fail without recovery strategy

>From commit 2831fe6f9cc4e16c103504ee09a47a084297c0f3, "driver core:
create a private portion of struct device":

void device_initialize(struct device *dev)
{
+ dev->p = kzalloc(sizeof(*dev->p), GFP_KERNEL);
+ if (!dev->p) {
+ WARN_ON(1);
+ return;
+ }
+ dev->p->device = dev;
dev->kobj.kset = devices_kset;
kobject_init(&dev->kobj, &device_ktype);


First of all, this prevents initialization of struct device in atomic
contexts, such as drivers/firewire/fw-device.c::fw_node_event.

This is a bug in current mainline.

We can fix the bug by changing firewire-core, but
a) it'd be more than a one-liner,
b) who knows which other subsystems are affected.

Next, the above code is bogus. In 2.6.28, device_initialize() could
never fail and was thus safe to use as a void-valued function.

How does driver core handle dev->p == NULL in subsequent usages of dev now?
--
Stefan Richter
-=====-==--= ---= -=--=
http://arcgraph.de/sr/


2009-01-09 19:50:26

by Stefan Richter

[permalink] [raw]
Subject: [PATCH post 2.6.28] firewire: core: fix sleep in atomic context due to driver core change

Due to commit 2831fe6f9cc4e16c103504ee09a47a084297c0f3, "driver core:
create a private portion of struct device", device_initialize() can no
longer be called from atomic contexts.

We now defer it until after config ROM probing. This requires changes
to the bus manager code because this may use a device before it was
probed.

Reported-by: Jay Fenlason <[email protected]>
Signed-off-by: Stefan Richter <[email protected]>
---

This still needs a lot of testing since I may have easily missed
something. Even if it works, it complicates firewire-core's lifetime
rules.

Better would be to fix driver core to not sleep in device_initialize
(and besides, never fail in void device_initialize).

drivers/firewire/fw-card.c | 13 +++++++------
drivers/firewire/fw-device.c | 23 +++++++++++++----------
2 files changed, 20 insertions(+), 16 deletions(-)

Index: linux/drivers/firewire/fw-device.c
===================================================================
--- linux.orig/drivers/firewire/fw-device.c
+++ linux/drivers/firewire/fw-device.c
@@ -160,7 +160,8 @@ static void fw_device_release(struct dev

/*
* Take the card lock so we don't set this to NULL while a
- * FW_NODE_UPDATED callback is being handled.
+ * FW_NODE_UPDATED callback is being handled or while the
+ * bus manager work looks at this node.
*/
spin_lock_irqsave(&card->lock, flags);
device->node->data = NULL;
@@ -693,12 +694,13 @@ static void fw_device_init(struct work_s
return;
}

- err = -ENOMEM;
+ device_initialize(&device->device);

fw_device_get(device);
down_write(&fw_device_rwsem);
- if (idr_pre_get(&fw_device_idr, GFP_KERNEL))
- err = idr_get_new(&fw_device_idr, device, &minor);
+ err = idr_pre_get(&fw_device_idr, GFP_KERNEL) ?
+ idr_get_new(&fw_device_idr, device, &minor) :
+ -ENOMEM;
up_write(&fw_device_rwsem);

if (err < 0)
@@ -909,13 +911,14 @@ void fw_node_event(struct fw_card *card,

/*
* Do minimal intialization of the device here, the
- * rest will happen in fw_device_init(). We need the
- * card and node so we can read the config rom and we
- * need to do device_initialize() now so
- * device_for_each_child() in FW_NODE_UPDATED is
- * doesn't freak out.
+ * rest will happen in fw_device_init().
+ *
+ * Attention: A lot of things, even fw_device_get(),
+ * cannot be done before fw_device_init() finished!
+ * You can basically just check device->state and
+ * schedule work until then, but only while holding
+ * card->lock.
*/
- device_initialize(&device->device);
atomic_set(&device->state, FW_DEVICE_INITIALIZING);
device->card = fw_card_get(card);
device->node = fw_node_get(node);
Index: linux/drivers/firewire/fw-card.c
===================================================================
--- linux.orig/drivers/firewire/fw-card.c
+++ linux/drivers/firewire/fw-card.c
@@ -203,6 +203,8 @@ static void fw_card_bm_work(struct work_
unsigned long flags;
int root_id, new_root_id, irm_id, gap_count, generation, grace, rcode;
bool do_reset = false;
+ bool root_device_is_running;
+ bool root_device_is_cmc;
__be32 lock_data[2];

spin_lock_irqsave(&card->lock, flags);
@@ -218,8 +220,9 @@ static void fw_card_bm_work(struct work_

generation = card->generation;
root_device = root_node->data;
- if (root_device)
- fw_device_get(root_device);
+ root_device_is_running = root_device &&
+ atomic_read(&root_device->state) == FW_DEVICE_RUNNING;
+ root_device_is_cmc = root_device && root_device->cmc;
root_id = root_node->node_id;
grace = time_after(jiffies, card->reset_jiffies + DIV_ROUND_UP(HZ, 10));

@@ -302,14 +305,14 @@ static void fw_card_bm_work(struct work_
* config rom. In either case, pick another root.
*/
new_root_id = local_node->node_id;
- } else if (atomic_read(&root_device->state) != FW_DEVICE_RUNNING) {
+ } else if (!root_device_is_running) {
/*
* If we haven't probed this device yet, bail out now
* and let's try again once that's done.
*/
spin_unlock_irqrestore(&card->lock, flags);
goto out;
- } else if (root_device->cmc) {
+ } else if (root_device_is_cmc) {
/*
* FIXME: I suppose we should set the cmstr bit in the
* STATE_CLEAR register of this node, as described in
@@ -356,8 +359,6 @@ static void fw_card_bm_work(struct work_
fw_core_initiate_bus_reset(card, 1);
}
out:
- if (root_device)
- fw_device_put(root_device);
fw_node_put(root_node);
fw_node_put(local_node);
out_put_card:

--
Stefan Richter
-=====-==--= ---= -=--=
http://arcgraph.de/sr/

2009-01-09 20:59:20

by Greg KH

[permalink] [raw]
Subject: Re: post 2.6.28 regression: device_initialize() now sleeps, and may fail without recovery strategy

On Fri, Jan 09, 2009 at 07:35:42PM +0100, Stefan Richter wrote:
> >From commit 2831fe6f9cc4e16c103504ee09a47a084297c0f3, "driver core:
> create a private portion of struct device":
>
> void device_initialize(struct device *dev)
> {
> + dev->p = kzalloc(sizeof(*dev->p), GFP_KERNEL);
> + if (!dev->p) {
> + WARN_ON(1);
> + return;
> + }
> + dev->p->device = dev;
> dev->kobj.kset = devices_kset;
> kobject_init(&dev->kobj, &device_ktype);
>
>
> First of all, this prevents initialization of struct device in atomic
> contexts, such as drivers/firewire/fw-device.c::fw_node_event.

Ick, sorry, I didn't think that any callers ever did this.

> This is a bug in current mainline.
>
> We can fix the bug by changing firewire-core, but
> a) it'd be more than a one-liner,
> b) who knows which other subsystems are affected.

I agree.

I originally looked at changing this to be at device_add time, but I
think there are some code paths that do device_initialize and then do
some operations on the device before calling device_add. But I could be
wrong, let me do some testing first before forcing you to make that big
change to the firewire core.

> Next, the above code is bogus. In 2.6.28, device_initialize() could
> never fail and was thus safe to use as a void-valued function.
>
> How does driver core handle dev->p == NULL in subsequent usages of dev now?

It dies a flaming horrible death, pretty much like the whole rest of the
system if allocating such a small ammount of memory is causing failures
:)

Give me a few hours to test here, your change might not be necessary...

thanks,

greg k-h

2009-01-09 21:14:25

by Stefan Richter

[permalink] [raw]
Subject: Re: post 2.6.28 regression: device_initialize() now sleeps, and may fail without recovery strategy

Greg KH wrote:
> On Fri, Jan 09, 2009 at 07:35:42PM +0100, Stefan Richter wrote:
>> We can fix the bug by changing firewire-core, but
>> a) it'd be more than a one-liner,
>> b) who knows which other subsystems are affected.
>
> I agree.
>
> I originally looked at changing this to be at device_add time, but I
> think there are some code paths that do device_initialize and then do
> some operations on the device before calling device_add.

get_device() and put_device() seem to be about the only things that are
interesting before device_add().

Don't know if a final put_device() in this situation

> But I could be
> wrong, let me do some testing first before forcing you to make that big
> change to the firewire core.

It isn't actually that big. And the added complication could hopefully
be covered by comments about the caveats.

Actually, maybe it would be better for the firewire stack to move the
concerned stuff into a non-atomic context. There are other things we do
in there and atomic context isn't very comfortable for all this. But
that would be a much bigger change.

>> Next, the above code is bogus. In 2.6.28, device_initialize() could
>> never fail and was thus safe to use as a void-valued function.
>>
>> How does driver core handle dev->p == NULL in subsequent usages of dev now?
>
> It dies a flaming horrible death, pretty much like the whole rest of the
> system if allocating such a small ammount of memory is causing failures
> :)

Well, at least code which allocates struct device can check for failure
and handle it, while the allocator of dev->p can't even check. Unless
you change device_initialize() to return error status and add error
handling all over the place...
--
Stefan Richter
-=====-==--= ---= -=--=
http://arcgraph.de/sr/

2009-01-09 21:17:38

by Alan

[permalink] [raw]
Subject: Re: [PATCH post 2.6.28] firewire: core: fix sleep in atomic context due to driver core change

On Fri, 9 Jan 2009 20:49:37 +0100 (CET)
Stefan Richter <[email protected]> wrote:

> Due to commit 2831fe6f9cc4e16c103504ee09a47a084297c0f3, "driver core:
> create a private portion of struct device", device_initialize() can no
> longer be called from atomic contexts.

I don't see why this is neccessary or appropriate - the original commit
needs to be pulled and the private area allocation rethought.

2009-01-09 21:20:55

by Stefan Richter

[permalink] [raw]
Subject: Re: post 2.6.28 regression: device_initialize() now sleeps, and may fail without recovery strategy

Stefan Richter wrote:
> Greg KH wrote:
>> I originally looked at changing this to be at device_add time, but I
>> think there are some code paths that do device_initialize and then do
>> some operations on the device before calling device_add.
>
> get_device() and put_device() seem to be about the only things that are
> interesting before device_add().
>
> Don't know if a final put_device() in this situation

...would require dev->p to be present.
--
Stefan Richter
-=====-==--= ---= -=--=
http://arcgraph.de/sr/

2009-01-09 21:24:21

by Alan

[permalink] [raw]
Subject: Re: post 2.6.28 regression: device_initialize() now sleeps, and may fail without recovery strategy

> It dies a flaming horrible death, pretty much like the whole rest of the
> system if allocating such a small ammount of memory is causing failures
> :)

Now and then I run tests with kmalloc set to randomly fail - the kernel
is suprisingly robust these days.

> Give me a few hours to test here, your change might not be necessary...

The alternative is pretty horrible - mark the function as int and
requiring a return check then start propogating it out and weep at the
gcc warnings...

2009-01-09 21:31:59

by Greg KH

[permalink] [raw]
Subject: Re: post 2.6.28 regression: device_initialize() now sleeps, and may fail without recovery strategy

On Fri, Jan 09, 2009 at 10:13:55PM +0100, Stefan Richter wrote:
> Greg KH wrote:
> > On Fri, Jan 09, 2009 at 07:35:42PM +0100, Stefan Richter wrote:
> >> We can fix the bug by changing firewire-core, but
> >> a) it'd be more than a one-liner,
> >> b) who knows which other subsystems are affected.
> >
> > I agree.
> >
> > I originally looked at changing this to be at device_add time, but I
> > think there are some code paths that do device_initialize and then do
> > some operations on the device before calling device_add.
>
> get_device() and put_device() seem to be about the only things that are
> interesting before device_add().
>
> Don't know if a final put_device() in this situation

Hm, that could be pretty simple to handle. I'd really like to force the
kobject itself to be dynamic, and inside the private portion of the
device structure. If I do that, then get_ and put_ would need to
allocate the object if it wasn't present. But that would mean that
get_device could sleep, which isn't the case today (put_device() can
always sleep, that's not an issue.)

> > But I could be
> > wrong, let me do some testing first before forcing you to make that big
> > change to the firewire core.
>
> It isn't actually that big. And the added complication could hopefully
> be covered by comments about the caveats.
>
> Actually, maybe it would be better for the firewire stack to move the
> concerned stuff into a non-atomic context. There are other things we do
> in there and atomic context isn't very comfortable for all this. But
> that would be a much bigger change.

Well, I'd always recommend doing things in non-atomic context wherever
possible, so I'll not object that hard to your patch either way :)

> >> Next, the above code is bogus. In 2.6.28, device_initialize() could
> >> never fail and was thus safe to use as a void-valued function.
> >>
> >> How does driver core handle dev->p == NULL in subsequent usages of dev now?
> >
> > It dies a flaming horrible death, pretty much like the whole rest of the
> > system if allocating such a small ammount of memory is causing failures
> > :)
>
> Well, at least code which allocates struct device can check for failure
> and handle it, while the allocator of dev->p can't even check. Unless
> you change device_initialize() to return error status and add error
> handling all over the place...

yeah, that would be a much bigger task than I'm really pondering,
although it probably is the correct thing to do...

thanks,

greg k-h

2009-01-09 21:36:00

by Greg KH

[permalink] [raw]
Subject: Re: post 2.6.28 regression: device_initialize() now sleeps, and may fail without recovery strategy

On Fri, Jan 09, 2009 at 10:20:25PM +0100, Stefan Richter wrote:
> Stefan Richter wrote:
> > Greg KH wrote:
> >> I originally looked at changing this to be at device_add time, but I
> >> think there are some code paths that do device_initialize and then do
> >> some operations on the device before calling device_add.
> >
> > get_device() and put_device() seem to be about the only things that are
> > interesting before device_add().
> >
> > Don't know if a final put_device() in this situation
>
> ...would require dev->p to be present.

But that can be easily handled...

give me a short while to test...

thanks,

greg k-h

2009-01-09 21:40:44

by Stefan Richter

[permalink] [raw]
Subject: Re: post 2.6.28 regression: device_initialize() now sleeps, and may fail without recovery strategy

Greg KH wrote:
> On Fri, Jan 09, 2009 at 10:13:55PM +0100, Stefan Richter wrote:
>> get_device() and put_device() seem to be about the only things that are
>> interesting before device_add().
...
> Hm, that could be pretty simple to handle. I'd really like to force the
> kobject itself to be dynamic, and inside the private portion of the
> device structure. If I do that, then get_ and put_ would need to
> allocate the object if it wasn't present. But that would mean that
> get_device could sleep, which isn't the case today

That could be a problem.

> (put_device() can always sleep, that's not an issue.)

Hmm, I wasn't aware of that, need to check my code...

...
>> Well, at least code which allocates struct device can check for failure
>> and handle it, while the allocator of dev->p can't even check. Unless
>> you change device_initialize() to return error status and add error
>> handling all over the place...
>
> yeah, that would be a much bigger task than I'm really pondering,
> although it probably is the correct thing to do...

If you inline struct device_private into struct device, then the problem
is gone. But then it's a little less private then you wanted it to be
of course.
--
Stefan Richter
-=====-==--= ---= -=--=
http://arcgraph.de/sr/

2009-01-09 21:55:39

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH post 2.6.28] firewire: core: fix sleep in atomic context due to driver core change

On Fri, Jan 09, 2009 at 09:17:18PM +0000, Alan Cox wrote:
> On Fri, 9 Jan 2009 20:49:37 +0100 (CET)
> Stefan Richter <[email protected]> wrote:
>
> > Due to commit 2831fe6f9cc4e16c103504ee09a47a084297c0f3, "driver core:
> > create a private portion of struct device", device_initialize() can no
> > longer be called from atomic contexts.
>
> I don't see why this is neccessary or appropriate - the original commit
> needs to be pulled and the private area allocation rethought.

Ugh, I think you're right. I'll revert this (and all of the patches in
the series that were needed to get this working properly), and redo the
patch set, first making device_initialize() able to fail, and auditing
all callers to make sure it's not being called in atomic context.

At first glance, I think it's only firewire that is doing this in atomic
context, so Stefan, I wouldn't mind seeing your patch go in as-is just
to make things simpler overall.

I'll go make up the patchset and send them to Linus...

thanks,

greg k-h

2009-01-09 22:31:22

by Stefan Richter

[permalink] [raw]
Subject: [git pull] FireWire fix

On 9 Jan, Greg KH wrote:
> On Fri, Jan 09, 2009 at 09:17:18PM +0000, Alan Cox wrote:
>> On Fri, 9 Jan 2009 20:49:37 +0100 (CET)
>> Stefan Richter <[email protected]> wrote:
>>
>> > Due to commit 2831fe6f9cc4e16c103504ee09a47a084297c0f3, "driver core:
>> > create a private portion of struct device", device_initialize() can no
>> > longer be called from atomic contexts.
>>
>> I don't see why this is neccessary or appropriate - the original commit
>> needs to be pulled and the private area allocation rethought.
>
> Ugh, I think you're right. I'll revert this (and all of the patches in
> the series that were needed to get this working properly), and redo the
> patch set, first making device_initialize() able to fail, and auditing
> all callers to make sure it's not being called in atomic context.
>
> At first glance, I think it's only firewire that is doing this in atomic
> context, so Stefan, I wouldn't mind seeing your patch go in as-is just
> to make things simpler overall.
>
> I'll go make up the patchset and send them to Linus...

Linus,

please pull from the for-linus branch at

git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6.git for-linus

to receive the following change --- as a hotfix for now, and to expand
Greg's options when he sorts device_initialize() out later.

Stefan Richter (1):
firewire: core: fix sleep in atomic context due to driver core change

drivers/firewire/fw-card.c | 13 +++++++------
drivers/firewire/fw-device.c | 23 +++++++++++++----------
2 files changed, 20 insertions(+), 16 deletions(-)


commit 6230582320b721e6cf2581d048cb688dca97f504
Author: Stefan Richter <[email protected]>
Date: Fri Jan 9 20:49:37 2009 +0100

firewire: core: fix sleep in atomic context due to driver core change

Due to commit 2831fe6f9cc4e16c103504ee09a47a084297c0f3, "driver core:
create a private portion of struct device", device_initialize() can no
longer be called from atomic contexts.

We now defer it until after config ROM probing. This requires changes
to the bus manager code because this may use a device before it was
probed.

Reported-by: Jay Fenlason <[email protected]>
Signed-off-by: Stefan Richter <[email protected]>

diff --git a/drivers/firewire/fw-card.c b/drivers/firewire/fw-card.c
index 799f944..6bd91a1 100644
--- a/drivers/firewire/fw-card.c
+++ b/drivers/firewire/fw-card.c
@@ -209,6 +209,8 @@ fw_card_bm_work(struct work_struct *work)
unsigned long flags;
int root_id, new_root_id, irm_id, gap_count, generation, grace, rcode;
bool do_reset = false;
+ bool root_device_is_running;
+ bool root_device_is_cmc;
__be32 lock_data[2];

spin_lock_irqsave(&card->lock, flags);
@@ -224,8 +226,9 @@ fw_card_bm_work(struct work_struct *work)

generation = card->generation;
root_device = root_node->data;
- if (root_device)
- fw_device_get(root_device);
+ root_device_is_running = root_device &&
+ atomic_read(&root_device->state) == FW_DEVICE_RUNNING;
+ root_device_is_cmc = root_device && root_device->cmc;
root_id = root_node->node_id;
grace = time_after(jiffies, card->reset_jiffies + DIV_ROUND_UP(HZ, 10));

@@ -308,14 +311,14 @@ fw_card_bm_work(struct work_struct *work)
* config rom. In either case, pick another root.
*/
new_root_id = local_node->node_id;
- } else if (atomic_read(&root_device->state) != FW_DEVICE_RUNNING) {
+ } else if (!root_device_is_running) {
/*
* If we haven't probed this device yet, bail out now
* and let's try again once that's done.
*/
spin_unlock_irqrestore(&card->lock, flags);
goto out;
- } else if (root_device->cmc) {
+ } else if (root_device_is_cmc) {
/*
* FIXME: I suppose we should set the cmstr bit in the
* STATE_CLEAR register of this node, as described in
@@ -362,8 +365,6 @@ fw_card_bm_work(struct work_struct *work)
fw_core_initiate_bus_reset(card, 1);
}
out:
- if (root_device)
- fw_device_put(root_device);
fw_node_put(root_node);
fw_node_put(local_node);
out_put_card:
diff --git a/drivers/firewire/fw-device.c b/drivers/firewire/fw-device.c
index c173be3..2af5a8d 100644
--- a/drivers/firewire/fw-device.c
+++ b/drivers/firewire/fw-device.c
@@ -159,7 +159,8 @@ static void fw_device_release(struct device *dev)

/*
* Take the card lock so we don't set this to NULL while a
- * FW_NODE_UPDATED callback is being handled.
+ * FW_NODE_UPDATED callback is being handled or while the
+ * bus manager work looks at this node.
*/
spin_lock_irqsave(&card->lock, flags);
device->node->data = NULL;
@@ -695,12 +696,13 @@ static void fw_device_init(struct work_struct *work)
return;
}

- err = -ENOMEM;
+ device_initialize(&device->device);

fw_device_get(device);
down_write(&fw_device_rwsem);
- if (idr_pre_get(&fw_device_idr, GFP_KERNEL))
- err = idr_get_new(&fw_device_idr, device, &minor);
+ err = idr_pre_get(&fw_device_idr, GFP_KERNEL) ?
+ idr_get_new(&fw_device_idr, device, &minor) :
+ -ENOMEM;
up_write(&fw_device_rwsem);

if (err < 0)
@@ -911,13 +913,14 @@ void fw_node_event(struct fw_card *card, struct fw_node *node, int event)

/*
* Do minimal intialization of the device here, the
- * rest will happen in fw_device_init(). We need the
- * card and node so we can read the config rom and we
- * need to do device_initialize() now so
- * device_for_each_child() in FW_NODE_UPDATED is
- * doesn't freak out.
+ * rest will happen in fw_device_init().
+ *
+ * Attention: A lot of things, even fw_device_get(),
+ * cannot be done before fw_device_init() finished!
+ * You can basically just check device->state and
+ * schedule work until then, but only while holding
+ * card->lock.
*/
- device_initialize(&device->device);
atomic_set(&device->state, FW_DEVICE_INITIALIZING);
device->card = fw_card_get(card);
device->node = fw_node_get(node);



Thanks,
--
Stefan Richter
-=====-==--= ---= -=--=
http://arcgraph.de/sr/