2009-09-16 19:54:17

by Lars Ericsson

[permalink] [raw]
Subject: Oops in drivers\base\firmware_class

Hi,

I have discovered a Oops in the firmware_loading_store function.
At first it looks like a timing issue but after adding a BUG_ON test,
it fails every time.

drivers\base\firmware_class:
------------------------------
541 01c0 F6463401 testb $1,52(%esi)
542 01c4 0F843FFF je .L38
542 FFFF
543 .loc 1 174 0
544 01ca 8B4630 movl 48(%esi),%eax
545 01cd 8B4004 movl 4(%eax),%eax <---- Oops
546 01d0 E8FCFFFF call vfree
546 FF

The code that fails was introduced in commit
6e03a201bbe8137487f340d26aa662110e324b20

Attached you will find the:
- Oops with the vanilla 2.6.31
- The BUG_ON patch
- Oops with the patched 2.6.31

/Lars


Attachments:
Vanilla_Opps.txt (2.32 kB)
BUG_ON_firmware_class.c.patch (652.00 B)
BUG_ON_Oops.txt (2.16 kB)
Download all attachments

2009-09-16 20:57:22

by Frederik Deweerdt

[permalink] [raw]
Subject: Re: Oops in drivers\base\firmware_class

Hello Lars,

On Wed, Sep 16, 2009 at 08:44:43PM +0200, Lars Ericsson wrote:
> Hi,
>
> I have discovered a Oops in the firmware_loading_store function.
> At first it looks like a timing issue but after adding a BUG_ON test,
> it fails every time.
>
> drivers\base\firmware_class:
> ------------------------------
> 541 01c0 F6463401 testb $1,52(%esi)
> 542 01c4 0F843FFF je .L38
> 542 FFFF
> 543 .loc 1 174 0
> 544 01ca 8B4630 movl 48(%esi),%eax
> 545 01cd 8B4004 movl 4(%eax),%eax <---- Oops
> 546 01d0 E8FCFFFF call vfree
> 546 FF
>
> The code that fails was introduced in commit
> 6e03a201bbe8137487f340d26aa662110e324b20
Could you try the following patch?

It seems that we're missing the locking around the accesses to
fw_priv->fw.

Thanks,
Frederik

Signed-off-by: Frederik Deweerdt <[email protected]>

diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
index 8a267c4..49105ed 100644
--- a/drivers/base/firmware_class.c
+++ b/drivers/base/firmware_class.c
@@ -171,12 +171,18 @@ static ssize_t firmware_loading_store(struct device *dev,
break;
case 0:
if (test_bit(FW_STATUS_LOADING, &fw_priv->status)) {
+ mutex_lock(&fw_lock);
+ if (!fw_priv->fw) {
+ mutex_unlock(&fw_lock);
+ break;
+ }
vfree(fw_priv->fw->data);
fw_priv->fw->data = vmap(fw_priv->pages,
fw_priv->nr_pages,
0, PAGE_KERNEL_RO);
if (!fw_priv->fw->data) {
dev_err(dev, "%s: vmap() failed\n", __func__);
+ mutex_unlock(&fw_lock);
goto err;
}
/* Pages will be freed by vfree() */
@@ -185,6 +191,7 @@ static ssize_t firmware_loading_store(struct device *dev,
fw_priv->nr_pages = 0;
complete(&fw_priv->completion);
clear_bit(FW_STATUS_LOADING, &fw_priv->status);
+ mutex_unlock(&fw_lock);
break;
}
/* fallthrough */
>
> Attached you will find the:
> - Oops with the vanilla 2.6.31
> - The BUG_ON patch
> - Oops with the patched 2.6.31
>
> /Lars

> [ 7.249453] rt61pci 0000:00:09.0: firmware: requesting rt2561s.bin
> [ 7.314512] BUG: unable to handle kernel NULL pointer dereference at 00000004
> [ 7.315639] IP: [<d04d51cd>] firmware_loading_store+0xed/0x160 [firmware_class]
> [ 7.315639] *pde = 00000000
> [ 7.315639] Oops: 0000 [#1] PREEMPT
> [ 7.315639] last sysfs file: /sys/devices/pci0000:00/0000:00:09.0/firmware/0000:00:09.0/loading
> [ 7.315639] Modules linked in: arc4 ecb cryptomgr crypto_blkcipher aead pcompress crypto_hash crypto_algapi rt61pci rt2x00pci rt2x00lib firmware_class mac80211 input_polldev crc_itu_t eeprom_93cx6 cfg80211 ndccanram ndccan ndcser(P) ndcscan(P) eu16550(P) ndccon(P)
> [ 7.315639]
> [ 7.315639] Pid: 1004, comm: echo Tainted: P (2.6.31 #1)
> [ 7.315639] EIP: 0060:[<d04d51cd>] EFLAGS: 00010202 CPU: 0
> [ 7.315639] EIP is at firmware_loading_store+0xed/0x160 [firmware_class]
> [ 7.315639] EAX: 00000000 EBX: d04d50e0 ECX: 0000000a EDX: 00000028
> [ 7.315639] ESI: cf65ca80 EDI: cea29c00 EBP: cea29c08 ESP: cf777f2c
> [ 7.315639] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
> [ 7.315639] Process echo (pid: 1004, ti=cf776000 task=cf755aa0 task.ti=cf776000)
> [ 7.315639] Stack:
> [ 7.315639] d04d50e0 cea29c00 00000002 c028051d 00000002 c0414c40 ce9c0420 c01c9c98
> [ 7.315639] <0> 00000002 c0414c40 00000002 b7f15000 ce9c1e80 cf44c000 c01c9ce6 ce9c1e94
> [ 7.315639] <0> cf44c000 b7f15000 00000002 c01c9cb0 c018d362 cf777f9c cf44c000 fffffff7
> [ 7.315639] Call Trace:
> [ 7.315639] [<d04d50e0>] ? firmware_loading_store+0x0/0x160 [firmware_class]
> [ 7.315639] [<c028051d>] ? dev_attr_store+0x1d/0x20
> [ 7.315639] [<c01c9c98>] ? flush_write_buffer+0x38/0x50
> [ 7.315639] [<c01c9ce6>] ? sysfs_write_file+0x36/0x60
> [ 7.315639] [<c01c9cb0>] ? sysfs_write_file+0x0/0x60
> [ 7.315639] [<c018d362>] ? vfs_write+0x82/0xf0
> [ 7.315639] [<c018d47c>] ? sys_write+0x3c/0x70
> [ 7.315639] [<c0102ba1>] ? syscall_call+0x7/0xb
> [ 7.315639] Code: 91 c9 ef 3b 5e 3c 7c ed eb b5 8d 74 26 00 83 f8 ff 0f 85 50 ff ff ff e9 6a ff ff ff 89 f6 f6 46 34 01 0f 84 3f ff ff ff 8b 46 30 <8b> 40 04 e8 5b 05 cb ef 31 c9 8b 56 3c 8b 46 38 8b 5e 30 68 61
> [ 7.315639] EIP: [<d04d51cd>] firmware_loading_store+0xed/0x160 [firmware_class] SS:ESP 0068:cf777f2c
> [ 7.315639] CR2: 0000000000000004
> [ 7.540848] ---[ end trace 6ebe83c102d3b046 ]---


> [ 7.242449] ------------[ cut here ]------------
> [ 7.243706] Kernel BUG at d04d523e [verbose debug info unavailable]
> [ 7.243706] invalid opcode: 0000 [#1] PREEMPT
> [ 7.243706] last sysfs file: /sys/devices/pci0000:00/0000:00:09.0/firmware/0000:00:09.0/loading
> [ 7.243706] Modules linked in: arc4 ecb cryptomgr crypto_blkcipher aead pcompress crypto_hash crypto_algapi rt61pci rt2x00pci rt2x00lib firmware_class mac80211 input_polldev crc_itu_t eeprom_93cx6 cfg80211 ndccanram ndccan ndcser(P) ndcscan(P) eu16550(P) ndccon(P)
> [ 7.243706]
> [ 7.243706] Pid: 1004, comm: echo Tainted: P (2.6.31 #1)
> [ 7.243706] EIP: 0060:[<d04d523e>] EFLAGS: 00010246 CPU: 0
> [ 7.243706] EIP is at firmware_loading_store+0x15e/0x170 [firmware_class]
> [ 7.243706] EAX: 00000000 EBX: d04d50e0 ECX: 0000000a EDX: 00000028
> [ 7.243706] ESI: cf4f2a20 EDI: cf4ffc00 EBP: cf4ffc08 ESP: cea37f2c
> [ 7.243706] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
> [ 7.243706] Process echo (pid: 1004, ti=cea36000 task=cf6c2380 task.ti=cea36000)
> [ 7.243706] Stack:
> [ 7.243706] d04d50e0 cf4ffc00 00000002 c028051d 00000002 c0414c40 cf4fd420 c01c9c98
> [ 7.243706] <0> 00000002 c0414c40 00000002 b7fc8000 cf636e80 cf471980 c01c9ce6 cf636e94
> [ 7.243706] <0> cf471980 b7fc8000 00000002 c01c9cb0 c018d362 cea37f9c cf471980 fffffff7
> [ 7.243706] Call Trace:
> [ 7.243706] [<d04d50e0>] ? firmware_loading_store+0x0/0x170 [firmware_class]
> [ 7.243706] [<c028051d>] ? dev_attr_store+0x1d/0x20
> [ 7.243706] [<c01c9c98>] ? flush_write_buffer+0x38/0x50
> [ 7.243706] [<c01c9ce6>] ? sysfs_write_file+0x36/0x60
> [ 7.243706] [<c01c9cb0>] ? sysfs_write_file+0x0/0x60
> [ 7.243706] [<c018d362>] ? vfs_write+0x82/0xf0
> [ 7.243706] [<c018d47c>] ? sys_write+0x3c/0x70
> [ 7.243706] [<c0102ba1>] ? syscall_call+0x7/0xb
> [ 7.243706] Code: 66 34 fe e9 14 ff ff ff 8b 47 08 68 1c 5f 4d d0 50 89 f8 e8 65 b2 da ef 50 68 c3 5f 4d d0 e8 da 3c c5 ef 83 c4 10 e9 ea fe ff ff <0f> 0b eb fe 8d b4 26 00 00 00 00 8d bc 27 00 00 00 00 55 89 cd
> [ 7.243706] EIP: [<d04d523e>] firmware_loading_store+0x15e/0x170 [firmware_class] SS:ESP 0068:cea37f2c
> [ 7.458557] ---[ end trace 604593037057054f ]---

2009-09-18 19:03:38

by Lars Ericsson

[permalink] [raw]
Subject: RE: Oops in drivers\base\firmware_class

> On Wed, Sep 16, 2009 at 08:44:43PM +0200, Lars Ericsson wrote:
> > Hi,
> >
> > I have discovered a Oops in the firmware_loading_store function.
> > At first it looks like a timing issue but after adding a
> BUG_ON test,
> > it fails every time.
> >
> > drivers\base\firmware_class:
> > ------------------------------
> > 541 01c0 F6463401 testb $1,52(%esi)
> > 542 01c4 0F843FFF je .L38
> > 542 FFFF
> > 543 .loc 1 174 0
> > 544 01ca 8B4630 movl 48(%esi),%eax
> > 545 01cd 8B4004 movl 4(%eax),%eax <---- Oops
> > 546 01d0 E8FCFFFF call vfree
> > 546 FF
> >
> > The code that fails was introduced in commit
> > 6e03a201bbe8137487f340d26aa662110e324b20
> Could you try the following patch?

No, problem.
I have tried the patch and it works for me. I will continue my
tests and report any problems.

Should I close the http://bugzilla.kernel.org/show_bug.cgi?id=14185 ,
which I have created before I noticed you reply ?

Thanks for the help !

/Lars

> It seems that we're missing the locking around the accesses to
> fw_priv->fw.
>
> Signed-off-by: Frederik Deweerdt <[email protected]>
>
> diff --git a/drivers/base/firmware_class.c
> b/drivers/base/firmware_class.c
> index 8a267c4..49105ed 100644
> --- a/drivers/base/firmware_class.c
> +++ b/drivers/base/firmware_class.c
> @@ -171,12 +171,18 @@ static ssize_t
> firmware_loading_store(struct device *dev,
> break;
> case 0:
> if (test_bit(FW_STATUS_LOADING, &fw_priv->status)) {
> + mutex_lock(&fw_lock);
> + if (!fw_priv->fw) {
> + mutex_unlock(&fw_lock);
> + break;
> + }
> vfree(fw_priv->fw->data);
> fw_priv->fw->data = vmap(fw_priv->pages,
> fw_priv->nr_pages,
> 0, PAGE_KERNEL_RO);
> if (!fw_priv->fw->data) {
> dev_err(dev, "%s: vmap()
> failed\n", __func__);
> + mutex_unlock(&fw_lock);
> goto err;
> }
> /* Pages will be freed by vfree() */
> @@ -185,6 +191,7 @@ static ssize_t
> firmware_loading_store(struct device *dev,
> fw_priv->nr_pages = 0;
> complete(&fw_priv->completion);
> clear_bit(FW_STATUS_LOADING, &fw_priv->status);
> + mutex_unlock(&fw_lock);
> break;
> }
> /* fallthrough */
> >
> > Attached you will find the:
> > - Oops with the vanilla 2.6.31
> > - The BUG_ON patch
> > - Oops with the patched 2.6.31
> >
> > /Lars
>
> > [ 7.249453] rt61pci 0000:00:09.0: firmware: requesting
> rt2561s.bin
> > [ 7.314512] BUG: unable to handle kernel NULL pointer
> dereference at 00000004
> > [ 7.315639] IP: [<d04d51cd>]
> firmware_loading_store+0xed/0x160 [firmware_class]
> > [ 7.315639] *pde = 00000000
> > [ 7.315639] Oops: 0000 [#1] PREEMPT
> > [ 7.315639] last sysfs file:
> /sys/devices/pci0000:00/0000:00:09.0/firmware/0000:00:09.0/loading
> > [ 7.315639] Modules linked in: arc4 ecb cryptomgr
> crypto_blkcipher aead pcompress crypto_hash crypto_algapi
> rt61pci rt2x00pci rt2x00lib firmware_class mac80211
> input_polldev crc_itu_t eeprom_93cx6 cfg80211 ndccanram
> ndccan ndcser(P) ndcscan(P) eu16550(P) ndccon(P)
> > [ 7.315639]
> > [ 7.315639] Pid: 1004, comm: echo Tainted: P
> (2.6.31 #1)
> > [ 7.315639] EIP: 0060:[<d04d51cd>] EFLAGS: 00010202 CPU: 0
> > [ 7.315639] EIP is at firmware_loading_store+0xed/0x160
> [firmware_class]
> > [ 7.315639] EAX: 00000000 EBX: d04d50e0 ECX: 0000000a
> EDX: 00000028
> > [ 7.315639] ESI: cf65ca80 EDI: cea29c00 EBP: cea29c08
> ESP: cf777f2c
> > [ 7.315639] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
> > [ 7.315639] Process echo (pid: 1004, ti=cf776000
> task=cf755aa0 task.ti=cf776000)
> > [ 7.315639] Stack:
> > [ 7.315639] d04d50e0 cea29c00 00000002 c028051d
> 00000002 c0414c40 ce9c0420 c01c9c98
> > [ 7.315639] <0> 00000002 c0414c40 00000002 b7f15000
> ce9c1e80 cf44c000 c01c9ce6 ce9c1e94
> > [ 7.315639] <0> cf44c000 b7f15000 00000002 c01c9cb0
> c018d362 cf777f9c cf44c000 fffffff7
> > [ 7.315639] Call Trace:
> > [ 7.315639] [<d04d50e0>] ?
> firmware_loading_store+0x0/0x160 [firmware_class]
> > [ 7.315639] [<c028051d>] ? dev_attr_store+0x1d/0x20
> > [ 7.315639] [<c01c9c98>] ? flush_write_buffer+0x38/0x50
> > [ 7.315639] [<c01c9ce6>] ? sysfs_write_file+0x36/0x60
> > [ 7.315639] [<c01c9cb0>] ? sysfs_write_file+0x0/0x60
> > [ 7.315639] [<c018d362>] ? vfs_write+0x82/0xf0
> > [ 7.315639] [<c018d47c>] ? sys_write+0x3c/0x70
> > [ 7.315639] [<c0102ba1>] ? syscall_call+0x7/0xb
> > [ 7.315639] Code: 91 c9 ef 3b 5e 3c 7c ed eb b5 8d 74 26
> 00 83 f8 ff 0f 85 50 ff ff ff e9 6a ff ff ff 89 f6 f6 46 34
> 01 0f 84 3f ff ff ff 8b 46 30 <8b> 40 04 e8 5b 05 cb ef 31 c9
> 8b 56 3c 8b 46 38 8b 5e 30 68 61
> > [ 7.315639] EIP: [<d04d51cd>]
> firmware_loading_store+0xed/0x160 [firmware_class] SS:ESP
> 0068:cf777f2c
> > [ 7.315639] CR2: 0000000000000004
> > [ 7.540848] ---[ end trace 6ebe83c102d3b046 ]---
>
>
> > [ 7.242449] ------------[ cut here ]------------
> > [ 7.243706] Kernel BUG at d04d523e [verbose debug info
> unavailable]
> > [ 7.243706] invalid opcode: 0000 [#1] PREEMPT
> > [ 7.243706] last sysfs file:
> /sys/devices/pci0000:00/0000:00:09.0/firmware/0000:00:09.0/loading
> > [ 7.243706] Modules linked in: arc4 ecb cryptomgr
> crypto_blkcipher aead pcompress crypto_hash crypto_algapi
> rt61pci rt2x00pci rt2x00lib firmware_class mac80211
> input_polldev crc_itu_t eeprom_93cx6 cfg80211 ndccanram
> ndccan ndcser(P) ndcscan(P) eu16550(P) ndccon(P)
> > [ 7.243706]
> > [ 7.243706] Pid: 1004, comm: echo Tainted: P
> (2.6.31 #1)
> > [ 7.243706] EIP: 0060:[<d04d523e>] EFLAGS: 00010246 CPU: 0
> > [ 7.243706] EIP is at firmware_loading_store+0x15e/0x170
> [firmware_class]
> > [ 7.243706] EAX: 00000000 EBX: d04d50e0 ECX: 0000000a
> EDX: 00000028
> > [ 7.243706] ESI: cf4f2a20 EDI: cf4ffc00 EBP: cf4ffc08
> ESP: cea37f2c
> > [ 7.243706] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
> > [ 7.243706] Process echo (pid: 1004, ti=cea36000
> task=cf6c2380 task.ti=cea36000)
> > [ 7.243706] Stack:
> > [ 7.243706] d04d50e0 cf4ffc00 00000002 c028051d
> 00000002 c0414c40 cf4fd420 c01c9c98
> > [ 7.243706] <0> 00000002 c0414c40 00000002 b7fc8000
> cf636e80 cf471980 c01c9ce6 cf636e94
> > [ 7.243706] <0> cf471980 b7fc8000 00000002 c01c9cb0
> c018d362 cea37f9c cf471980 fffffff7
> > [ 7.243706] Call Trace:
> > [ 7.243706] [<d04d50e0>] ?
> firmware_loading_store+0x0/0x170 [firmware_class]
> > [ 7.243706] [<c028051d>] ? dev_attr_store+0x1d/0x20
> > [ 7.243706] [<c01c9c98>] ? flush_write_buffer+0x38/0x50
> > [ 7.243706] [<c01c9ce6>] ? sysfs_write_file+0x36/0x60
> > [ 7.243706] [<c01c9cb0>] ? sysfs_write_file+0x0/0x60
> > [ 7.243706] [<c018d362>] ? vfs_write+0x82/0xf0
> > [ 7.243706] [<c018d47c>] ? sys_write+0x3c/0x70
> > [ 7.243706] [<c0102ba1>] ? syscall_call+0x7/0xb
> > [ 7.243706] Code: 66 34 fe e9 14 ff ff ff 8b 47 08 68 1c
> 5f 4d d0 50 89 f8 e8 65 b2 da ef 50 68 c3 5f 4d d0 e8 da 3c
> c5 ef 83 c4 10 e9 ea fe ff ff <0f> 0b eb fe 8d b4 26 00 00 00
> 00 8d bc 27 00 00 00 00 55 89 cd
> > [ 7.243706] EIP: [<d04d523e>]
> firmware_loading_store+0x15e/0x170 [firmware_class] SS:ESP
> 0068:cea37f2c
> > [ 7.458557] ---[ end trace 604593037057054f ]---
>
> --
> To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2009-09-21 13:32:27

by Frederik Deweerdt

[permalink] [raw]
Subject: [patch -stable] firware_class oops: fix firmware_loading_store locking

Hi all,
On Fri, Sep 18, 2009 at 07:53:40PM +0200, Lars Ericsson wrote:
> > On Wed, Sep 16, 2009 at 08:44:43PM +0200, Lars Ericsson wrote:
> > > Hi,
> > >
> > > I have discovered a Oops in the firmware_loading_store function.
> > > At first it looks like a timing issue but after adding a
> > BUG_ON test,
> > > it fails every time.
> > >
> > > drivers\base\firmware_class:
> > > ------------------------------
> > > 541 01c0 F6463401 testb $1,52(%esi)
> > > 542 01c4 0F843FFF je .L38
> > > 542 FFFF
> > > 543 .loc 1 174 0
> > > 544 01ca 8B4630 movl 48(%esi),%eax
> > > 545 01cd 8B4004 movl 4(%eax),%eax <---- Oops
> > > 546 01d0 E8FCFFFF call vfree
> > > 546 FF
> > >
> > > The code that fails was introduced in commit
> > > 6e03a201bbe8137487f340d26aa662110e324b20
> > Could you try the following patch?
>
> No, problem.
> I have tried the patch and it works for me. I will continue my
> tests and report any problems.
>
> Should I close the http://bugzilla.kernel.org/show_bug.cgi?id=14185 ,
> which I have created before I noticed you reply ?
>
I'd rather wait someone picks it up for mainline inclusion. I've added
your {Reported,Tested}-by tags.

The bug its vanilla 2.6.31, and should be considered for -stable inclusion.

Regards,
Frederik

----

The code introduced by commit 6e03a201bbe8137487f340d26aa662110e324b20 leads
to a potential null deref. The following patch adds the proper locking
around the accesses to fw_priv->fw.
See http://bugzilla.kernel.org/show_bug.cgi?id=14185 for a full bug report.


Reported-by: Lars Ericsson <[email protected]>
Signed-off-by: Frederik Deweerdt <[email protected]>
Tested-by: Lars Ericsson <[email protected]>


diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
index 8a267c4..49105ed 100644
--- a/drivers/base/firmware_class.c
+++ b/drivers/base/firmware_class.c
@@ -171,12 +171,18 @@ static ssize_t firmware_loading_store(struct device *dev,
break;
case 0:
if (test_bit(FW_STATUS_LOADING, &fw_priv->status)) {
+ mutex_lock(&fw_lock);
+ if (!fw_priv->fw) {
+ mutex_unlock(&fw_lock);
+ break;
+ }
vfree(fw_priv->fw->data);
fw_priv->fw->data = vmap(fw_priv->pages,
fw_priv->nr_pages,
0, PAGE_KERNEL_RO);
if (!fw_priv->fw->data) {
dev_err(dev, "%s: vmap() failed\n", __func__);
+ mutex_unlock(&fw_lock);
goto err;
}
/* Pages will be freed by vfree() */
@@ -185,6 +191,7 @@ static ssize_t firmware_loading_store(struct device *dev,
fw_priv->nr_pages = 0;
complete(&fw_priv->completion);
clear_bit(FW_STATUS_LOADING, &fw_priv->status);
+ mutex_unlock(&fw_lock);
break;
}
/* fallthrough */

2009-09-23 16:43:24

by Linus Torvalds

[permalink] [raw]
Subject: Re: [patch -stable] firware_class oops: fix firmware_loading_store locking



On Mon, 21 Sep 2009, Frederik Deweerdt wrote:
<
> I'd rather wait someone picks it up for mainline inclusion. I've added
> your {Reported,Tested}-by tags.
>
> The bug its vanilla 2.6.31, and should be considered for -stable inclusion.
>
> Regards,
> Frederik
>
> ----
>
> The code introduced by commit 6e03a201bbe8137487f340d26aa662110e324b20 leads
> to a potential null deref. The following patch adds the proper locking
> around the accesses to fw_priv->fw.
> See http://bugzilla.kernel.org/show_bug.cgi?id=14185 for a full bug report.

I don't think this is correct.

I think you should protect the FW_STATUS_LOADING bit too, shouldn't you?

As it is, it does this:

if (test_bit(FW_STATUS_LOADING, &fw_priv->status)) {
mutex_lock(&fw_lock);
...
clear_bit(FW_STATUS_LOADING, &fw_priv->status);
mutex_unlock(&fw_lock);
break;
}

and if this code can race (which it obviously can, since your addition of
fw_lock mutex matters), then I think it can race on that FW_STATUS_LOADING
bit too. No?

So my gut feel is that the whole damn function should be protected by the
mutex_lock thing. IOW, the patch would be something like the appended.

UNTESTED. Somebody needs to test this, verify, and send it back to me.

Am I missing something?

Linus

---
drivers/base/firmware_class.c | 9 ++++-----
1 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
index 7376367..1b803df 100644
--- a/drivers/base/firmware_class.c
+++ b/drivers/base/firmware_class.c
@@ -150,13 +150,12 @@ static ssize_t firmware_loading_store(struct device *dev,
int loading = simple_strtol(buf, NULL, 10);
int i;

+ mutex_lock(&fw_lock);
switch (loading) {
case 1:
- mutex_lock(&fw_lock);
- if (!fw_priv->fw) {
- mutex_unlock(&fw_lock);
+ if (!fw_priv->fw)
break;
- }
+
vfree(fw_priv->fw->data);
fw_priv->fw->data = NULL;
for (i = 0; i < fw_priv->nr_pages; i++)
@@ -167,7 +166,6 @@ static ssize_t firmware_loading_store(struct device *dev,
fw_priv->nr_pages = 0;
fw_priv->fw->size = 0;
set_bit(FW_STATUS_LOADING, &fw_priv->status);
- mutex_unlock(&fw_lock);
break;
case 0:
if (test_bit(FW_STATUS_LOADING, &fw_priv->status)) {
@@ -195,6 +193,7 @@ static ssize_t firmware_loading_store(struct device *dev,
fw_load_abort(fw_priv);
break;
}
+ mutex_unlock(&fw_lock);

return count;
}

2009-09-24 15:14:03

by Sachin Sant

[permalink] [raw]
Subject: Re: [patch -stable] firware_class oops: fix firmware_loading_store locking

Linus Torvalds wrote:
> I don't think this is correct.
>
> I think you should protect the FW_STATUS_LOADING bit too, shouldn't you?
>
> As it is, it does this:
>
> if (test_bit(FW_STATUS_LOADING, &fw_priv->status)) {
> mutex_lock(&fw_lock);
> ...
> clear_bit(FW_STATUS_LOADING, &fw_priv->status);
> mutex_unlock(&fw_lock);
> break;
> }
>
> and if this code can race (which it obviously can, since your addition of
> fw_lock mutex matters), then I think it can race on that FW_STATUS_LOADING
> bit too. No?
>
> So my gut feel is that the whole damn function should be protected by the
> mutex_lock thing. IOW, the patch would be something like the appended.
>
> UNTESTED. Somebody needs to test this, verify, and send it back to me.
>
I did a quick boot test with this patch and didn't find any issues.

But that said i haven't been able to recreate the problem reported by Lars,
so not sure how relevant would be the test results from me.

Thanks
-Sachin


--

---------------------------------
Sachin Sant
IBM Linux Technology Center
India Systems and Technology Labs
Bangalore, India
---------------------------------

2009-09-24 15:26:53

by Frederik Deweerdt

[permalink] [raw]
Subject: Re: [patch -stable] firware_class oops: fix firmware_loading_store locking

On Wed, Sep 23, 2009 at 09:42:41AM -0700, Linus Torvalds wrote:
>
>
> On Mon, 21 Sep 2009, Frederik Deweerdt wrote:
> <
> > I'd rather wait someone picks it up for mainline inclusion. I've added
> > your {Reported,Tested}-by tags.
> >
> > The bug its vanilla 2.6.31, and should be considered for -stable inclusion.
> >
> > Regards,
> > Frederik
> >
> > ----
> >
> > The code introduced by commit 6e03a201bbe8137487f340d26aa662110e324b20 leads
> > to a potential null deref. The following patch adds the proper locking
> > around the accesses to fw_priv->fw.
> > See http://bugzilla.kernel.org/show_bug.cgi?id=14185 for a full bug report.
>
> I don't think this is correct.
>
> I think you should protect the FW_STATUS_LOADING bit too, shouldn't you?
>
> As it is, it does this:
>
> if (test_bit(FW_STATUS_LOADING, &fw_priv->status)) {
> mutex_lock(&fw_lock);
> ...
> clear_bit(FW_STATUS_LOADING, &fw_priv->status);
> mutex_unlock(&fw_lock);
> break;
> }
>
> and if this code can race (which it obviously can, since your addition of
> fw_lock mutex matters), then I think it can race on that FW_STATUS_LOADING
> bit too. No?
>
> So my gut feel is that the whole damn function should be protected by the
> mutex_lock thing. IOW, the patch would be something like the appended.
>
> UNTESTED. Somebody needs to test this, verify, and send it back to me.
>
> Am I missing something?
You're right, the status must be protected too, but we would want to
keep the check on fw_priv->fw not being NULL (patch follows).

I cannot reproduce the bug here, Lars could you please have a look ?

Regards,
Frederik

diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
index 7376367..21ac040 100644
--- a/drivers/base/firmware_class.c
+++ b/drivers/base/firmware_class.c
@@ -150,13 +150,15 @@ static ssize_t firmware_loading_store(struct device *dev,
int loading = simple_strtol(buf, NULL, 10);
int i;

+
+ mutex_lock(&fw_lock);
+ if (!fw_priv->fw) {
+ mutex_unlock(&fw_lock);
+ return -ENODEV;
+ }
+
switch (loading) {
case 1:
- mutex_lock(&fw_lock);
- if (!fw_priv->fw) {
- mutex_unlock(&fw_lock);
- break;
- }
vfree(fw_priv->fw->data);
fw_priv->fw->data = NULL;
for (i = 0; i < fw_priv->nr_pages; i++)
@@ -167,7 +169,6 @@ static ssize_t firmware_loading_store(struct device *dev,
fw_priv->nr_pages = 0;
fw_priv->fw->size = 0;
set_bit(FW_STATUS_LOADING, &fw_priv->status);
- mutex_unlock(&fw_lock);
break;
case 0:
if (test_bit(FW_STATUS_LOADING, &fw_priv->status)) {
@@ -195,6 +196,7 @@ static ssize_t firmware_loading_store(struct device *dev,
fw_load_abort(fw_priv);
break;
}
+ mutex_unlock(&fw_lock);

return count;
}