Hi,
A few moments ago a system of mine running 2.6.21 on a P4 with
hyperthreading, 2GB ram, IDE disk, crashed:
[10371.128320] BUG: unable to handle kernel paging request at virtual address 00100100
[10371.128419] printing eip:
[10371.128462] c118ebb3
[10371.128502] *pde = 00000000
[10371.128544] Oops: 0000 [#1]
[10371.128584] SMP
[10371.128691] Modules linked in: tuner tvaudio bttv video_buf ir_common i2c_algo_bit btcx_risc tveeprom wcfxo pl2303 zaptel usbserial pwc nfs w83627hf hwmon_vid eeprom plusb i2c_isa i2c_core usbnet snd_pcm_oss snd_mixer_oss snd_intel8x0 snd_ac97_codec sd_mod snd_pcm scsi_mod snd_timer snd ide_cd soundcore cdrom snd_page_alloc ac97_bus parport_pc parport microcode firmware_class netconsole nfsd exportfs lockd sunrpc ipt_owner ip6table_filter ip6_tables ipv6 ipt_recent xt_limit xt_state act_police sch_ingress cls_u32 sch_sfq sch_cbq ipt_REJECT ipt_MASQUERADE ipt_TOS xt_tcpudp iptable_mangle iptable_filter rtl8150 e1000 3c59x mii iptable_nat ip_tables nf_nat nf_conntrack_ipv4 nf_conntrack nfnetlink x_tables capability commoncap ppp_deflate zlib_deflate zlib_inflate ppp_async crc_ccitt ppp_generic slip slhc genrtc rd
[10371.131693] CPU: 0
[10371.131694] EIP: 0060:[<c118ebb3>] Not tainted VLI
[10371.131696] EFLAGS: 00210046 (2.6.21 #3)
[10371.131825] EIP is at hiddev_send_event+0xa1/0xd3
[10371.131869] eax: 000ffaec ebx: 000ffaec ecx: 00020001 edx: 00100100
[10371.131917] esi: c133ee20 edi: f74042b0 ebp: c133ee18 esp: c133ee04
[10371.131963] ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068
[10371.132008] Process snmpget (pid: 5471, ti=c133e000 task=ca0de030 task.ti=cc954000)
[10371.132052] Stack: 00000000 f7404280 c2725164 f7671000 c2725164 c133ee48 c118ec41 00000001
[10371.132418] 00000000 00000000 00000005 ffa10003 00000000 00200046 f7404288 f7671000
[10371.132800] c118ebe5 c133ee64 c119df5a 00000000 c27250c0 e9518580 00000000 00000008
[10371.133169] Call Trace:
[10371.133251] [<c1004d53>] show_trace_log_lvl+0x1a/0x30
[10371.133336] [<c1004e0a>] show_stack_log_lvl+0x8d/0xaa
[10371.133416] [<c1005044>] show_registers+0x1cd/0x2cb
[10371.133496] [<c10052a1>] die+0x11b/0x227
[10371.133575] [<c121181f>] do_page_fault+0x319/0x57a
[10371.133662] [<c120fdfc>] error_code+0x7c/0x84
[10371.133746] [<c118ec41>] hiddev_hid_event+0x5c/0x63
[10371.133828] [<c119df5a>] hid_process_event+0x6e/0x7b
[10371.133913] [<c119e0f7>] hid_input_field+0x190/0x328
[10371.134003] [<c119e671>] hid_input_report+0xb3/0xf6
[10371.134086] [<c118cfe7>] hid_irq_in+0x164/0x169
[10371.134164] [<c1173dba>] usb_hcd_giveback_urb+0x47/0xa4
[10371.134243] [<c118b128>] uhci_giveback_urb+0x74/0x130
[10371.134322] [<c118b271>] uhci_scan_qh+0x8d/0x1ed
[10371.134399] [<c118b532>] uhci_scan_schedule+0x81/0x113
[10371.134476] [<c118c215>] uhci_irq+0xb4/0x14d
[10371.134553] [<c1173e3c>] usb_hcd_irq+0x25/0x5d
[10371.134632] [<c104db36>] handle_IRQ_event+0x28/0x5b
[10371.134711] [<c104ecb9>] handle_fasteoi_irq+0x67/0xbe
[10371.134788] [<c1005dd7>] do_IRQ+0x86/0xe5
[10371.134867] =======================
[10371.134908] Code: 00 06 00 00 b9 01 00 02 00 89 54 18 14 8d 83 0c 06 00 00 ba 1d 00 00 00 e8 7c be ee ff 8b 93 14 06 00 00 8d 82 ec f9 ff ff 89 c3 <8b> 80 14 06 00 00 0f 18 00 90 39 fa 75 80 8b 45 f0 b9 01 00 00
[10371.137290] EIP: [<c118ebb3>] hiddev_send_event+0xa1/0xd3 SS:ESP 0068:c133ee04
[10371.137413] Kernel panic - not syncing: Fatal exception in interrupt
[10371.137460] BUG: at arch/i386/kernel/smp.c:546 smp_call_function()
[10371.137504] [<c1004d53>] show_trace_log_lvl+0x1a/0x30
[10371.137583] [<c1004d7b>] show_trace+0x12/0x14
[10371.137662] [<c1004e75>] dump_stack+0x16/0x18
[10371.137740] [<c100df63>] smp_call_function+0x10f/0x114
[10371.137819] [<c100dfb9>] smp_send_stop+0x1e/0x31
[10371.137897] [<c101ca83>] panic+0x50/0xf5
[10371.137976] [<c100539e>] die+0x218/0x227
[10371.138054] [<c121181f>] do_page_fault+0x319/0x57a
[10371.138136] [<c120fdfc>] error_code+0x7c/0x84
[10371.138213] [<c118ec41>] hiddev_hid_event+0x5c/0x63
[10371.138293] [<c119df5a>] hid_process_event+0x6e/0x7b
[10371.138378] [<c119e0f7>] hid_input_field+0x190/0x328
[10371.138458] [<c119e671>] hid_input_report+0xb3/0xf6
[10371.138541] [<c118cfe7>] hid_irq_in+0x164/0x169
[10371.138633] [<c1173dba>] usb_hcd_giveback_urb+0x47/0xa4
[10371.138719] [<c118b128>] uhci_giveback_urb+0x74/0x130
I reported other bugs on circular locks also happening in this show_trace_log_lvl part.
Folkert van Heusden
--
Temperature outside: 23.562500, temperature livingroom: 24.4
----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, http://www.vanheusden.com
On Sun, 6 May 2007, Folkert van Heusden wrote:
> A few moments ago a system of mine running 2.6.21 on a P4 with
> hyperthreading, 2GB ram, IDE disk, crashed:
> [10371.128320] BUG: unable to handle kernel paging request at virtual address 00100100
> [10371.128419] printing eip:
[...]
> [10371.131825] EIP is at hiddev_send_event+0xa1/0xd3
Hi,
I will look into it. Are you able to reproduce the problem, or did it
happen just randomly? Is there any userspace driver using the hiddev
interface at the moment it crashes?
There have been no changes in the hiddev code for a pretty long time.
Thanks,
--
Jiri Kosina
Hi,
> > A few moments ago a system of mine running 2.6.21 on a P4 with
> > hyperthreading, 2GB ram, IDE disk, crashed:
> > [10371.128320] BUG: unable to handle kernel paging request at virtual address 00100100
> > [10371.128419] printing eip:
> [...]
> > [10371.131825] EIP is at hiddev_send_event+0xa1/0xd3
>
> I will look into it. Are you able to reproduce the problem, or did it
> happen just randomly? Is there any userspace driver using the hiddev
> interface at the moment it crashes?
> There have been no changes in the hiddev code for a pretty long time.
It is the first time this happened altough with 2.6.21 I get relatively
frequent warnings of circular locks dependencies. This is the first time
it was followed by an oops.
Connected via hid are 2 temperature sensors and an UPS. Has been running
fine for ages.
Folkert van Heusden
--
MultiTail es una herramienta flexibele para consiguir archivos de log,
y para ejecutar ordenes. Filtrar, a?adir colores, merger y vista de
las differencias. http://www.vanheusden.com/multitail/
----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, http://www.vanheusden.com
On Sun, 6 May 2007, Folkert van Heusden wrote:
> > > A few moments ago a system of mine running 2.6.21 on a P4 with
> > > hyperthreading, 2GB ram, IDE disk, crashed: [10371.128320] BUG:
> > > unable to handle kernel paging request at virtual address 00100100
> > > [10371.128419] printing eip:
> > [...]
> > > [10371.131825] EIP is at hiddev_send_event+0xa1/0xd3
> >
> > I will look into it. Are you able to reproduce the problem, or did it
> > happen just randomly? Is there any userspace driver using the hiddev
> > interface at the moment it crashes?
> > There have been no changes in the hiddev code for a pretty long time.
[...]
> This is the first time it was followed by an oops. Connected via hid are
> 2 temperature sensors and an UPS. Has been running fine for ages.
OK, I think the patch below should solve this. I am pretty surprised
though that this bug wasn't triggered/reported by anyone anytime sooner,
it has been there for ages too (but ok, the race window should be pretty
small and hiddev is usually not high-throughput interface).
Could you please test it and let me know?
From: Jiri Kosina <[email protected]>
USB HID: hiddev - fix race between hiddev_send_event() and hiddev_release()
There is a small race window in which hiddev_release() could corrupt the
list that is being processed for new event in hiddev_send_event().
Synchronize the operations over this list.
Signed-off-by: Jiri Kosina <[email protected]>
diff --git a/drivers/usb/input/hiddev.c b/drivers/usb/input/hiddev.c
index a8b3d66..488d61b 100644
--- a/drivers/usb/input/hiddev.c
+++ b/drivers/usb/input/hiddev.c
@@ -51,6 +51,7 @@ struct hiddev {
wait_queue_head_t wait;
struct hid_device *hid;
struct list_head list;
+ spinlock_t list_lock;
};
struct hiddev_list {
@@ -161,7 +162,9 @@ static void hiddev_send_event(struct hid
{
struct hiddev *hiddev = hid->hiddev;
struct hiddev_list *list;
+ unsigned long flags;
+ spin_lock_irqsave(&hiddev->list_lock, flags);
list_for_each_entry(list, &hiddev->list, node) {
if (uref->field_index != HID_FIELD_INDEX_NONE ||
(list->flags & HIDDEV_FLAG_REPORT) != 0) {
@@ -171,6 +174,7 @@ static void hiddev_send_event(struct hid
kill_fasync(&list->fasync, SIGIO, POLL_IN);
}
}
+ spin_unlock_irqrestore(&hiddev->list_lock, flags);
wake_up_interruptible(&hiddev->wait);
}
@@ -235,9 +239,13 @@ static int hiddev_fasync(int fd, struct
static int hiddev_release(struct inode * inode, struct file * file)
{
struct hiddev_list *list = file->private_data;
+ unsigned long flags;
hiddev_fasync(-1, file, 0);
+
+ spin_lock_irqsave(&list->hiddev->list_lock, flags);
list_del(&list->node);
+ spin_unlock_irqrestore(&list->hiddev->list_lock, flags);
if (!--list->hiddev->open) {
if (list->hiddev->exist)
@@ -257,6 +265,7 @@ static int hiddev_release(struct inode *
static int hiddev_open(struct inode *inode, struct file *file)
{
struct hiddev_list *list;
+ unsigned long flags;
int i = iminor(inode) - HIDDEV_MINOR_BASE;
@@ -267,7 +276,11 @@ static int hiddev_open(struct inode *ino
return -ENOMEM;
list->hiddev = hiddev_table[i];
+
+ spin_lock_irqsave(&list->hiddev->list_lock, flags);
list_add_tail(&list->node, &hiddev_table[i]->list);
+ spin_unlock_irqrestore(&list->hiddev->list_lock, flags);
+
file->private_data = list;
if (!list->hiddev->open++)
@@ -773,6 +786,7 @@ int hiddev_connect(struct hid_device *hi
init_waitqueue_head(&hiddev->wait);
INIT_LIST_HEAD(&hiddev->list);
+ spin_lock_init(&hiddev->list_lock);
hiddev->hid = hid;
hiddev->exist = 1;
On Mon, 7 May 2007, Jiri Kosina wrote:
> OK, I think the patch below should solve this. I am pretty surprised
> though that this bug wasn't triggered/reported by anyone anytime sooner,
> it has been there for ages too (but ok, the race window should be pretty
> small and hiddev is usually not high-throughput interface).
> Could you please test it and let me know?
> From: Jiri Kosina <[email protected]>
> USB HID: hiddev - fix race between hiddev_send_event() and hiddev_release()
Flokert,
did you have time to verify whether this patch fixes the issue for you?
Thanks,
--
Jiri Kosina
Hi,
> > OK, I think the patch below should solve this. I am pretty surprised
> > though that this bug wasn't triggered/reported by anyone anytime sooner,
> > it has been there for ages too (but ok, the race window should be pretty
> > small and hiddev is usually not high-throughput interface).
> > Could you please test it and let me know?
> > From: Jiri Kosina <[email protected]>
> > USB HID: hiddev - fix race between hiddev_send_event() and hiddev_release()
> did you have time to verify whether this patch fixes the issue for you?
It looks like this solves the hid-problem. But I still get the
occasional 'not responding but still forwarding network-traffic(!)' and
'ext3 circular lock failure' problems.
Folkert van Heusden
--
To MultiTail einai ena polymorfiko ergaleio gia ta logfiles kai tin
eksodo twn entolwn. Prosferei: filtrarisma, xrwmatismo, sygxwneysi,
diaforetikes provoles. http://www.vanheusden.com/multitail/
----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, http://www.vanheusden.com