2009-04-30 13:20:13

by Marc Pignat

[permalink] [raw]
Subject: [BUG] 2.6.30-rc4 hid bluetooth not working

Hi all!

My bluetooth keyboard is not working any more in rc4, but was working in rc3.

Here is the dmesg output, triggered by the first key press on the keyboard, fortunately, this
is 100% reproductible (once per boot).

Attached my .config and the full dmesg output from boot, just in case.

Best regards

Marc


[ 1492.544731] ------------[ cut here ]------------
[ 1492.544741] WARNING: at kernel/workqueue.c:371 flush_cpu_workqueue+0x2d/0x77()
[ 1492.544748] Hardware name: System Product Name
[ 1492.544753] Modules linked in: btusb ipv6 nfs lockd nfs_acl auth_rpcgss sunrpc loop snd_cmipci tuner gameport snd_hda_codec_atihdmi snd_opl3_lib cx88_alsa cx8800 cx88xx snd_mpu401_uart ir_common snd_hda_codec_via tveeprom snd_seq_midi videobuf_dma_sg snd_hda_intel snd_hda_codec psmouse videobuf_core snd_rawmidi snd_hwdep serio_raw btcx_risc wmi pcspkr
[ 1492.544820] Pid: 324, comm: bluetooth Not tainted 2.6.30-rc4 #8
[ 1492.544825] Call Trace:
[ 1492.544837] [<ffffffff80246405>] ? warn_slowpath+0xd8/0x10a
[ 1492.544850] [<ffffffff8037b8a2>] ? vsnprintf+0x3b8/0x3f7
[ 1492.544862] [<ffffffff802cd069>] ? path_lookup_open+0x83/0x91
[ 1492.544872] [<ffffffff8037b9eb>] ? snprintf+0x44/0x4c
[ 1492.544882] [<ffffffff8020f631>] ? __switch_to+0xab/0x25a
[ 1492.544892] [<ffffffff8023f33f>] ? dequeue_entity+0xf/0x11f
[ 1492.544901] [<ffffffff805a5462>] ? _spin_unlock_irq+0x36/0x44
[ 1492.544910] [<ffffffff8023d646>] ? finish_task_switch+0x47/0xd8
[ 1492.544921] [<ffffffff805a3bf8>] ? thread_return+0x4c/0xa3
[ 1492.544931] [<ffffffff8025593d>] ? flush_cpu_workqueue+0x2d/0x77
[ 1492.544940] [<ffffffff8023d646>] ? finish_task_switch+0x47/0xd8
[ 1492.544950] [<ffffffff8022af3f>] ? default_spin_lock_flags+0x5/0xa
[ 1492.544959] [<ffffffff805a5257>] ? _spin_lock_irqsave+0x30/0x3d
[ 1492.544968] [<ffffffff80255b95>] ? flush_workqueue+0x33/0x55
[ 1492.544978] [<ffffffff80580553>] ? add_conn+0x14/0x39
[ 1492.544987] [<ffffffff80255687>] ? worker_thread+0x1c2/0x26a
[ 1492.544996] [<ffffffff802590aa>] ? autoremove_wake_function+0x0/0x2e
[ 1492.545006] [<ffffffff802554c5>] ? worker_thread+0x0/0x26a
[ 1492.545015] [<ffffffff802554c5>] ? worker_thread+0x0/0x26a
[ 1492.545023] [<ffffffff80258c88>] ? kthread+0x54/0x80
[ 1492.545033] [<ffffffff80243b77>] ? schedule_tail+0x2b/0x62
[ 1492.545042] [<ffffffff80211bfa>] ? child_rip+0xa/0x20
[ 1492.545050] [<ffffffff80258c34>] ? kthread+0x0/0x80
[ 1492.545058] [<ffffffff80211bf0>] ? child_rip+0x0/0x20
[ 1492.545064] ---[ end trace e4644449f8ce64bf ]---
[ 1492.613313] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
[ 1492.617028] IP: [<ffffffff803120c7>] sysfs_addrm_start+0x25/0xa5
[ 1492.617028] PGD 0
[ 1492.617028] Oops: 0000 [#1] PREEMPT SMP
[ 1492.617028] last sysfs file: /sys/devices/platform/it87.656/pwm1
[ 1492.617028] CPU 3
[ 1492.617028] Modules linked in: btusb ipv6 nfs lockd nfs_acl auth_rpcgss sunrpc loop snd_cmipci tuner gameport snd_hda_codec_atihdmi snd_opl3_lib cx88_alsa cx8800 cx88xx snd_mpu401_uart ir_common snd_hda_codec_via tveeprom snd_seq_midi videobuf_dma_sg snd_hda_intel snd_hda_codec psmouse videobuf_core snd_rawmidi snd_hwdep serio_raw btcx_risc wmi pcspkr
[ 1492.617028] Pid: 3556, comm: hidd Tainted: G W 2.6.30-rc4 #8 System Product Name
[ 1492.617028] RIP: 0010:[<ffffffff803120c7>] [<ffffffff803120c7>] sysfs_addrm_start+0x25/0xa5
[ 1492.617028] RSP: 0018:ffff88010ddd7a48 EFLAGS: 00010286
[ 1492.617028] RAX: ffff8800c59f1898 RBX: 0000000000000000 RCX: 0000000000000000
[ 1492.617028] RDX: fffffffffffffff8 RSI: 0000000000000000 RDI: ffffffff8074f1c0
[ 1492.617028] RBP: ffff88010ddd7a68 R08: 00000000000025c3 R09: ffff88010ddd7a2c
[ 1492.617028] R10: ffff88010cd64360 R11: ffff8801071d9250 R12: 00000000fffffff4
[ 1492.617028] R13: 0000000000000000 R14: ffff88010ddd7ac0 R15: ffff8800ca4710e8
[ 1492.617028] FS: 00007f0ba75f56f0(0000) GS:ffff88002806b000(0000) knlGS:0000000000000000
[ 1492.617028] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1492.617028] CR2: 0000000000000038 CR3: 000000010b4bb000 CR4: 00000000000006e0
[ 1492.617028] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1492.617028] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1492.617028] Process hidd (pid: 3556, threadinfo ffff88010ddd6000, task ffff88010c5747c0)
[ 1492.617028] Stack:
[ 1492.617028] 0000000000000000 ffff8800c59f1898 ffff88010dc385a0 ffffffff80312608
[ 1492.617028] 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 1492.740068] ffff8800c59f1898 00000000fffffffe 00000000fffffff4 ffff8800c59f0000
[ 1492.740068] Call Trace:
[ 1492.740068] [<ffffffff80312608>] ? create_dir+0x44/0x7c
[ 1492.740068] [<ffffffff80312675>] ? sysfs_create_dir+0x35/0x4a
[ 1492.740068] [<ffffffff805a5560>] ? _spin_unlock+0x2f/0x3d
[ 1492.740068] [<ffffffff803762cf>] ? kobject_add_internal+0xd0/0x186
[ 1492.740068] [<ffffffff80376531>] ? kobject_add+0x74/0x7c
[ 1492.740068] [<ffffffff802be5cb>] ? ____cache_alloc+0x18/0x222
[ 1492.740068] [<ffffffff802be915>] ? __kmalloc+0x140/0x14c
[ 1492.740068] [<ffffffff8042c40e>] ? device_add+0xd1/0x50c
[ 1492.740068] [<ffffffff8057c9b1>] ? hci_get_route+0xae/0xbc
[ 1492.740068] [<ffffffff804ceb24>] ? hid_add_device+0x143/0x159
[ 1492.740068] [<ffffffff8058b74a>] ? hidp_add_connection+0x35d/0x5de
[ 1492.740068] [<ffffffff802c44eb>] ? __rcu_read_unlock+0xe/0x2a
[ 1492.740068] [<ffffffff8058c421>] ? hidp_sock_ioctl+0xf0/0x22b
[ 1492.740068] [<ffffffff805a50c3>] ? _spin_lock+0x1a/0x20
[ 1492.740068] [<ffffffff802a6451>] ? __do_fault+0x32e/0x376
[ 1492.740068] [<ffffffff80505a49>] ? sockfd_lookup_light+0x1a/0x51
[ 1492.740068] [<ffffffff8050573d>] ? sock_ioctl+0x1e7/0x20a
[ 1492.740068] [<ffffffff802cefda>] ? vfs_ioctl+0x21/0x6c
[ 1492.740068] [<ffffffff802cf457>] ? do_vfs_ioctl+0x432/0x46b
[ 1492.740068] [<ffffffff802cf4e1>] ? sys_ioctl+0x51/0x70
[ 1492.740068] [<ffffffff80210b42>] ? system_call_fastpath+0x16/0x1b
[ 1492.740068] Code: 44 89 f0 41 5e c3 55 31 c0 b9 08 00 00 00 48 89 fd 53 48 89 f3 48 83 ec 08 f3 ab 48 89 75 00 48 c7 c7 c0 f1 74 80 e8 3e 20 29 00 <48> 8b 73 38 48 8b 3d a6 18 60 00 48 89 d9 48 c7 c2 ec 1b 31 80
[ 1492.740068] RIP [<ffffffff803120c7>] sysfs_addrm_start+0x25/0xa5
[ 1492.740068] RSP <ffff88010ddd7a48>
[ 1492.740068] CR2: 0000000000000038
[ 1492.741545] ---[ end trace e4644449f8ce64c0 ]---


Attachments:
(No filename) (6.25 kB)
.config (93.53 kB)
kern.log (82.42 kB)
Download all attachments

2009-04-30 22:34:14

by Marcel Holtmann

[permalink] [raw]
Subject: Re: [BUG] 2.6.30-rc4 hid bluetooth not working

Hi Jiri,

> > > > My bluetooth keyboard is not working any more in rc4, but was
> > > > working in rc3. Here is the dmesg output, triggered by the first key
> > > > press on the keyboard, fortunately, this is 100% reproductible (once
> > > > per boot).
> > > Does reverting f3784d834c7 fix the problem?
> > we have seen a similar report where reverting f3784d834c7 didn't fix it.
> > And I don't see anything wrong with that patch. Did something important
> > got changed in the work queue code that I am missing?
>
> Calling flush() from work->func() is not safe. That's what the WARN_ON()
> in flush_cpu_workqueue() is there for, right?

I don't know since this got changed in 2.6.30-rc1. I do have this kernel
running and I have seen the WARN_ON() only once. However I have never
seen a NULL pointer because of this patch.

Regards

Marcel



2009-04-30 18:58:57

by Jiri Kosina

[permalink] [raw]
Subject: Re: [BUG] 2.6.30-rc4 hid bluetooth not working

On Thu, 30 Apr 2009, Marcel Holtmann wrote:

> > > My bluetooth keyboard is not working any more in rc4, but was
> > > working in rc3. Here is the dmesg output, triggered by the first key
> > > press on the keyboard, fortunately, this is 100% reproductible (once
> > > per boot).
> > Does reverting f3784d834c7 fix the problem?
> we have seen a similar report where reverting f3784d834c7 didn't fix it.
> And I don't see anything wrong with that patch. Did something important
> got changed in the work queue code that I am missing?

Calling flush() from work->func() is not safe. That's what the WARN_ON()
in flush_cpu_workqueue() is there for, right?

--
Jiri Kosina
SUSE Labs

2009-04-30 15:03:03

by Marcel Holtmann

[permalink] [raw]
Subject: Re: [BUG] 2.6.30-rc4 hid bluetooth not working

Hi Jiri,

> > My bluetooth keyboard is not working any more in rc4, but was working in
> > rc3.
> > Here is the dmesg output, triggered by the first key press on the
> > keyboard, fortunately, this is 100% reproductible (once per boot).
>
> Does reverting f3784d834c7 fix the problem?

we have seen a similar report where reverting f3784d834c7 didn't fix it.
And I don't see anything wrong with that patch. Did something important
got changed in the work queue code that I am missing?

Regards

Marcel



2009-04-30 13:57:08

by Jiri Kosina

[permalink] [raw]
Subject: Re: [BUG] 2.6.30-rc4 hid bluetooth not working

On Thu, 30 Apr 2009, Marc Pignat wrote:

> My bluetooth keyboard is not working any more in rc4, but was working in
> rc3.
> Here is the dmesg output, triggered by the first key press on the
> keyboard, fortunately, this is 100% reproductible (once per boot).

Does reverting f3784d834c7 fix the problem?

--
Jiri Kosina
SUSE Labs


2009-05-04 07:57:33

by Roger Quadros

[permalink] [raw]
Subject: Re: [BUG] 2.6.30-rc4 hid bluetooth not working

ext Marcel Holtmann wrote:
> Hi Marc,
>
> you really need to fix your mailer. You keep breaking the threading all
> the time and that is clearly your fault.
>
>>> thanks for testing.
>>> Marc, Roger, can you test this and confirm that it works for you and
>>> doesn't have any other side effects. Then I prepare it work upstream
>>> inclusion.
>> Tested and working!
>
> great. So I queue this up for 2.6.30 inclusion.
>
>> Exercise left for 2.6.31: use the system wide wq ;)
>
> Actually no, the private workqueue is used on purpose. No need to bother
> keventd for a regular task.
>
> Regards
>
> Marcel
>
>

Hi Marcel,

Yes it works for me too. Thanks.

regards,
-roger

2009-05-03 21:18:26

by Marc Pignat

[permalink] [raw]
Subject: Re: [BUG] 2.6.30-rc4 hid bluetooth not working

Hi Marcel!

On Sunday 03 May 2009 22.17:01 Marcel Holtmann wrote:
> Hi Marc,
>
> you really need to fix your mailer. You keep breaking the threading all
> the time and that is clearly your fault.

Sorry for that, I used a web client, should be fixed now.

...
> great. So I queue this up for 2.6.30 inclusion.

fine!

Regards

Marc

2009-05-03 20:17:01

by Marcel Holtmann

[permalink] [raw]
Subject: Re: [BUG] 2.6.30-rc4 hid bluetooth not working

Hi Marc,

you really need to fix your mailer. You keep breaking the threading all
the time and that is clearly your fault.

> > thanks for testing.
>
> > Marc, Roger, can you test this and confirm that it works for you and
> > doesn't have any other side effects. Then I prepare it work upstream
> > inclusion.
>
> Tested and working!

great. So I queue this up for 2.6.30 inclusion.

> Exercise left for 2.6.31: use the system wide wq ;)

Actually no, the private workqueue is used on purpose. No need to bother
keventd for a regular task.

Regards

Marcel



2009-05-03 14:55:33

by Marc Pignat

[permalink] [raw]
Subject: Re: [BUG] 2.6.30-rc4 hid bluetooth not working

>>> Marcel Holtmann <[email protected]> 05/02/09 11:15 PM >>>
...
> thanks for testing.

> Marc, Roger, can you test this and confirm that it works for you and
> doesn't have any other side effects. Then I prepare it work upstream
> inclusion.

Tested and working!

Exercise left for 2.6.31: use the system wide wq ;)

Best regards

Marc

2009-05-02 21:31:40

by Justin P. Mattock

[permalink] [raw]
Subject: Re: [BUG] 2.6.30-rc4 hid bluetooth not working

On Sat, 2009-05-02 at 14:14 -0700, Marcel Holtmann wrote:
> Hi Justin,
>
> > > > >> >>> > Subject: bluetooth: Fix serialization when adding/deleting connections in hci_sysfs
> > > > >> >>> >
> > > > >> >>> > add_conn and del_conn should be serialized, but flush_workqueue can't be used
> > > > >> >>> > by the worker thread on it's own queue, so use flush_work to serialize add_conn
> > > > >> >>> > and del_conn against each other.
> > > > >> >>> >
> > > > >> >>> > Signed-off-by: Marc Pignat <[email protected]>
> > > > >> >>>
> > > > >> >>> Acked-by: Jiri Kosina <[email protected]>
> > > > >> >>>
> > > > >> >>> FWIW.
> > > > >> >>
> > > > >> >>nak from my side since I think it is the wrong fix. We really wanna wait
> > > > >> >>for all works to finish here. This includes work from other connection
> > > > >> >>attempts or terminations.
> > > > >> >
> > > > >> > IMHO, there is no need to wait for work currently running, since this is a
> > > > >> > singlethread workqueue.
> > > > >>
> > > > >> Yes, sounds right.
> > > > >>
> > > > >> >
> > > > >> > But it is perhaps simpler to use a lock (mutex or watherver locking primitive).
> > > > >>
> > > > >> I'm here a little bit late. Marcel, I'm quite busy recently, I just
> > > > >> see the commit and then this thread.
> > > > >>
> > > > >> Let me explain why I add two workqueue originally, because workqueue
> > > > >> will be defered, so we must guarantee "connection deletion" finished
> > > > >> before "connection adding with same bt addr", or the "connection
> > > > >> adding" will fail.
> > > > >>
> > > > >> On the other hand flush "adding" workqueue in "connection deletion"
> > > > >> function is not necessary.
> > > > >>
> > > > >> To fix this bug, I think we can just use the two work struct for
> > > > >> add/del, at the same time keeping the original two workqueue.
> > > > >>
> > > > >> Please see following patch for this, (building-test only, I have no
> > > > >> bluetooth device at hand, I can test this the day after tommorrow)
> > > > >
> > > > > so I spent the whole day figuring out what is going on here and we keep
> > > > > making the wrong assumptions over and over again.
> > > > >
> > > > > First of all, we only add the sysfs device when we have a successful
> > > > > connection. And we identify it with the handle. This means that we can
> > > > > NOT have any name clashes anymore since the controller has to make sure
> > > > > a handle is only assigned once. Previously we did this on the BD_ADDR
> > > > > value and that lead to it. That is no longer the case.
> > > > >
> > > > > Second of all the two work queues introduces way too much complexity for
> > > > > a really simple task of adding and removing a sysfs device entry.
> > > > >
> > > > > The real problem we have right now are that we are not initializing the
> > > > > sysfs device when creating the hci_conn. This is just wrong and can lead
> > > > > to all kinds of weird invalid data access. And as a result the adding of
> > > > > the sysfs device should only set the name and add it.
> > > > >
> > > > > We also check device_registered before making sure that device_add has
> > > > > been run. And instead of adding more locking or crazy work queue
> > > > > dependencies, we should use the single thread work queue to ensure the
> > > > > correct order of things.
> > > > >
> > > > > The attached patch introduces a hci_conn_init_sysfs step to make sure we
> > > > > setup the sysfs device correctly. I left the flush_work calls, but I
> > > > > think they are not needed since a del_conn before add_conn is no longer
> > > > > possible now.
> > > >
> > > > well it seems your not the only one
> > > > with a broken bluetooth(latest git pull
> > > > this morning)
> > >
> > > can you try the patch that I attached to the previous email. It should
> > > fix exactly this.
> >
> > nice patch.
> > works like a charm
> > bluetooth is alive again.
>
> thanks for testing.
>
> Marc, Roger, can you test this and confirm that it works for you and
> doesn't have any other side effects. Then I prepare it work upstream
> inclusion.
>
> Regards
>
> Marcel
>
>

no worries,
Thanks for the patch.

regards,

Justin P. Mattock

2009-05-02 21:14:33

by Marcel Holtmann

[permalink] [raw]
Subject: Re: [BUG] 2.6.30-rc4 hid bluetooth not working

Hi Justin,

> > > >> >>> > Subject: bluetooth: Fix serialization when adding/deleting connections in hci_sysfs
> > > >> >>> >
> > > >> >>> > add_conn and del_conn should be serialized, but flush_workqueue can't be used
> > > >> >>> > by the worker thread on it's own queue, so use flush_work to serialize add_conn
> > > >> >>> > and del_conn against each other.
> > > >> >>> >
> > > >> >>> > Signed-off-by: Marc Pignat <[email protected]>
> > > >> >>>
> > > >> >>> Acked-by: Jiri Kosina <[email protected]>
> > > >> >>>
> > > >> >>> FWIW.
> > > >> >>
> > > >> >>nak from my side since I think it is the wrong fix. We really wanna wait
> > > >> >>for all works to finish here. This includes work from other connection
> > > >> >>attempts or terminations.
> > > >> >
> > > >> > IMHO, there is no need to wait for work currently running, since this is a
> > > >> > singlethread workqueue.
> > > >>
> > > >> Yes, sounds right.
> > > >>
> > > >> >
> > > >> > But it is perhaps simpler to use a lock (mutex or watherver locking primitive).
> > > >>
> > > >> I'm here a little bit late. Marcel, I'm quite busy recently, I just
> > > >> see the commit and then this thread.
> > > >>
> > > >> Let me explain why I add two workqueue originally, because workqueue
> > > >> will be defered, so we must guarantee "connection deletion" finished
> > > >> before "connection adding with same bt addr", or the "connection
> > > >> adding" will fail.
> > > >>
> > > >> On the other hand flush "adding" workqueue in "connection deletion"
> > > >> function is not necessary.
> > > >>
> > > >> To fix this bug, I think we can just use the two work struct for
> > > >> add/del, at the same time keeping the original two workqueue.
> > > >>
> > > >> Please see following patch for this, (building-test only, I have no
> > > >> bluetooth device at hand, I can test this the day after tommorrow)
> > > >
> > > > so I spent the whole day figuring out what is going on here and we keep
> > > > making the wrong assumptions over and over again.
> > > >
> > > > First of all, we only add the sysfs device when we have a successful
> > > > connection. And we identify it with the handle. This means that we can
> > > > NOT have any name clashes anymore since the controller has to make sure
> > > > a handle is only assigned once. Previously we did this on the BD_ADDR
> > > > value and that lead to it. That is no longer the case.
> > > >
> > > > Second of all the two work queues introduces way too much complexity for
> > > > a really simple task of adding and removing a sysfs device entry.
> > > >
> > > > The real problem we have right now are that we are not initializing the
> > > > sysfs device when creating the hci_conn. This is just wrong and can lead
> > > > to all kinds of weird invalid data access. And as a result the adding of
> > > > the sysfs device should only set the name and add it.
> > > >
> > > > We also check device_registered before making sure that device_add has
> > > > been run. And instead of adding more locking or crazy work queue
> > > > dependencies, we should use the single thread work queue to ensure the
> > > > correct order of things.
> > > >
> > > > The attached patch introduces a hci_conn_init_sysfs step to make sure we
> > > > setup the sysfs device correctly. I left the flush_work calls, but I
> > > > think they are not needed since a del_conn before add_conn is no longer
> > > > possible now.
> > >
> > > well it seems your not the only one
> > > with a broken bluetooth(latest git pull
> > > this morning)
> >
> > can you try the patch that I attached to the previous email. It should
> > fix exactly this.
>
> nice patch.
> works like a charm
> bluetooth is alive again.

thanks for testing.

Marc, Roger, can you test this and confirm that it works for you and
doesn't have any other side effects. Then I prepare it work upstream
inclusion.

Regards

Marcel



2009-05-02 21:07:18

by Justin P. Mattock

[permalink] [raw]
Subject: Re: [BUG] 2.6.30-rc4 hid bluetooth not working

On Sat, 2009-05-02 at 13:21 -0700, Marcel Holtmann wrote:
> Hi Justin,
>
> > >> >>> > Subject: bluetooth: Fix serialization when adding/deleting connections in hci_sysfs
> > >> >>> >
> > >> >>> > add_conn and del_conn should be serialized, but flush_workqueue can't be used
> > >> >>> > by the worker thread on it's own queue, so use flush_work to serialize add_conn
> > >> >>> > and del_conn against each other.
> > >> >>> >
> > >> >>> > Signed-off-by: Marc Pignat <[email protected]>
> > >> >>>
> > >> >>> Acked-by: Jiri Kosina <[email protected]>
> > >> >>>
> > >> >>> FWIW.
> > >> >>
> > >> >>nak from my side since I think it is the wrong fix. We really wanna wait
> > >> >>for all works to finish here. This includes work from other connection
> > >> >>attempts or terminations.
> > >> >
> > >> > IMHO, there is no need to wait for work currently running, since this is a
> > >> > singlethread workqueue.
> > >>
> > >> Yes, sounds right.
> > >>
> > >> >
> > >> > But it is perhaps simpler to use a lock (mutex or watherver locking primitive).
> > >>
> > >> I'm here a little bit late. Marcel, I'm quite busy recently, I just
> > >> see the commit and then this thread.
> > >>
> > >> Let me explain why I add two workqueue originally, because workqueue
> > >> will be defered, so we must guarantee "connection deletion" finished
> > >> before "connection adding with same bt addr", or the "connection
> > >> adding" will fail.
> > >>
> > >> On the other hand flush "adding" workqueue in "connection deletion"
> > >> function is not necessary.
> > >>
> > >> To fix this bug, I think we can just use the two work struct for
> > >> add/del, at the same time keeping the original two workqueue.
> > >>
> > >> Please see following patch for this, (building-test only, I have no
> > >> bluetooth device at hand, I can test this the day after tommorrow)
> > >
> > > so I spent the whole day figuring out what is going on here and we keep
> > > making the wrong assumptions over and over again.
> > >
> > > First of all, we only add the sysfs device when we have a successful
> > > connection. And we identify it with the handle. This means that we can
> > > NOT have any name clashes anymore since the controller has to make sure
> > > a handle is only assigned once. Previously we did this on the BD_ADDR
> > > value and that lead to it. That is no longer the case.
> > >
> > > Second of all the two work queues introduces way too much complexity for
> > > a really simple task of adding and removing a sysfs device entry.
> > >
> > > The real problem we have right now are that we are not initializing the
> > > sysfs device when creating the hci_conn. This is just wrong and can lead
> > > to all kinds of weird invalid data access. And as a result the adding of
> > > the sysfs device should only set the name and add it.
> > >
> > > We also check device_registered before making sure that device_add has
> > > been run. And instead of adding more locking or crazy work queue
> > > dependencies, we should use the single thread work queue to ensure the
> > > correct order of things.
> > >
> > > The attached patch introduces a hci_conn_init_sysfs step to make sure we
> > > setup the sysfs device correctly. I left the flush_work calls, but I
> > > think they are not needed since a del_conn before add_conn is no longer
> > > possible now.
> >
> > well it seems your not the only one
> > with a broken bluetooth(latest git pull
> > this morning)
>
> can you try the patch that I attached to the previous email. It should
> fix exactly this.
>
> Regards
>
> Marcel
>
>

nice patch.
works like a charm
bluetooth is alive again.

Justin P. Mattock

2009-05-02 20:21:38

by Marcel Holtmann

[permalink] [raw]
Subject: Re: [BUG] 2.6.30-rc4 hid bluetooth not working

Hi Justin,

> >> >>> > Subject: bluetooth: Fix serialization when adding/deleting connections in hci_sysfs
> >> >>> >
> >> >>> > add_conn and del_conn should be serialized, but flush_workqueue can't be used
> >> >>> > by the worker thread on it's own queue, so use flush_work to serialize add_conn
> >> >>> > and del_conn against each other.
> >> >>> >
> >> >>> > Signed-off-by: Marc Pignat <[email protected]>
> >> >>>
> >> >>> Acked-by: Jiri Kosina <[email protected]>
> >> >>>
> >> >>> FWIW.
> >> >>
> >> >>nak from my side since I think it is the wrong fix. We really wanna wait
> >> >>for all works to finish here. This includes work from other connection
> >> >>attempts or terminations.
> >> >
> >> > IMHO, there is no need to wait for work currently running, since this is a
> >> > singlethread workqueue.
> >>
> >> Yes, sounds right.
> >>
> >> >
> >> > But it is perhaps simpler to use a lock (mutex or watherver locking primitive).
> >>
> >> I'm here a little bit late. Marcel, I'm quite busy recently, I just
> >> see the commit and then this thread.
> >>
> >> Let me explain why I add two workqueue originally, because workqueue
> >> will be defered, so we must guarantee "connection deletion" finished
> >> before "connection adding with same bt addr", or the "connection
> >> adding" will fail.
> >>
> >> On the other hand flush "adding" workqueue in "connection deletion"
> >> function is not necessary.
> >>
> >> To fix this bug, I think we can just use the two work struct for
> >> add/del, at the same time keeping the original two workqueue.
> >>
> >> Please see following patch for this, (building-test only, I have no
> >> bluetooth device at hand, I can test this the day after tommorrow)
> >
> > so I spent the whole day figuring out what is going on here and we keep
> > making the wrong assumptions over and over again.
> >
> > First of all, we only add the sysfs device when we have a successful
> > connection. And we identify it with the handle. This means that we can
> > NOT have any name clashes anymore since the controller has to make sure
> > a handle is only assigned once. Previously we did this on the BD_ADDR
> > value and that lead to it. That is no longer the case.
> >
> > Second of all the two work queues introduces way too much complexity for
> > a really simple task of adding and removing a sysfs device entry.
> >
> > The real problem we have right now are that we are not initializing the
> > sysfs device when creating the hci_conn. This is just wrong and can lead
> > to all kinds of weird invalid data access. And as a result the adding of
> > the sysfs device should only set the name and add it.
> >
> > We also check device_registered before making sure that device_add has
> > been run. And instead of adding more locking or crazy work queue
> > dependencies, we should use the single thread work queue to ensure the
> > correct order of things.
> >
> > The attached patch introduces a hci_conn_init_sysfs step to make sure we
> > setup the sysfs device correctly. I left the flush_work calls, but I
> > think they are not needed since a del_conn before add_conn is no longer
> > possible now.
>
> well it seems your not the only one
> with a broken bluetooth(latest git pull
> this morning)

can you try the patch that I attached to the previous email. It should
fix exactly this.

Regards

Marcel



2009-05-02 20:19:40

by Justin P. Mattock

[permalink] [raw]
Subject: Re: [BUG] 2.6.30-rc4 hid bluetooth not working

On Sat, May 2, 2009 at 12:42 PM, Marcel Holtmann <[email protected]> wrot=
e:
> Hi Dave,
>
>> >>> > Subject: bluetooth: Fix serialization when adding/deleting connect=
ions in hci_sysfs
>> >>> >
>> >>> > add_conn and del_conn should be serialized, but flush_workqueue ca=
n't be used
>> >>> > by the worker thread on it's own queue, so use flush_work to seria=
lize add_conn
>> >>> > and del_conn against each other.
>> >>> >
>> >>> > Signed-off-by: Marc Pignat <[email protected]>
>> >>>
>> >>> Acked-by: Jiri Kosina <[email protected]>
>> >>>
>> >>> FWIW.
>> >>
>> >>nak from my side since I think it is the wrong fix. We really wanna wa=
it
>> >>for all works to finish here. This includes work from other connection
>> >>attempts or terminations.
>> >
>> > IMHO, there is no need to wait for work currently running, since this =
is a
>> > singlethread workqueue.
>>
>> Yes, sounds right.
>>
>> >
>> > But it is perhaps simpler to use a lock (mutex or watherver locking pr=
imitive).
>>
>> I'm here a little bit late. Marcel, I'm quite busy recently, I just
>> see the commit and then this thread.
>>
>> Let me explain why I add two workqueue originally, =A0because workqueue
>> will be defered, so =A0 we must guarantee "connection deletion" finished
>> before "connection adding with same bt addr", =A0or the "connection
>> adding" will fail.
>>
>> On the other hand flush "adding" workqueue in "connection deletion"
>> function is not necessary.
>>
>> To fix this bug, I think we can just use the two work struct for
>> add/del, at the same time keeping the original two workqueue.
>>
>> Please see following patch for this, (building-test only, I have no
>> bluetooth device at hand, I can test this the day after tommorrow)
>
> so I spent the whole day figuring out what is going on here and we keep
> making the wrong assumptions over and over again.
>
> First of all, we only add the sysfs device when we have a successful
> connection. And we identify it with the handle. This means that we can
> NOT have any name clashes anymore since the controller has to make sure
> a handle is only assigned once. Previously we did this on the BD_ADDR
> value and that lead to it. That is no longer the case.
>
> Second of all the two work queues introduces way too much complexity for
> a really simple task of adding and removing a sysfs device entry.
>
> The real problem we have right now are that we are not initializing the
> sysfs device when creating the hci_conn. This is just wrong and can lead
> to all kinds of weird invalid data access. And as a result the adding of
> the sysfs device should only set the name and add it.
>
> We also check device_registered before making sure that device_add has
> been run. And instead of adding more locking or crazy work queue
> dependencies, we should use the single thread work queue to ensure the
> correct order of things.
>
> The attached patch introduces a hci_conn_init_sysfs step to make sure we
> setup the sysfs device correctly. I left the flush_work calls, but I
> think they are not needed since a del_conn before add_conn is no longer
> possible now.
>
> Regards
>
> Marcel
>
>

well it seems your not the only one
with a broken bluetooth(latest git pull
this morning)
this is what dmesg shows:

[ 64.484350] ------------[ cut here ]------------
[ 64.484357] WARNING: at kernel/workqueue.c:371
flush_cpu_workqueue+0x26/0x6b()
[ 64.484363] Hardware name: MacBookPro2,2
[ 64.484366] Modules linked in: radeon drm agpgart bnep
snd_hda_codec_idt firewire_ohci firewire_core snd_hda_intel
snd_hda_codec snd_hwdep snd_pcm ohci1394 snd_timer ath9k ieee1394 sky2
ehci_hcd snd_page_alloc i2c_i801 battery joydev evdev ac video
uhci_hcd thermal button hci_uart sco rfcomm btusb hidp l2cap bluetooth
ipmi_watchdog ipmi_msghandler uvcvideo isight_firmware uinput
arpt_mangle arptable_filter arp_tables nf_conntrack_ipv4 nf_conntrack
nf_defrag_ipv4 iptable_mangle iptable_filter ip_tables x_tables
coretemp acpi_cpufreq processor appletouch applesmc
[ 64.484473] Pid: 762, comm: bluetooth Not tainted 2.6.30-rc4 #7
[ 64.484477] Call Trace:
[ 64.484488] [<c0126c01>] warn_slowpath+0x71/0x87
[ 64.484496] [<c011c746>] ? dequeue_entity+0x20/0x206
[ 64.484503] [<c011c746>] ? dequeue_entity+0x20/0x206
[ 64.484511] [<c011cfc6>] ? dequeue_task_fair+0x57/0x5c
[ 64.484520] [<c011950c>] ? dequeue_task+0x12b/0x13e
[ 64.484530] [<c0238f39>] ? _raw_spin_unlock+0x75/0x7a
[ 64.484539] [<c03e45bd>] ? _spin_unlock_irq+0x8/0x10
[ 64.484547] [<c0123433>] ? finish_task_switch+0x4d/0xa2
[ 64.484555] [<c03e29e0>] ? __schedule+0x80d/0x87d
[ 64.484563] [<c0133e85>] flush_cpu_workqueue+0x26/0x6b
[ 64.484572] [<c0238f39>] ? _raw_spin_unlock+0x75/0x7a
[ 64.484579] [<c013405c>] flush_workqueue+0x2b/0x49
[ 64.484602] [<f877e91c>] add_conn+0x10/0x34 [bluetooth]
[ 64.484609] [<c0133c6a>] worker_thread+0x13b/0x1b9
[ 64.484628] [<f877e90c>] ? add_conn+0x0/0x34 [bluetooth]
[ 64.484637] [<c0136d99>] ? autoremove_wake_function+0x0/0x2f
[ 64.484644] [<c0133b2f>] ? worker_thread+0x0/0x1b9
[ 64.484652] [<c0136ac0>] kthread+0x46/0x6a
[ 64.484659] [<c0136a7a>] ? kthread+0x0/0x6a
[ 64.484667] [<c01033a7>] kernel_thread_helper+0x7/0x10
[ 64.484673] ---[ end trace bd97c7b15860436c ]---
[ 64.643667] BUG: unable to handle kernel NULL pointer dereference at 000=
00020
[ 64.643678] IP: [<c01be904>] sysfs_addrm_start+0x21/0x8f
[ 64.643692] *pde =3D 00000000
[ 64.643698] Oops: 0000 [#1] SMP
[ 64.643705] last sysfs file: /sys/devices/platform/applesmc.768/light
[ 64.643711] Modules linked in: radeon drm agpgart bnep
snd_hda_codec_idt firewire_ohci firewire_core snd_hda_intel
snd_hda_codec snd_hwdep snd_pcm ohci1394 snd_timer ath9k ieee1394 sky2
ehci_hcd snd_page_alloc i2c_i801 battery joydev evdev ac video
uhci_hcd thermal button hci_uart sco rfcomm btusb hidp l2cap bluetooth
ipmi_watchdog ipmi_msghandler uvcvideo isight_firmware uinput
arpt_mangle arptable_filter arp_tables nf_conntrack_ipv4 nf_conntrack
nf_defrag_ipv4 iptable_mangle iptable_filter ip_tables x_tables
coretemp acpi_cpufreq processor appletouch applesmc
[ 64.643810]
[ 64.643817] Pid: 1584, comm: bluetoothd Tainted: G W
(2.6.30-rc4 #7) MacBookPro2,2
[ 64.643823] EIP: 0060:[<c01be904>] EFLAGS: 00010286 CPU: 0
[ 64.643829] EIP is at sysfs_addrm_start+0x21/0x8f
[ 64.643834] EAX: f5796000 EBX: 00000000 ECX: 00000000 EDX: c054acc0
[ 64.643840] ESI: f5797c60 EDI: f5797c70 EBP: f5797c54 ESP: f5797c48
[ 64.643845] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[ 64.643851] Process bluetoothd (pid: 1584, ti=3Df5796000
task=3Df56c9860 task.ti=3Df5796000)
[ 64.643856] Stack:
[ 64.643860] f4936058 f5797c60 00000000 f5797c7c c01bedaf fffffff4
00000000 00000000
[ 64.643874] 00000000 00000000 f4ef4d1c f48129b4 f48129b4 f5797c90
c01bee18 f5797c88
[ 64.643889] f5797c90 f4ef4d1c f5797cac c0230e4a ffffffff fffffffe
f4ef4d1c f48129b4
[ 64.643906] Call Trace:
[ 64.643910] [<c01bedaf>] ? create_dir+0x3a/0x76
[ 64.643918] [<c01bee18>] ? sysfs_create_dir+0x2d/0x3d
[ 64.643926] [<c0230e4a>] ? kobject_add_internal+0xb0/0x15f
[ 64.643936] [<c0230faa>] ? kobject_add_varg+0x31/0x3d
[ 64.643945] [<c0231020>] ? kobject_add+0x43/0x49
[ 64.643952] [<c02b2f0d>] ? device_add+0xd5/0x451
[ 64.643962] [<c0236f9c>] ? kvasprintf+0x38/0x43
[ 64.643971] [<c0230f73>] ? kobject_set_name_vargs+0x46/0x4c
[ 64.643979] [<c0322b1f>] ? hid_add_device+0x12b/0x147
[ 64.643991] [<f87a4091>] ? hidp_add_connection+0x2c0/0x539 [hidp]
[ 64.644006] [<f87a451d>] ? hidp_sock_ioctl+0xe2/0x1e6 [hidp]
[ 64.644020] [<c018141a>] ? check_object+0x136/0x190
[ 64.644029] [<c0208fb2>] ? avc_has_perm+0x3c/0x46
[ 64.644038] [<c020a123>] ? inode_has_perm+0x5b/0x65
[ 64.644048] [<c03413ee>] ? sock_ioctl+0x1b9/0x1dd
[ 64.644057] [<c0341235>] ? sock_ioctl+0x0/0x1dd
[ 64.644064] [<c0190ff7>] ? vfs_ioctl+0x18/0x71
[ 64.644073] [<c01914b1>] ? do_vfs_ioctl+0x461/0x49f
[ 64.644082] [<c020a1a9>] ? file_has_perm+0x7c/0x85
[ 64.644091] [<c0191530>] ? sys_ioctl+0x41/0x61
[ 64.644099] [<c020620e>] ? security_file_ioctl+0x10/0x13
[ 64.644107] [<c0191530>] ? sys_ioctl+0x41/0x61
[ 64.644115] [<c0102914>] ? sysenter_do_call+0x12/0x28
[ 64.644125] Code: 45 f0 8d 65 f4 5b 5e 5f c9 c3 55 b9 04 00 00 00
89 e5 57 89 c7 56 89 c6 53 31 c0 89 d3 f3 ab b8 ac ac 54 c0 89 16 e8
a7 49 22 00 <8b> 53 20 b9 88 e4 1b c0 53 a1 80 0f 68 c0 e8 59 70 fd ff
89 c3
[ 64.644213] EIP: [<c01be904>] sysfs_addrm_start+0x21/0x8f SS:ESP
0068:f5797c48
[ 64.644223] CR2: 0000000000000020
[ 64.644229] ---[ end trace bd97c7b15860436d ]---


--=20
Justin P. Mattock

2009-05-02 19:42:46

by Marcel Holtmann

[permalink] [raw]
Subject: Re: [BUG] 2.6.30-rc4 hid bluetooth not working

Hi Dave,

> >>> > Subject: bluetooth: Fix serialization when adding/deleting connections in hci_sysfs
> >>> >
> >>> > add_conn and del_conn should be serialized, but flush_workqueue can't be used
> >>> > by the worker thread on it's own queue, so use flush_work to serialize add_conn
> >>> > and del_conn against each other.
> >>> >
> >>> > Signed-off-by: Marc Pignat <[email protected]>
> >>>
> >>> Acked-by: Jiri Kosina <[email protected]>
> >>>
> >>> FWIW.
> >>
> >>nak from my side since I think it is the wrong fix. We really wanna wait
> >>for all works to finish here. This includes work from other connection
> >>attempts or terminations.
> >
> > IMHO, there is no need to wait for work currently running, since this is a
> > singlethread workqueue.
>
> Yes, sounds right.
>
> >
> > But it is perhaps simpler to use a lock (mutex or watherver locking primitive).
>
> I'm here a little bit late. Marcel, I'm quite busy recently, I just
> see the commit and then this thread.
>
> Let me explain why I add two workqueue originally, because workqueue
> will be defered, so we must guarantee "connection deletion" finished
> before "connection adding with same bt addr", or the "connection
> adding" will fail.
>
> On the other hand flush "adding" workqueue in "connection deletion"
> function is not necessary.
>
> To fix this bug, I think we can just use the two work struct for
> add/del, at the same time keeping the original two workqueue.
>
> Please see following patch for this, (building-test only, I have no
> bluetooth device at hand, I can test this the day after tommorrow)

so I spent the whole day figuring out what is going on here and we keep
making the wrong assumptions over and over again.

First of all, we only add the sysfs device when we have a successful
connection. And we identify it with the handle. This means that we can
NOT have any name clashes anymore since the controller has to make sure
a handle is only assigned once. Previously we did this on the BD_ADDR
value and that lead to it. That is no longer the case.

Second of all the two work queues introduces way too much complexity for
a really simple task of adding and removing a sysfs device entry.

The real problem we have right now are that we are not initializing the
sysfs device when creating the hci_conn. This is just wrong and can lead
to all kinds of weird invalid data access. And as a result the adding of
the sysfs device should only set the name and add it.

We also check device_registered before making sure that device_add has
been run. And instead of adding more locking or crazy work queue
dependencies, we should use the single thread work queue to ensure the
correct order of things.

The attached patch introduces a hci_conn_init_sysfs step to make sure we
setup the sysfs device correctly. I left the flush_work calls, but I
think they are not needed since a del_conn before add_conn is no longer
possible now.

Regards

Marcel


Attachments:
patch-bluetooth-fix-sysfs-workq (4.30 kB)

2009-05-02 17:03:26

by Marcel Holtmann

[permalink] [raw]
Subject: Re: [BUG] 2.6.30-rc4 hid bluetooth not working

Hi Marc,

can you please stop breaking the threading here. Your responses are
hanging all over the place and it is hard to keep track of them.

> >> > Subject: bluetooth: Fix serialization when adding/deleting connections in hci_sysfs
> >> >
> >> > add_conn and del_conn should be serialized, but flush_workqueue can't be used
> >> > by the worker thread on it's own queue, so use flush_work to serialize add_conn
> >> > and del_conn against each other.
> >> >
> >> > Signed-off-by: Marc Pignat <[email protected]>
> >>
> >> Acked-by: Jiri Kosina <[email protected]>
> >>
> >> FWIW.
> >
> >nak from my side since I think it is the wrong fix. We really wanna wait
> >for all works to finish here. This includes work from other connection
> >attempts or terminations.
>
> IMHO, there is no need to wait for work currently running, since this is a
> singlethread workqueue.
>
> But it is perhaps simpler to use a lock (mutex or watherver locking primitive).

I think that using a mutex might be a better solution to ensure that
previous work has been finished.

Regards

Marcel



2009-05-02 14:08:50

by Dave Young

[permalink] [raw]
Subject: Re: [BUG] 2.6.30-rc4 hid bluetooth not working

On Sat, May 2, 2009 at 4:45 PM, Marc Pignat <[email protected]> wrote:
>>>> Marcel Holtmann <[email protected]> 05/02/09 12:57 AM >>>
>>Hi Jiri,
>>
>>> > Subject: bluetooth: Fix serialization when adding/deleting connections in hci_sysfs
>>> >
>>> > add_conn and del_conn should be serialized, but flush_workqueue can't be used
>>> > by the worker thread on it's own queue, so use flush_work to serialize add_conn
>>> > and del_conn against each other.
>>> >
>>> > Signed-off-by: Marc Pignat <[email protected]>
>>>
>>> Acked-by: Jiri Kosina <[email protected]>
>>>
>>> FWIW.
>>
>>nak from my side since I think it is the wrong fix. We really wanna wait
>>for all works to finish here. This includes work from other connection
>>attempts or terminations.
>
> IMHO, there is no need to wait for work currently running, since this is a
> singlethread workqueue.

Yes, sounds right.

>
> But it is perhaps simpler to use a lock (mutex or watherver locking primitive).

I'm here a little bit late. Marcel, I'm quite busy recently, I just
see the commit and then this thread.

Let me explain why I add two workqueue originally, because workqueue
will be defered, so we must guarantee "connection deletion" finished
before "connection adding with same bt addr", or the "connection
adding" will fail.

On the other hand flush "adding" workqueue in "connection deletion"
function is not necessary.

To fix this bug, I think we can just use the two work struct for
add/del, at the same time keeping the original two workqueue.

Please see following patch for this, (building-test only, I have no
bluetooth device at hand, I can test this the day after tommorrow)

--- linux-2.6.orig/net/bluetooth/hci_sysfs.c 2009-04-30 11:35:54.000000000 +0800
+++ linux-2.6/net/bluetooth/hci_sysfs.c 2009-05-02 21:54:40.000000000 +0800
@@ -9,7 +9,8 @@
struct class *bt_class = NULL;
EXPORT_SYMBOL_GPL(bt_class);

-static struct workqueue_struct *bluetooth;
+static struct workqueue_struct *btaddconn;
+static struct workqueue_struct *btdelconn;

static inline char *link_typetostr(int type)
{
@@ -89,8 +90,7 @@
{
struct hci_conn *conn = container_of(work, struct hci_conn, work_add);

- /* ensure previous add/del is complete */
- flush_workqueue(bluetooth);
+ flush_workqueue(btdelconn);

if (device_add(&conn->dev) < 0) {
BT_ERR("Failed to register connection device");
@@ -116,7 +116,7 @@

INIT_WORK(&conn->work_add, add_conn);

- queue_work(bluetooth, &conn->work_add);
+ queue_work(btaddconn, &conn->work_add);
}

/*
@@ -134,9 +134,6 @@
struct hci_conn *conn = container_of(work, struct hci_conn, work_del);
struct hci_dev *hdev = conn->hdev;

- /* ensure previous add/del is complete */
- flush_workqueue(bluetooth);
-
while (1) {
struct device *dev;

@@ -161,7 +158,7 @@

INIT_WORK(&conn->work_del, del_conn);

- queue_work(bluetooth, &conn->work_del);
+ queue_work(btdelconn, &conn->work_del);
}

static inline char *host_typetostr(int type)
@@ -438,13 +435,20 @@

int __init bt_sysfs_init(void)
{
- bluetooth = create_singlethread_workqueue("bluetooth");
- if (!bluetooth)
+ btaddconn = create_singlethread_workqueue("btaddconn");
+ if (!btaddconn)
+ return -ENOMEM;
+
+ btdelconn = create_singlethread_workqueue("btdelconn");
+ if (!btdelconn) {
+ destroy_workqueue(btaddconn);
return -ENOMEM;
+ }

bt_class = class_create(THIS_MODULE, "bluetooth");
if (IS_ERR(bt_class)) {
- destroy_workqueue(bluetooth);
+ destroy_workqueue(btdelconn);
+ destroy_workqueue(btaddconn);
return PTR_ERR(bt_class);
}

@@ -453,7 +457,8 @@

void bt_sysfs_cleanup(void)
{
- destroy_workqueue(bluetooth);
+ destroy_workqueue(btaddconn);
+ destroy_workqueue(btdelconn);

class_destroy(bt_class);
}

2009-05-02 08:45:41

by Marc Pignat

[permalink] [raw]
Subject: Re: [BUG] 2.6.30-rc4 hid bluetooth not working

>>> Marcel Holtmann <[email protected]> 05/02/09 12:57 AM >>>
>Hi Jiri,
>
>> > Subject: bluetooth: Fix serialization when adding/deleting connections=
in hci_sysfs
>> >=20
>> > add_conn and del_conn should be serialized, but flush_workqueue can't =
be used
>> > by the worker thread on it's own queue, so use flush_work to =
serialize add_conn
>> > and del_conn against each other.
>> >=20
>> > Signed-off-by: Marc Pignat <[email protected]>
>>=20
>> Acked-by: Jiri Kosina <[email protected]>
>>=20
>> FWIW.
>
>nak from my side since I think it is the wrong fix. We really wanna wait
>for all works to finish here. This includes work from other connection
>attempts or terminations.

IMHO, there is no need to wait for work currently running, since this is a
singlethread workqueue.

But it is perhaps simpler to use a lock (mutex or watherver locking =
primitive).

Regards

Marc

2009-05-01 22:56:34

by Marcel Holtmann

[permalink] [raw]
Subject: Re: [BUG] 2.6.30-rc4 hid bluetooth not working

Hi Jiri,

> > Subject: bluetooth: Fix serialization when adding/deleting connections in hci_sysfs
> >
> > add_conn and del_conn should be serialized, but flush_workqueue can't be used
> > by the worker thread on it's own queue, so use flush_work to serialize add_conn
> > and del_conn against each other.
> >
> > Signed-off-by: Marc Pignat <[email protected]>
>
> Acked-by: Jiri Kosina <[email protected]>
>
> FWIW.

nak from my side since I think it is the wrong fix. We really wanna wait
for all works to finish here. This includes work from other connection
attempts or terminations.

Regards

Marcel



2009-05-01 22:20:22

by Jiri Kosina

[permalink] [raw]
Subject: Re: [BUG] 2.6.30-rc4 hid bluetooth not working

On Fri, 1 May 2009, Marc Pignat wrote:

> Subject: bluetooth: Fix serialization when adding/deleting connections in hci_sysfs
>
> add_conn and del_conn should be serialized, but flush_workqueue can't be used
> by the worker thread on it's own queue, so use flush_work to serialize add_conn
> and del_conn against each other.
>
> Signed-off-by: Marc Pignat <[email protected]>

Acked-by: Jiri Kosina <[email protected]>

FWIW.

--
Jiri Kosina
SUSE Labs

2009-05-01 00:37:00

by Marc Pignat

[permalink] [raw]
Subject: Re: [BUG] 2.6.30-rc4 hid bluetooth not working

Subject: bluetooth: Fix serialization when adding/deleting connections in =
hci_sysfs

add_conn and del_conn should be serialized, but flush_workqueue can't be =
used
by the worker thread on it's own queue, so use flush_work to serialize =
add_conn
and del_conn against each other.

Signed-off-by: Marc Pignat <[email protected]>
---

patch against 2.6.30-rc4.


diff --git a/net/bluetooth/hci_sysfs.c b/net/bluetooth/hci_sysfs.c
index b7c5108..42695e3 100644
--- a/net/bluetooth/hci_sysfs.c
+++ b/net/bluetooth/hci_sysfs.c
@@ -89,8 +89,8 @@ static void add_conn(struct work_struct *work)
{
struct hci_conn *conn =3D container_of(work, struct hci_conn, =
work_add);
=20
- /* ensure previous add/del is complete */
- flush_workqueue(bluetooth);
+ /* ensure previous del is complete */
+ flush_work(&conn->work_del);
=20
if (device_add(&conn->dev) < 0) {
BT_ERR("Failed to register connection device");
@@ -134,8 +134,8 @@ static void del_conn(struct work_struct *work)
struct hci_conn *conn =3D container_of(work, struct hci_conn, =
work_del);
struct hci_dev *hdev =3D conn->hdev;
=20
- /* ensure previous add/del is complete */
- flush_workqueue(bluetooth);
+ /* ensure previous add is complete */
+ flush_work(&conn->work_add);
=20
while (1) {
struct device *dev;