LinuxLists.cc - [media] cx25821 regression from 3.9: BUG: bad unlock balance detected!

2013-05-16 17:41:50

Subject: [media] cx25821 regression from 3.9: BUG: bad unlock balance detected!

Hi Hans / Mauro,

With 3.10.0-rc1 (including the cx25821 changes from Hans), I get the bug below which wasn't present with 3.9.

--
Sander

[ 53.004968] =====================================
[ 53.004968] [ BUG: bad unlock balance detected! ]
[ 53.004968] 3.10.0-rc1-20130516-jens+ #1 Not tainted
[ 53.004968] -------------------------------------
[ 53.004968] motion/3328 is trying to release lock (&dev->lock) at:
[ 53.004968] [<ffffffff819be5f9>] mutex_unlock+0x9/0x10
[ 53.004968] but there are no more locks to release!
[ 53.004968]
[ 53.004968] other info that might help us debug this:
[ 53.004968] 1 lock held by motion/3328:
[ 53.004968] #0: (&mm->mmap_sem){++++++}, at: [<ffffffff81156cae>] vm_munmap+0x3e/0x70
[ 53.004968]
[ 53.004968] stack backtrace:
[ 53.004968] CPU: 1 PID: 3328 Comm: motion Not tainted 3.10.0-rc1-20130516-jens+ #1
[ 53.004968] Hardware name: Xen HVM domU, BIOS 4.3-unstable 05/16/2013
[ 53.004968] ffffffff819be5f9 ffff88002ac35c58 ffffffff819b9029 ffff88002ac35c88
[ 53.004968] ffffffff810e615e ffff88002ac35cb8 ffff88002b7c18a8 ffffffff819be5f9
[ 53.004968] 00000000ffffffff ffff88002ac35d28 ffffffff810eb17e ffffffff810e7ba5
[ 53.004968] Call Trace:
[ 53.004968] [<ffffffff819be5f9>] ? mutex_unlock+0x9/0x10
[ 53.004968] [<ffffffff819b9029>] dump_stack+0x19/0x1b
[ 53.004968] [<ffffffff810e615e>] print_unlock_imbalance_bug+0xfe/0x110
[ 53.004968] [<ffffffff819be5f9>] ? mutex_unlock+0x9/0x10
[ 53.004968] [<ffffffff810eb17e>] lock_release_non_nested+0x1ce/0x320
[ 53.004968] [<ffffffff810e7ba5>] ? debug_check_no_locks_freed+0x105/0x1b0
[ 53.353529] [<ffffffff819be5f9>] ? mutex_unlock+0x9/0x10
[ 53.353529] [<ffffffff810eb3cc>] lock_release+0xfc/0x250
[ 53.353529] [<ffffffff819be4b2>] __mutex_unlock_slowpath+0xb2/0x1f0
[ 53.353529] [<ffffffff819be5f9>] mutex_unlock+0x9/0x10
[ 53.353529] [<ffffffff81711105>] videobuf_waiton+0x55/0x230
[ 53.353529] [<ffffffff8114d052>] ? tlb_finish_mmu+0x32/0x50
[ 53.353529] [<ffffffff81154a46>] ? unmap_region+0xc6/0x100
[ 53.353529] [<ffffffff81172e05>] ? kmem_cache_free+0x195/0x230
[ 53.353529] [<ffffffff8172d3d9>] cx25821_free_buffer+0x49/0xa0
[ 53.353529] [<ffffffff8172f939>] cx25821_buffer_release+0x9/0x10
[ 53.353529] [<ffffffff81712c35>] videobuf_vm_close+0xc5/0x160
[ 53.353529] [<ffffffff81154aa5>] remove_vma+0x25/0x60
[ 53.353529] [<ffffffff81156b67>] do_munmap+0x307/0x410
[ 53.353529] [<ffffffff81156cbc>] vm_munmap+0x4c/0x70
[ 53.353529] [<ffffffff81157c09>] SyS_munmap+0x9/0x10
[ 53.353529] [<ffffffff819c20a9>] system_call_fastpath+0x16/0x1b

2013-05-17 08:25:47

by Hans Verkuil

[permalink] [raw]

Subject: Re: [media] cx25821 regression from 3.9: BUG: bad unlock balance detected!

On Thu May 16 2013 19:41:42 Sander Eikelenboom wrote:
> Hi Hans / Mauro,
>
> With 3.10.0-rc1 (including the cx25821 changes from Hans), I get the bug below which wasn't present with 3.9.

How do I reproduce this? I've tried to, but I can't make this happen.

Looking at the code I can't see how it could hit this bug anyway.

Regards,

Hans

>
> --
> Sander
>
>
> [ 53.004968] =====================================
> [ 53.004968] [ BUG: bad unlock balance detected! ]
> [ 53.004968] 3.10.0-rc1-20130516-jens+ #1 Not tainted
> [ 53.004968] -------------------------------------
> [ 53.004968] motion/3328 is trying to release lock (&dev->lock) at:
> [ 53.004968] [<ffffffff819be5f9>] mutex_unlock+0x9/0x10
> [ 53.004968] but there are no more locks to release!
> [ 53.004968]
> [ 53.004968] other info that might help us debug this:
> [ 53.004968] 1 lock held by motion/3328:
> [ 53.004968] #0: (&mm->mmap_sem){++++++}, at: [<ffffffff81156cae>] vm_munmap+0x3e/0x70
> [ 53.004968]
> [ 53.004968] stack backtrace:
> [ 53.004968] CPU: 1 PID: 3328 Comm: motion Not tainted 3.10.0-rc1-20130516-jens+ #1
> [ 53.004968] Hardware name: Xen HVM domU, BIOS 4.3-unstable 05/16/2013
> [ 53.004968] ffffffff819be5f9 ffff88002ac35c58 ffffffff819b9029 ffff88002ac35c88
> [ 53.004968] ffffffff810e615e ffff88002ac35cb8 ffff88002b7c18a8 ffffffff819be5f9
> [ 53.004968] 00000000ffffffff ffff88002ac35d28 ffffffff810eb17e ffffffff810e7ba5
> [ 53.004968] Call Trace:
> [ 53.004968] [<ffffffff819be5f9>] ? mutex_unlock+0x9/0x10
> [ 53.004968] [<ffffffff819b9029>] dump_stack+0x19/0x1b
> [ 53.004968] [<ffffffff810e615e>] print_unlock_imbalance_bug+0xfe/0x110
> [ 53.004968] [<ffffffff819be5f9>] ? mutex_unlock+0x9/0x10
> [ 53.004968] [<ffffffff810eb17e>] lock_release_non_nested+0x1ce/0x320
> [ 53.004968] [<ffffffff810e7ba5>] ? debug_check_no_locks_freed+0x105/0x1b0
> [ 53.353529] [<ffffffff819be5f9>] ? mutex_unlock+0x9/0x10
> [ 53.353529] [<ffffffff810eb3cc>] lock_release+0xfc/0x250
> [ 53.353529] [<ffffffff819be4b2>] __mutex_unlock_slowpath+0xb2/0x1f0
> [ 53.353529] [<ffffffff819be5f9>] mutex_unlock+0x9/0x10
> [ 53.353529] [<ffffffff81711105>] videobuf_waiton+0x55/0x230
> [ 53.353529] [<ffffffff8114d052>] ? tlb_finish_mmu+0x32/0x50
> [ 53.353529] [<ffffffff81154a46>] ? unmap_region+0xc6/0x100
> [ 53.353529] [<ffffffff81172e05>] ? kmem_cache_free+0x195/0x230
> [ 53.353529] [<ffffffff8172d3d9>] cx25821_free_buffer+0x49/0xa0
> [ 53.353529] [<ffffffff8172f939>] cx25821_buffer_release+0x9/0x10
> [ 53.353529] [<ffffffff81712c35>] videobuf_vm_close+0xc5/0x160
> [ 53.353529] [<ffffffff81154aa5>] remove_vma+0x25/0x60
> [ 53.353529] [<ffffffff81156b67>] do_munmap+0x307/0x410
> [ 53.353529] [<ffffffff81156cbc>] vm_munmap+0x4c/0x70
> [ 53.353529] [<ffffffff81157c09>] SyS_munmap+0x9/0x10
> [ 53.353529] [<ffffffff819c20a9>] system_call_fastpath+0x16/0x1b
>

2013-05-17 09:05:01

by Sander Eikelenboom

[permalink] [raw]

Subject: Re: [media] cx25821 regression from 3.9: BUG: bad unlock balance detected!

Friday, May 17, 2013, 10:25:24 AM, you wrote:

> On Thu May 16 2013 19:41:42 Sander Eikelenboom wrote:
>> Hi Hans / Mauro,
>>
>> With 3.10.0-rc1 (including the cx25821 changes from Hans), I get the bug below which wasn't present with 3.9.

> How do I reproduce this? I've tried to, but I can't make this happen.

> Looking at the code I can't see how it could hit this bug anyway.

I'm using "motion" to grab and process 6 from the video streams of the card i have (card with 8 inputs).
It seems the cx25821 underwent quite some changes between 3.9 and 3.10.

And in the past there have been some more locking issues around mmap and media devices, although they seem to appear as circular locking dependencies and with different devices.
- http://www.mail-archive.com/[email protected]/msg46217.html
- Under kvm: http://www.spinics.net/lists/linux-media/msg63322.html

- Perhaps that running in a VM could have to do with it ?
- The driver on 3.9 occasionaly gives this, probably latency related (but continues to work):
cx25821: cx25821_video_wakeup: 2 buffers handled (should be 1)

Could it be something double unlocking in that path ?

- Is there any extra debugging i could enable that could pinpoint the issue ?

--

Sander

> Regards,

> Hans

>>
>> --
>> Sander
>>
>>
>> [ 53.004968] =====================================
>> [ 53.004968] [ BUG: bad unlock balance detected! ]
>> [ 53.004968] 3.10.0-rc1-20130516-jens+ #1 Not tainted
>> [ 53.004968] -------------------------------------
>> [ 53.004968] motion/3328 is trying to release lock (&dev->lock) at:
>> [ 53.004968] [<ffffffff819be5f9>] mutex_unlock+0x9/0x10
>> [ 53.004968] but there are no more locks to release!
>> [ 53.004968]
>> [ 53.004968] other info that might help us debug this:
>> [ 53.004968] 1 lock held by motion/3328:
>> [ 53.004968] #0: (&mm->mmap_sem){++++++}, at: [<ffffffff81156cae>] vm_munmap+0x3e/0x70
>> [ 53.004968]
>> [ 53.004968] stack backtrace:
>> [ 53.004968] CPU: 1 PID: 3328 Comm: motion Not tainted 3.10.0-rc1-20130516-jens+ #1
>> [ 53.004968] Hardware name: Xen HVM domU, BIOS 4.3-unstable 05/16/2013
>> [ 53.004968] ffffffff819be5f9 ffff88002ac35c58 ffffffff819b9029 ffff88002ac35c88
>> [ 53.004968] ffffffff810e615e ffff88002ac35cb8 ffff88002b7c18a8 ffffffff819be5f9
>> [ 53.004968] 00000000ffffffff ffff88002ac35d28 ffffffff810eb17e ffffffff810e7ba5
>> [ 53.004968] Call Trace:
>> [ 53.004968] [<ffffffff819be5f9>] ? mutex_unlock+0x9/0x10
>> [ 53.004968] [<ffffffff819b9029>] dump_stack+0x19/0x1b
>> [ 53.004968] [<ffffffff810e615e>] print_unlock_imbalance_bug+0xfe/0x110
>> [ 53.004968] [<ffffffff819be5f9>] ? mutex_unlock+0x9/0x10
>> [ 53.004968] [<ffffffff810eb17e>] lock_release_non_nested+0x1ce/0x320
>> [ 53.004968] [<ffffffff810e7ba5>] ? debug_check_no_locks_freed+0x105/0x1b0
>> [ 53.353529] [<ffffffff819be5f9>] ? mutex_unlock+0x9/0x10
>> [ 53.353529] [<ffffffff810eb3cc>] lock_release+0xfc/0x250
>> [ 53.353529] [<ffffffff819be4b2>] __mutex_unlock_slowpath+0xb2/0x1f0
>> [ 53.353529] [<ffffffff819be5f9>] mutex_unlock+0x9/0x10
>> [ 53.353529] [<ffffffff81711105>] videobuf_waiton+0x55/0x230
>> [ 53.353529] [<ffffffff8114d052>] ? tlb_finish_mmu+0x32/0x50
>> [ 53.353529] [<ffffffff81154a46>] ? unmap_region+0xc6/0x100
>> [ 53.353529] [<ffffffff81172e05>] ? kmem_cache_free+0x195/0x230
>> [ 53.353529] [<ffffffff8172d3d9>] cx25821_free_buffer+0x49/0xa0
>> [ 53.353529] [<ffffffff8172f939>] cx25821_buffer_release+0x9/0x10
>> [ 53.353529] [<ffffffff81712c35>] videobuf_vm_close+0xc5/0x160
>> [ 53.353529] [<ffffffff81154aa5>] remove_vma+0x25/0x60
>> [ 53.353529] [<ffffffff81156b67>] do_munmap+0x307/0x410
>> [ 53.353529] [<ffffffff81156cbc>] vm_munmap+0x4c/0x70
>> [ 53.353529] [<ffffffff81157c09>] SyS_munmap+0x9/0x10
>> [ 53.353529] [<ffffffff819c20a9>] system_call_fastpath+0x16/0x1b
>>

2013-05-17 09:52:29

by Hans Verkuil

[permalink] [raw]

Subject: Re: [media] cx25821 regression from 3.9: BUG: bad unlock balance detected!

On Fri May 17 2013 11:04:50 Sander Eikelenboom wrote:
>
> Friday, May 17, 2013, 10:25:24 AM, you wrote:
>
> > On Thu May 16 2013 19:41:42 Sander Eikelenboom wrote:
> >> Hi Hans / Mauro,
> >>
> >> With 3.10.0-rc1 (including the cx25821 changes from Hans), I get the bug below which wasn't present with 3.9.
>
> > How do I reproduce this? I've tried to, but I can't make this happen.
>
> > Looking at the code I can't see how it could hit this bug anyway.
>
> I'm using "motion" to grab and process 6 from the video streams of the card i have (card with 8 inputs).
> It seems the cx25821 underwent quite some changes between 3.9 and 3.10.

It did.

> And in the past there have been some more locking issues around mmap and media devices, although they seem to appear as circular locking dependencies and with different devices.
> - http://www.mail-archive.com/[email protected]/msg46217.html
> - Under kvm: http://www.spinics.net/lists/linux-media/msg63322.html

Neither of those are related to this issue.

>
> - Perhaps that running in a VM could have to do with it ?
> - The driver on 3.9 occasionaly gives this, probably latency related (but continues to work):
> cx25821: cx25821_video_wakeup: 2 buffers handled (should be 1)
>
> Could it be something double unlocking in that path ?
>
> - Is there any extra debugging i could enable that could pinpoint the issue ?

Try this patch:

diff --git a/drivers/media/pci/cx25821/cx25821-core.c b/drivers/media/pci/cx25821/cx25821-core.c
index b762c5b..8f8d0e0 100644
--- a/drivers/media/pci/cx25821/cx25821-core.c
+++ b/drivers/media/pci/cx25821/cx25821-core.c
@@ -1208,7 +1208,6 @@ void cx25821_free_buffer(struct videobuf_queue *q, struct cx25821_buffer *buf)
struct videobuf_dmabuf *dma = videobuf_to_dma(&buf->vb);

BUG_ON(in_interrupt());
- videobuf_waiton(q, &buf->vb, 0, 0);
videobuf_dma_unmap(q->dev, dma);
videobuf_dma_free(dma);
btcx_riscmem_free(to_pci_dev(q->dev), &buf->risc);

I don't think the waiton is really needed for this driver.

What really should happen is that videobuf is replaced by videobuf2 in this
driver, but that's a fair amount of work.

Regards,

Hans

>
>
> --
>
> Sander
>
>
>
> > Regards,
>
> > Hans
>
> >>
> >> --
> >> Sander
> >>
> >>
> >> [ 53.004968] =====================================
> >> [ 53.004968] [ BUG: bad unlock balance detected! ]
> >> [ 53.004968] 3.10.0-rc1-20130516-jens+ #1 Not tainted
> >> [ 53.004968] -------------------------------------
> >> [ 53.004968] motion/3328 is trying to release lock (&dev->lock) at:
> >> [ 53.004968] [<ffffffff819be5f9>] mutex_unlock+0x9/0x10
> >> [ 53.004968] but there are no more locks to release!
> >> [ 53.004968]
> >> [ 53.004968] other info that might help us debug this:
> >> [ 53.004968] 1 lock held by motion/3328:
> >> [ 53.004968] #0: (&mm->mmap_sem){++++++}, at: [<ffffffff81156cae>] vm_munmap+0x3e/0x70
> >> [ 53.004968]
> >> [ 53.004968] stack backtrace:
> >> [ 53.004968] CPU: 1 PID: 3328 Comm: motion Not tainted 3.10.0-rc1-20130516-jens+ #1
> >> [ 53.004968] Hardware name: Xen HVM domU, BIOS 4.3-unstable 05/16/2013
> >> [ 53.004968] ffffffff819be5f9 ffff88002ac35c58 ffffffff819b9029 ffff88002ac35c88
> >> [ 53.004968] ffffffff810e615e ffff88002ac35cb8 ffff88002b7c18a8 ffffffff819be5f9
> >> [ 53.004968] 00000000ffffffff ffff88002ac35d28 ffffffff810eb17e ffffffff810e7ba5
> >> [ 53.004968] Call Trace:
> >> [ 53.004968] [<ffffffff819be5f9>] ? mutex_unlock+0x9/0x10
> >> [ 53.004968] [<ffffffff819b9029>] dump_stack+0x19/0x1b
> >> [ 53.004968] [<ffffffff810e615e>] print_unlock_imbalance_bug+0xfe/0x110
> >> [ 53.004968] [<ffffffff819be5f9>] ? mutex_unlock+0x9/0x10
> >> [ 53.004968] [<ffffffff810eb17e>] lock_release_non_nested+0x1ce/0x320
> >> [ 53.004968] [<ffffffff810e7ba5>] ? debug_check_no_locks_freed+0x105/0x1b0
> >> [ 53.353529] [<ffffffff819be5f9>] ? mutex_unlock+0x9/0x10
> >> [ 53.353529] [<ffffffff810eb3cc>] lock_release+0xfc/0x250
> >> [ 53.353529] [<ffffffff819be4b2>] __mutex_unlock_slowpath+0xb2/0x1f0
> >> [ 53.353529] [<ffffffff819be5f9>] mutex_unlock+0x9/0x10
> >> [ 53.353529] [<ffffffff81711105>] videobuf_waiton+0x55/0x230
> >> [ 53.353529] [<ffffffff8114d052>] ? tlb_finish_mmu+0x32/0x50
> >> [ 53.353529] [<ffffffff81154a46>] ? unmap_region+0xc6/0x100
> >> [ 53.353529] [<ffffffff81172e05>] ? kmem_cache_free+0x195/0x230
> >> [ 53.353529] [<ffffffff8172d3d9>] cx25821_free_buffer+0x49/0xa0
> >> [ 53.353529] [<ffffffff8172f939>] cx25821_buffer_release+0x9/0x10
> >> [ 53.353529] [<ffffffff81712c35>] videobuf_vm_close+0xc5/0x160
> >> [ 53.353529] [<ffffffff81154aa5>] remove_vma+0x25/0x60
> >> [ 53.353529] [<ffffffff81156b67>] do_munmap+0x307/0x410
> >> [ 53.353529] [<ffffffff81156cbc>] vm_munmap+0x4c/0x70
> >> [ 53.353529] [<ffffffff81157c09>] SyS_munmap+0x9/0x10
> >> [ 53.353529] [<ffffffff819c20a9>] system_call_fastpath+0x16/0x1b
> >>
>
>

2013-05-17 15:52:25

by Sander Eikelenboom

[permalink] [raw]

Subject: Re: [media] cx25821 regression from 3.9: BUG: bad unlock balance detected!

Friday, May 17, 2013, 11:52:17 AM, you wrote:

> On Fri May 17 2013 11:04:50 Sander Eikelenboom wrote:
>>
>> Friday, May 17, 2013, 10:25:24 AM, you wrote:
>>
>> > On Thu May 16 2013 19:41:42 Sander Eikelenboom wrote:
>> >> Hi Hans / Mauro,
>> >>
>> >> With 3.10.0-rc1 (including the cx25821 changes from Hans), I get the bug below which wasn't present with 3.9.
>>
>> > How do I reproduce this? I've tried to, but I can't make this happen.
>>
>> > Looking at the code I can't see how it could hit this bug anyway.
>>
>> I'm using "motion" to grab and process 6 from the video streams of the card i have (card with 8 inputs).
>> It seems the cx25821 underwent quite some changes between 3.9 and 3.10.

> It did.

>> And in the past there have been some more locking issues around mmap and media devices, although they seem to appear as circular locking dependencies and with different devices.
>> - http://www.mail-archive.com/[email protected]/msg46217.html
>> - Under kvm: http://www.spinics.net/lists/linux-media/msg63322.html

> Neither of those are related to this issue.

>>
>> - Perhaps that running in a VM could have to do with it ?
>> - The driver on 3.9 occasionaly gives this, probably latency related (but continues to work):
>> cx25821: cx25821_video_wakeup: 2 buffers handled (should be 1)
>>
>> Could it be something double unlocking in that path ?
>>
>> - Is there any extra debugging i could enable that could pinpoint the issue ?

> Try this patch:

Hmm it seems it's gone after pulling in linuses latest tree, with some workqueue / rcu fixes.
(running without the patch underneath now)

Thx,

Sander

> diff --git a/drivers/media/pci/cx25821/cx25821-core.c b/drivers/media/pci/cx25821/cx25821-core.c
> index b762c5b..8f8d0e0 100644
> --- a/drivers/media/pci/cx25821/cx25821-core.c
> +++ b/drivers/media/pci/cx25821/cx25821-core.c
> @@ -1208,7 +1208,6 @@ void cx25821_free_buffer(struct videobuf_queue *q, struct cx25821_buffer *buf)
> struct videobuf_dmabuf *dma = videobuf_to_dma(&buf->vb);
>
> BUG_ON(in_interrupt());
> - videobuf_waiton(q, &buf->vb, 0, 0);
> videobuf_dma_unmap(q->dev, dma);
> videobuf_dma_free(dma);
> btcx_riscmem_free(to_pci_dev(q->dev), &buf->risc);

> I don't think the waiton is really needed for this driver.

> What really should happen is that videobuf is replaced by videobuf2 in this
> driver, but that's a fair amount of work.

> Regards,

> Hans

>>
>>
>> --
>>
>> Sander
>>
>>
>>
>> > Regards,
>>
>> > Hans
>>
>> >>
>> >> --
>> >> Sander
>> >>
>> >>
>> >> [ 53.004968] =====================================
>> >> [ 53.004968] [ BUG: bad unlock balance detected! ]
>> >> [ 53.004968] 3.10.0-rc1-20130516-jens+ #1 Not tainted
>> >> [ 53.004968] -------------------------------------
>> >> [ 53.004968] motion/3328 is trying to release lock (&dev->lock) at:
>> >> [ 53.004968] [<ffffffff819be5f9>] mutex_unlock+0x9/0x10
>> >> [ 53.004968] but there are no more locks to release!
>> >> [ 53.004968]
>> >> [ 53.004968] other info that might help us debug this:
>> >> [ 53.004968] 1 lock held by motion/3328:
>> >> [ 53.004968] #0: (&mm->mmap_sem){++++++}, at: [<ffffffff81156cae>] vm_munmap+0x3e/0x70
>> >> [ 53.004968]
>> >> [ 53.004968] stack backtrace:
>> >> [ 53.004968] CPU: 1 PID: 3328 Comm: motion Not tainted 3.10.0-rc1-20130516-jens+ #1
>> >> [ 53.004968] Hardware name: Xen HVM domU, BIOS 4.3-unstable 05/16/2013
>> >> [ 53.004968] ffffffff819be5f9 ffff88002ac35c58 ffffffff819b9029 ffff88002ac35c88
>> >> [ 53.004968] ffffffff810e615e ffff88002ac35cb8 ffff88002b7c18a8 ffffffff819be5f9
>> >> [ 53.004968] 00000000ffffffff ffff88002ac35d28 ffffffff810eb17e ffffffff810e7ba5
>> >> [ 53.004968] Call Trace:
>> >> [ 53.004968] [<ffffffff819be5f9>] ? mutex_unlock+0x9/0x10
>> >> [ 53.004968] [<ffffffff819b9029>] dump_stack+0x19/0x1b
>> >> [ 53.004968] [<ffffffff810e615e>] print_unlock_imbalance_bug+0xfe/0x110
>> >> [ 53.004968] [<ffffffff819be5f9>] ? mutex_unlock+0x9/0x10
>> >> [ 53.004968] [<ffffffff810eb17e>] lock_release_non_nested+0x1ce/0x320
>> >> [ 53.004968] [<ffffffff810e7ba5>] ? debug_check_no_locks_freed+0x105/0x1b0
>> >> [ 53.353529] [<ffffffff819be5f9>] ? mutex_unlock+0x9/0x10
>> >> [ 53.353529] [<ffffffff810eb3cc>] lock_release+0xfc/0x250
>> >> [ 53.353529] [<ffffffff819be4b2>] __mutex_unlock_slowpath+0xb2/0x1f0
>> >> [ 53.353529] [<ffffffff819be5f9>] mutex_unlock+0x9/0x10
>> >> [ 53.353529] [<ffffffff81711105>] videobuf_waiton+0x55/0x230
>> >> [ 53.353529] [<ffffffff8114d052>] ? tlb_finish_mmu+0x32/0x50
>> >> [ 53.353529] [<ffffffff81154a46>] ? unmap_region+0xc6/0x100
>> >> [ 53.353529] [<ffffffff81172e05>] ? kmem_cache_free+0x195/0x230
>> >> [ 53.353529] [<ffffffff8172d3d9>] cx25821_free_buffer+0x49/0xa0
>> >> [ 53.353529] [<ffffffff8172f939>] cx25821_buffer_release+0x9/0x10
>> >> [ 53.353529] [<ffffffff81712c35>] videobuf_vm_close+0xc5/0x160
>> >> [ 53.353529] [<ffffffff81154aa5>] remove_vma+0x25/0x60
>> >> [ 53.353529] [<ffffffff81156b67>] do_munmap+0x307/0x410
>> >> [ 53.353529] [<ffffffff81156cbc>] vm_munmap+0x4c/0x70
>> >> [ 53.353529] [<ffffffff81157c09>] SyS_munmap+0x9/0x10
>> >> [ 53.353529] [<ffffffff819c20a9>] system_call_fastpath+0x16/0x1b
>> >>
>>
>>

2013-07-12 20:56:54

by Sander Eikelenboom

[permalink] [raw]

Subject: Re: [media] cx25821 regression from 3.9: BUG: bad unlock balance detected!

Friday, May 17, 2013, 11:52:17 AM, you wrote:

> On Fri May 17 2013 11:04:50 Sander Eikelenboom wrote:
>>
>> Friday, May 17, 2013, 10:25:24 AM, you wrote:
>>
>> > On Thu May 16 2013 19:41:42 Sander Eikelenboom wrote:
>> >> Hi Hans / Mauro,
>> >>
>> >> With 3.10.0-rc1 (including the cx25821 changes from Hans), I get the bug below which wasn't present with 3.9.
>>
>> > How do I reproduce this? I've tried to, but I can't make this happen.
>>
>> > Looking at the code I can't see how it could hit this bug anyway.
>>
>> I'm using "motion" to grab and process 6 from the video streams of the card i have (card with 8 inputs).
>> It seems the cx25821 underwent quite some changes between 3.9 and 3.10.

> It did.

>> And in the past there have been some more locking issues around mmap and media devices, although they seem to appear as circular locking dependencies and with different devices.
>> - http://www.mail-archive.com/[email protected]/msg46217.html
>> - Under kvm: http://www.spinics.net/lists/linux-media/msg63322.html

> Neither of those are related to this issue.

>>
>> - Perhaps that running in a VM could have to do with it ?
>> - The driver on 3.9 occasionaly gives this, probably latency related (but continues to work):
>> cx25821: cx25821_video_wakeup: 2 buffers handled (should be 1)
>>
>> Could it be something double unlocking in that path ?
>>
>> - Is there any extra debugging i could enable that could pinpoint the issue ?

> Try this patch:

> diff --git a/drivers/media/pci/cx25821/cx25821-core.c b/drivers/media/pci/cx25821/cx25821-core.c
> index b762c5b..8f8d0e0 100644
> --- a/drivers/media/pci/cx25821/cx25821-core.c
> +++ b/drivers/media/pci/cx25821/cx25821-core.c
> @@ -1208,7 +1208,6 @@ void cx25821_free_buffer(struct videobuf_queue *q, struct cx25821_buffer *buf)
> struct videobuf_dmabuf *dma = videobuf_to_dma(&buf->vb);
>
> BUG_ON(in_interrupt());
> - videobuf_waiton(q, &buf->vb, 0, 0);
> videobuf_dma_unmap(q->dev, dma);
> videobuf_dma_free(dma);
> btcx_riscmem_free(to_pci_dev(q->dev), &buf->risc);

> I don't think the waiton is really needed for this driver.

> What really should happen is that videobuf is replaced by videobuf2 in this
> driver, but that's a fair amount of work.

Hi Hans,

After being busy for quite some time, i do have some spare time now.

Since i'm still having trouble with this driver, is there a patch series for a similar driver
that was converted to videobuf2 ?
I don't know if it is entirely in my league, but i could give it a try when i have a example.

--
Sander

> Regards,

> Hans

>>
>>
>> --
>>
>> Sander
>>
>>
>>
>> > Regards,
>>
>> > Hans
>>
>> >>
>> >> --
>> >> Sander
>> >>
>> >>
>> >> [ 53.004968] =====================================
>> >> [ 53.004968] [ BUG: bad unlock balance detected! ]
>> >> [ 53.004968] 3.10.0-rc1-20130516-jens+ #1 Not tainted
>> >> [ 53.004968] -------------------------------------
>> >> [ 53.004968] motion/3328 is trying to release lock (&dev->lock) at:
>> >> [ 53.004968] [<ffffffff819be5f9>] mutex_unlock+0x9/0x10
>> >> [ 53.004968] but there are no more locks to release!
>> >> [ 53.004968]
>> >> [ 53.004968] other info that might help us debug this:
>> >> [ 53.004968] 1 lock held by motion/3328:
>> >> [ 53.004968] #0: (&mm->mmap_sem){++++++}, at: [<ffffffff81156cae>] vm_munmap+0x3e/0x70
>> >> [ 53.004968]
>> >> [ 53.004968] stack backtrace:
>> >> [ 53.004968] CPU: 1 PID: 3328 Comm: motion Not tainted 3.10.0-rc1-20130516-jens+ #1
>> >> [ 53.004968] Hardware name: Xen HVM domU, BIOS 4.3-unstable 05/16/2013
>> >> [ 53.004968] ffffffff819be5f9 ffff88002ac35c58 ffffffff819b9029 ffff88002ac35c88
>> >> [ 53.004968] ffffffff810e615e ffff88002ac35cb8 ffff88002b7c18a8 ffffffff819be5f9
>> >> [ 53.004968] 00000000ffffffff ffff88002ac35d28 ffffffff810eb17e ffffffff810e7ba5
>> >> [ 53.004968] Call Trace:
>> >> [ 53.004968] [<ffffffff819be5f9>] ? mutex_unlock+0x9/0x10
>> >> [ 53.004968] [<ffffffff819b9029>] dump_stack+0x19/0x1b
>> >> [ 53.004968] [<ffffffff810e615e>] print_unlock_imbalance_bug+0xfe/0x110
>> >> [ 53.004968] [<ffffffff819be5f9>] ? mutex_unlock+0x9/0x10
>> >> [ 53.004968] [<ffffffff810eb17e>] lock_release_non_nested+0x1ce/0x320
>> >> [ 53.004968] [<ffffffff810e7ba5>] ? debug_check_no_locks_freed+0x105/0x1b0
>> >> [ 53.353529] [<ffffffff819be5f9>] ? mutex_unlock+0x9/0x10
>> >> [ 53.353529] [<ffffffff810eb3cc>] lock_release+0xfc/0x250
>> >> [ 53.353529] [<ffffffff819be4b2>] __mutex_unlock_slowpath+0xb2/0x1f0
>> >> [ 53.353529] [<ffffffff819be5f9>] mutex_unlock+0x9/0x10
>> >> [ 53.353529] [<ffffffff81711105>] videobuf_waiton+0x55/0x230
>> >> [ 53.353529] [<ffffffff8114d052>] ? tlb_finish_mmu+0x32/0x50
>> >> [ 53.353529] [<ffffffff81154a46>] ? unmap_region+0xc6/0x100
>> >> [ 53.353529] [<ffffffff81172e05>] ? kmem_cache_free+0x195/0x230
>> >> [ 53.353529] [<ffffffff8172d3d9>] cx25821_free_buffer+0x49/0xa0
>> >> [ 53.353529] [<ffffffff8172f939>] cx25821_buffer_release+0x9/0x10
>> >> [ 53.353529] [<ffffffff81712c35>] videobuf_vm_close+0xc5/0x160
>> >> [ 53.353529] [<ffffffff81154aa5>] remove_vma+0x25/0x60
>> >> [ 53.353529] [<ffffffff81156b67>] do_munmap+0x307/0x410
>> >> [ 53.353529] [<ffffffff81156cbc>] vm_munmap+0x4c/0x70
>> >> [ 53.353529] [<ffffffff81157c09>] SyS_munmap+0x9/0x10
>> >> [ 53.353529] [<ffffffff819c20a9>] system_call_fastpath+0x16/0x1b
>> >>
>>
>>

2013-07-14 09:41:36

by Hans Verkuil

[permalink] [raw]

Subject: Re: [media] cx25821 regression from 3.9: BUG: bad unlock balance detected!

Hi Sander,

On 07/12/2013 10:56 PM, Sander Eikelenboom wrote:
>
> Friday, May 17, 2013, 11:52:17 AM, you wrote:
>
>> On Fri May 17 2013 11:04:50 Sander Eikelenboom wrote:
>>>
>>> Friday, May 17, 2013, 10:25:24 AM, you wrote:
>>>
>>>> On Thu May 16 2013 19:41:42 Sander Eikelenboom wrote:
>>>>> Hi Hans / Mauro,
>>>>>
>>>>> With 3.10.0-rc1 (including the cx25821 changes from Hans), I get the bug below which wasn't present with 3.9.
>>>
>>>> How do I reproduce this? I've tried to, but I can't make this happen.
>>>
>>>> Looking at the code I can't see how it could hit this bug anyway.
>>>
>>> I'm using "motion" to grab and process 6 from the video streams of the card i have (card with 8 inputs).
>>> It seems the cx25821 underwent quite some changes between 3.9 and 3.10.
>
>> It did.
>
>>> And in the past there have been some more locking issues around mmap and media devices, although they seem to appear as circular locking dependencies and with different devices.
>>> - http://www.mail-archive.com/[email protected]/msg46217.html
>>> - Under kvm: http://www.spinics.net/lists/linux-media/msg63322.html
>
>> Neither of those are related to this issue.
>
>>>
>>> - Perhaps that running in a VM could have to do with it ?
>>> - The driver on 3.9 occasionaly gives this, probably latency related (but continues to work):
>>> cx25821: cx25821_video_wakeup: 2 buffers handled (should be 1)
>>>
>>> Could it be something double unlocking in that path ?
>>>
>>> - Is there any extra debugging i could enable that could pinpoint the issue ?
>
>> Try this patch:
>
>> diff --git a/drivers/media/pci/cx25821/cx25821-core.c b/drivers/media/pci/cx25821/cx25821-core.c
>> index b762c5b..8f8d0e0 100644
>> --- a/drivers/media/pci/cx25821/cx25821-core.c
>> +++ b/drivers/media/pci/cx25821/cx25821-core.c
>> @@ -1208,7 +1208,6 @@ void cx25821_free_buffer(struct videobuf_queue *q, struct cx25821_buffer *buf)
>> struct videobuf_dmabuf *dma = videobuf_to_dma(&buf->vb);
>>
>> BUG_ON(in_interrupt());
>> - videobuf_waiton(q, &buf->vb, 0, 0);
>> videobuf_dma_unmap(q->dev, dma);
>> videobuf_dma_free(dma);
>> btcx_riscmem_free(to_pci_dev(q->dev), &buf->risc);
>
>> I don't think the waiton is really needed for this driver.
>
>> What really should happen is that videobuf is replaced by videobuf2 in this
>> driver, but that's a fair amount of work.
>
> Hi Hans,
>
> After being busy for quite some time, i do have some spare time now.
>
> Since i'm still having trouble with this driver, is there a patch series for a similar driver
> that was converted to videobuf2 ?
> I don't know if it is entirely in my league, but i could give it a try when i have a example.

The changes done for usb/em28xx might come close. That said, the cx25821 is already in
decent shape to convert to vb2. At least the videobuf data structures are already in the
correct place (they are often stored in a per-filehandle struct, which is wrong).

include/media/videobuf2-core.h gives a reasonable overview of vb2. Like em28xx, you
should use the vb2 helper functions (vb2_fop_* and vb2_ioctl_*) which takes a lot
of the work off your hands.

Converting cx25821-alsa.c may be the most difficult part as it is using some videobuf
internal functions which probably won't translate to vb2 as is. I think videobuf is
being abused here, but I don't know off-hand what the correct approach will be with
vb2.

I would ignore the alsa part for the time being (also the audio/video-upstream code,
that's been disabled and without datasheets of the cx25821 I'm not sure it can be
resurrected).

If you can get cx25821-video.c to work with vb2, then we can take a look at the alsa
code.

Regards,

Hans