2013-09-30 07:02:57

by Madper Xie

[permalink] [raw]
Subject: [Bug] 12.864681 BUG: lock held when returning to user space!

Hi all,
With kernel3.12-rc2 the dmesg shows following logs:
[ 12.864680] ================================================
[ 12.864681] [ BUG: lock held when returning to user space! ]
[ 12.864682] 3.12.0-rc2 #1 Not tainted
[ 12.864683] ------------------------------------------------
[ 12.864684] iprinit/719 is leaving the kernel with locks still held!
[ 12.864685] 1 lock held by iprinit/719:
[ 12.864686] #0: (&sdp->o_sem){.+.+..}, at: [<ffffffffa050de05>] sg_open+0x4b5/0x644 [sg]
[ 12.934954] ath9k 0000:01:00.0: enabling device (0000 -> 0002)
[ 12.940346] ath: phy0: timeout (1000 us) on reg 0x15f18: 0x00000000 & 0x00000007 != 0x00000004
[ 12.943125] ath: EEPROM regdomain: 0x60
[ 12.943127] ath: EEPROM indicates we should expect a direct regpair map
[ 12.943129] ath: Country alpha2 being used: 00
[ 12.943130] ath: Regpair used: 0x60
[ 12.960202] r8169 0000:02:00.0 p3p1: link down
[ 12.960236] r8169 0000:02:00.0 p3p1: link down
[ 12.960256] IPv6: ADDRCONF(NETDEV_UP): p3p1: link is not ready
[ 13.003523] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'
[ 13.003886] ieee80211 phy0: Atheros AR9485 Rev:1 mem=0xffffc9000bc80000, irq=16
[ 13.012120] ip6_tables: (C) 2000-2006 Netfilter Core Team
[ 13.023667] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[ 13.055802] Ebtables v2.0 registered
[ 13.192291] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
[ 15.906392] r8169 0000:02:00.0 p3p1: link up
[ 15.906416] IPv6: ADDRCONF(NETDEV_CHANGE): p3p1: link becomes ready
[ 17.121989] systemd-udevd (334) used greatest stack depth: 3352 bytes left

I'm working on finding which version bring this bug in.
--
Best,
Madper Xie.


2013-10-08 05:58:08

by Madper Xie

[permalink] [raw]
Subject: Re: [Bug] 12.864681 BUG: lock held when returning to user space!

Howdy Vaughan Cao,
I can't meet this issue on both 3.11 and 3.11.4. There are only four
patches between 3.11 and 3.12-rc2 and you are the author. Will you
please check them if you have time.

[email protected] writes:

> Hi all,
> With kernel3.12-rc2 the dmesg shows following logs:
> [ 12.864680] ================================================
> [ 12.864681] [ BUG: lock held when returning to user space! ]
> [ 12.864682] 3.12.0-rc2 #1 Not tainted
> [ 12.864683] ------------------------------------------------
> [ 12.864684] iprinit/719 is leaving the kernel with locks still held!
> [ 12.864685] 1 lock held by iprinit/719:
> [ 12.864686] #0: (&sdp->o_sem){.+.+..}, at: [<ffffffffa050de05>] sg_open+0x4b5/0x644 [sg]
> [ 12.934954] ath9k 0000:01:00.0: enabling device (0000 -> 0002)
> [ 12.940346] ath: phy0: timeout (1000 us) on reg 0x15f18: 0x00000000 & 0x00000007 != 0x00000004
> [ 12.943125] ath: EEPROM regdomain: 0x60
> [ 12.943127] ath: EEPROM indicates we should expect a direct regpair map
> [ 12.943129] ath: Country alpha2 being used: 00
> [ 12.943130] ath: Regpair used: 0x60
> [ 12.960202] r8169 0000:02:00.0 p3p1: link down
> [ 12.960236] r8169 0000:02:00.0 p3p1: link down
> [ 12.960256] IPv6: ADDRCONF(NETDEV_UP): p3p1: link is not ready
> [ 13.003523] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'
> [ 13.003886] ieee80211 phy0: Atheros AR9485 Rev:1 mem=0xffffc9000bc80000, irq=16
> [ 13.012120] ip6_tables: (C) 2000-2006 Netfilter Core Team
> [ 13.023667] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
> [ 13.055802] Ebtables v2.0 registered
> [ 13.192291] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
> [ 15.906392] r8169 0000:02:00.0 p3p1: link up
> [ 15.906416] IPv6: ADDRCONF(NETDEV_CHANGE): p3p1: link becomes ready
> [ 17.121989] systemd-udevd (334) used greatest stack depth: 3352 bytes left
>
> I'm working on finding which version bring this bug in.


--
Best,
Madper Xie.

2013-10-08 06:42:23

by vaughan

[permalink] [raw]
Subject: Re: [Bug] 12.864681 BUG: lock held when returning to user space!

Hi Madper,

CC to Douglas to get comments.
I use the rw_semaphore o_sem to protect excl open, introduced in commit
15b06f9a02406e5460001db6d5af5c738cd3d4e7 since v3.12-rc1.
Is it forbidden to do like that in kernel?...

Regards,
Vaughan

On 10/08/2013 01:57 PM, Madper Xie wrote:
> Howdy Vaughan Cao,
> I can't meet this issue on both 3.11 and 3.11.4. There are only four
> patches between 3.11 and 3.12-rc2 and you are the author. Will you
> please check them if you have time.
>
> [email protected] writes:
>
>> Hi all,
>> With kernel3.12-rc2 the dmesg shows following logs:
>> [ 12.864680] ================================================
>> [ 12.864681] [ BUG: lock held when returning to user space! ]
>> [ 12.864682] 3.12.0-rc2 #1 Not tainted
>> [ 12.864683] ------------------------------------------------
>> [ 12.864684] iprinit/719 is leaving the kernel with locks still held!
>> [ 12.864685] 1 lock held by iprinit/719:
>> [ 12.864686] #0: (&sdp->o_sem){.+.+..}, at: [<ffffffffa050de05>] sg_open+0x4b5/0x644 [sg]
>> [ 12.934954] ath9k 0000:01:00.0: enabling device (0000 -> 0002)
>> [ 12.940346] ath: phy0: timeout (1000 us) on reg 0x15f18: 0x00000000 & 0x00000007 != 0x00000004
>> [ 12.943125] ath: EEPROM regdomain: 0x60
>> [ 12.943127] ath: EEPROM indicates we should expect a direct regpair map
>> [ 12.943129] ath: Country alpha2 being used: 00
>> [ 12.943130] ath: Regpair used: 0x60
>> [ 12.960202] r8169 0000:02:00.0 p3p1: link down
>> [ 12.960236] r8169 0000:02:00.0 p3p1: link down
>> [ 12.960256] IPv6: ADDRCONF(NETDEV_UP): p3p1: link is not ready
>> [ 13.003523] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'
>> [ 13.003886] ieee80211 phy0: Atheros AR9485 Rev:1 mem=0xffffc9000bc80000, irq=16
>> [ 13.012120] ip6_tables: (C) 2000-2006 Netfilter Core Team
>> [ 13.023667] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
>> [ 13.055802] Ebtables v2.0 registered
>> [ 13.192291] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
>> [ 15.906392] r8169 0000:02:00.0 p3p1: link up
>> [ 15.906416] IPv6: ADDRCONF(NETDEV_CHANGE): p3p1: link becomes ready
>> [ 17.121989] systemd-udevd (334) used greatest stack depth: 3352 bytes left
>>
>> I'm working on finding which version bring this bug in.
>

2013-10-08 13:46:18

by Douglas Gilbert

[permalink] [raw]
Subject: Re: [Bug] 12.864681 BUG: lock held when returning to user space!

On 13-10-08 02:44 AM, vaughan wrote:
> Hi Madper,
>
> CC to Douglas to get comments.
> I use the rw_semaphore o_sem to protect excl open, introduced in commit
> 15b06f9a02406e5460001db6d5af5c738cd3d4e7 since v3.12-rc1.
> Is it forbidden to do like that in kernel?...

It appears you can not (allow sg_open() to hold a semaphore
then return to the user space). So you will need to do some
rework on that patch or revert it.

Doug Gilbert

Reference: scsi-linux + kernel lists, title:
[PATCH v6 0/4][SCSI] sg: fix race condition in sg_open
20130828

> On 10/08/2013 01:57 PM, Madper Xie wrote:
>> Howdy Vaughan Cao,
>> I can't meet this issue on both 3.11 and 3.11.4. There are only four
>> patches between 3.11 and 3.12-rc2 and you are the author. Will you
>> please check them if you have time.
>>
>> [email protected] writes:
>>
>>> Hi all,
>>> With kernel3.12-rc2 the dmesg shows following logs:
>>> [ 12.864680] ================================================
>>> [ 12.864681] [ BUG: lock held when returning to user space! ]
>>> [ 12.864682] 3.12.0-rc2 #1 Not tainted
>>> [ 12.864683] ------------------------------------------------
>>> [ 12.864684] iprinit/719 is leaving the kernel with locks still held!
>>> [ 12.864685] 1 lock held by iprinit/719:
>>> [ 12.864686] #0: (&sdp->o_sem){.+.+..}, at: [<ffffffffa050de05>] sg_open+0x4b5/0x644 [sg]
>>> [ 12.934954] ath9k 0000:01:00.0: enabling device (0000 -> 0002)
>>> [ 12.940346] ath: phy0: timeout (1000 us) on reg 0x15f18: 0x00000000 & 0x00000007 != 0x00000004
>>> [ 12.943125] ath: EEPROM regdomain: 0x60
>>> [ 12.943127] ath: EEPROM indicates we should expect a direct regpair map
>>> [ 12.943129] ath: Country alpha2 being used: 00
>>> [ 12.943130] ath: Regpair used: 0x60
>>> [ 12.960202] r8169 0000:02:00.0 p3p1: link down
>>> [ 12.960236] r8169 0000:02:00.0 p3p1: link down
>>> [ 12.960256] IPv6: ADDRCONF(NETDEV_UP): p3p1: link is not ready
>>> [ 13.003523] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'
>>> [ 13.003886] ieee80211 phy0: Atheros AR9485 Rev:1 mem=0xffffc9000bc80000, irq=16
>>> [ 13.012120] ip6_tables: (C) 2000-2006 Netfilter Core Team
>>> [ 13.023667] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
>>> [ 13.055802] Ebtables v2.0 registered
>>> [ 13.192291] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
>>> [ 15.906392] r8169 0000:02:00.0 p3p1: link up
>>> [ 15.906416] IPv6: ADDRCONF(NETDEV_CHANGE): p3p1: link becomes ready
>>> [ 17.121989] systemd-udevd (334) used greatest stack depth: 3352 bytes left
>>>
>>> I'm working on finding which version bring this bug in.
>>
>
>

2013-10-16 19:22:18

by James Bottomley

[permalink] [raw]
Subject: Re: [Bug] 12.864681 BUG: lock held when returning to user space!

On Tue, 2013-10-08 at 09:45 -0400, Douglas Gilbert wrote:
> On 13-10-08 02:44 AM, vaughan wrote:
> > Hi Madper,
> >
> > CC to Douglas to get comments.
> > I use the rw_semaphore o_sem to protect excl open, introduced in commit
> > 15b06f9a02406e5460001db6d5af5c738cd3d4e7 since v3.12-rc1.
> > Is it forbidden to do like that in kernel?...
>
> It appears you can not (allow sg_open() to hold a semaphore
> then return to the user space). So you will need to do some
> rework on that patch or revert it.

OK, there being no reply on this, I'll do the revert ... that's all four
patches, correct?

James

2013-10-16 22:41:12

by Douglas Gilbert

[permalink] [raw]
Subject: Re: [Bug] 12.864681 BUG: lock held when returning to user space!

On 13-10-16 09:24 AM, James Bottomley wrote:
> On Tue, 2013-10-08 at 09:45 -0400, Douglas Gilbert wrote:
>> On 13-10-08 02:44 AM, vaughan wrote:
>>> Hi Madper,
>>>
>>> CC to Douglas to get comments.
>>> I use the rw_semaphore o_sem to protect excl open, introduced in commit
>>> 15b06f9a02406e5460001db6d5af5c738cd3d4e7 since v3.12-rc1.
>>> Is it forbidden to do like that in kernel?...
>>
>> It appears you can not (allow sg_open() to hold a semaphore
>> then return to the user space). So you will need to do some
>> rework on that patch or revert it.
>
> OK, there being no reply on this, I'll do the revert ... that's all four
> patches, correct?

That seems to be the case. Vaughan acknowledged the
problem and forwarded it to me 8 days ago. Yes, it
seems to be a "no-no" to hold a any kernel semaphore
when returning to the user space; in this case from
sg_open(). I was hoping a revised patch might
appear from Vaughan but to date that has not been
the case. So with only a few weeks to go before
lk 3.12 is released, reverting the whole 4 patches
in that series seems to be the safest course.

Also without a new patch from Vaughan in the next few
weeks he may also miss the opportunity of getting
his improved O_EXCL logic into the lk 3.13 series.


Thinking about how to solve this problem: a field could
be added to 'struct sg_device' with one of three states:
no_opens, non_excl_opens and excl_open. It could be
manipulated by sg_open() and sg_release() like a
read-write semaphore. And the faulty 'struct
rw_semaphore o_sem' in sg_device could be replaced by a
normal semaphore to protect the manipulation of the new
three-state field.

Doug Gilbert

2013-10-17 02:49:29

by vaughan

[permalink] [raw]
Subject: Re: [Bug] 12.864681 BUG: lock held when returning to user space!

On 10/17/2013 06:41 AM, Douglas Gilbert wrote:
> That seems to be the case. Vaughan acknowledged the
> problem and forwarded it to me 8 days ago. Yes, it
> seems to be a "no-no" to hold a any kernel semaphore
> when returning to the user space; in this case from
> sg_open(). I was hoping a revised patch might
> appear from Vaughan but to date that has not been
> the case. So with only a few weeks to go before
> lk 3.12 is released, reverting the whole 4 patches
> in that series seems to be the safest course.
>
> Also without a new patch from Vaughan in the next few
> weeks he may also miss the opportunity of getting
> his improved O_EXCL logic into the lk 3.13 series.
>
>
> Thinking about how to solve this problem: a field could
> be added to 'struct sg_device' with one of three states:
> no_opens, non_excl_opens and excl_open. It could be
> manipulated by sg_open() and sg_release() like a
> read-write semaphore. And the faulty 'struct
> rw_semaphore o_sem' in sg_device could be replaced by a
> normal semaphore to protect the manipulation of the new
> three-state field.
> And the new three-state field would replace (or expand)
the 'char exclude' field in struct sg_device .
>
> Doug Gilbert
Hi Doug,

Thanks for providing advice on how to fix this.
However, it seems be still awkward somehow. We have to
1. hold a lock (maybe sg_index_lock or a new one)
2. check
a) the new three-state field;
b) if sfp list is empty;
c) sdp->detached field;
if either condition fails, we may link the open process into o_excl_wait
queue and need wakeup.
if satisfied, we go on.
3. then we release at least sg_index_lock to malloc a new sfp and
initialize.
4. try to acquire sg_index_lock again and add this sfp into sfd_siblings
list if possible. <== We still have to check at least sdp->detached field
5. update three-state field to reflect the result of Step 4, and wake up
processes waiting in o_excl_wait.

This uncomfortable is introduced by releasing the sg_index_lock in the
middle of check->malloc->add the new sfp struct.

I wanna ask if it is possible to split the sg_add_sfp() into two
functions, sg_init_sfp() and sg_add_sfp2(). We can do all initialize
work in sg_init_sfp()
without any lock and let sg_add_sfp2() only serve lock-check-add in one
way. It seems more convenient for me to understand.
But there is still some questions on this approach:
1. memory consume can be very large if lots of sg_init_sfp in the same time;
2. some field are initialized according to the fields of scsi device sdp
points to, such as low_dma, sg_tablesize, max_sector, phys_segs.
I know scsi_device_get() would keep the underlying scsi_device
alive, however would these fields change in the gap of our initialize
and add?
The relationship of sg_device and scsi_device like above said
confuse me somehow...

Thanks,
Vaughan