2015-06-09 12:00:33

by Morten Stevens

[permalink] [raw]
Subject: linux 4.1-rc7 deadlock

Hi,

After each restart appears directly this deadlock:

[ 28.177939] ======================================================
[ 28.177959] [ INFO: possible circular locking dependency detected ]
[ 28.177980] 4.1.0-0.rc7.git0.1.fc23.x86_64+debug #1 Tainted: G W
[ 28.178002] -------------------------------------------------------
[ 28.178022] sshd/1764 is trying to acquire lock:
[ 28.178037] (&isec->lock){+.+.+.}, at: [<ffffffff813b52c5>]
inode_doinit_with_dentry+0xc5/0x6a0
[ 28.178078]
but task is already holding lock:
[ 28.178097] (&mm->mmap_sem){++++++}, at: [<ffffffff81216a0f>]
vm_mmap_pgoff+0x8f/0xf0
[ 28.178131]
which lock already depends on the new lock.

[ 28.178157]
the existing dependency chain (in reverse order) is:
[ 28.178180]
-> #2 (&mm->mmap_sem){++++++}:
[ 28.178201] [<ffffffff81114017>] lock_acquire+0xc7/0x2a0
[ 28.178225] [<ffffffff8122853c>] might_fault+0x8c/0xb0
[ 28.178248] [<ffffffff8129af3a>] filldir+0x9a/0x130
[ 28.178269] [<ffffffffa019cfd6>]
xfs_dir2_block_getdents.isra.12+0x1a6/0x1d0 [xfs]
[ 28.178330] [<ffffffffa019dae4>] xfs_readdir+0x1c4/0x360 [xfs]
[ 28.178368] [<ffffffffa01a0a5b>] xfs_file_readdir+0x2b/0x30 [xfs]
[ 28.178404] [<ffffffff8129ad0a>] iterate_dir+0x9a/0x140
[ 28.178425] [<ffffffff8129b241>] SyS_getdents+0x91/0x120
[ 28.178447] [<ffffffff818a016e>] system_call_fastpath+0x12/0x76
[ 28.178471]
-> #1 (&xfs_dir_ilock_class){++++.+}:
[ 28.178494] [<ffffffff81114017>] lock_acquire+0xc7/0x2a0
[ 28.178515] [<ffffffff8110bee7>] down_read_nested+0x57/0xa0
[ 28.178538] [<ffffffffa01b2ed1>] xfs_ilock+0x171/0x390 [xfs]
[ 28.178579] [<ffffffffa01b3168>]
xfs_ilock_attr_map_shared+0x38/0x50 [xfs]
[ 28.178618] [<ffffffffa0145d8d>] xfs_attr_get+0xbd/0x1b0 [xfs]
[ 28.178651] [<ffffffffa01c44ad>] xfs_xattr_get+0x3d/0x80 [xfs]
[ 28.178688] [<ffffffff812b022f>] generic_getxattr+0x4f/0x70
[ 28.178711] [<ffffffff813b5372>]
inode_doinit_with_dentry+0x172/0x6a0
[ 28.178737] [<ffffffff813b68db>] sb_finish_set_opts+0xdb/0x260
[ 28.178759] [<ffffffff813b6ff1>] selinux_set_mnt_opts+0x331/0x670
[ 28.178783] [<ffffffff813b9b47>] superblock_doinit+0x77/0xf0
[ 28.178804] [<ffffffff813b9bd0>] delayed_superblock_init+0x10/0x20
[ 28.178849] [<ffffffff8128691a>] iterate_supers+0xba/0x120
[ 28.178872] [<ffffffff813bef23>] selinux_complete_init+0x33/0x40
[ 28.178897] [<ffffffff813cf313>] security_load_policy+0x103/0x640
[ 28.178920] [<ffffffff813c0a76>] sel_write_load+0xb6/0x790
[ 28.179482] [<ffffffff812821f7>] __vfs_write+0x37/0x110
[ 28.180047] [<ffffffff81282c89>] vfs_write+0xa9/0x1c0
[ 28.180630] [<ffffffff81283a1c>] SyS_write+0x5c/0xd0
[ 28.181168] [<ffffffff818a016e>] system_call_fastpath+0x12/0x76
[ 28.181740]
-> #0 (&isec->lock){+.+.+.}:
[ 28.182808] [<ffffffff81113331>] __lock_acquire+0x1b31/0x1e40
[ 28.183347] [<ffffffff81114017>] lock_acquire+0xc7/0x2a0
[ 28.183897] [<ffffffff8189c10d>] mutex_lock_nested+0x7d/0x460
[ 28.184427] [<ffffffff813b52c5>]
inode_doinit_with_dentry+0xc5/0x6a0
[ 28.184944] [<ffffffff813b58bc>] selinux_d_instantiate+0x1c/0x20
[ 28.185470] [<ffffffff813b07ab>] security_d_instantiate+0x1b/0x30
[ 28.185980] [<ffffffff8129e8c4>] d_instantiate+0x54/0x80
[ 28.186495] [<ffffffff81211edc>] __shmem_file_setup+0xdc/0x250
[ 28.186990] [<ffffffff812164a8>] shmem_zero_setup+0x28/0x70
[ 28.187500] [<ffffffff8123471c>] mmap_region+0x66c/0x680
[ 28.188006] [<ffffffff81234a53>] do_mmap_pgoff+0x323/0x410
[ 28.188500] [<ffffffff81216a30>] vm_mmap_pgoff+0xb0/0xf0
[ 28.189005] [<ffffffff81232bf6>] SyS_mmap_pgoff+0x116/0x2b0
[ 28.189490] [<ffffffff810232bb>] SyS_mmap+0x1b/0x30
[ 28.189975] [<ffffffff818a016e>] system_call_fastpath+0x12/0x76
[ 28.190474]
other info that might help us debug this:

[ 28.191901] Chain exists of:
&isec->lock --> &xfs_dir_ilock_class --> &mm->mmap_sem

[ 28.193327] Possible unsafe locking scenario:

[ 28.194297] CPU0 CPU1
[ 28.194774] ---- ----
[ 28.195254] lock(&mm->mmap_sem);
[ 28.195709] lock(&xfs_dir_ilock_class);
[ 28.196174] lock(&mm->mmap_sem);
[ 28.196654] lock(&isec->lock);
[ 28.197108]
*** DEADLOCK ***

[ 28.198451] 1 lock held by sshd/1764:
[ 28.198900] #0: (&mm->mmap_sem){++++++}, at: [<ffffffff81216a0f>]
vm_mmap_pgoff+0x8f/0xf0
[ 28.199370]
stack backtrace:
[ 28.200276] CPU: 2 PID: 1764 Comm: sshd Tainted: G W
4.1.0-0.rc7.git0.1.fc23.x86_64+debug #1
[ 28.200753] Hardware name: VMware, Inc. VMware Virtual Platform/440BX
Desktop Reference Platform, BIOS 6.00 05/20/2014
[ 28.201246] 0000000000000000 00000000eda89a94 ffff8800a86a39c8
ffffffff81896375
[ 28.201771] 0000000000000000 ffffffff82a910d0 ffff8800a86a3a18
ffffffff8110fbd6
[ 28.202275] 0000000000000002 ffff8800a86a3a78 0000000000000001
ffff8800a897b008
[ 28.203099] Call Trace:
[ 28.204237] [<ffffffff81896375>] dump_stack+0x4c/0x65
[ 28.205362] [<ffffffff8110fbd6>] print_circular_bug+0x206/0x280
[ 28.206502] [<ffffffff81113331>] __lock_acquire+0x1b31/0x1e40
[ 28.207650] [<ffffffff81114017>] lock_acquire+0xc7/0x2a0
[ 28.208758] [<ffffffff813b52c5>] ? inode_doinit_with_dentry+0xc5/0x6a0
[ 28.209902] [<ffffffff8189c10d>] mutex_lock_nested+0x7d/0x460
[ 28.211023] [<ffffffff813b52c5>] ? inode_doinit_with_dentry+0xc5/0x6a0
[ 28.212162] [<ffffffff813b52c5>] ? inode_doinit_with_dentry+0xc5/0x6a0
[ 28.213283] [<ffffffff81027e7d>] ? native_sched_clock+0x2d/0xa0
[ 28.214403] [<ffffffff81027ef9>] ? sched_clock+0x9/0x10
[ 28.215514] [<ffffffff813b52c5>] inode_doinit_with_dentry+0xc5/0x6a0
[ 28.216656] [<ffffffff813b58bc>] selinux_d_instantiate+0x1c/0x20
[ 28.217776] [<ffffffff813b07ab>] security_d_instantiate+0x1b/0x30
[ 28.218902] [<ffffffff8129e8c4>] d_instantiate+0x54/0x80
[ 28.219992] [<ffffffff81211edc>] __shmem_file_setup+0xdc/0x250
[ 28.221112] [<ffffffff812164a8>] shmem_zero_setup+0x28/0x70
[ 28.222234] [<ffffffff8123471c>] mmap_region+0x66c/0x680
[ 28.223362] [<ffffffff81234a53>] do_mmap_pgoff+0x323/0x410
[ 28.224493] [<ffffffff81216a0f>] ? vm_mmap_pgoff+0x8f/0xf0
[ 28.225643] [<ffffffff81216a30>] vm_mmap_pgoff+0xb0/0xf0
[ 28.226771] [<ffffffff81232bf6>] SyS_mmap_pgoff+0x116/0x2b0
[ 28.227900] [<ffffffff812996ce>] ? SyS_fcntl+0x5de/0x760
[ 28.229042] [<ffffffff810232bb>] SyS_mmap+0x1b/0x30
[ 28.230156] [<ffffffff818a016e>] system_call_fastpath+0x12/0x76
[ 46.520367] Adjusting tsc more than 11% (5419175 vs 7179037)


Any ideas?

Best regards,

Morten


2015-06-09 14:10:45

by Daniel Wagner

[permalink] [raw]
Subject: Re: linux 4.1-rc7 deadlock

Hi Morten,

On 06/09/2015 01:54 PM, Morten Stevens wrote:
> [ 28.193327] Possible unsafe locking scenario:
>
> [ 28.194297] CPU0 CPU1
> [ 28.194774] ---- ----
> [ 28.195254] lock(&mm->mmap_sem);
> [ 28.195709] lock(&xfs_dir_ilock_class);
> [ 28.196174] lock(&mm->mmap_sem);
> [ 28.196654] lock(&isec->lock);
> [ 28.197108]

[...]

> Any ideas?

I think you hit the same problem many have already reported:

https://lkml.org/lkml/2015/3/30/594

cheers,
daniel

2015-06-09 15:35:11

by Morten Stevens

[permalink] [raw]
Subject: Re: linux 4.1-rc7 deadlock

2015-06-09 16:10 GMT+02:00 Daniel Wagner <[email protected]>:
> Hi Morten,

Hi Daniel,

> On 06/09/2015 01:54 PM, Morten Stevens wrote:
>> [ 28.193327] Possible unsafe locking scenario:
>>
>> [ 28.194297] CPU0 CPU1
>> [ 28.194774] ---- ----
>> [ 28.195254] lock(&mm->mmap_sem);
>> [ 28.195709] lock(&xfs_dir_ilock_class);
>> [ 28.196174] lock(&mm->mmap_sem);
>> [ 28.196654] lock(&isec->lock);
>> [ 28.197108]
>
> [...]
>
>> Any ideas?
>
> I think you hit the same problem many have already reported:
>
> https://lkml.org/lkml/2015/3/30/594

Yes, that sounds very likely. But that was about 1 month ago, so I
thought that it has been fixed in the last weeks?

Best regards,

Morten

2015-06-11 20:06:16

by Hugh Dickins

[permalink] [raw]
Subject: Re: linux 4.1-rc7 deadlock

On Tue, 9 Jun 2015, Morten Stevens wrote:
> 2015-06-09 16:10 GMT+02:00 Daniel Wagner <[email protected]>:
> > On 06/09/2015 01:54 PM, Morten Stevens wrote:
> >> [ 28.193327] Possible unsafe locking scenario:
> >>
> >> [ 28.194297] CPU0 CPU1
> >> [ 28.194774] ---- ----
> >> [ 28.195254] lock(&mm->mmap_sem);
> >> [ 28.195709] lock(&xfs_dir_ilock_class);
> >> [ 28.196174] lock(&mm->mmap_sem);
> >> [ 28.196654] lock(&isec->lock);
> >> [ 28.197108]
> >
> > [...]
> >
> >> Any ideas?
> >
> > I think you hit the same problem many have already reported:
> >
> > https://lkml.org/lkml/2015/3/30/594
>
> Yes, that sounds very likely. But that was about 1 month ago, so I
> thought that it has been fixed in the last weeks?

It's not likely to get fixed without Cc'ing the right people.

This appears to be the same as Prarit reported to linux-mm on
2014/09/10. Dave Chinner thinks it's a shmem bug, I disagree,
but I am hopeful that it can be easily fixed at the shmem end.

Here's the patch I suggested nine months ago: but got no feedback,
so it remains Not-Yet-Signed-off-by. Please, if you find this works
(and does not just delay the lockdep conflict until a little later),
do let me know, then I can add some Tested-bys and send it to Linus.

mm: shmem_zero_setup skip security check and lockdep conflict with XFS

It appears that, at some point last year, XFS made directory handling
changes which bring it into lockdep conflict with shmem_zero_setup():
it is surprising that mmap() can clone an inode while holding mmap_sem,
but that has been so for many years.

Since those few lockdep traces that I've seen all implicated selinux,
I'm hoping that we can use the __shmem_file_setup(,,,S_PRIVATE) which
v3.13's commit c7277090927a ("security: shmem: implement kernel private
shmem inodes") introduced to avoid LSM checks on kernel-internal inodes:
the mmap("/dev/zero") cloned inode is indeed a kernel-internal detail.

This also covers the !CONFIG_SHMEM use of ramfs to support /dev/zero
(and MAP_SHARED|MAP_ANONYMOUS). I thought there were also drivers
which cloned inode in mmap(), but if so, I cannot locate them now.

Reported-by: Prarit Bhargava <[email protected]>
Reported-by: Daniel Wagner <[email protected]>
Reported-by: Morten Stevens <[email protected]>
Not-Yet-Signed-off-by: Hugh Dickins <[email protected]>
---

mm/shmem.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- 4.1-rc7/mm/shmem.c 2015-04-26 19:16:31.352191298 -0700
+++ linux/mm/shmem.c 2015-06-11 11:08:21.042745594 -0700
@@ -3401,7 +3401,7 @@ int shmem_zero_setup(struct vm_area_stru
struct file *file;
loff_t size = vma->vm_end - vma->vm_start;

- file = shmem_file_setup("dev/zero", size, vma->vm_flags);
+ file = __shmem_file_setup("dev/zero", size, vma->vm_flags, S_PRIVATE);
if (IS_ERR(file))
return PTR_ERR(file);

2015-06-11 22:15:58

by Prarit Bhargava

[permalink] [raw]
Subject: Re: linux 4.1-rc7 deadlock



On 06/11/2015 04:06 PM, Hugh Dickins wrote:
> On Tue, 9 Jun 2015, Morten Stevens wrote:
>> 2015-06-09 16:10 GMT+02:00 Daniel Wagner <[email protected]>:
>>> On 06/09/2015 01:54 PM, Morten Stevens wrote:
>>>> [ 28.193327] Possible unsafe locking scenario:
>>>>
>>>> [ 28.194297] CPU0 CPU1
>>>> [ 28.194774] ---- ----
>>>> [ 28.195254] lock(&mm->mmap_sem);
>>>> [ 28.195709] lock(&xfs_dir_ilock_class);
>>>> [ 28.196174] lock(&mm->mmap_sem);
>>>> [ 28.196654] lock(&isec->lock);
>>>> [ 28.197108]
>>>
>>> [...]
>>>
>>>> Any ideas?
>>>
>>> I think you hit the same problem many have already reported:
>>>
>>> https://lkml.org/lkml/2015/3/30/594
>>
>> Yes, that sounds very likely. But that was about 1 month ago, so I
>> thought that it has been fixed in the last weeks?
>
> It's not likely to get fixed without Cc'ing the right people.
>
> This appears to be the same as Prarit reported to linux-mm on
> 2014/09/10. Dave Chinner thinks it's a shmem bug, I disagree,
> but I am hopeful that it can be easily fixed at the shmem end.
>
> Here's the patch I suggested nine months ago: but got no feedback,
> so it remains Not-Yet-Signed-off-by. Please, if you find this works
> (and does not just delay the lockdep conflict until a little later),
> do let me know, then I can add some Tested-bys and send it to Linus.
>
> mm: shmem_zero_setup skip security check and lockdep conflict with XFS
>
> It appears that, at some point last year, XFS made directory handling
> changes which bring it into lockdep conflict with shmem_zero_setup():
> it is surprising that mmap() can clone an inode while holding mmap_sem,
> but that has been so for many years.
>
> Since those few lockdep traces that I've seen all implicated selinux,
> I'm hoping that we can use the __shmem_file_setup(,,,S_PRIVATE) which
> v3.13's commit c7277090927a ("security: shmem: implement kernel private
> shmem inodes") introduced to avoid LSM checks on kernel-internal inodes:
> the mmap("/dev/zero") cloned inode is indeed a kernel-internal detail.
>
> This also covers the !CONFIG_SHMEM use of ramfs to support /dev/zero
> (and MAP_SHARED|MAP_ANONYMOUS). I thought there were also drivers
> which cloned inode in mmap(), but if so, I cannot locate them now.
>
> Reported-by: Prarit Bhargava <[email protected]>
> Reported-by: Daniel Wagner <[email protected]>
> Reported-by: Morten Stevens <[email protected]>
> Not-Yet-Signed-off-by: Hugh Dickins <[email protected]>
> ---
>
> mm/shmem.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> --- 4.1-rc7/mm/shmem.c 2015-04-26 19:16:31.352191298 -0700
> +++ linux/mm/shmem.c 2015-06-11 11:08:21.042745594 -0700
> @@ -3401,7 +3401,7 @@ int shmem_zero_setup(struct vm_area_stru
> struct file *file;
> loff_t size = vma->vm_end - vma->vm_start;
>
> - file = shmem_file_setup("dev/zero", size, vma->vm_flags);
> + file = __shmem_file_setup("dev/zero", size, vma->vm_flags, S_PRIVATE);

Perhaps,

file = shmem_kernel_file_setup("dev/zero", size, vma->vm_flags) ?

Tested-by: Prarit Bhargava <[email protected]>

P.

> if (IS_ERR(file))
> return PTR_ERR(file);
>
>

2015-06-14 16:44:41

by Hugh Dickins

[permalink] [raw]
Subject: Re: linux 4.1-rc7 deadlock

On Thu, 11 Jun 2015, Prarit Bhargava wrote:
> On 06/11/2015 04:06 PM, Hugh Dickins wrote:
> > On Tue, 9 Jun 2015, Morten Stevens wrote:
> >> 2015-06-09 16:10 GMT+02:00 Daniel Wagner <[email protected]>:
> >>> On 06/09/2015 01:54 PM, Morten Stevens wrote:
> >
> > Reported-by: Prarit Bhargava <[email protected]>
> > Reported-by: Daniel Wagner <[email protected]>
> > Reported-by: Morten Stevens <[email protected]>
> > Not-Yet-Signed-off-by: Hugh Dickins <[email protected]>
> > ---
> >
> > mm/shmem.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > --- 4.1-rc7/mm/shmem.c 2015-04-26 19:16:31.352191298 -0700
> > +++ linux/mm/shmem.c 2015-06-11 11:08:21.042745594 -0700
> > @@ -3401,7 +3401,7 @@ int shmem_zero_setup(struct vm_area_stru
> > struct file *file;
> > loff_t size = vma->vm_end - vma->vm_start;
> >
> > - file = shmem_file_setup("dev/zero", size, vma->vm_flags);
> > + file = __shmem_file_setup("dev/zero", size, vma->vm_flags, S_PRIVATE);
>
> Perhaps,
>
> file = shmem_kernel_file_setup("dev/zero", size, vma->vm_flags) ?

Perhaps. I couldn't decide whether this is a proper intended use of
shmem_kernel_file_setup(), or a handy reuse of its flag. Andrew asked
for a comment, so in the end I left that line as is, but refer to
shmem_kernel_file_setup() in the comment. And that forced me to look a
little closer at the security implications: but we do seem to be safe.

>
> Tested-by: Prarit Bhargava <[email protected]>

Thank you: I had been hoping for some corroboration from one of the other
guys (no offence to you, but 33% looks a bit weak!), but now it's Sunday
so I think I'd better send this off in the hope that it makes -rc8.

Hugh

>
> P.
>
> > if (IS_ERR(file))
> > return PTR_ERR(file);