2015-11-16 12:09:49

by Yigal Korman

[permalink] [raw]
Subject: [PATCH] mm, dax: fix DAX deadlocks (COW fault)

DAX handling of COW faults has wrong locking sequence:
dax_fault does i_mmap_lock_read
do_cow_fault does i_mmap_unlock_write

Ross's commit[1] missed a fix[2] that Kirill added to Matthew's
commit[3].

Original COW locking logic was introduced by Matthew here[4].

This should be applied to v4.3 as well.

[1] 0f90cc6609c7 mm, dax: fix DAX deadlocks
[2] 52a2b53ffde6 mm, dax: use i_mmap_unlock_write() in do_cow_fault()
[3] 843172978bb9 dax: fix race between simultaneous faults
[4] 2e4cdab0584f mm: allow page fault handlers to perform the COW

Signed-off-by: Yigal Korman <[email protected]>

Cc: Stable Tree <[email protected]>
Cc: Boaz Harrosh <[email protected]>
Cc: Ross Zwisler <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: Matthew Wilcox <[email protected]>
---
mm/memory.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index c716913..e5071af 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3015,9 +3015,9 @@ static int do_cow_fault(struct mm_struct *mm, struct vm_area_struct *vma,
} else {
/*
* The fault handler has no page to lock, so it holds
- * i_mmap_lock for write to protect against truncate.
+ * i_mmap_lock for read to protect against truncate.
*/
- i_mmap_unlock_write(vma->vm_file->f_mapping);
+ i_mmap_unlock_read(vma->vm_file->f_mapping);
}
goto uncharge_out;
}
@@ -3031,9 +3031,9 @@ static int do_cow_fault(struct mm_struct *mm, struct vm_area_struct *vma,
} else {
/*
* The fault handler has no page to lock, so it holds
- * i_mmap_lock for write to protect against truncate.
+ * i_mmap_lock for read to protect against truncate.
*/
- i_mmap_unlock_write(vma->vm_file->f_mapping);
+ i_mmap_unlock_read(vma->vm_file->f_mapping);
}
return ret;
uncharge_out:
--
1.9.3


2015-11-16 18:16:02

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH] mm, dax: fix DAX deadlocks (COW fault)

On Mon, Nov 16, 2015 at 4:09 AM, Yigal Korman <[email protected]> wrote:
> DAX handling of COW faults has wrong locking sequence:
> dax_fault does i_mmap_lock_read
> do_cow_fault does i_mmap_unlock_write
>
> Ross's commit[1] missed a fix[2] that Kirill added to Matthew's
> commit[3].
>
> Original COW locking logic was introduced by Matthew here[4].
>
> This should be applied to v4.3 as well.
>
> [1] 0f90cc6609c7 mm, dax: fix DAX deadlocks
> [2] 52a2b53ffde6 mm, dax: use i_mmap_unlock_write() in do_cow_fault()
> [3] 843172978bb9 dax: fix race between simultaneous faults
> [4] 2e4cdab0584f mm: allow page fault handlers to perform the COW
>
> Signed-off-by: Yigal Korman <[email protected]>
>
> Cc: Stable Tree <[email protected]>
> Cc: Boaz Harrosh <[email protected]>
> Cc: Ross Zwisler <[email protected]>
> Cc: Alexander Viro <[email protected]>
> Cc: Dan Williams <[email protected]>
> Cc: Dave Chinner <[email protected]>
> Cc: Jan Kara <[email protected]>
> Cc: "Kirill A. Shutemov" <[email protected]>
> Cc: Matthew Wilcox <[email protected]>
> ---
> mm/memory.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/mm/memory.c b/mm/memory.c
> index c716913..e5071af 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3015,9 +3015,9 @@ static int do_cow_fault(struct mm_struct *mm, struct vm_area_struct *vma,
> } else {
> /*
> * The fault handler has no page to lock, so it holds
> - * i_mmap_lock for write to protect against truncate.
> + * i_mmap_lock for read to protect against truncate.
> */
> - i_mmap_unlock_write(vma->vm_file->f_mapping);
> + i_mmap_unlock_read(vma->vm_file->f_mapping);
> }
> goto uncharge_out;
> }
> @@ -3031,9 +3031,9 @@ static int do_cow_fault(struct mm_struct *mm, struct vm_area_struct *vma,
> } else {
> /*
> * The fault handler has no page to lock, so it holds
> - * i_mmap_lock for write to protect against truncate.
> + * i_mmap_lock for read to protect against truncate.
> */
> - i_mmap_unlock_write(vma->vm_file->f_mapping);
> + i_mmap_unlock_read(vma->vm_file->f_mapping);
> }
> return ret;
> uncharge_out:

Looks good to me. I'll include this with some other DAX fixes I have pending.

2015-11-16 18:34:14

by Ross Zwisler

[permalink] [raw]
Subject: Re: [PATCH] mm, dax: fix DAX deadlocks (COW fault)

On Mon, Nov 16, 2015 at 10:15:56AM -0800, Dan Williams wrote:
> On Mon, Nov 16, 2015 at 4:09 AM, Yigal Korman <[email protected]> wrote:
> > DAX handling of COW faults has wrong locking sequence:
> > dax_fault does i_mmap_lock_read
> > do_cow_fault does i_mmap_unlock_write
> >
> > Ross's commit[1] missed a fix[2] that Kirill added to Matthew's
> > commit[3].
> >
> > Original COW locking logic was introduced by Matthew here[4].
> >
> > This should be applied to v4.3 as well.
> >
> > [1] 0f90cc6609c7 mm, dax: fix DAX deadlocks
> > [2] 52a2b53ffde6 mm, dax: use i_mmap_unlock_write() in do_cow_fault()
> > [3] 843172978bb9 dax: fix race between simultaneous faults
> > [4] 2e4cdab0584f mm: allow page fault handlers to perform the COW
> >
> > Signed-off-by: Yigal Korman <[email protected]>
> >
> > Cc: Stable Tree <[email protected]>
> > Cc: Boaz Harrosh <[email protected]>
> > Cc: Ross Zwisler <[email protected]>
> > Cc: Alexander Viro <[email protected]>
> > Cc: Dan Williams <[email protected]>
> > Cc: Dave Chinner <[email protected]>
> > Cc: Jan Kara <[email protected]>
> > Cc: "Kirill A. Shutemov" <[email protected]>
> > Cc: Matthew Wilcox <[email protected]>
> > ---
> > mm/memory.c | 8 ++++----
> > 1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/mm/memory.c b/mm/memory.c
> > index c716913..e5071af 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -3015,9 +3015,9 @@ static int do_cow_fault(struct mm_struct *mm, struct vm_area_struct *vma,
> > } else {
> > /*
> > * The fault handler has no page to lock, so it holds
> > - * i_mmap_lock for write to protect against truncate.
> > + * i_mmap_lock for read to protect against truncate.
> > */
> > - i_mmap_unlock_write(vma->vm_file->f_mapping);
> > + i_mmap_unlock_read(vma->vm_file->f_mapping);
> > }
> > goto uncharge_out;
> > }
> > @@ -3031,9 +3031,9 @@ static int do_cow_fault(struct mm_struct *mm, struct vm_area_struct *vma,
> > } else {
> > /*
> > * The fault handler has no page to lock, so it holds
> > - * i_mmap_lock for write to protect against truncate.
> > + * i_mmap_lock for read to protect against truncate.
> > */
> > - i_mmap_unlock_write(vma->vm_file->f_mapping);
> > + i_mmap_unlock_read(vma->vm_file->f_mapping);
> > }
> > return ret;
> > uncharge_out:
>
> Looks good to me. I'll include this with some other DAX fixes I have pending.

Looks good to me as well. Thanks for catching this.

2015-11-17 10:40:23

by Boaz Harrosh

[permalink] [raw]
Subject: Re: [PATCH] mm, dax: fix DAX deadlocks (COW fault)

On 11/16/2015 08:34 PM, Ross Zwisler wrote:
> On Mon, Nov 16, 2015 at 10:15:56AM -0800, Dan Williams wrote:
>> On Mon, Nov 16, 2015 at 4:09 AM, Yigal Korman <[email protected]> wrote:
>>> DAX handling of COW faults has wrong locking sequence:
>>> dax_fault does i_mmap_lock_read
>>> do_cow_fault does i_mmap_unlock_write
>>>
>>> Ross's commit[1] missed a fix[2] that Kirill added to Matthew's
>>> commit[3].
>>>
>>> Original COW locking logic was introduced by Matthew here[4].
>>>
>>> This should be applied to v4.3 as well.
>>>
>>> [1] 0f90cc6609c7 mm, dax: fix DAX deadlocks
>>> [2] 52a2b53ffde6 mm, dax: use i_mmap_unlock_write() in do_cow_fault()
>>> [3] 843172978bb9 dax: fix race between simultaneous faults
>>> [4] 2e4cdab0584f mm: allow page fault handlers to perform the COW
>>>
>>> Signed-off-by: Yigal Korman <[email protected]>
>>>
>>> Cc: Stable Tree <[email protected]>
>>> Cc: Boaz Harrosh <[email protected]>
>>> Cc: Ross Zwisler <[email protected]>
>>> Cc: Alexander Viro <[email protected]>
>>> Cc: Dan Williams <[email protected]>
>>> Cc: Dave Chinner <[email protected]>
>>> Cc: Jan Kara <[email protected]>
>>> Cc: "Kirill A. Shutemov" <[email protected]>
>>> Cc: Matthew Wilcox <[email protected]>
>>> ---
>>> mm/memory.c | 8 ++++----
>>> 1 file changed, 4 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/mm/memory.c b/mm/memory.c
>>> index c716913..e5071af 100644
>>> --- a/mm/memory.c
>>> +++ b/mm/memory.c
>>> @@ -3015,9 +3015,9 @@ static int do_cow_fault(struct mm_struct *mm, struct vm_area_struct *vma,
>>> } else {
>>> /*
>>> * The fault handler has no page to lock, so it holds
>>> - * i_mmap_lock for write to protect against truncate.
>>> + * i_mmap_lock for read to protect against truncate.
>>> */
>>> - i_mmap_unlock_write(vma->vm_file->f_mapping);
>>> + i_mmap_unlock_read(vma->vm_file->f_mapping);
>>> }
>>> goto uncharge_out;
>>> }
>>> @@ -3031,9 +3031,9 @@ static int do_cow_fault(struct mm_struct *mm, struct vm_area_struct *vma,
>>> } else {
>>> /*
>>> * The fault handler has no page to lock, so it holds
>>> - * i_mmap_lock for write to protect against truncate.
>>> + * i_mmap_lock for read to protect against truncate.
>>> */
>>> - i_mmap_unlock_write(vma->vm_file->f_mapping);
>>> + i_mmap_unlock_read(vma->vm_file->f_mapping);
>>> }
>>> return ret;
>>> uncharge_out:
>>
>> Looks good to me. I'll include this with some other DAX fixes I have pending.
>
> Looks good to me as well. Thanks for catching this.
>

Yes. None of the xfstests catch this. It needs a private-mapping mmap in some combination
of other activity on the file at the same time.

Which the linker of gcc does. We have a test of a git clone Kernel-tree and make. which
catches this in the make phase. For some reason on ext4 it is reliable to crash but on xfs 1/2
the runs go through, go figure.

Thanks Yigal for the fast fix
Boaz