2017-01-10 15:48:06

by Christoph Hellwig

[permalink] [raw]
Subject: fix write synchronization for DAX

While I've fixed both ext4 and XFS to not incorrectly allow parallel
writers when mounting with -o dax ext4 still has this issue after the
iomap conversion.

Patch 1 fixes it, and patch 2 adds a lockdep assert to catch any new
file systems copy and pasting from the direct I/O path.


2017-01-10 15:48:14

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 1/2] ext4: fix DAX write locking

Unlike O_DIRECT DAX is not an optional opt-in feature selected by the
application, so we'll have to provide the traditional synchronіzation
of overlapping writes as we do for buffered writes.

This was broken historically for DAX, but got fixed for ext2 and XFS
as part of the iomap conversion. Fix up ext4 as well.

Signed-off-by: Christoph Hellwig <[email protected]>
---
fs/ext4/file.c | 10 +---------
1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index d663d3d..a1e88aa 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -175,7 +175,6 @@ ext4_dax_write_iter(struct kiocb *iocb, struct iov_iter *from)
{
struct inode *inode = file_inode(iocb->ki_filp);
ssize_t ret;
- bool overwrite = false;

inode_lock(inode);
ret = ext4_write_checks(iocb, from);
@@ -188,16 +187,9 @@ ext4_dax_write_iter(struct kiocb *iocb, struct iov_iter *from)
if (ret)
goto out;

- if (ext4_overwrite_io(inode, iocb->ki_pos, iov_iter_count(from))) {
- overwrite = true;
- downgrade_write(&inode->i_rwsem);
- }
ret = dax_iomap_rw(iocb, from, &ext4_iomap_ops);
out:
- if (!overwrite)
- inode_unlock(inode);
- else
- inode_unlock_shared(inode);
+ inode_unlock(inode);
if (ret > 0)
ret = generic_write_sync(iocb, ret);
return ret;
--
2.1.4


2017-01-10 15:48:08

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 2/2] dax: assert that i_rwsem is held exclusive for writes

Make sure all callers follow the same locking protocol, given that DAX
transparantly replaced the normal buffered I/O path.

Signed-off-by: Christoph Hellwig <[email protected]>
---
fs/dax.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/dax.c b/fs/dax.c
index 5c74f60..04734da 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -1061,8 +1061,12 @@ dax_iomap_rw(struct kiocb *iocb, struct iov_iter *iter,
loff_t pos = iocb->ki_pos, ret = 0, done = 0;
unsigned flags = 0;

- if (iov_iter_rw(iter) == WRITE)
+ if (iov_iter_rw(iter) == WRITE) {
+ lockdep_assert_held_exclusive(&inode->i_rwsem);
flags |= IOMAP_WRITE;
+ } else {
+ lockdep_assert_held(&inode->i_rwsem);
+ }

while (iov_iter_count(iter)) {
ret = iomap_apply(inode, pos, iov_iter_count(iter), flags, ops,
--
2.1.4


2017-01-11 09:01:36

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH 1/2] ext4: fix DAX write locking

On Tue 10-01-17 16:48:07, Christoph Hellwig wrote:
> Unlike O_DIRECT DAX is not an optional opt-in feature selected by the
> application, so we'll have to provide the traditional synchronіzation
> of overlapping writes as we do for buffered writes.
>
> This was broken historically for DAX, but got fixed for ext2 and XFS
> as part of the iomap conversion. Fix up ext4 as well.
>
> Signed-off-by: Christoph Hellwig <[email protected]>

Makes sense. You can add:

Reviewed-by: Jan Kara <[email protected]>

Ted, can you please pick it up? Thanks!

Honza

> ---
> fs/ext4/file.c | 10 +---------
> 1 file changed, 1 insertion(+), 9 deletions(-)
>
> diff --git a/fs/ext4/file.c b/fs/ext4/file.c
> index d663d3d..a1e88aa 100644
> --- a/fs/ext4/file.c
> +++ b/fs/ext4/file.c
> @@ -175,7 +175,6 @@ ext4_dax_write_iter(struct kiocb *iocb, struct iov_iter *from)
> {
> struct inode *inode = file_inode(iocb->ki_filp);
> ssize_t ret;
> - bool overwrite = false;
>
> inode_lock(inode);
> ret = ext4_write_checks(iocb, from);
> @@ -188,16 +187,9 @@ ext4_dax_write_iter(struct kiocb *iocb, struct iov_iter *from)
> if (ret)
> goto out;
>
> - if (ext4_overwrite_io(inode, iocb->ki_pos, iov_iter_count(from))) {
> - overwrite = true;
> - downgrade_write(&inode->i_rwsem);
> - }
> ret = dax_iomap_rw(iocb, from, &ext4_iomap_ops);
> out:
> - if (!overwrite)
> - inode_unlock(inode);
> - else
> - inode_unlock_shared(inode);
> + inode_unlock(inode);
> if (ret > 0)
> ret = generic_write_sync(iocb, ret);
> return ret;
> --
> 2.1.4
>
--
Jan Kara <[email protected]>
SUSE Labs, CR

2017-01-11 09:02:50

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH 2/2] dax: assert that i_rwsem is held exclusive for writes

On Tue 10-01-17 16:48:08, Christoph Hellwig wrote:
> Make sure all callers follow the same locking protocol, given that DAX
> transparantly replaced the normal buffered I/O path.
>
> Signed-off-by: Christoph Hellwig <[email protected]>

Looks good. You can add:

Reviewed-by: Jan Kara <[email protected]>

Probably also for Ted since it depends on the ext4 fix...

Honza
> ---
> fs/dax.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/fs/dax.c b/fs/dax.c
> index 5c74f60..04734da 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -1061,8 +1061,12 @@ dax_iomap_rw(struct kiocb *iocb, struct iov_iter *iter,
> loff_t pos = iocb->ki_pos, ret = 0, done = 0;
> unsigned flags = 0;
>
> - if (iov_iter_rw(iter) == WRITE)
> + if (iov_iter_rw(iter) == WRITE) {
> + lockdep_assert_held_exclusive(&inode->i_rwsem);
> flags |= IOMAP_WRITE;
> + } else {
> + lockdep_assert_held(&inode->i_rwsem);
> + }
>
> while (iov_iter_count(iter)) {
> ret = iomap_apply(inode, pos, iov_iter_count(iter), flags, ops,
> --
> 2.1.4
>
--
Jan Kara <[email protected]>
SUSE Labs, CR

2017-02-08 17:14:10

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 1/2] ext4: fix DAX write locking

On Wed, Jan 11, 2017 at 10:01:36AM +0100, Jan Kara wrote:
> On Tue 10-01-17 16:48:07, Christoph Hellwig wrote:
> > Unlike O_DIRECT DAX is not an optional opt-in feature selected by the
> > application, so we'll have to provide the traditional synchronіzation
> > of overlapping writes as we do for buffered writes.
> >
> > This was broken historically for DAX, but got fixed for ext2 and XFS
> > as part of the iomap conversion. Fix up ext4 as well.
> >
> > Signed-off-by: Christoph Hellwig <[email protected]>
>
> Makes sense. You can add:
>
> Reviewed-by: Jan Kara <[email protected]>
>
> Ted, can you please pick it up? Thanks!

Did this get picked up?
_______________________________________________
Linux-nvdimm mailing list
[email protected]
https://lists.01.org/mailman/listinfo/linux-nvdimm

2017-02-08 19:38:32

by Theodore Y. Ts'o

[permalink] [raw]
Subject: Re: [PATCH 1/2] ext4: fix DAX write locking

On Wed, Feb 08, 2017 at 09:14:10AM -0800, Christoph Hellwig wrote:
> On Wed, Jan 11, 2017 at 10:01:36AM +0100, Jan Kara wrote:
> > On Tue 10-01-17 16:48:07, Christoph Hellwig wrote:
> > > Unlike O_DIRECT DAX is not an optional opt-in feature selected by the
> > > application, so we'll have to provide the traditional synchronіzation
> > > of overlapping writes as we do for buffered writes.
> > >
> > > This was broken historically for DAX, but got fixed for ext2 and XFS
> > > as part of the iomap conversion. Fix up ext4 as well.
> > >
> > > Signed-off-by: Christoph Hellwig <[email protected]>
> >
> > Makes sense. You can add:
> >
> > Reviewed-by: Jan Kara <[email protected]>
> >
> > Ted, can you please pick it up? Thanks!
>
> Did this get picked up?

Sorry, I missed Jan's request to pick it up. I had assumed it would
be going in via the dax tree. I'll take a look at it now for the ext4
tree.

- Ted

_______________________________________________
Linux-nvdimm mailing list
[email protected]
https://lists.01.org/mailman/listinfo/linux-nvdimm

2017-02-08 19:41:25

by Theodore Y. Ts'o

[permalink] [raw]
Subject: Re: [PATCH 1/2] ext4: fix DAX write locking

On Wed, Jan 11, 2017 at 10:01:36AM +0100, Jan Kara wrote:
> On Tue 10-01-17 16:48:07, Christoph Hellwig wrote:
> > Unlike O_DIRECT DAX is not an optional opt-in feature selected by the
> > application, so we'll have to provide the traditional synchronіzation
> > of overlapping writes as we do for buffered writes.
> >
> > This was broken historically for DAX, but got fixed for ext2 and XFS
> > as part of the iomap conversion. Fix up ext4 as well.
> >
> > Signed-off-by: Christoph Hellwig <[email protected]>
>
> Makes sense. You can add:
>
> Reviewed-by: Jan Kara <[email protected]>
>
> Ted, can you please pick it up? Thanks!

Thanks, applied to the ext4 tree.

- Ted
_______________________________________________
Linux-nvdimm mailing list
[email protected]
https://lists.01.org/mailman/listinfo/linux-nvdimm

2017-02-08 19:43:34

by Theodore Y. Ts'o

[permalink] [raw]
Subject: Re: [PATCH 2/2] dax: assert that i_rwsem is held exclusive for writes

On Wed, Jan 11, 2017 at 10:02:50AM +0100, Jan Kara wrote:
> On Tue 10-01-17 16:48:08, Christoph Hellwig wrote:
> > Make sure all callers follow the same locking protocol, given that DAX
> > transparantly replaced the normal buffered I/O path.
> >
> > Signed-off-by: Christoph Hellwig <[email protected]>
>
> Looks good. You can add:
>
> Reviewed-by: Jan Kara <[email protected]>
>
> Probably also for Ted since it depends on the ext4 fix...

Thanks, applied to the ext4 tree.

- Ted