2020-04-15 08:33:01

by Ira Weiny

[permalink] [raw]
Subject: [PATCH RFC 0/8] Enable ext4 support for per-file/directory DAX operations

From: Ira Weiny <[email protected]>

Enable the same per file DAX support to ext4 as was done for xfs. This series
builds and depends on the V7 series for xfs.[1]

To summarize:

1. There exists an in-kernel access mode flag S_DAX that is set when
file accesses go directly to persistent memory, bypassing the page
cache. Applications must call statx to discover the current S_DAX
state (STATX_ATTR_DAX).

2. There exists an advisory file inode flag FS_XFLAG_DAX that is
inherited from the parent directory FS_XFLAG_DAX inode flag at file
creation time. This advisory flag can be set or cleared at any
time, but doing so does not immediately affect the S_DAX state.

Unless overridden by mount options (see (3)), if FS_XFLAG_DAX is set
and the fs is on pmem then it will enable S_DAX at inode load time;
if FS_XFLAG_DAX is not set, it will not enable S_DAX.

3. There exists a dax= mount option.

"-o dax=never" means "never set S_DAX, ignore FS_XFLAG_DAX."

"-o dax=always" means "always set S_DAX (at least on pmem),
and ignore FS_XFLAG_DAX."

"-o dax" is an alias for "dax=always".

"-o dax=inode" means "follow FS_XFLAG_DAX" and is the default.

4. There exists an advisory directory inode flag FS_XFLAG_DAX that can
be set or cleared at any time. The flag state is inherited by any files or
subdirectories when they are created within that directory.

5. Programs that require a specific file access mode (DAX or not DAX)
can do one of the following:

(a) Create files in directories that the FS_XFLAG_DAX flag set as
needed; or

(b) Have the administrator set an override via mount option; or

(c) Set or clear the file's FS_XFLAG_DAX flag as needed. Programs
must then cause the kernel to evict the inode from memory. This
can be done by:

i> Closing the file and re-opening the file and using statx to
see if the fs has changed the S_DAX flag; and

ii> If the file still does not have the desired S_DAX access
mode, either unmount and remount the filesystem, or close
the file and use drop_caches.

6. It is expected that users who want to squeeze every last bit of performance
out of the particular rough and tumble bits of their storage will also be
exposed to the difficulties of what happens when the operating system can't
totally virtualize those hardware capabilities. DAX is such a feature.


[1] https://lore.kernel.org/lkml/[email protected]/

To: [email protected]
Cc: "Darrick J. Wong" <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: "Theodore Y. Ts'o" <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]


Ira Weiny (8):
fs/ext4: Narrow scope of DAX check in setflags
fs/ext4: Disallow verity if inode is DAX
fs/ext4: Disallow encryption if inode is DAX
fs/ext4: Introduce DAX inode flag
fs/ext4: Make DAX mount option a tri-state
fs/ext4: Update ext4_should_use_dax()
fs/ext4: Only change S_DAX on inode load
Documentation/dax: Update DAX enablement for ext4

Documentation/filesystems/dax.txt | 13 +------
fs/ext4/ext4.h | 16 ++++++---
fs/ext4/ialloc.c | 2 +-
fs/ext4/inode.c | 22 ++++++++----
fs/ext4/ioctl.c | 28 ++++++++++++---
fs/ext4/super.c | 57 +++++++++++++++++++++++--------
fs/ext4/verity.c | 5 ++-
7 files changed, 99 insertions(+), 44 deletions(-)

--
2.25.1


2020-04-15 08:33:49

by Ira Weiny

[permalink] [raw]
Subject: [PATCH RFC 6/8] fs/ext4: Update ext4_should_use_dax()

From: Ira Weiny <[email protected]>

Change the logic of ext4_should_use_dax() to support using the inode dax
flag OR the overriding tri-state mount option.

While we are at it change the function to ext4_enable_dax() as this
better reflects the ask.

Signed-off-by: Ira Weiny <[email protected]>
---
fs/ext4/inode.c | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index fa0ff78dc033..e9d582e516bc 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4383,9 +4383,11 @@ int ext4_get_inode_loc(struct inode *inode, struct ext4_iloc *iloc)
!ext4_test_inode_state(inode, EXT4_STATE_XATTR));
}

-static bool ext4_should_use_dax(struct inode *inode)
+static bool ext4_enable_dax(struct inode *inode)
{
- if (!test_opt(inode->i_sb, DAX))
+ unsigned int flags = EXT4_I(inode)->i_flags;
+
+ if (test_opt2(inode->i_sb, NODAX))
return false;
if (!S_ISREG(inode->i_mode))
return false;
@@ -4397,7 +4399,13 @@ static bool ext4_should_use_dax(struct inode *inode)
return false;
if (ext4_test_inode_flag(inode, EXT4_INODE_VERITY))
return false;
- return true;
+ if (!bdev_dax_supported(inode->i_sb->s_bdev,
+ inode->i_sb->s_blocksize))
+ return false;
+ if (test_opt(inode->i_sb, DAX))
+ return true;
+
+ return (flags & EXT4_DAX_FL) == EXT4_DAX_FL;
}

void ext4_set_inode_flags(struct inode *inode)
@@ -4415,7 +4423,7 @@ void ext4_set_inode_flags(struct inode *inode)
new_fl |= S_NOATIME;
if (flags & EXT4_DIRSYNC_FL)
new_fl |= S_DIRSYNC;
- if (ext4_should_use_dax(inode))
+ if (ext4_enable_dax(inode))
new_fl |= S_DAX;
if (flags & EXT4_ENCRYPT_FL)
new_fl |= S_ENCRYPTED;
--
2.25.1

2020-04-15 08:34:48

by Ira Weiny

[permalink] [raw]
Subject: [PATCH RFC 5/8] fs/ext4: Make DAX mount option a tri-state

From: Ira Weiny <[email protected]>

We add 'always', 'never', and 'inode' (default). '-o dax' continue to
operate the same.

Specifically we introduce a 2nd DAX mount flag EXT4_MOUNT_NODAX and set
it and EXT4_MOUNT_DAX appropriately.

We also force EXT4_MOUNT_NODAX if !CONFIG_FS_DAX.

https://lore.kernel.org/lkml/[email protected]/

Signed-off-by: Ira Weiny <[email protected]>
---
fs/ext4/ext4.h | 1 +
fs/ext4/super.c | 43 +++++++++++++++++++++++++++++++++++++++----
2 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 434021fcec88..aadf33d5b37d 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1174,6 +1174,7 @@ struct ext4_inode_info {
blocks */
#define EXT4_MOUNT2_HURD_COMPAT 0x00000004 /* Support HURD-castrated
file systems */
+#define EXT4_MOUNT2_NODAX 0x00000008 /* Do not allow Direct Access */

#define EXT4_MOUNT2_EXPLICIT_JOURNAL_CHECKSUM 0x00000008 /* User explicitly
specified journal checksum */
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index b14863058115..0175fbfeb2ea 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1510,6 +1510,7 @@ enum {
Opt_jqfmt_vfsold, Opt_jqfmt_vfsv0, Opt_jqfmt_vfsv1, Opt_quota,
Opt_noquota, Opt_barrier, Opt_nobarrier, Opt_err,
Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version, Opt_dax,
+ Opt_dax_str,
Opt_stripe, Opt_delalloc, Opt_nodelalloc, Opt_warn_on_error,
Opt_nowarn_on_error, Opt_mblk_io_submit,
Opt_lazytime, Opt_nolazytime, Opt_debug_want_extra_isize,
@@ -1575,6 +1576,7 @@ static const match_table_t tokens = {
{Opt_barrier, "barrier"},
{Opt_nobarrier, "nobarrier"},
{Opt_i_version, "i_version"},
+ {Opt_dax_str, "dax=%s"},
{Opt_dax, "dax"},
{Opt_stripe, "stripe=%u"},
{Opt_delalloc, "delalloc"},
@@ -1772,6 +1774,7 @@ static const struct mount_opts {
{Opt_min_batch_time, 0, MOPT_GTE0},
{Opt_inode_readahead_blks, 0, MOPT_GTE0},
{Opt_init_itable, 0, MOPT_GTE0},
+ {Opt_dax_str, 0, MOPT_STRING},
{Opt_dax, EXT4_MOUNT_DAX, MOPT_SET},
{Opt_stripe, 0, MOPT_GTE0},
{Opt_resuid, 0, MOPT_GTE0},
@@ -2081,13 +2084,32 @@ static int handle_mount_opt(struct super_block *sb, char *opt, int token,
}
sbi->s_jquota_fmt = m->mount_opt;
#endif
- } else if (token == Opt_dax) {
+ } else if (token == Opt_dax || token == Opt_dax_str) {
#ifdef CONFIG_FS_DAX
- ext4_msg(sb, KERN_WARNING,
- "DAX enabled. Warning: EXPERIMENTAL, use at your own risk");
- sbi->s_mount_opt |= m->mount_opt;
+ char *tmp = match_strdup(&args[0]);
+
+ if (!tmp || !strcmp(tmp, "always")) {
+ ext4_msg(sb, KERN_WARNING,
+ "DAX enabled. Warning: EXPERIMENTAL, use at your own risk");
+ sbi->s_mount_opt |= EXT4_MOUNT_DAX;
+ sbi->s_mount_opt2 &= ~EXT4_MOUNT2_NODAX;
+ } else if (!strcmp(tmp, "never")) {
+ sbi->s_mount_opt2 |= EXT4_MOUNT2_NODAX;
+ sbi->s_mount_opt &= ~EXT4_MOUNT_DAX;
+ } else if (!strcmp(tmp, "inode")) {
+ sbi->s_mount_opt &= ~EXT4_MOUNT_DAX;
+ sbi->s_mount_opt2 &= ~EXT4_MOUNT2_NODAX;
+ } else {
+ ext4_msg(sb, KERN_WARNING, "DAX invalid option.");
+ kfree(tmp);
+ return -1;
+ }
+
+ kfree(tmp);
#else
ext4_msg(sb, KERN_INFO, "dax option not supported");
+ sbi->s_mount_opt2 |= EXT4_MOUNT2_NODAX;
+ sbi->s_mount_opt &= ~EXT4_MOUNT_DAX;
return -1;
#endif
} else if (token == Opt_data_err_abort) {
@@ -2303,6 +2325,13 @@ static int _ext4_show_options(struct seq_file *seq, struct super_block *sb,
if (DUMMY_ENCRYPTION_ENABLED(sbi))
SEQ_OPTS_PUTS("test_dummy_encryption");

+ if (test_opt2(sb, NODAX))
+ SEQ_OPTS_PUTS("dax=never");
+ else if (test_opt(sb, DAX))
+ SEQ_OPTS_PUTS("dax=always");
+ else
+ SEQ_OPTS_PUTS("dax=inode");
+
ext4_show_quota_options(seq, sb);
return 0;
}
@@ -5424,6 +5453,12 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data)
sbi->s_mount_opt ^= EXT4_MOUNT_DAX;
}

+ if ((sbi->s_mount_opt2 ^ old_opts.s_mount_opt2) & EXT4_MOUNT2_NODAX) {
+ ext4_msg(sb, KERN_WARNING, "warning: refusing change of "
+ "non-dax flag with busy inodes while remounting");
+ sbi->s_mount_opt2 ^= EXT4_MOUNT2_NODAX;
+ }
+
if (sbi->s_mount_flags & EXT4_MF_FS_ABORTED)
ext4_abort(sb, "Abort forced by user");

--
2.25.1

2020-04-15 08:35:00

by Ira Weiny

[permalink] [raw]
Subject: [PATCH RFC 3/8] fs/ext4: Disallow encryption if inode is DAX

From: Ira Weiny <[email protected]>

Encryption and DAX are incompatible. Changing the DAX mode due to a
change in Encryption mode is wrong without a corresponding
address_space_operations update.

Make the 2 options mutually exclusive by returning an error if DAX was
set first.

Signed-off-by: Ira Weiny <[email protected]>
---
fs/ext4/super.c | 10 +---------
1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 0c7c4adb664e..b14863058115 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1325,7 +1325,7 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
if (inode->i_ino == EXT4_ROOT_INO)
return -EPERM;

- if (WARN_ON_ONCE(IS_DAX(inode) && i_size_read(inode)))
+ if (WARN_ON_ONCE(IS_DAX(inode)))
return -EINVAL;

res = ext4_convert_inline_data(inode);
@@ -1349,10 +1349,6 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT);
ext4_clear_inode_state(inode,
EXT4_STATE_MAY_INLINE_DATA);
- /*
- * Update inode->i_flags - S_ENCRYPTED will be enabled,
- * S_DAX may be disabled
- */
ext4_set_inode_flags(inode);
}
return res;
@@ -1376,10 +1372,6 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
ctx, len, 0);
if (!res) {
ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT);
- /*
- * Update inode->i_flags - S_ENCRYPTED will be enabled,
- * S_DAX may be disabled
- */
ext4_set_inode_flags(inode);
res = ext4_mark_inode_dirty(handle, inode);
if (res)
--
2.25.1

2020-04-15 08:36:02

by Ira Weiny

[permalink] [raw]
Subject: [PATCH RFC 8/8] Documentation/dax: Update DAX enablement for ext4

From: Ira Weiny <[email protected]>

Update the document to reflect ext4 and xfs now behave the same.

Signed-off-by: Ira Weiny <[email protected]>
---
Documentation/filesystems/dax.txt | 13 +------------
1 file changed, 1 insertion(+), 12 deletions(-)

diff --git a/Documentation/filesystems/dax.txt b/Documentation/filesystems/dax.txt
index af14c1b330a9..5f28b22d2815 100644
--- a/Documentation/filesystems/dax.txt
+++ b/Documentation/filesystems/dax.txt
@@ -22,18 +22,7 @@ on it as usual. The DAX code currently only supports files with a block
size equal to your kernel's PAGE_SIZE, so you may need to specify a block
size when creating the file system.

-Currently 2 filesystems support DAX, ext4 and xfs. Enabling DAX on them is
-different at this time.
-
-Enabling DAX on ext4
---------------------
-
-When mounting the filesystem, use the "-o dax" option on the command line or
-add 'dax' to the options in /etc/fstab.
-
-
-Enabling DAX on xfs
--------------------
+Currently 2 filesystems support DAX, ext4 and xfs.

Summary
-------
--
2.25.1

2020-04-15 23:09:14

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH RFC 3/8] fs/ext4: Disallow encryption if inode is DAX

On Mon 13-04-20 21:00:25, [email protected] wrote:
> From: Ira Weiny <[email protected]>
>
> Encryption and DAX are incompatible. Changing the DAX mode due to a
> change in Encryption mode is wrong without a corresponding
> address_space_operations update.
>
> Make the 2 options mutually exclusive by returning an error if DAX was
> set first.
>
> Signed-off-by: Ira Weiny <[email protected]>
> ---
> fs/ext4/super.c | 10 +---------
> 1 file changed, 1 insertion(+), 9 deletions(-)
>
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 0c7c4adb664e..b14863058115 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -1325,7 +1325,7 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
> if (inode->i_ino == EXT4_ROOT_INO)
> return -EPERM;
>
> - if (WARN_ON_ONCE(IS_DAX(inode) && i_size_read(inode)))
> + if (WARN_ON_ONCE(IS_DAX(inode)))

Also here I don't think WARN_ON_ONCE() is warranted once we allow per-inode
setting of DAX. It will then become a regular error condition...

Honza

> return -EINVAL;
>
> res = ext4_convert_inline_data(inode);
> @@ -1349,10 +1349,6 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
> ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT);
> ext4_clear_inode_state(inode,
> EXT4_STATE_MAY_INLINE_DATA);
> - /*
> - * Update inode->i_flags - S_ENCRYPTED will be enabled,
> - * S_DAX may be disabled
> - */
> ext4_set_inode_flags(inode);
> }
> return res;
> @@ -1376,10 +1372,6 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
> ctx, len, 0);
> if (!res) {
> ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT);
> - /*
> - * Update inode->i_flags - S_ENCRYPTED will be enabled,
> - * S_DAX may be disabled
> - */
> ext4_set_inode_flags(inode);
> res = ext4_mark_inode_dirty(handle, inode);
> if (res)
> --
> 2.25.1
>
--
Jan Kara <[email protected]>
SUSE Labs, CR

2020-04-15 23:54:31

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH RFC 5/8] fs/ext4: Make DAX mount option a tri-state

On Mon 13-04-20 21:00:27, [email protected] wrote:
> From: Ira Weiny <[email protected]>
>
> We add 'always', 'never', and 'inode' (default). '-o dax' continue to
> operate the same.
>
> Specifically we introduce a 2nd DAX mount flag EXT4_MOUNT_NODAX and set
> it and EXT4_MOUNT_DAX appropriately.
>
> We also force EXT4_MOUNT_NODAX if !CONFIG_FS_DAX.
>
> https://lore.kernel.org/lkml/[email protected]/
>
> Signed-off-by: Ira Weiny <[email protected]>

...

> @@ -2303,6 +2325,13 @@ static int _ext4_show_options(struct seq_file *seq, struct super_block *sb,
> if (DUMMY_ENCRYPTION_ENABLED(sbi))
> SEQ_OPTS_PUTS("test_dummy_encryption");
>
> + if (test_opt2(sb, NODAX))
> + SEQ_OPTS_PUTS("dax=never");
> + else if (test_opt(sb, DAX))
> + SEQ_OPTS_PUTS("dax=always");
> + else
> + SEQ_OPTS_PUTS("dax=inode");
> +

We try to show only mount options that were explicitely set by the user, or
that are different from defaults - e.g., see how 'data=' mount option
printing is handled.

> ext4_show_quota_options(seq, sb);
> return 0;
> }
> @@ -5424,6 +5453,12 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data)
> sbi->s_mount_opt ^= EXT4_MOUNT_DAX;
> }
>
> + if ((sbi->s_mount_opt2 ^ old_opts.s_mount_opt2) & EXT4_MOUNT2_NODAX) {
> + ext4_msg(sb, KERN_WARNING, "warning: refusing change of "
> + "non-dax flag with busy inodes while remounting");
> + sbi->s_mount_opt2 ^= EXT4_MOUNT2_NODAX;
> + }
> +
> if (sbi->s_mount_flags & EXT4_MF_FS_ABORTED)
> ext4_abort(sb, "Abort forced by user");

I'd just merge this with the check whether EXT4_MOUNT_DAX changed.

Honza

--
Jan Kara <[email protected]>
SUSE Labs, CR

2020-04-16 00:06:25

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH RFC 6/8] fs/ext4: Update ext4_should_use_dax()

On Mon 13-04-20 21:00:28, [email protected] wrote:
> From: Ira Weiny <[email protected]>
>
> Change the logic of ext4_should_use_dax() to support using the inode dax
> flag OR the overriding tri-state mount option.
>
> While we are at it change the function to ext4_enable_dax() as this
> better reflects the ask.

I disagree with the renaming. ext4_enable_dax() suggests it enables
something. It does not. I'd either leave ext4_should_use_dax() or maybe
change it to ext4_should_enable_dax() if you really like the "enable" word
:).

>
> Signed-off-by: Ira Weiny <[email protected]>
> ---
> fs/ext4/inode.c | 16 ++++++++++++----
> 1 file changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index fa0ff78dc033..e9d582e516bc 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -4383,9 +4383,11 @@ int ext4_get_inode_loc(struct inode *inode, struct ext4_iloc *iloc)
> !ext4_test_inode_state(inode, EXT4_STATE_XATTR));
> }
>
> -static bool ext4_should_use_dax(struct inode *inode)
> +static bool ext4_enable_dax(struct inode *inode)
> {
> - if (!test_opt(inode->i_sb, DAX))
> + unsigned int flags = EXT4_I(inode)->i_flags;
> +
> + if (test_opt2(inode->i_sb, NODAX))
> return false;
> if (!S_ISREG(inode->i_mode))
> return false;
> @@ -4397,7 +4399,13 @@ static bool ext4_should_use_dax(struct inode *inode)
> return false;
> if (ext4_test_inode_flag(inode, EXT4_INODE_VERITY))
> return false;
> - return true;
> + if (!bdev_dax_supported(inode->i_sb->s_bdev,
> + inode->i_sb->s_blocksize))
> + return false;
> + if (test_opt(inode->i_sb, DAX))
> + return true;
> +
> + return (flags & EXT4_DAX_FL) == EXT4_DAX_FL;

flags & EXT4_DAX_FL is enough here, isn't it?

Honza

> }
>
> void ext4_set_inode_flags(struct inode *inode)
> @@ -4415,7 +4423,7 @@ void ext4_set_inode_flags(struct inode *inode)
> new_fl |= S_NOATIME;
> if (flags & EXT4_DIRSYNC_FL)
> new_fl |= S_DIRSYNC;
> - if (ext4_should_use_dax(inode))
> + if (ext4_enable_dax(inode))
> new_fl |= S_DAX;
> if (flags & EXT4_ENCRYPT_FL)
> new_fl |= S_ENCRYPTED;
> --
> 2.25.1
>
--
Jan Kara <[email protected]>
SUSE Labs, CR

2020-04-16 00:16:44

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH RFC 3/8] fs/ext4: Disallow encryption if inode is DAX

On Mon, Apr 13, 2020 at 09:00:25PM -0700, [email protected] wrote:
> From: Ira Weiny <[email protected]>
>
> Encryption and DAX are incompatible. Changing the DAX mode due to a
> change in Encryption mode is wrong without a corresponding
> address_space_operations update.
>
> Make the 2 options mutually exclusive by returning an error if DAX was
> set first.
>
> Signed-off-by: Ira Weiny <[email protected]>

The encryption flag is inherited from the containing directory, and
directories can't have the DAX flag set, so anything we do in
ext4_set_context() will be safety belt / sanity checking in nature.

But we *do* need to figure out what we do with mount -o dax=always
when the file system might have encrypted files. My previous comments
about the verity flag and dax flag applies here.

Also note that encrypted files are read/write so we must never allow
the combination of ENCRPYT_FL and DAX_FL. So that may be something
where we should teach __ext4_iget() to check for this, and declare the
file system as corrupted if it sees this combination. (For VERITY_FL
&& DAX_FL that is a combo that we might want to support in the future,
so that's probably a case where arguably, we should just ignore the
DAX_FL for now.)

- Ted

2020-04-16 00:28:57

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH RFC 3/8] fs/ext4: Disallow encryption if inode is DAX

On Wed, Apr 15, 2020 at 9:03 AM Theodore Y. Ts'o <[email protected]> wrote:
>
> On Mon, Apr 13, 2020 at 09:00:25PM -0700, [email protected] wrote:
> > From: Ira Weiny <[email protected]>
> >
> > Encryption and DAX are incompatible. Changing the DAX mode due to a
> > change in Encryption mode is wrong without a corresponding
> > address_space_operations update.
> >
> > Make the 2 options mutually exclusive by returning an error if DAX was
> > set first.
> >
> > Signed-off-by: Ira Weiny <[email protected]>
>
> The encryption flag is inherited from the containing directory, and
> directories can't have the DAX flag set, so anything we do in
> ext4_set_context() will be safety belt / sanity checking in nature.
>
> But we *do* need to figure out what we do with mount -o dax=always
> when the file system might have encrypted files. My previous comments
> about the verity flag and dax flag applies here.
>
> Also note that encrypted files are read/write so we must never allow
> the combination of ENCRPYT_FL and DAX_FL. So that may be something
> where we should teach __ext4_iget() to check for this, and declare the
> file system as corrupted if it sees this combination. (For VERITY_FL
> && DAX_FL that is a combo that we might want to support in the future,
> so that's probably a case where arguably, we should just ignore the
> DAX_FL for now.)

We also have a pending consideration for what MKTME (inline memory
encryption with programmable hardware keys) means for file-encryption
+ dax. Certainly kernel based software encryption is incompatible with
dax, but one of the hallway track discussions I wanted to have at LSF
is whether fscrypt is the right interface for managing inline memory
encryption. For now, disallowing ENCRPYT_FL + DAX_FL seems ok if
ENCRPYT_FL always means software encryption, but this is something to
circle back to once we get MKTME implemented for volume encryption and
start to look at finer grained (per-directory key) encryption.

2020-04-16 01:04:41

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH RFC 3/8] fs/ext4: Disallow encryption if inode is DAX

On Wed, Apr 15, 2020 at 12:03:07PM -0400, Theodore Y. Ts'o wrote:
> On Mon, Apr 13, 2020 at 09:00:25PM -0700, [email protected] wrote:
> > From: Ira Weiny <[email protected]>
> >
> > Encryption and DAX are incompatible. Changing the DAX mode due to a
> > change in Encryption mode is wrong without a corresponding
> > address_space_operations update.
> >
> > Make the 2 options mutually exclusive by returning an error if DAX was
> > set first.
> >
> > Signed-off-by: Ira Weiny <[email protected]>
>
> The encryption flag is inherited from the containing directory, and
> directories can't have the DAX flag set,

But they can have FS_XFLAG_DAX set.

> so anything we do in
> ext4_set_context() will be safety belt / sanity checking in nature.
>
> But we *do* need to figure out what we do with mount -o dax=always
> when the file system might have encrypted files. My previous comments
> about the verity flag and dax flag applies here.

:-( agreed.

FWIW without these patches an inode which has encrypt or verity set is already
turning off DAX... So we already have a '-o dax' flag which is not "always".

:-(

Unfortunately the 'always' designation kind of breaks semantically but it is
equal to the current mount option.

>
> Also note that encrypted files are read/write so we must never allow
> the combination of ENCRPYT_FL and DAX_FL. So that may be something
> where we should teach __ext4_iget() to check for this, and declare the
> file system as corrupted if it sees this combination.

ok...

> (For VERITY_FL
> && DAX_FL that is a combo that we might want to support in the future,
> so that's probably a case where arguably, we should just ignore the
> DAX_FL for now.)

ok...

Ira

2020-04-16 01:05:23

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH RFC 3/8] fs/ext4: Disallow encryption if inode is DAX

On Wed, Apr 15, 2020 at 02:02:41PM +0200, Jan Kara wrote:
> On Mon 13-04-20 21:00:25, [email protected] wrote:
> > From: Ira Weiny <[email protected]>
> >
> > Encryption and DAX are incompatible. Changing the DAX mode due to a
> > change in Encryption mode is wrong without a corresponding
> > address_space_operations update.
> >
> > Make the 2 options mutually exclusive by returning an error if DAX was
> > set first.
> >
> > Signed-off-by: Ira Weiny <[email protected]>
> > ---
> > fs/ext4/super.c | 10 +---------
> > 1 file changed, 1 insertion(+), 9 deletions(-)
> >
> > diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> > index 0c7c4adb664e..b14863058115 100644
> > --- a/fs/ext4/super.c
> > +++ b/fs/ext4/super.c
> > @@ -1325,7 +1325,7 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
> > if (inode->i_ino == EXT4_ROOT_INO)
> > return -EPERM;
> >
> > - if (WARN_ON_ONCE(IS_DAX(inode) && i_size_read(inode)))
> > + if (WARN_ON_ONCE(IS_DAX(inode)))
>
> Also here I don't think WARN_ON_ONCE() is warranted once we allow per-inode
> setting of DAX. It will then become a regular error condition...

Removed.
Ira

>
> Honza
>
> > return -EINVAL;
> >
> > res = ext4_convert_inline_data(inode);
> > @@ -1349,10 +1349,6 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
> > ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT);
> > ext4_clear_inode_state(inode,
> > EXT4_STATE_MAY_INLINE_DATA);
> > - /*
> > - * Update inode->i_flags - S_ENCRYPTED will be enabled,
> > - * S_DAX may be disabled
> > - */
> > ext4_set_inode_flags(inode);
> > }
> > return res;
> > @@ -1376,10 +1372,6 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
> > ctx, len, 0);
> > if (!res) {
> > ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT);
> > - /*
> > - * Update inode->i_flags - S_ENCRYPTED will be enabled,
> > - * S_DAX may be disabled
> > - */
> > ext4_set_inode_flags(inode);
> > res = ext4_mark_inode_dirty(handle, inode);
> > if (res)
> > --
> > 2.25.1
> >
> --
> Jan Kara <[email protected]>
> SUSE Labs, CR

2020-04-17 17:20:58

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH RFC 6/8] fs/ext4: Update ext4_should_use_dax()

On Wed, Apr 15, 2020 at 03:58:34PM +0200, Jan Kara wrote:
> On Mon 13-04-20 21:00:28, [email protected] wrote:
> > From: Ira Weiny <[email protected]>
> >
> > Change the logic of ext4_should_use_dax() to support using the inode dax
> > flag OR the overriding tri-state mount option.
> >
> > While we are at it change the function to ext4_enable_dax() as this
> > better reflects the ask.
>
> I disagree with the renaming. ext4_enable_dax() suggests it enables
> something. It does not. I'd either leave ext4_should_use_dax() or maybe
> change it to ext4_should_enable_dax() if you really like the "enable" word
> :).

Ok that does sound better. And I've changed it in the xfs series as well.

but I kept Darrick's review on that patch...

>
> >
> > Signed-off-by: Ira Weiny <[email protected]>
> > ---
> > fs/ext4/inode.c | 16 ++++++++++++----
> > 1 file changed, 12 insertions(+), 4 deletions(-)
> >
> > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > index fa0ff78dc033..e9d582e516bc 100644
> > --- a/fs/ext4/inode.c
> > +++ b/fs/ext4/inode.c
> > @@ -4383,9 +4383,11 @@ int ext4_get_inode_loc(struct inode *inode, struct ext4_iloc *iloc)
> > !ext4_test_inode_state(inode, EXT4_STATE_XATTR));
> > }
> >
> > -static bool ext4_should_use_dax(struct inode *inode)
> > +static bool ext4_enable_dax(struct inode *inode)
> > {
> > - if (!test_opt(inode->i_sb, DAX))
> > + unsigned int flags = EXT4_I(inode)->i_flags;
> > +
> > + if (test_opt2(inode->i_sb, NODAX))
> > return false;
> > if (!S_ISREG(inode->i_mode))
> > return false;
> > @@ -4397,7 +4399,13 @@ static bool ext4_should_use_dax(struct inode *inode)
> > return false;
> > if (ext4_test_inode_flag(inode, EXT4_INODE_VERITY))
> > return false;
> > - return true;
> > + if (!bdev_dax_supported(inode->i_sb->s_bdev,
> > + inode->i_sb->s_blocksize))
> > + return false;
> > + if (test_opt(inode->i_sb, DAX))
> > + return true;
> > +
> > + return (flags & EXT4_DAX_FL) == EXT4_DAX_FL;
>
> flags & EXT4_DAX_FL is enough here, isn't it?

Yes, changed.

Ira

>
> Honza
>
> > }
> >
> > void ext4_set_inode_flags(struct inode *inode)
> > @@ -4415,7 +4423,7 @@ void ext4_set_inode_flags(struct inode *inode)
> > new_fl |= S_NOATIME;
> > if (flags & EXT4_DIRSYNC_FL)
> > new_fl |= S_DIRSYNC;
> > - if (ext4_should_use_dax(inode))
> > + if (ext4_enable_dax(inode))
> > new_fl |= S_DAX;
> > if (flags & EXT4_ENCRYPT_FL)
> > new_fl |= S_ENCRYPTED;
> > --
> > 2.25.1
> >
> --
> Jan Kara <[email protected]>
> SUSE Labs, CR

2020-04-21 18:42:39

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH RFC 3/8] fs/ext4: Disallow encryption if inode is DAX

On Wed, Apr 15, 2020 at 12:54:34PM -0700, 'Ira Weiny' wrote:
> On Wed, Apr 15, 2020 at 12:03:07PM -0400, Theodore Y. Ts'o wrote:
> > On Mon, Apr 13, 2020 at 09:00:25PM -0700, [email protected] wrote:

[snip]

> >
> > Also note that encrypted files are read/write so we must never allow
> > the combination of ENCRPYT_FL and DAX_FL. So that may be something
> > where we should teach __ext4_iget() to check for this, and declare the
> > file system as corrupted if it sees this combination.
>
> ok...

After thinking about this...

Do we really want to declare the FS corrupted?

If so, I think we need to return errors when such a configuration is attempted.
If in the future we have an encrypted mode which can co-exist with DAX (such as
Dan mentioned) we can change this.

FWIW I think we should return errors when such a configuration is attempted but
_not_ declare the FS corrupted. That allows users to enable this configuration
later if we can figure out how to support it.

>
> > (For VERITY_FL
> > && DAX_FL that is a combo that we might want to support in the future,
> > so that's probably a case where arguably, we should just ignore the
> > DAX_FL for now.)
>
> ok...

I think this should work the same.

It looks like VERITY_FL and ENCRYPT_FL are _not_ user modifiable? Is that
correct?

You said that ENCRPYT_FL is set from the parent directory? But I'm not seeing
where that occurs?

Similarly I don't see where VERITY_FL is being set either? :-/

I think to make this work correctly we should restrict setting those flags if
DAX_FL is set and vice versa. But I'm not finding where to do that. :-/

Ira

>
> Ira
>

2020-04-21 18:57:48

by Darrick J. Wong

[permalink] [raw]
Subject: Re: [PATCH RFC 3/8] fs/ext4: Disallow encryption if inode is DAX

On Tue, Apr 21, 2020 at 11:41:43AM -0700, Ira Weiny wrote:
> On Wed, Apr 15, 2020 at 12:54:34PM -0700, 'Ira Weiny' wrote:
> > On Wed, Apr 15, 2020 at 12:03:07PM -0400, Theodore Y. Ts'o wrote:
> > > On Mon, Apr 13, 2020 at 09:00:25PM -0700, [email protected] wrote:
>
> [snip]
>
> > >
> > > Also note that encrypted files are read/write so we must never allow
> > > the combination of ENCRPYT_FL and DAX_FL. So that may be something
> > > where we should teach __ext4_iget() to check for this, and declare the
> > > file system as corrupted if it sees this combination.
> >
> > ok...
>
> After thinking about this...
>
> Do we really want to declare the FS corrupted?

Seeing as we're defining the dax inode flag to be advisory (since its
value is ignored if the fs isn't on pmem, or the administrator overrode
with dax=never mount option), I don't see why that's filesystem
corruption.

I can see a case for returning errors if you're trying to change ENCRYPT
or VERITY on a file that's has S_DAX set. We can't encrypt or set
verity data for a file that could be changed behind our backs, so the
kernel cannot satisfy /that/ request.

As for changing FS_DAX_FL on an encrypted/verity'd file, the API says
that it might not have an immedate effect on S_DAX and that programs
have to check S_DAX after changing FS_DAX_FL. It'll never result in
S_DAX being set, but the current spec never guarantees that. ;)

(If FS_DAX_FL were *mandatory* then yes that would be corruption.)

--D

> If so, I think we need to return errors when such a configuration is attempted.
> If in the future we have an encrypted mode which can co-exist with DAX (such as
> Dan mentioned) we can change this.
>
> FWIW I think we should return errors when such a configuration is attempted but
> _not_ declare the FS corrupted. That allows users to enable this configuration
> later if we can figure out how to support it.
>
> >
> > > (For VERITY_FL
> > > && DAX_FL that is a combo that we might want to support in the future,
> > > so that's probably a case where arguably, we should just ignore the
> > > DAX_FL for now.)
> >
> > ok...
>
> I think this should work the same.
>
> It looks like VERITY_FL and ENCRYPT_FL are _not_ user modifiable? Is that
> correct?
>
> You said that ENCRPYT_FL is set from the parent directory? But I'm not seeing
> where that occurs?
>
> Similarly I don't see where VERITY_FL is being set either? :-/
>
> I think to make this work correctly we should restrict setting those flags if
> DAX_FL is set and vice versa. But I'm not finding where to do that. :-/
>
> Ira
>
> >
> > Ira
> >