2018-09-11 16:05:30

by Kani, Toshimitsu

[permalink] [raw]
Subject: [PATCH 0/2] fix sync to flush processor cache for ext4 DAX files

This patchset fixes an issue that sync syscall to an existing DAX file
does not flush processor cache.

Patch 1/2 adds a check to skip the journal inode. It's a bit awkward,
but I could not find a beter way to get the journal inode.

Patch 2/2 fixes the issue by moving up ext4_set_inode_flags() before
ext4_set_aops() in ext4_iget(). This assumes updated i_flags is harmless
in the error cases after the moved-up ext4_set_inode_flags(). Please
review.

---
Toshi Kani (2):
1/2 ext4, dax: update dax check to skip journal inode
2/2 ext4, dax: set ext4_dax_aops for dax files

---
fs/ext4/ext4_jbd2.h | 8 ++++++++
fs/ext4/inode.c | 5 ++++-
2 files changed, 12 insertions(+), 1 deletion(-)


2018-09-11 16:04:04

by Kani, Toshimitsu

[permalink] [raw]
Subject: [PATCH 2/2] ext4, dax: set ext4_dax_aops for dax files

Sync syscall to an existing DAX file needs to flush processor cache,
but it does not currently. This is because 'ext4_da_aops' is set to
address_space_operations of existing DAX files, instead of 'ext4_dax_aops',
since S_DAX flag is set after ext4_set_aops() in the open path.

New file
--------
lookup_open
ext4_create
__ext4_new_inode
ext4_set_inode_flags // Set S_DAX flag
ext4_set_aops // Set aops to ext4_dax_aops

Existing file
-------------
lookup_open
ext4_lookup
ext4_iget
ext4_set_aops // Set aops to ext4_da_aops
ext4_set_inode_flags // Set S_DAX flag

Change ext4_iget() to call ext4_set_inode_flags() before ext4_set_aops().

Fixes: 5f0663bb4a64f588f0a2dd6d1be68d40f9af0086
Signed-off-by: Toshi Kani <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: "Theodore Ts'o" <[email protected]>
Cc: Andreas Dilger <[email protected]>
---
fs/ext4/inode.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 775cd9b4af55..93cbbb859c40 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4998,6 +4998,8 @@ struct inode *ext4_iget(struct super_block *sb, unsigned long ino)
if (ret)
goto bad_inode;

+ ext4_set_inode_flags(inode);
+
if (S_ISREG(inode->i_mode)) {
inode->i_op = &ext4_file_inode_operations;
inode->i_fop = &ext4_file_operations;
@@ -5043,7 +5045,6 @@ struct inode *ext4_iget(struct super_block *sb, unsigned long ino)
goto bad_inode;
}
brelse(iloc.bh);
- ext4_set_inode_flags(inode);

unlock_new_inode(inode);
return inode;

2018-09-11 16:05:59

by Kani, Toshimitsu

[permalink] [raw]
Subject: [PATCH 1/2] ext4, dax: update dax check to skip journal inode

Ext4 mount path calls ext4_iget() to obtain the journal inode. This
inode does not support DAX, and 'ext4_da_aops' needs to be set. It
currently works for the DAX mount case because ext4_iget() always set
'ext4_da_aops' to any regular files.

ext4_fill_super
ext4_load_journal
ext4_get_journal_inode
ext4_iget

In preparation to fix ext4_iget() to set 'ext4_dax_aops' for DAX files,
update ext4_should_use_dax() to return false for the journal inode.

Fixes: 5f0663bb4a64f588f0a2dd6d1be68d40f9af0086
Signed-off-by: Toshi Kani <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: "Theodore Ts'o" <[email protected]>
Cc: Andreas Dilger <[email protected]>
---
fs/ext4/ext4_jbd2.h | 8 ++++++++
fs/ext4/inode.c | 2 ++
2 files changed, 10 insertions(+)

diff --git a/fs/ext4/ext4_jbd2.h b/fs/ext4/ext4_jbd2.h
index 15b6dd733780..9847dc980a0d 100644
--- a/fs/ext4/ext4_jbd2.h
+++ b/fs/ext4/ext4_jbd2.h
@@ -437,6 +437,14 @@ static inline int ext4_should_writeback_data(struct inode *inode)
return ext4_inode_journal_mode(inode) & EXT4_INODE_WRITEBACK_DATA_MODE;
}

+static inline int ext4_is_journal_inode(struct inode *inode)
+{
+ struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
+ unsigned int journal_inum = le32_to_cpu(sbi->s_es->s_journal_inum);
+
+ return journal_inum && (journal_inum == inode->i_ino);
+}
+
/*
* This function controls whether or not we should try to go down the
* dioread_nolock code paths, which makes it safe to avoid taking
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index d0dd585add6a..775cd9b4af55 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4686,6 +4686,8 @@ static bool ext4_should_use_dax(struct inode *inode)
return false;
if (ext4_should_journal_data(inode))
return false;
+ if (ext4_is_journal_inode(inode))
+ return false;
if (ext4_has_inline_data(inode))
return false;
if (ext4_encrypted_inode(inode))

2018-09-11 18:00:27

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH 1/2] ext4, dax: update dax check to skip journal inode

On Tue, Sep 11, 2018 at 8:42 AM, Toshi Kani <[email protected]> wrote:
> Ext4 mount path calls ext4_iget() to obtain the journal inode. This
> inode does not support DAX, and 'ext4_da_aops' needs to be set. It
> currently works for the DAX mount case because ext4_iget() always set
> 'ext4_da_aops' to any regular files.
>
> ext4_fill_super
> ext4_load_journal
> ext4_get_journal_inode
> ext4_iget
>
> In preparation to fix ext4_iget() to set 'ext4_dax_aops' for DAX files,
> update ext4_should_use_dax() to return false for the journal inode.
>
> Fixes: 5f0663bb4a64f588f0a2dd6d1be68d40f9af0086

Just a minor format recommendation:

Fixes: 5f0663bb4a64 ("ext4, dax: introduce ext4_dax_aops")
Cc: <[email protected]>

> Signed-off-by: Toshi Kani <[email protected]>
> Cc: Jan Kara <[email protected]>
> Cc: Dan Williams <[email protected]>
> Cc: "Theodore Ts'o" <[email protected]>
> Cc: Andreas Dilger <[email protected]>
> ---
> fs/ext4/ext4_jbd2.h | 8 ++++++++
> fs/ext4/inode.c | 2 ++
> 2 files changed, 10 insertions(+)
>
> diff --git a/fs/ext4/ext4_jbd2.h b/fs/ext4/ext4_jbd2.h
> index 15b6dd733780..9847dc980a0d 100644
> --- a/fs/ext4/ext4_jbd2.h
> +++ b/fs/ext4/ext4_jbd2.h
> @@ -437,6 +437,14 @@ static inline int ext4_should_writeback_data(struct inode *inode)
> return ext4_inode_journal_mode(inode) & EXT4_INODE_WRITEBACK_DATA_MODE;
> }
>
> +static inline int ext4_is_journal_inode(struct inode *inode)
> +{
> + struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
> + unsigned int journal_inum = le32_to_cpu(sbi->s_es->s_journal_inum);
> +
> + return journal_inum && (journal_inum == inode->i_ino);
> +}
> +
> /*
> * This function controls whether or not we should try to go down the
> * dioread_nolock code paths, which makes it safe to avoid taking
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index d0dd585add6a..775cd9b4af55 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -4686,6 +4686,8 @@ static bool ext4_should_use_dax(struct inode *inode)
> return false;
> if (ext4_should_journal_data(inode))
> return false;
> + if (ext4_is_journal_inode(inode))
> + return false;

This looks good to me.

Reviewed-by: Dan Williams <[email protected]>

2018-09-11 18:12:38

by Kani, Toshimitsu

[permalink] [raw]
Subject: Re: [PATCH 1/2] ext4, dax: update dax check to skip journal inode

On Tue, 2018-09-11 at 10:59 -0700, Dan Williams wrote:
> On Tue, Sep 11, 2018 at 8:42 AM, Toshi Kani <[email protected]> wrote:
> > Ext4 mount path calls ext4_iget() to obtain the journal inode. This
> > inode does not support DAX, and 'ext4_da_aops' needs to be set. It
> > currently works for the DAX mount case because ext4_iget() always set
> > 'ext4_da_aops' to any regular files.
> >
> > ext4_fill_super
> > ext4_load_journal
> > ext4_get_journal_inode
> > ext4_iget
> >
> > In preparation to fix ext4_iget() to set 'ext4_dax_aops' for DAX files,
> > update ext4_should_use_dax() to return false for the journal inode.
> >
> > Fixes: 5f0663bb4a64f588f0a2dd6d1be68d40f9af0086
>
> Just a minor format recommendation:
>
> Fixes: 5f0663bb4a64 ("ext4, dax: introduce ext4_dax_aops")
> Cc: <[email protected]>

Will do.

>
> This looks good to me.
>
> Reviewed-by: Dan Williams <[email protected]>

Thanks!
-Toshi

2018-09-11 18:16:14

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH 2/2] ext4, dax: set ext4_dax_aops for dax files

On Tue, Sep 11, 2018 at 8:42 AM, Toshi Kani <[email protected]> wrote:
> Sync syscall to an existing DAX file needs to flush processor cache,
> but it does not currently. This is because 'ext4_da_aops' is set to
> address_space_operations of existing DAX files, instead of 'ext4_dax_aops',
> since S_DAX flag is set after ext4_set_aops() in the open path.
>
> New file
> --------
> lookup_open
> ext4_create
> __ext4_new_inode
> ext4_set_inode_flags // Set S_DAX flag
> ext4_set_aops // Set aops to ext4_dax_aops
>
> Existing file
> -------------
> lookup_open
> ext4_lookup
> ext4_iget
> ext4_set_aops // Set aops to ext4_da_aops
> ext4_set_inode_flags // Set S_DAX flag
>
> Change ext4_iget() to call ext4_set_inode_flags() before ext4_set_aops().
>
> Fixes: 5f0663bb4a64f588f0a2dd6d1be68d40f9af0086

Same format nit:

Fixes: 5f0663bb4a64 ("ext4, dax: introduce ext4_dax_aops")
Cc: <[email protected]>


> Signed-off-by: Toshi Kani <[email protected]>
> Cc: Jan Kara <[email protected]>
> Cc: Dan Williams <[email protected]>
> Cc: "Theodore Ts'o" <[email protected]>
> Cc: Andreas Dilger <[email protected]>
> ---
> fs/ext4/inode.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 775cd9b4af55..93cbbb859c40 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -4998,6 +4998,8 @@ struct inode *ext4_iget(struct super_block *sb, unsigned long ino)
> if (ret)
> goto bad_inode;
>
> + ext4_set_inode_flags(inode);
> +

Hmm, does this have unintended behavior changes?

I notice that there are some checks for flags "IS_APPEND(inode) ||
IS_IMMUTABLE(inode)" *before* the call to ext4_set_inode_flags(). I
didn't look too much deeper at whether those checks are bogus, but it
would seem safer to do something like this for a lower risk fix.

Thoughts?

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index d0dd585add6a..1e9ab445c777 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4999,7 +4999,6 @@ struct inode *ext4_iget(struct super_block *sb,
unsigned long ino)
if (S_ISREG(inode->i_mode)) {
inode->i_op = &ext4_file_inode_operations;
inode->i_fop = &ext4_file_operations;
- ext4_set_aops(inode);
} else if (S_ISDIR(inode->i_mode)) {
inode->i_op = &ext4_dir_inode_operations;
inode->i_fop = &ext4_dir_operations;
@@ -5042,6 +5041,12 @@ struct inode *ext4_iget(struct super_block *sb,
unsigned long ino)
}
brelse(iloc.bh);
ext4_set_inode_flags(inode);
+ /*
+ * Now that we have determined whether DAX is enabled, set the
+ * proper address spaces operations
+ */
+ if (S_ISREG(inode->i_mode))
+ ext4_set_aops(inode);

unlock_new_inode(inode);
return inode;

2018-09-11 18:42:29

by Kani, Toshimitsu

[permalink] [raw]
Subject: Re: [PATCH 2/2] ext4, dax: set ext4_dax_aops for dax files

On Tue, 2018-09-11 at 11:15 -0700, Dan Williams wrote:
> On Tue, Sep 11, 2018 at 8:42 AM, Toshi Kani <[email protected]> wrote:
> > Sync syscall to an existing DAX file needs to flush processor cache,
> > but it does not currently. This is because 'ext4_da_aops' is set to
> > address_space_operations of existing DAX files, instead of 'ext4_dax_aops',
> > since S_DAX flag is set after ext4_set_aops() in the open path.
> >
> > New file
> > --------
> > lookup_open
> > ext4_create
> > __ext4_new_inode
> > ext4_set_inode_flags // Set S_DAX flag
> > ext4_set_aops // Set aops to ext4_dax_aops
> >
> > Existing file
> > -------------
> > lookup_open
> > ext4_lookup
> > ext4_iget
> > ext4_set_aops // Set aops to ext4_da_aops
> > ext4_set_inode_flags // Set S_DAX flag
> >
> > Change ext4_iget() to call ext4_set_inode_flags() before ext4_set_aops().
> >
> > Fixes: 5f0663bb4a64f588f0a2dd6d1be68d40f9af0086
>
> Same format nit:
>
> Fixes: 5f0663bb4a64 ("ext4, dax: introduce ext4_dax_aops")
> Cc: <[email protected]>

Will do.

> > Signed-off-by: Toshi Kani <[email protected]>
> > Cc: Jan Kara <[email protected]>
> > Cc: Dan Williams <[email protected]>
> > Cc: "Theodore Ts'o" <[email protected]>
> > Cc: Andreas Dilger <[email protected]>
> > ---
> > fs/ext4/inode.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > index 775cd9b4af55..93cbbb859c40 100644
> > --- a/fs/ext4/inode.c
> > +++ b/fs/ext4/inode.c
> > @@ -4998,6 +4998,8 @@ struct inode *ext4_iget(struct super_block *sb, unsigned long ino)
> > if (ret)
> > goto bad_inode;
> >
> > + ext4_set_inode_flags(inode);
> > +
>
> Hmm, does this have unintended behavior changes?
>
> I notice that there are some checks for flags "IS_APPEND(inode) ||
> IS_IMMUTABLE(inode)" *before* the call to ext4_set_inode_flags(). I
> didn't look too much deeper at whether those checks are bogus, but it
> would seem safer to do something like this for a lower risk fix.
>
> Thoughts?

Good catch! Agreed.

Thanks!
-Toshi

2018-09-12 09:24:55

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH 1/2] ext4, dax: update dax check to skip journal inode

On Tue 11-09-18 09:42:45, Toshi Kani wrote:
> Ext4 mount path calls ext4_iget() to obtain the journal inode. This
> inode does not support DAX, and 'ext4_da_aops' needs to be set. It
> currently works for the DAX mount case because ext4_iget() always set
> 'ext4_da_aops' to any regular files.
>
> ext4_fill_super
> ext4_load_journal
> ext4_get_journal_inode
> ext4_iget
>
> In preparation to fix ext4_iget() to set 'ext4_dax_aops' for DAX files,
> update ext4_should_use_dax() to return false for the journal inode.
>
> Fixes: 5f0663bb4a64f588f0a2dd6d1be68d40f9af0086
> Signed-off-by: Toshi Kani <[email protected]>
> Cc: Jan Kara <[email protected]>
> Cc: Dan Williams <[email protected]>
> Cc: "Theodore Ts'o" <[email protected]>
> Cc: Andreas Dilger <[email protected]>

The journal inode is similar to any other 'system' inode we have in ext4.
We don't really expect any of the address space operations to be called for
it because we don't use page cache with these inodes. Very similar
situation is with e.g. quota files. So to me it seems this patch is not
really necessary. Or did you observe any bad effects without this patch?

That being said I agree that it would be a good cleanup to define something
like ext4_is_system_inode() and disable DAX for these inodes just because
they are special. But I don't see a need for this as a part of your fix.

Honza

> ---
> fs/ext4/ext4_jbd2.h | 8 ++++++++
> fs/ext4/inode.c | 2 ++
> 2 files changed, 10 insertions(+)
>
> diff --git a/fs/ext4/ext4_jbd2.h b/fs/ext4/ext4_jbd2.h
> index 15b6dd733780..9847dc980a0d 100644
> --- a/fs/ext4/ext4_jbd2.h
> +++ b/fs/ext4/ext4_jbd2.h
> @@ -437,6 +437,14 @@ static inline int ext4_should_writeback_data(struct inode *inode)
> return ext4_inode_journal_mode(inode) & EXT4_INODE_WRITEBACK_DATA_MODE;
> }
>
> +static inline int ext4_is_journal_inode(struct inode *inode)
> +{
> + struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
> + unsigned int journal_inum = le32_to_cpu(sbi->s_es->s_journal_inum);
> +
> + return journal_inum && (journal_inum == inode->i_ino);
> +}
> +
> /*
> * This function controls whether or not we should try to go down the
> * dioread_nolock code paths, which makes it safe to avoid taking
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index d0dd585add6a..775cd9b4af55 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -4686,6 +4686,8 @@ static bool ext4_should_use_dax(struct inode *inode)
> return false;
> if (ext4_should_journal_data(inode))
> return false;
> + if (ext4_is_journal_inode(inode))
> + return false;
> if (ext4_has_inline_data(inode))
> return false;
> if (ext4_encrypted_inode(inode))
--
Jan Kara <[email protected]>
SUSE Labs, CR

2018-09-12 09:32:36

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH 2/2] ext4, dax: set ext4_dax_aops for dax files

On Tue 11-09-18 11:15:18, Dan Williams wrote:
> On Tue, Sep 11, 2018 at 8:42 AM, Toshi Kani <[email protected]> wrote:
> > Sync syscall to an existing DAX file needs to flush processor cache,
> > but it does not currently. This is because 'ext4_da_aops' is set to
> > address_space_operations of existing DAX files, instead of 'ext4_dax_aops',
> > since S_DAX flag is set after ext4_set_aops() in the open path.
> >
> > New file
> > --------
> > lookup_open
> > ext4_create
> > __ext4_new_inode
> > ext4_set_inode_flags // Set S_DAX flag
> > ext4_set_aops // Set aops to ext4_dax_aops
> >
> > Existing file
> > -------------
> > lookup_open
> > ext4_lookup
> > ext4_iget
> > ext4_set_aops // Set aops to ext4_da_aops
> > ext4_set_inode_flags // Set S_DAX flag
> >
> > Change ext4_iget() to call ext4_set_inode_flags() before ext4_set_aops().
> >
> > Fixes: 5f0663bb4a64f588f0a2dd6d1be68d40f9af0086
>
> Same format nit:
>
> Fixes: 5f0663bb4a64 ("ext4, dax: introduce ext4_dax_aops")
> Cc: <[email protected]>
>
>
> > Signed-off-by: Toshi Kani <[email protected]>
> > Cc: Jan Kara <[email protected]>
> > Cc: Dan Williams <[email protected]>
> > Cc: "Theodore Ts'o" <[email protected]>
> > Cc: Andreas Dilger <[email protected]>
> > ---
> > fs/ext4/inode.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > index 775cd9b4af55..93cbbb859c40 100644
> > --- a/fs/ext4/inode.c
> > +++ b/fs/ext4/inode.c
> > @@ -4998,6 +4998,8 @@ struct inode *ext4_iget(struct super_block *sb, unsigned long ino)
> > if (ret)
> > goto bad_inode;
> >
> > + ext4_set_inode_flags(inode);
> > +
>
> Hmm, does this have unintended behavior changes?
>
> I notice that there are some checks for flags "IS_APPEND(inode) ||
> IS_IMMUTABLE(inode)" *before* the call to ext4_set_inode_flags(). I
> didn't look too much deeper at whether those checks are bogus, but it
> would seem safer to do something like this for a lower risk fix.
>
> Thoughts?

Well, safer but it would leave the landmine around for others to hit.
Toshi, please move the ext4_set_inode_flags() call to be just after the
assignment:

ei->i_flags = le32_to_cpu(raw_inode->i_flags);

in ext4_iget(). That way people won't introduce checks for i_flags that can
never hit... And yes, it fixes also other bugs (mostly in sanity checks
AFAICS) than the DAX issue.

Honza

>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index d0dd585add6a..1e9ab445c777 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -4999,7 +4999,6 @@ struct inode *ext4_iget(struct super_block *sb,
> unsigned long ino)
> if (S_ISREG(inode->i_mode)) {
> inode->i_op = &ext4_file_inode_operations;
> inode->i_fop = &ext4_file_operations;
> - ext4_set_aops(inode);
> } else if (S_ISDIR(inode->i_mode)) {
> inode->i_op = &ext4_dir_inode_operations;
> inode->i_fop = &ext4_dir_operations;
> @@ -5042,6 +5041,12 @@ struct inode *ext4_iget(struct super_block *sb,
> unsigned long ino)
> }
> brelse(iloc.bh);
> ext4_set_inode_flags(inode);
> + /*
> + * Now that we have determined whether DAX is enabled, set the
> + * proper address spaces operations
> + */
> + if (S_ISREG(inode->i_mode))
> + ext4_set_aops(inode);
>
> unlock_new_inode(inode);
> return inode;
--
Jan Kara <[email protected]>
SUSE Labs, CR

2018-09-12 15:47:40

by Kani, Toshimitsu

[permalink] [raw]
Subject: Re: [PATCH 1/2] ext4, dax: update dax check to skip journal inode

On Wed, 2018-09-12 at 11:24 +0200, Jan Kara wrote:
> On Tue 11-09-18 09:42:45, Toshi Kani wrote:
> > Ext4 mount path calls ext4_iget() to obtain the journal inode. This
> > inode does not support DAX, and 'ext4_da_aops' needs to be set. It
> > currently works for the DAX mount case because ext4_iget() always set
> > 'ext4_da_aops' to any regular files.
> >
> > ext4_fill_super
> > ext4_load_journal
> > ext4_get_journal_inode
> > ext4_iget
> >
> > In preparation to fix ext4_iget() to set 'ext4_dax_aops' for DAX files,
> > update ext4_should_use_dax() to return false for the journal inode.
> >
> > Fixes: 5f0663bb4a64f588f0a2dd6d1be68d40f9af0086
> > Signed-off-by: Toshi Kani <[email protected]>
> > Cc: Jan Kara <[email protected]>
> > Cc: Dan Williams <[email protected]>
> > Cc: "Theodore Ts'o" <[email protected]>
> > Cc: Andreas Dilger <[email protected]>
>
> The journal inode is similar to any other 'system' inode we have in ext4.
> We don't really expect any of the address space operations to be called for
> it because we don't use page cache with these inodes. Very similar
> situation is with e.g. quota files. So to me it seems this patch is not
> really necessary. Or did you observe any bad effects without this patch?

Yes, without this change, mount fails with the error below. I believe
.bmap operation in ext4_da_aops is necessary for this case.

jbd2_journal_init_inode: Cannot locate journal superblock
EXT4-fs
(pmem1): Could not load journal inode

> That being said I agree that it would be a good cleanup to define something
> like ext4_is_system_inode() and disable DAX for these inodes just because
> they are special. But I don't see a need for this as a part of your fix.

Is there other system inode we need to check for?

Thanks,
-Toshi



2018-09-12 16:10:13

by Kani, Toshimitsu

[permalink] [raw]
Subject: Re: [PATCH 2/2] ext4, dax: set ext4_dax_aops for dax files

On Wed, 2018-09-12 at 11:31 +0200, Jan Kara wrote:
> On Tue 11-09-18 11:15:18, Dan Williams wrote:
> > On Tue, Sep 11, 2018 at 8:42 AM, Toshi Kani <[email protected]> wrote:
> > > Sync syscall to an existing DAX file needs to flush processor cache,
> > > but it does not currently. This is because 'ext4_da_aops' is set to
> > > address_space_operations of existing DAX files, instead of 'ext4_dax_aops',
> > > since S_DAX flag is set after ext4_set_aops() in the open path.
> > >
> > > New file
> > > --------
> > > lookup_open
> > > ext4_create
> > > __ext4_new_inode
> > > ext4_set_inode_flags // Set S_DAX flag
> > > ext4_set_aops // Set aops to ext4_dax_aops
> > >
> > > Existing file
> > > -------------
> > > lookup_open
> > > ext4_lookup
> > > ext4_iget
> > > ext4_set_aops // Set aops to ext4_da_aops
> > > ext4_set_inode_flags // Set S_DAX flag
> > >
> > > Change ext4_iget() to call ext4_set_inode_flags() before ext4_set_aops().
> > >
> > > Fixes: 5f0663bb4a64f588f0a2dd6d1be68d40f9af0086
> >
> > Same format nit:
> >
> > Fixes: 5f0663bb4a64 ("ext4, dax: introduce ext4_dax_aops")
> > Cc: <[email protected]>
> >
> >
> > > Signed-off-by: Toshi Kani <[email protected]>
> > > Cc: Jan Kara <[email protected]>
> > > Cc: Dan Williams <[email protected]>
> > > Cc: "Theodore Ts'o" <[email protected]>
> > > Cc: Andreas Dilger <[email protected]>
> > > ---
> > > fs/ext4/inode.c | 3 ++-
> > > 1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > > index 775cd9b4af55..93cbbb859c40 100644
> > > --- a/fs/ext4/inode.c
> > > +++ b/fs/ext4/inode.c
> > > @@ -4998,6 +4998,8 @@ struct inode *ext4_iget(struct super_block *sb, unsigned long ino)
> > > if (ret)
> > > goto bad_inode;
> > >
> > > + ext4_set_inode_flags(inode);
> > > +
> >
> > Hmm, does this have unintended behavior changes?
> >
> > I notice that there are some checks for flags "IS_APPEND(inode) ||
> > IS_IMMUTABLE(inode)" *before* the call to ext4_set_inode_flags(). I
> > didn't look too much deeper at whether those checks are bogus, but it
> > would seem safer to do something like this for a lower risk fix.
> >
> > Thoughts?
>
> Well, safer but it would leave the landmine around for others to hit.
> Toshi, please move the ext4_set_inode_flags() call to be just after the
> assignment:
>
> ei->i_flags = le32_to_cpu(raw_inode->i_flags);
>
> in ext4_iget(). That way people won't introduce checks for i_flags that can
> never hit... And yes, it fixes also other bugs (mostly in sanity checks
> AFAICS) than the DAX issue.

Sure. Assuming you think the implicit change Dan pointed out is not a
problem, yes, I will go with this cleaner approach.

Thanks!
-Toshi


2018-09-12 16:22:55

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH 1/2] ext4, dax: update dax check to skip journal inode

On Wed 12-09-18 15:47:12, Kani, Toshi wrote:
> On Wed, 2018-09-12 at 11:24 +0200, Jan Kara wrote:
> > On Tue 11-09-18 09:42:45, Toshi Kani wrote:
> > > Ext4 mount path calls ext4_iget() to obtain the journal inode. This
> > > inode does not support DAX, and 'ext4_da_aops' needs to be set. It
> > > currently works for the DAX mount case because ext4_iget() always set
> > > 'ext4_da_aops' to any regular files.
> > >
> > > ext4_fill_super
> > > ext4_load_journal
> > > ext4_get_journal_inode
> > > ext4_iget
> > >
> > > In preparation to fix ext4_iget() to set 'ext4_dax_aops' for DAX files,
> > > update ext4_should_use_dax() to return false for the journal inode.
> > >
> > > Fixes: 5f0663bb4a64f588f0a2dd6d1be68d40f9af0086
> > > Signed-off-by: Toshi Kani <[email protected]>
> > > Cc: Jan Kara <[email protected]>
> > > Cc: Dan Williams <[email protected]>
> > > Cc: "Theodore Ts'o" <[email protected]>
> > > Cc: Andreas Dilger <[email protected]>
> >
> > The journal inode is similar to any other 'system' inode we have in ext4.
> > We don't really expect any of the address space operations to be called for
> > it because we don't use page cache with these inodes. Very similar
> > situation is with e.g. quota files. So to me it seems this patch is not
> > really necessary. Or did you observe any bad effects without this patch?
>
> Yes, without this change, mount fails with the error below. I believe
> .bmap operation in ext4_da_aops is necessary for this case.
>
> jbd2_journal_init_inode: Cannot locate journal superblock
> EXT4-fs
> (pmem1): Could not load journal inode

Ah, OK, forgot about this. Then how about adding:
.bmap = ext4_bmap,

to ext4_dax_aops? .bmap is completely fine for DAX inodes and there's no
reason to disallow it.

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR

2018-09-12 16:43:37

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH 2/2] ext4, dax: set ext4_dax_aops for dax files

On Wed, Sep 12, 2018 at 9:08 AM, Kani, Toshi <[email protected]> wrote:
> On Wed, 2018-09-12 at 11:31 +0200, Jan Kara wrote:
>> On Tue 11-09-18 11:15:18, Dan Williams wrote:
>> > On Tue, Sep 11, 2018 at 8:42 AM, Toshi Kani <[email protected]> wrote:
>> > > Sync syscall to an existing DAX file needs to flush processor cache,
>> > > but it does not currently. This is because 'ext4_da_aops' is set to
>> > > address_space_operations of existing DAX files, instead of 'ext4_dax_aops',
>> > > since S_DAX flag is set after ext4_set_aops() in the open path.
>> > >
>> > > New file
>> > > --------
>> > > lookup_open
>> > > ext4_create
>> > > __ext4_new_inode
>> > > ext4_set_inode_flags // Set S_DAX flag
>> > > ext4_set_aops // Set aops to ext4_dax_aops
>> > >
>> > > Existing file
>> > > -------------
>> > > lookup_open
>> > > ext4_lookup
>> > > ext4_iget
>> > > ext4_set_aops // Set aops to ext4_da_aops
>> > > ext4_set_inode_flags // Set S_DAX flag
>> > >
>> > > Change ext4_iget() to call ext4_set_inode_flags() before ext4_set_aops().
>> > >
>> > > Fixes: 5f0663bb4a64f588f0a2dd6d1be68d40f9af0086
>> >
>> > Same format nit:
>> >
>> > Fixes: 5f0663bb4a64 ("ext4, dax: introduce ext4_dax_aops")
>> > Cc: <[email protected]>
>> >
>> >
>> > > Signed-off-by: Toshi Kani <[email protected]>
>> > > Cc: Jan Kara <[email protected]>
>> > > Cc: Dan Williams <[email protected]>
>> > > Cc: "Theodore Ts'o" <[email protected]>
>> > > Cc: Andreas Dilger <[email protected]>
>> > > ---
>> > > fs/ext4/inode.c | 3 ++-
>> > > 1 file changed, 2 insertions(+), 1 deletion(-)
>> > >
>> > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
>> > > index 775cd9b4af55..93cbbb859c40 100644
>> > > --- a/fs/ext4/inode.c
>> > > +++ b/fs/ext4/inode.c
>> > > @@ -4998,6 +4998,8 @@ struct inode *ext4_iget(struct super_block *sb, unsigned long ino)
>> > > if (ret)
>> > > goto bad_inode;
>> > >
>> > > + ext4_set_inode_flags(inode);
>> > > +
>> >
>> > Hmm, does this have unintended behavior changes?
>> >
>> > I notice that there are some checks for flags "IS_APPEND(inode) ||
>> > IS_IMMUTABLE(inode)" *before* the call to ext4_set_inode_flags(). I
>> > didn't look too much deeper at whether those checks are bogus, but it
>> > would seem safer to do something like this for a lower risk fix.
>> >
>> > Thoughts?
>>
>> Well, safer but it would leave the landmine around for others to hit.
>> Toshi, please move the ext4_set_inode_flags() call to be just after the
>> assignment:
>>
>> ei->i_flags = le32_to_cpu(raw_inode->i_flags);
>>
>> in ext4_iget(). That way people won't introduce checks for i_flags that can
>> never hit... And yes, it fixes also other bugs (mostly in sanity checks
>> AFAICS) than the DAX issue.
>
> Sure. Assuming you think the implicit change Dan pointed out is not a
> problem, yes, I will go with this cleaner approach.
>

Yes, Jan's proposal looks best to me.

2018-09-12 16:52:45

by Kani, Toshimitsu

[permalink] [raw]
Subject: Re: [PATCH 1/2] ext4, dax: update dax check to skip journal inode

On Wed, 2018-09-12 at 18:20 +0200, Jan Kara wrote:
> On Wed 12-09-18 15:47:12, Kani, Toshi wrote:
> > On Wed, 2018-09-12 at 11:24 +0200, Jan Kara wrote:
> > > On Tue 11-09-18 09:42:45, Toshi Kani wrote:
> > > > Ext4 mount path calls ext4_iget() to obtain the journal inode. This
> > > > inode does not support DAX, and 'ext4_da_aops' needs to be set. It
> > > > currently works for the DAX mount case because ext4_iget() always set
> > > > 'ext4_da_aops' to any regular files.
> > > >
> > > > ext4_fill_super
> > > > ext4_load_journal
> > > > ext4_get_journal_inode
> > > > ext4_iget
> > > >
> > > > In preparation to fix ext4_iget() to set 'ext4_dax_aops' for DAX files,
> > > > update ext4_should_use_dax() to return false for the journal inode.
> > > >
> > > > Fixes: 5f0663bb4a64f588f0a2dd6d1be68d40f9af0086
> > > > Signed-off-by: Toshi Kani <[email protected]>
> > > > Cc: Jan Kara <[email protected]>
> > > > Cc: Dan Williams <[email protected]>
> > > > Cc: "Theodore Ts'o" <[email protected]>
> > > > Cc: Andreas Dilger <[email protected]>
> > >
> > > The journal inode is similar to any other 'system' inode we have in ext4.
> > > We don't really expect any of the address space operations to be called for
> > > it because we don't use page cache with these inodes. Very similar
> > > situation is with e.g. quota files. So to me it seems this patch is not
> > > really necessary. Or did you observe any bad effects without this patch?
> >
> > Yes, without this change, mount fails with the error below. I believe
> > .bmap operation in ext4_da_aops is necessary for this case.
> >
> > jbd2_journal_init_inode: Cannot locate journal superblock
> > EXT4-fs
> > (pmem1): Could not load journal inode
>
> Ah, OK, forgot about this. Then how about adding:
> .bmap = ext4_bmap,
>
> to ext4_dax_aops? .bmap is completely fine for DAX inodes and there's no
> reason to disallow it.

Yes, mount -o dax succeeds with the added ext4_bmap in ext4_dax_aops.

Thanks!
-Toshi