2020-04-21 19:22:10

by Ira Weiny

[permalink] [raw]
Subject: [PATCH V9 03/11] fs/stat: Define DAX statx attribute

From: Ira Weiny <[email protected]>

In order for users to determine if a file is currently operating in DAX
state (effective DAX). Define a statx attribute value and set that
attribute if the effective DAX flag is set.

To go along with this we propose the following addition to the statx man
page:

STATX_ATTR_DAX

The file is in the DAX (cpu direct access) state. DAX state
attempts to minimize software cache effects for both I/O and
memory mappings of this file. It requires a file system which
has been configured to support DAX.

DAX generally assumes all accesses are via cpu load / store
instructions which can minimize overhead for small accesses, but
may adversely affect cpu utilization for large transfers.

File I/O is done directly to/from user-space buffers and memory
mapped I/O may be performed with direct memory mappings that
bypass kernel page cache.

While the DAX property tends to result in data being transferred
synchronously, it does not give the same guarantees of O_SYNC
where data and the necessary metadata are transferred together.

A DAX file may support being mapped with the MAP_SYNC flag,
which enables a program to use CPU cache flush instructions to
persist CPU store operations without an explicit fsync(2). See
mmap(2) for more information.

Reviewed-by: Dave Chinner <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Signed-off-by: Ira Weiny <[email protected]>

---
Changes from V2:
Update man page text with comments from Darrick, Jan, Dan, and
Dave.
---
fs/stat.c | 3 +++
include/uapi/linux/stat.h | 1 +
2 files changed, 4 insertions(+)

diff --git a/fs/stat.c b/fs/stat.c
index 030008796479..894699c74dde 100644
--- a/fs/stat.c
+++ b/fs/stat.c
@@ -79,6 +79,9 @@ int vfs_getattr_nosec(const struct path *path, struct kstat *stat,
if (IS_AUTOMOUNT(inode))
stat->attributes |= STATX_ATTR_AUTOMOUNT;

+ if (IS_DAX(inode))
+ stat->attributes |= STATX_ATTR_DAX;
+
if (inode->i_op->getattr)
return inode->i_op->getattr(path, stat, request_mask,
query_flags);
diff --git a/include/uapi/linux/stat.h b/include/uapi/linux/stat.h
index ad80a5c885d5..e5f9d5517f6b 100644
--- a/include/uapi/linux/stat.h
+++ b/include/uapi/linux/stat.h
@@ -169,6 +169,7 @@ struct statx {
#define STATX_ATTR_ENCRYPTED 0x00000800 /* [I] File requires key to decrypt in fs */
#define STATX_ATTR_AUTOMOUNT 0x00001000 /* Dir: Automount trigger */
#define STATX_ATTR_VERITY 0x00100000 /* [I] Verity protected file */
+#define STATX_ATTR_DAX 0x00002000 /* [I] File is DAX */


#endif /* _UAPI_LINUX_STAT_H */
--
2.25.1


2020-04-22 18:53:27

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH V9 03/11] fs/stat: Define DAX statx attribute

On Wed, Apr 22, 2020 at 09:29:51AM -0700, Darrick J. Wong wrote:
> On Tue, Apr 21, 2020 at 12:17:45PM -0700, [email protected] wrote:
> > From: Ira Weiny <[email protected]>
> >
> > In order for users to determine if a file is currently operating in DAX
> > state (effective DAX). Define a statx attribute value and set that
> > attribute if the effective DAX flag is set.
> >
> > To go along with this we propose the following addition to the statx man
> > page:
> >
> > STATX_ATTR_DAX
> >
> > The file is in the DAX (cpu direct access) state. DAX state
> > attempts to minimize software cache effects for both I/O and
> > memory mappings of this file. It requires a file system which
> > has been configured to support DAX.
> >
> > DAX generally assumes all accesses are via cpu load / store
> > instructions which can minimize overhead for small accesses, but
> > may adversely affect cpu utilization for large transfers.
> >
> > File I/O is done directly to/from user-space buffers and memory
> > mapped I/O may be performed with direct memory mappings that
> > bypass kernel page cache.
> >
> > While the DAX property tends to result in data being transferred
> > synchronously, it does not give the same guarantees of O_SYNC
> > where data and the necessary metadata are transferred together.
> >
> > A DAX file may support being mapped with the MAP_SYNC flag,
> > which enables a program to use CPU cache flush instructions to
> > persist CPU store operations without an explicit fsync(2). See
> > mmap(2) for more information.
>
> One thing I hadn't noticed before -- this is a change to userspace API,
> so please cc this series to [email protected] when you send V10.

Right! Glad you caught me on this because I was just preparing to send V10.

Is there someone I could directly mail who needs to look at this? I guess I
thought we had the important FS people involved for this type of API change.
:-/

>
> Also, I've started to think about commit order sequencing for actually
> landing this series. Usually I try to put vfs and documentation things
> before xfs stuff, which means I came up with:
>
> vfs xfs I_DONTCACHE
> 2 3 11 1 4 5 6 7 8 9 10
>
> Note that I separated the DONTCACHE part because it touches VFS
> internals, which implies a higher standard of review (aka Al) and I do
> not wish to hold up the 2-3-11-1-4-5-6-7 patches if the dontcache part
> becomes contentious.
>
> What do you think of that ordering?

I think 1 stands on it's own separate from this series... so I would keep it
first. Moving Documentation up is easy.

I've changed to this order...

prelim vfs xfs I_DONTCACHE
1 2 3 11 4 5 6 7 8 9 10

Which is pretty much the same now that I look at it! ;-)

>
> (Heck, maybe I'll just put patch 1 in the queue for 5.8 right now...)

IMHO, I think 1 and 2 can go.

While patch 2 is in the VFS layer it is very much a DAX thing. Jan and
Christoph approved it. I think even Dave approved the version before I
removed io_is_direct() but I don't recall now.

Dan and I also discussed it internally when I first found the issue. So I'm
very confident in it! :-D

Unfortunately, 3 and 10 are the critical pieces to the feature. So we could
move 3 out later after 8 and 9 are approved. But I don't think it buys us
much to have the tri-state go in without the rest.

Ira

>
> --D
>
> > Reviewed-by: Dave Chinner <[email protected]>
> > Reviewed-by: Jan Kara <[email protected]>
> > Reviewed-by: Darrick J. Wong <[email protected]>
> > Signed-off-by: Ira Weiny <[email protected]>
> >
> > ---
> > Changes from V2:
> > Update man page text with comments from Darrick, Jan, Dan, and
> > Dave.
> > ---
> > fs/stat.c | 3 +++
> > include/uapi/linux/stat.h | 1 +
> > 2 files changed, 4 insertions(+)
> >
> > diff --git a/fs/stat.c b/fs/stat.c
> > index 030008796479..894699c74dde 100644
> > --- a/fs/stat.c
> > +++ b/fs/stat.c
> > @@ -79,6 +79,9 @@ int vfs_getattr_nosec(const struct path *path, struct kstat *stat,
> > if (IS_AUTOMOUNT(inode))
> > stat->attributes |= STATX_ATTR_AUTOMOUNT;
> >
> > + if (IS_DAX(inode))
> > + stat->attributes |= STATX_ATTR_DAX;
> > +
> > if (inode->i_op->getattr)
> > return inode->i_op->getattr(path, stat, request_mask,
> > query_flags);
> > diff --git a/include/uapi/linux/stat.h b/include/uapi/linux/stat.h
> > index ad80a5c885d5..e5f9d5517f6b 100644
> > --- a/include/uapi/linux/stat.h
> > +++ b/include/uapi/linux/stat.h
> > @@ -169,6 +169,7 @@ struct statx {
> > #define STATX_ATTR_ENCRYPTED 0x00000800 /* [I] File requires key to decrypt in fs */
> > #define STATX_ATTR_AUTOMOUNT 0x00001000 /* Dir: Automount trigger */
> > #define STATX_ATTR_VERITY 0x00100000 /* [I] Verity protected file */
> > +#define STATX_ATTR_DAX 0x00002000 /* [I] File is DAX */
> >
> >
> > #endif /* _UAPI_LINUX_STAT_H */
> > --
> > 2.25.1
> >