2019-02-21 18:24:40

by Johannes Thumshirn

[permalink] [raw]
Subject: [PATCH] nvmet: disable direct I/O when unavailable

Some file-systems, like tmpfs, do not support direct IO, but file-backed
namespaces default to using direct IO. If direct IO is unavailable fall
back to using buffered IO for the file-backed namespace.

This might not ultimately be a solution for production environments but
for test environments it sometimes is feasible to use tmpfs.

Signed-off-by: Johannes Thumshirn <[email protected]>
---
drivers/nvme/target/io-cmd-file.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/drivers/nvme/target/io-cmd-file.c b/drivers/nvme/target/io-cmd-file.c
index 517522305e5c..8a861cc0160e 100644
--- a/drivers/nvme/target/io-cmd-file.c
+++ b/drivers/nvme/target/io-cmd-file.c
@@ -38,11 +38,21 @@ int nvmet_file_ns_enable(struct nvmet_ns *ns)

ns->file = filp_open(ns->device_path, flags, 0);
if (IS_ERR(ns->file)) {
+ if (ns->file == ERR_PTR(-EINVAL) && (flags & O_DIRECT)) {
+ flags &= ~O_DIRECT;
+ ns->buffered_io = 0;
+ ns->file = filp_open(ns->device_path, flags, 0);
+ if (!IS_ERR(ns->file)) {
+ pr_info("direct I/O unavailable, falling back to buffered I/O\n");
+ goto getattr;
+ }
+ }
pr_err("failed to open file %s: (%ld)\n",
ns->device_path, PTR_ERR(ns->file));
return PTR_ERR(ns->file);
}

+getattr:
ret = vfs_getattr(&ns->file->f_path,
&stat, STATX_SIZE, AT_STATX_FORCE_SYNC);
if (ret)
--
2.16.4



2019-02-22 00:43:17

by Chaitanya Kulkarni

[permalink] [raw]
Subject: Re: [PATCH] nvmet: disable direct I/O when unavailable

On 02/21/2019 10:23 AM, Johannes Thumshirn wrote:
> Some file-systems, like tmpfs, do not support direct IO, but file-backed
> namespaces default to using direct IO. If direct IO is unavailable fall
> back to using buffered IO for the file-backed namespace.
>
> This might not ultimately be a solution for production environments but
> for test environments it sometimes is feasible to use tmpfs.
>
> Signed-off-by: Johannes Thumshirn <[email protected]>
> ---
> drivers/nvme/target/io-cmd-file.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/drivers/nvme/target/io-cmd-file.c b/drivers/nvme/target/io-cmd-file.c
> index 517522305e5c..8a861cc0160e 100644
> --- a/drivers/nvme/target/io-cmd-file.c
> +++ b/drivers/nvme/target/io-cmd-file.c
> @@ -38,11 +38,21 @@ int nvmet_file_ns_enable(struct nvmet_ns *ns)
>
> ns->file = filp_open(ns->device_path, flags, 0);
> if (IS_ERR(ns->file)) {
> + if (ns->file == ERR_PTR(-EINVAL) && (flags & O_DIRECT)) {
> + flags &= ~O_DIRECT;
> + ns->buffered_io = 0;
This needs to be set to 1 when we enable buffered I/O.
> + ns->file = filp_open(ns->device_path, flags, 0);
> + if (!IS_ERR(ns->file)) {
> + pr_info("direct I/O unavailable, falling back to buffered I/O\n");
> + goto getattr;
> + }
> + }
> pr_err("failed to open file %s: (%ld)\n",
> ns->device_path, PTR_ERR(ns->file));
> return PTR_ERR(ns->file);
> }
>
> +getattr:
> ret = vfs_getattr(&ns->file->f_path,
> &stat, STATX_SIZE, AT_STATX_FORCE_SYNC);
> if (ret)
> kk

As per specified in the patch, this is only useful for testing, then we
should modify the test scripts so that on creation of the ctrl we switch
to the buffered I/O before running fio.

OR

Similar result can be achieved by setting buffered I/O flag
buffered_io=1 before enabling the name-space in the test script.



2019-02-22 05:56:15

by Johannes Thumshirn

[permalink] [raw]
Subject: Re: [PATCH] nvmet: disable direct I/O when unavailable

On 22/02/2019 01:41, Chaitanya Kulkarni wrote:
[...]
>
> As per specified in the patch, this is only useful for testing, then we
> should modify the test scripts so that on creation of the ctrl we switch
> to the buffered I/O before running fio.

Or on any other file-system that does not support DIO..

>
> OR
>
> Similar result can be achieved by setting buffered I/O flag
> buffered_io=1 before enabling the name-space in the test script.

Frankly, we have a ton of testing related special cases in the kernel.

This one is a) simple and small, only 10 LoC, b) far away from the fast
path or any other place where it could have any impact on legitimate
users and c) it prints an informal message showing you what happened.

Sorry but this is a https://xkcd.com/386/ moment.

Byte,
Johannes
--
Johannes Thumshirn SUSE Labs Filesystems
[email protected] +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

2019-02-24 10:56:41

by Max Gurtovoy

[permalink] [raw]
Subject: Re: [PATCH] nvmet: disable direct I/O when unavailable


On 2/22/2019 7:55 AM, Johannes Thumshirn wrote:
> On 22/02/2019 01:41, Chaitanya Kulkarni wrote:
> [...]
>> As per specified in the patch, this is only useful for testing, then we
>> should modify the test scripts so that on creation of the ctrl we switch
>> to the buffered I/O before running fio.
> Or on any other file-system that does not support DIO..

Do we really want to support these kind of filesystems for fabrics
backend device ? only for testing ?

What is the status of iSCSI/SRP targets in this case ?


>
>> OR
>>
>> Similar result can be achieved by setting buffered I/O flag
>> buffered_io=1 before enabling the name-space in the test script.
> Frankly, we have a ton of testing related special cases in the kernel.
>
> This one is a) simple and small, only 10 LoC, b) far away from the fast
> path or any other place where it could have any impact on legitimate
> users and c) it prints an informal message showing you what happened.
>
> Sorry but this is a https://xkcd.com/386/ moment.
>
> Byte,
> Johannes

2019-02-25 09:38:54

by Johannes Thumshirn

[permalink] [raw]
Subject: Re: [PATCH] nvmet: disable direct I/O when unavailable

On 24/02/2019 11:54, Max Gurtovoy wrote:
>
> On 2/22/2019 7:55 AM, Johannes Thumshirn wrote:
>> On 22/02/2019 01:41, Chaitanya Kulkarni wrote:
>> [...]
>>> As per specified in the patch, this is only useful for testing, then we
>>> should modify the test scripts so that on creation of the ctrl we switch
>>> to the buffered I/O before running fio.
>> Or on any other file-system that does not support DIO..
>
> Do we really want to support these kind of filesystems for fabrics
> backend device ? only for testing ?
>
> What is the status of iSCSI/SRP targets in this case ?

iSCSI/SRP passes in the following:

/*
* Use O_DSYNC by default instead of O_SYNC to forgo syncing
* of pure timestamp updates.
*/
flags = O_RDWR | O_CREAT | O_LARGEFILE | O_DSYNC;

Meaning it uses O_DSYNC, so for instance if I do a dd to tmpfs with O_DSYNC:
$ dd if=/dev/zero of=/dev/shm/foo oflag=dsync bs=4k count=1
1+0 records in
1+0 records out
4096 bytes (4.1 kB, 4.0 KiB) copied, 0.000175094 s, 23.4 MB/s

Same with O_DIRECT:
$ dd if=/dev/zero of=/dev/shm/foo oflag=direct bs=4k count=1
dd: failed to open '/dev/shm/foo': Invalid argument


--
Johannes Thumshirn SUSE Labs Filesystems
[email protected] +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

2019-02-25 14:50:30

by Bart Van Assche

[permalink] [raw]
Subject: Re: [PATCH] nvmet: disable direct I/O when unavailable

On 2/25/19 1:37 AM, Johannes Thumshirn wrote:
> On 24/02/2019 11:54, Max Gurtovoy wrote:
>> What is the status of iSCSI/SRP targets in this case ?
>
> iSCSI/SRP passes in the following:
>
> /*
> * Use O_DSYNC by default instead of O_SYNC to forgo syncing
> * of pure timestamp updates.
> */
> flags = O_RDWR | O_CREAT | O_LARGEFILE | O_DSYNC;

That code fragment comes from the LIO file backend. There is no
requirement for a SCSI target core file backend to use O_DSYNC. SCST
allows users to choose whether or not O_DSYNC should be used:

if (virt_dev->wt_flag && !virt_dev->nv_cache)
open_flags |= O_DSYNC;

Bart.


2019-02-25 15:39:21

by Johannes Thumshirn

[permalink] [raw]
Subject: Re: [PATCH] nvmet: disable direct I/O when unavailable

On Mon, Feb 25, 2019 at 06:49:04AM -0800, Bart Van Assche wrote:
> On 2/25/19 1:37 AM, Johannes Thumshirn wrote:
> > On 24/02/2019 11:54, Max Gurtovoy wrote:
> > > What is the status of iSCSI/SRP targets in this case ?
> >
> > iSCSI/SRP passes in the following:
> >
> > /*
> > * Use O_DSYNC by default instead of O_SYNC to forgo syncing
> > * of pure timestamp updates.
> > */
> > flags = O_RDWR | O_CREAT | O_LARGEFILE | O_DSYNC;
>
> That code fragment comes from the LIO file backend. There is no requirement
> for a SCSI target core file backend to use O_DSYNC. SCST allows users to
> choose whether or not O_DSYNC should be used:

Yes, LIO file backend is the in-tree equivalent to NVMe's file backend. That's
why I copied it here.


>
> if (virt_dev->wt_flag && !virt_dev->nv_cache)
> open_flags |= O_DSYNC;
>

OK. Do your open_flags include O_DIRECT per default as well?

Byte,
Johannes
--
Johannes Thumshirn SUSE Labs Filesystems
[email protected] +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: Felix Imend?rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N?rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

2019-02-25 16:11:27

by Bart Van Assche

[permalink] [raw]
Subject: Re: [PATCH] nvmet: disable direct I/O when unavailable

On Mon, 2019-02-25 at 16:38 +-0100, Johannes Thumshirn wrote:
+AD4 On Mon, Feb 25, 2019 at 06:49:04AM -0800, Bart Van Assche wrote:
+AD4 +AD4 On 2/25/19 1:37 AM, Johannes Thumshirn wrote:
+AD4 +AD4 +AD4 On 24/02/2019 11:54, Max Gurtovoy wrote:
+AD4 +AD4 +AD4 +AD4 What is the status of iSCSI/SRP targets in this case ?
+AD4 +AD4 +AD4
+AD4 +AD4 +AD4 iSCSI/SRP passes in the following:
+AD4 +AD4 +AD4
+AD4 +AD4 +AD4 /+ACo
+AD4 +AD4 +AD4 +ACo Use O+AF8-DSYNC by default instead of O+AF8-SYNC to forgo syncing
+AD4 +AD4 +AD4 +ACo of pure timestamp updates.
+AD4 +AD4 +AD4 +ACo-/
+AD4 +AD4 +AD4 flags +AD0 O+AF8-RDWR +AHw O+AF8-CREAT +AHw O+AF8-LARGEFILE +AHw O+AF8-DSYNC+ADs
+AD4 +AD4
+AD4 +AD4 That code fragment comes from the LIO file backend. There is no requirement
+AD4 +AD4 for a SCSI target core file backend to use O+AF8-DSYNC. SCST allows users to
+AD4 +AD4 choose whether or not O+AF8-DSYNC should be used:
+AD4
+AD4 Yes, LIO file backend is the in-tree equivalent to NVMe's file backend. That's
+AD4 why I copied it here.
+AD4
+AD4
+AD4 +AD4
+AD4 +AD4 if (virt+AF8-dev-+AD4-wt+AF8-flag +ACYAJg +ACE-virt+AF8-dev-+AD4-nv+AF8-cache)
+AD4 +AD4 open+AF8-flags +AHwAPQ O+AF8-DSYNC+ADs
+AD4 +AD4
+AD4
+AD4 OK. Do your open+AF8-flags include O+AF8-DIRECT per default as well?

I don't think it has ever been supported to pass O+AF8-DIRECT to filp+AF8-open().
As one can see in build+AF8-open+AF8-flags() O+AF8-DIRECT is ignored. The only way I
know of to submit direct I/O from kernel context is by setting the
IOCB+AF8-DIRECT flag.

Bart.

2019-02-25 22:01:25

by Sagi Grimberg

[permalink] [raw]
Subject: Re: [PATCH] nvmet: disable direct I/O when unavailable


> Frankly, we have a ton of testing related special cases in the kernel.
>
> This one is a) simple and small, only 10 LoC, b) far away from the fast
> path or any other place where it could have any impact on legitimate
> users and c) it prints an informal message showing you what happened.
>
> Sorry but this is a https://xkcd.com/386/ moment.

How about we just fail it with a proper error message and let the
user set buffered_io instead of trying to get smart here...

2019-03-08 13:28:19

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] nvmet: disable direct I/O when unavailable

On Mon, Feb 25, 2019 at 01:18:10PM -0800, Sagi Grimberg wrote:
>
>> Frankly, we have a ton of testing related special cases in the kernel.
>>
>> This one is a) simple and small, only 10 LoC, b) far away from the fast
>> path or any other place where it could have any impact on legitimate
>> users and c) it prints an informal message showing you what happened.
>>
>> Sorry but this is a https://xkcd.com/386/ moment.
>
> How about we just fail it with a proper error message and let the
> user set buffered_io instead of trying to get smart here...

Yes, would be my preference.