2020-03-12 11:16:25

by Sahitya Tummala

[permalink] [raw]
Subject: [PATCH] f2fs: fix long latency due to discard during umount

F2FS already has a default timeout of 5 secs for discards that
can be issued during umount, but it can take more than the 5 sec
timeout if the underlying UFS device queue is already full and there
are no more available free tags to be used. In that case, submit_bio()
will wait for the already queued discard requests to complete to get
a free tag, which can potentially take way more than 5 sec.

Fix this by submitting the discard requests with REQ_NOWAIT
flags during umount. This will return -EAGAIN for UFS queue/tag full
scenario without waiting in the context of submit_bio(). The FS can
then handle these requests by retrying again within the stipulated
discard timeout period to avoid long latencies.

Signed-off-by: Sahitya Tummala <[email protected]>
---
fs/f2fs/segment.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index fb3e531..a06bbac 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -1124,10 +1124,13 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi,
struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info;
struct list_head *wait_list = (dpolicy->type == DPOLICY_FSTRIM) ?
&(dcc->fstrim_list) : &(dcc->wait_list);
- int flag = dpolicy->sync ? REQ_SYNC : 0;
+ int flag;
block_t lstart, start, len, total_len;
int err = 0;

+ flag = dpolicy->sync ? REQ_SYNC : 0;
+ flag |= dpolicy->type == DPOLICY_UMOUNT ? REQ_NOWAIT : 0;
+
if (dc->state != D_PREP)
return 0;

@@ -1203,6 +1206,11 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi,
bio->bi_end_io = f2fs_submit_discard_endio;
bio->bi_opf |= flag;
submit_bio(bio);
+ if ((flag & REQ_NOWAIT) && (dc->error == -EAGAIN)) {
+ dc->state = D_PREP;
+ err = dc->error;
+ break;
+ }

atomic_inc(&dcc->issued_discard);

@@ -1510,6 +1518,10 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi,
}

__submit_discard_cmd(sbi, dpolicy, dc, &issued);
+ if (dc->error == -EAGAIN) {
+ congestion_wait(BLK_RW_ASYNC, HZ/50);
+ __relocate_discard_cmd(dcc, dc);
+ }

if (issued >= dpolicy->max_requests)
break;
--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.


2020-03-12 17:05:15

by Jaegeuk Kim

[permalink] [raw]
Subject: Re: [PATCH] f2fs: fix long latency due to discard during umount

On 03/12, Sahitya Tummala wrote:
> F2FS already has a default timeout of 5 secs for discards that
> can be issued during umount, but it can take more than the 5 sec
> timeout if the underlying UFS device queue is already full and there
> are no more available free tags to be used. In that case, submit_bio()
> will wait for the already queued discard requests to complete to get
> a free tag, which can potentially take way more than 5 sec.
>
> Fix this by submitting the discard requests with REQ_NOWAIT
> flags during umount. This will return -EAGAIN for UFS queue/tag full
> scenario without waiting in the context of submit_bio(). The FS can
> then handle these requests by retrying again within the stipulated
> discard timeout period to avoid long latencies.
>
> Signed-off-by: Sahitya Tummala <[email protected]>
> ---
> fs/f2fs/segment.c | 14 +++++++++++++-
> 1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> index fb3e531..a06bbac 100644
> --- a/fs/f2fs/segment.c
> +++ b/fs/f2fs/segment.c
> @@ -1124,10 +1124,13 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi,
> struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info;
> struct list_head *wait_list = (dpolicy->type == DPOLICY_FSTRIM) ?
> &(dcc->fstrim_list) : &(dcc->wait_list);
> - int flag = dpolicy->sync ? REQ_SYNC : 0;
> + int flag;
> block_t lstart, start, len, total_len;
> int err = 0;
>
> + flag = dpolicy->sync ? REQ_SYNC : 0;
> + flag |= dpolicy->type == DPOLICY_UMOUNT ? REQ_NOWAIT : 0;
> +
> if (dc->state != D_PREP)
> return 0;
>
> @@ -1203,6 +1206,11 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi,
> bio->bi_end_io = f2fs_submit_discard_endio;
> bio->bi_opf |= flag;
> submit_bio(bio);
> + if ((flag & REQ_NOWAIT) && (dc->error == -EAGAIN)) {
> + dc->state = D_PREP;
> + err = dc->error;
> + break;
> + }
>
> atomic_inc(&dcc->issued_discard);
>
> @@ -1510,6 +1518,10 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi,
> }
>
> __submit_discard_cmd(sbi, dpolicy, dc, &issued);
> + if (dc->error == -EAGAIN) {
> + congestion_wait(BLK_RW_ASYNC, HZ/50);

--> need to be DEFAULT_IO_TIMEOUT

> + __relocate_discard_cmd(dcc, dc);

It seems we need to submit bio first, and then move dc to wait_list, if there's
no error, in __submit_discard_cmd().

> + }
>
> if (issued >= dpolicy->max_requests)
> break;
> --
> Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.

2020-03-13 01:27:22

by Sahitya Tummala

[permalink] [raw]
Subject: Re: [PATCH] f2fs: fix long latency due to discard during umount

On Thu, Mar 12, 2020 at 10:02:42AM -0700, Jaegeuk Kim wrote:
> On 03/12, Sahitya Tummala wrote:
> > F2FS already has a default timeout of 5 secs for discards that
> > can be issued during umount, but it can take more than the 5 sec
> > timeout if the underlying UFS device queue is already full and there
> > are no more available free tags to be used. In that case, submit_bio()
> > will wait for the already queued discard requests to complete to get
> > a free tag, which can potentially take way more than 5 sec.
> >
> > Fix this by submitting the discard requests with REQ_NOWAIT
> > flags during umount. This will return -EAGAIN for UFS queue/tag full
> > scenario without waiting in the context of submit_bio(). The FS can
> > then handle these requests by retrying again within the stipulated
> > discard timeout period to avoid long latencies.
> >
> > Signed-off-by: Sahitya Tummala <[email protected]>
> > ---
> > fs/f2fs/segment.c | 14 +++++++++++++-
> > 1 file changed, 13 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> > index fb3e531..a06bbac 100644
> > --- a/fs/f2fs/segment.c
> > +++ b/fs/f2fs/segment.c
> > @@ -1124,10 +1124,13 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi,
> > struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info;
> > struct list_head *wait_list = (dpolicy->type == DPOLICY_FSTRIM) ?
> > &(dcc->fstrim_list) : &(dcc->wait_list);
> > - int flag = dpolicy->sync ? REQ_SYNC : 0;
> > + int flag;
> > block_t lstart, start, len, total_len;
> > int err = 0;
> >
> > + flag = dpolicy->sync ? REQ_SYNC : 0;
> > + flag |= dpolicy->type == DPOLICY_UMOUNT ? REQ_NOWAIT : 0;
> > +
> > if (dc->state != D_PREP)
> > return 0;
> >
> > @@ -1203,6 +1206,11 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi,
> > bio->bi_end_io = f2fs_submit_discard_endio;
> > bio->bi_opf |= flag;
> > submit_bio(bio);
> > + if ((flag & REQ_NOWAIT) && (dc->error == -EAGAIN)) {
> > + dc->state = D_PREP;
> > + err = dc->error;
> > + break;
> > + }
> >
> > atomic_inc(&dcc->issued_discard);
> >
> > @@ -1510,6 +1518,10 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi,
> > }
> >
> > __submit_discard_cmd(sbi, dpolicy, dc, &issued);
> > + if (dc->error == -EAGAIN) {
> > + congestion_wait(BLK_RW_ASYNC, HZ/50);
>
> --> need to be DEFAULT_IO_TIMEOUT

Yes, i will update it.

>
> > + __relocate_discard_cmd(dcc, dc);
>
> It seems we need to submit bio first, and then move dc to wait_list, if there's
> no error, in __submit_discard_cmd().

Yes, that is not changed and it still happens for the failed request
that is re-queued here too when it gets submitted again later.

I am requeuing the discard request failed with -EAGAIN error back to
dcc->pend_list[] from wait_list. It will call submit_bio() for this request
and also move to wait_list when it calls __submit_discard_cmd() again next
time. Please let me know if I am missing anything?

Thanks,

>
> > + }
> >
> > if (issued >= dpolicy->max_requests)
> > break;
> > --
> > Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.
> > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.

--
--
Sent by a consultant of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

2020-03-13 01:46:37

by Jaegeuk Kim

[permalink] [raw]
Subject: Re: [PATCH] f2fs: fix long latency due to discard during umount

On 03/13, Sahitya Tummala wrote:
> On Thu, Mar 12, 2020 at 10:02:42AM -0700, Jaegeuk Kim wrote:
> > On 03/12, Sahitya Tummala wrote:
> > > F2FS already has a default timeout of 5 secs for discards that
> > > can be issued during umount, but it can take more than the 5 sec
> > > timeout if the underlying UFS device queue is already full and there
> > > are no more available free tags to be used. In that case, submit_bio()
> > > will wait for the already queued discard requests to complete to get
> > > a free tag, which can potentially take way more than 5 sec.
> > >
> > > Fix this by submitting the discard requests with REQ_NOWAIT
> > > flags during umount. This will return -EAGAIN for UFS queue/tag full
> > > scenario without waiting in the context of submit_bio(). The FS can
> > > then handle these requests by retrying again within the stipulated
> > > discard timeout period to avoid long latencies.
> > >
> > > Signed-off-by: Sahitya Tummala <[email protected]>
> > > ---
> > > fs/f2fs/segment.c | 14 +++++++++++++-
> > > 1 file changed, 13 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> > > index fb3e531..a06bbac 100644
> > > --- a/fs/f2fs/segment.c
> > > +++ b/fs/f2fs/segment.c
> > > @@ -1124,10 +1124,13 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi,
> > > struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info;
> > > struct list_head *wait_list = (dpolicy->type == DPOLICY_FSTRIM) ?
> > > &(dcc->fstrim_list) : &(dcc->wait_list);
> > > - int flag = dpolicy->sync ? REQ_SYNC : 0;
> > > + int flag;
> > > block_t lstart, start, len, total_len;
> > > int err = 0;
> > >
> > > + flag = dpolicy->sync ? REQ_SYNC : 0;
> > > + flag |= dpolicy->type == DPOLICY_UMOUNT ? REQ_NOWAIT : 0;
> > > +
> > > if (dc->state != D_PREP)
> > > return 0;
> > >
> > > @@ -1203,6 +1206,11 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi,
> > > bio->bi_end_io = f2fs_submit_discard_endio;
> > > bio->bi_opf |= flag;
> > > submit_bio(bio);
> > > + if ((flag & REQ_NOWAIT) && (dc->error == -EAGAIN)) {
> > > + dc->state = D_PREP;
> > > + err = dc->error;
> > > + break;
> > > + }
> > >
> > > atomic_inc(&dcc->issued_discard);
> > >
> > > @@ -1510,6 +1518,10 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi,
> > > }
> > >
> > > __submit_discard_cmd(sbi, dpolicy, dc, &issued);
> > > + if (dc->error == -EAGAIN) {
> > > + congestion_wait(BLK_RW_ASYNC, HZ/50);
> >
> > --> need to be DEFAULT_IO_TIMEOUT
>
> Yes, i will update it.
>
> >
> > > + __relocate_discard_cmd(dcc, dc);
> >
> > It seems we need to submit bio first, and then move dc to wait_list, if there's
> > no error, in __submit_discard_cmd().
>
> Yes, that is not changed and it still happens for the failed request
> that is re-queued here too when it gets submitted again later.
>
> I am requeuing the discard request failed with -EAGAIN error back to
> dcc->pend_list[] from wait_list. It will call submit_bio() for this request
> and also move to wait_list when it calls __submit_discard_cmd() again next
> time. Please let me know if I am missing anything?

This patch has no problem, but I'm thinking that __submit_discard_cmd() needs
to return with any values by assumption where the waiting list should have
submitted commands.

>
> Thanks,
>
> >
> > > + }
> > >
> > > if (issued >= dpolicy->max_requests)
> > > break;
> > > --
> > > Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.
> > > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.
>
> --
> --
> Sent by a consultant of the Qualcomm Innovation Center, Inc.
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

2020-03-13 02:20:55

by Chao Yu

[permalink] [raw]
Subject: Re: [PATCH] f2fs: fix long latency due to discard during umount

On 2020/3/12 19:14, Sahitya Tummala wrote:
> F2FS already has a default timeout of 5 secs for discards that
> can be issued during umount, but it can take more than the 5 sec
> timeout if the underlying UFS device queue is already full and there
> are no more available free tags to be used. In that case, submit_bio()
> will wait for the already queued discard requests to complete to get
> a free tag, which can potentially take way more than 5 sec.
>
> Fix this by submitting the discard requests with REQ_NOWAIT
> flags during umount. This will return -EAGAIN for UFS queue/tag full
> scenario without waiting in the context of submit_bio(). The FS can
> then handle these requests by retrying again within the stipulated
> discard timeout period to avoid long latencies.
>
> Signed-off-by: Sahitya Tummala <[email protected]>
> ---
> fs/f2fs/segment.c | 14 +++++++++++++-
> 1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> index fb3e531..a06bbac 100644
> --- a/fs/f2fs/segment.c
> +++ b/fs/f2fs/segment.c
> @@ -1124,10 +1124,13 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi,
> struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info;
> struct list_head *wait_list = (dpolicy->type == DPOLICY_FSTRIM) ?
> &(dcc->fstrim_list) : &(dcc->wait_list);
> - int flag = dpolicy->sync ? REQ_SYNC : 0;
> + int flag;
> block_t lstart, start, len, total_len;
> int err = 0;
>
> + flag = dpolicy->sync ? REQ_SYNC : 0;
> + flag |= dpolicy->type == DPOLICY_UMOUNT ? REQ_NOWAIT : 0;
> +
> if (dc->state != D_PREP)
> return 0;
>
> @@ -1203,6 +1206,11 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi,
> bio->bi_end_io = f2fs_submit_discard_endio;
> bio->bi_opf |= flag;
> submit_bio(bio);
> + if ((flag & REQ_NOWAIT) && (dc->error == -EAGAIN)) {

If we want to update dc->state, we need to cover it with dc->lock.

> + dc->state = D_PREP;

BTW, one dc can be referenced by multiple bios, so dc->state could be updated to
D_DONE later by f2fs_submit_discard_endio(), however we just relocate it to
pending list... which is inconsistent status.

Thanks,

> + err = dc->error;
> + break;
> + }
>
> atomic_inc(&dcc->issued_discard);
>
> @@ -1510,6 +1518,10 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi,
> }
>
> __submit_discard_cmd(sbi, dpolicy, dc, &issued);
> + if (dc->error == -EAGAIN) {
> + congestion_wait(BLK_RW_ASYNC, HZ/50);
> + __relocate_discard_cmd(dcc, dc);
> + }
>
> if (issued >= dpolicy->max_requests)
> break;
>

2020-03-13 03:40:29

by Sahitya Tummala

[permalink] [raw]
Subject: Re: [PATCH] f2fs: fix long latency due to discard during umount

On Fri, Mar 13, 2020 at 10:20:04AM +0800, Chao Yu wrote:
> On 2020/3/12 19:14, Sahitya Tummala wrote:
> > F2FS already has a default timeout of 5 secs for discards that
> > can be issued during umount, but it can take more than the 5 sec
> > timeout if the underlying UFS device queue is already full and there
> > are no more available free tags to be used. In that case, submit_bio()
> > will wait for the already queued discard requests to complete to get
> > a free tag, which can potentially take way more than 5 sec.
> >
> > Fix this by submitting the discard requests with REQ_NOWAIT
> > flags during umount. This will return -EAGAIN for UFS queue/tag full
> > scenario without waiting in the context of submit_bio(). The FS can
> > then handle these requests by retrying again within the stipulated
> > discard timeout period to avoid long latencies.
> >
> > Signed-off-by: Sahitya Tummala <[email protected]>
> > ---
> > fs/f2fs/segment.c | 14 +++++++++++++-
> > 1 file changed, 13 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> > index fb3e531..a06bbac 100644
> > --- a/fs/f2fs/segment.c
> > +++ b/fs/f2fs/segment.c
> > @@ -1124,10 +1124,13 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi,
> > struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info;
> > struct list_head *wait_list = (dpolicy->type == DPOLICY_FSTRIM) ?
> > &(dcc->fstrim_list) : &(dcc->wait_list);
> > - int flag = dpolicy->sync ? REQ_SYNC : 0;
> > + int flag;
> > block_t lstart, start, len, total_len;
> > int err = 0;
> >
> > + flag = dpolicy->sync ? REQ_SYNC : 0;
> > + flag |= dpolicy->type == DPOLICY_UMOUNT ? REQ_NOWAIT : 0;
> > +
> > if (dc->state != D_PREP)
> > return 0;
> >
> > @@ -1203,6 +1206,11 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi,
> > bio->bi_end_io = f2fs_submit_discard_endio;
> > bio->bi_opf |= flag;
> > submit_bio(bio);
> > + if ((flag & REQ_NOWAIT) && (dc->error == -EAGAIN)) {
>
> If we want to update dc->state, we need to cover it with dc->lock.

Sure, will update it.

>
> > + dc->state = D_PREP;
>
> BTW, one dc can be referenced by multiple bios, so dc->state could be updated to
> D_DONE later by f2fs_submit_discard_endio(), however we just relocate it to
> pending list... which is inconsistent status.

In that case dc->bio_ref will reflect it and until it becomes 0, the dc->state
will not be updated to D_DONE in f2fs_submit_discard_endio()?

Thanks,

>
> Thanks,
>
> > + err = dc->error;
> > + break;
> > + }
> >
> > atomic_inc(&dcc->issued_discard);
> >
> > @@ -1510,6 +1518,10 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi,
> > }
> >
> > __submit_discard_cmd(sbi, dpolicy, dc, &issued);
> > + if (dc->error == -EAGAIN) {
> > + congestion_wait(BLK_RW_ASYNC, HZ/50);
> > + __relocate_discard_cmd(dcc, dc);
> > + }
> >
> > if (issued >= dpolicy->max_requests)
> > break;
> >

--
--
Sent by a consultant of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

2020-03-13 05:13:42

by Sahitya Tummala

[permalink] [raw]
Subject: Re: [PATCH] f2fs: fix long latency due to discard during umount

On Thu, Mar 12, 2020 at 06:45:35PM -0700, Jaegeuk Kim wrote:
> On 03/13, Sahitya Tummala wrote:
> > On Thu, Mar 12, 2020 at 10:02:42AM -0700, Jaegeuk Kim wrote:
> > > On 03/12, Sahitya Tummala wrote:
> > > > F2FS already has a default timeout of 5 secs for discards that
> > > > can be issued during umount, but it can take more than the 5 sec
> > > > timeout if the underlying UFS device queue is already full and there
> > > > are no more available free tags to be used. In that case, submit_bio()
> > > > will wait for the already queued discard requests to complete to get
> > > > a free tag, which can potentially take way more than 5 sec.
> > > >
> > > > Fix this by submitting the discard requests with REQ_NOWAIT
> > > > flags during umount. This will return -EAGAIN for UFS queue/tag full
> > > > scenario without waiting in the context of submit_bio(). The FS can
> > > > then handle these requests by retrying again within the stipulated
> > > > discard timeout period to avoid long latencies.
> > > >
> > > > Signed-off-by: Sahitya Tummala <[email protected]>
> > > > ---
> > > > fs/f2fs/segment.c | 14 +++++++++++++-
> > > > 1 file changed, 13 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> > > > index fb3e531..a06bbac 100644
> > > > --- a/fs/f2fs/segment.c
> > > > +++ b/fs/f2fs/segment.c
> > > > @@ -1124,10 +1124,13 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi,
> > > > struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info;
> > > > struct list_head *wait_list = (dpolicy->type == DPOLICY_FSTRIM) ?
> > > > &(dcc->fstrim_list) : &(dcc->wait_list);
> > > > - int flag = dpolicy->sync ? REQ_SYNC : 0;
> > > > + int flag;
> > > > block_t lstart, start, len, total_len;
> > > > int err = 0;
> > > >
> > > > + flag = dpolicy->sync ? REQ_SYNC : 0;
> > > > + flag |= dpolicy->type == DPOLICY_UMOUNT ? REQ_NOWAIT : 0;
> > > > +
> > > > if (dc->state != D_PREP)
> > > > return 0;
> > > >
> > > > @@ -1203,6 +1206,11 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi,
> > > > bio->bi_end_io = f2fs_submit_discard_endio;
> > > > bio->bi_opf |= flag;
> > > > submit_bio(bio);
> > > > + if ((flag & REQ_NOWAIT) && (dc->error == -EAGAIN)) {
> > > > + dc->state = D_PREP;
> > > > + err = dc->error;
> > > > + break;
> > > > + }
> > > >
> > > > atomic_inc(&dcc->issued_discard);
> > > >
> > > > @@ -1510,6 +1518,10 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi,
> > > > }
> > > >
> > > > __submit_discard_cmd(sbi, dpolicy, dc, &issued);
> > > > + if (dc->error == -EAGAIN) {
> > > > + congestion_wait(BLK_RW_ASYNC, HZ/50);
> > >
> > > --> need to be DEFAULT_IO_TIMEOUT
> >
> > Yes, i will update it.
> >
> > >
> > > > + __relocate_discard_cmd(dcc, dc);
> > >
> > > It seems we need to submit bio first, and then move dc to wait_list, if there's
> > > no error, in __submit_discard_cmd().
> >
> > Yes, that is not changed and it still happens for the failed request
> > that is re-queued here too when it gets submitted again later.
> >
> > I am requeuing the discard request failed with -EAGAIN error back to
> > dcc->pend_list[] from wait_list. It will call submit_bio() for this request
> > and also move to wait_list when it calls __submit_discard_cmd() again next
> > time. Please let me know if I am missing anything?
>
> This patch has no problem, but I'm thinking that __submit_discard_cmd() needs
> to return with any values by assumption where the waiting list should have
> submitted commands.

I think dc->queued will indicated that dc is moved to wait_list. This can be
used along with return value to take right action. Can you check if this
works?

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index a06bbac..91df060 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -1478,7 +1478,7 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi,
struct list_head *pend_list;
struct discard_cmd *dc, *tmp;
struct blk_plug plug;
- int i, issued = 0;
+ int i, err, issued = 0;
bool io_interrupted = false;

if (dpolicy->timeout != 0)
@@ -1517,8 +1517,10 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi,
break;
}

- __submit_discard_cmd(sbi, dpolicy, dc, &issued);
- if (dc->error == -EAGAIN) {
+ err = __submit_discard_cmd(sbi, dpolicy, dc, &issued);
+ if (err && err != -EAGAIN) {
+ __remove_discard_cmd(sbi, dc);
+ } else if (err == -EAGAIN && dc->queued) {
congestion_wait(BLK_RW_ASYNC, HZ/50);
__relocate_discard_cmd(dcc, dc);
}

thanks,
>
> >
> > Thanks,
> >
> > >
> > > > + }
> > > >
> > > > if (issued >= dpolicy->max_requests)
> > > > break;
> > > > --
> > > > Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.
> > > > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.
> >
> > --
> > --
> > Sent by a consultant of the Qualcomm Innovation Center, Inc.
> > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

--
--
Sent by a consultant of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

2020-03-13 06:32:26

by Chao Yu

[permalink] [raw]
Subject: Re: [PATCH] f2fs: fix long latency due to discard during umount

On 2020/3/13 11:39, Sahitya Tummala wrote:
> On Fri, Mar 13, 2020 at 10:20:04AM +0800, Chao Yu wrote:
>> On 2020/3/12 19:14, Sahitya Tummala wrote:
>>> F2FS already has a default timeout of 5 secs for discards that
>>> can be issued during umount, but it can take more than the 5 sec
>>> timeout if the underlying UFS device queue is already full and there
>>> are no more available free tags to be used. In that case, submit_bio()
>>> will wait for the already queued discard requests to complete to get
>>> a free tag, which can potentially take way more than 5 sec.
>>>
>>> Fix this by submitting the discard requests with REQ_NOWAIT
>>> flags during umount. This will return -EAGAIN for UFS queue/tag full
>>> scenario without waiting in the context of submit_bio(). The FS can
>>> then handle these requests by retrying again within the stipulated
>>> discard timeout period to avoid long latencies.
>>>
>>> Signed-off-by: Sahitya Tummala <[email protected]>
>>> ---
>>> fs/f2fs/segment.c | 14 +++++++++++++-
>>> 1 file changed, 13 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
>>> index fb3e531..a06bbac 100644
>>> --- a/fs/f2fs/segment.c
>>> +++ b/fs/f2fs/segment.c
>>> @@ -1124,10 +1124,13 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi,
>>> struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info;
>>> struct list_head *wait_list = (dpolicy->type == DPOLICY_FSTRIM) ?
>>> &(dcc->fstrim_list) : &(dcc->wait_list);
>>> - int flag = dpolicy->sync ? REQ_SYNC : 0;
>>> + int flag;
>>> block_t lstart, start, len, total_len;
>>> int err = 0;
>>>
>>> + flag = dpolicy->sync ? REQ_SYNC : 0;
>>> + flag |= dpolicy->type == DPOLICY_UMOUNT ? REQ_NOWAIT : 0;
>>> +
>>> if (dc->state != D_PREP)
>>> return 0;
>>>
>>> @@ -1203,6 +1206,11 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi,
>>> bio->bi_end_io = f2fs_submit_discard_endio;
>>> bio->bi_opf |= flag;
>>> submit_bio(bio);
>>> + if ((flag & REQ_NOWAIT) && (dc->error == -EAGAIN)) {
>>
>> If we want to update dc->state, we need to cover it with dc->lock.
>
> Sure, will update it.
>
>>
>>> + dc->state = D_PREP;
>>
>> BTW, one dc can be referenced by multiple bios, so dc->state could be updated to
>> D_DONE later by f2fs_submit_discard_endio(), however we just relocate it to
>> pending list... which is inconsistent status.
>
> In that case dc->bio_ref will reflect it and until it becomes 0, the dc->state
> will not be updated to D_DONE in f2fs_submit_discard_endio()?

__submit_discard_cmd()
lock()
dc->state = D_SUBMIT;
dc->bio_ref++;
unlock()
...
submit_bio()
f2fs_submit_discard_endio()
dc->error = -EAGAIN;
lock()
dc->bio_ref--;

dc->state = D_PREP;

dc->state = D_DONE;
unlock()

So finally, dc's state is D_DONE, and it's in wait list, then will be relocated
to pending list.

>
> Thanks,
>
>>
>> Thanks,
>>
>>> + err = dc->error;
>>> + break;
>>> + }
>>>
>>> atomic_inc(&dcc->issued_discard);
>>>
>>> @@ -1510,6 +1518,10 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi,
>>> }
>>>
>>> __submit_discard_cmd(sbi, dpolicy, dc, &issued);
>>> + if (dc->error == -EAGAIN) {
>>> + congestion_wait(BLK_RW_ASYNC, HZ/50);
>>> + __relocate_discard_cmd(dcc, dc);
>>> + }
>>>
>>> if (issued >= dpolicy->max_requests)
>>> break;
>>>
>

2020-03-13 11:11:06

by Sahitya Tummala

[permalink] [raw]
Subject: Re: [PATCH] f2fs: fix long latency due to discard during umount

On Fri, Mar 13, 2020 at 02:30:55PM +0800, Chao Yu wrote:
> On 2020/3/13 11:39, Sahitya Tummala wrote:
> > On Fri, Mar 13, 2020 at 10:20:04AM +0800, Chao Yu wrote:
> >> On 2020/3/12 19:14, Sahitya Tummala wrote:
> >>> F2FS already has a default timeout of 5 secs for discards that
> >>> can be issued during umount, but it can take more than the 5 sec
> >>> timeout if the underlying UFS device queue is already full and there
> >>> are no more available free tags to be used. In that case, submit_bio()
> >>> will wait for the already queued discard requests to complete to get
> >>> a free tag, which can potentially take way more than 5 sec.
> >>>
> >>> Fix this by submitting the discard requests with REQ_NOWAIT
> >>> flags during umount. This will return -EAGAIN for UFS queue/tag full
> >>> scenario without waiting in the context of submit_bio(). The FS can
> >>> then handle these requests by retrying again within the stipulated
> >>> discard timeout period to avoid long latencies.
> >>>
> >>> Signed-off-by: Sahitya Tummala <[email protected]>
> >>> ---
> >>> fs/f2fs/segment.c | 14 +++++++++++++-
> >>> 1 file changed, 13 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> >>> index fb3e531..a06bbac 100644
> >>> --- a/fs/f2fs/segment.c
> >>> +++ b/fs/f2fs/segment.c
> >>> @@ -1124,10 +1124,13 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi,
> >>> struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info;
> >>> struct list_head *wait_list = (dpolicy->type == DPOLICY_FSTRIM) ?
> >>> &(dcc->fstrim_list) : &(dcc->wait_list);
> >>> - int flag = dpolicy->sync ? REQ_SYNC : 0;
> >>> + int flag;
> >>> block_t lstart, start, len, total_len;
> >>> int err = 0;
> >>>
> >>> + flag = dpolicy->sync ? REQ_SYNC : 0;
> >>> + flag |= dpolicy->type == DPOLICY_UMOUNT ? REQ_NOWAIT : 0;
> >>> +
> >>> if (dc->state != D_PREP)
> >>> return 0;
> >>>
> >>> @@ -1203,6 +1206,11 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi,
> >>> bio->bi_end_io = f2fs_submit_discard_endio;
> >>> bio->bi_opf |= flag;
> >>> submit_bio(bio);
> >>> + if ((flag & REQ_NOWAIT) && (dc->error == -EAGAIN)) {
> >>
> >> If we want to update dc->state, we need to cover it with dc->lock.
> >
> > Sure, will update it.
> >
> >>
> >>> + dc->state = D_PREP;
> >>
> >> BTW, one dc can be referenced by multiple bios, so dc->state could be updated to
> >> D_DONE later by f2fs_submit_discard_endio(), however we just relocate it to
> >> pending list... which is inconsistent status.
> >
> > In that case dc->bio_ref will reflect it and until it becomes 0, the dc->state
> > will not be updated to D_DONE in f2fs_submit_discard_endio()?
>
> __submit_discard_cmd()
> lock()
> dc->state = D_SUBMIT;
> dc->bio_ref++;
> unlock()
> ...
> submit_bio()
> f2fs_submit_discard_endio()
> dc->error = -EAGAIN;
> lock()
> dc->bio_ref--;
>
> dc->state = D_PREP;
>
> dc->state = D_DONE;
> unlock()
>
> So finally, dc's state is D_DONE, and it's in wait list, then will be relocated
> to pending list.

In case of queue full, f2fs_submit_discard_endio() will not be called
asynchronously. It will be called in the context of submit_bio() itself.
So by the time, submit_bio returns dc->error will be -EAGAIN and dc->state
will be D_DONE.

submit_bio()
->blk_mq_make_request
->blk_mq_get_request()
->bio_wouldblock_error() (called due to queue full)
->bio_endio()

Thanks,
>
> >
> > Thanks,
> >
> >>
> >> Thanks,
> >>
> >>> + err = dc->error;
> >>> + break;
> >>> + }
> >>>
> >>> atomic_inc(&dcc->issued_discard);
> >>>
> >>> @@ -1510,6 +1518,10 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi,
> >>> }
> >>>
> >>> __submit_discard_cmd(sbi, dpolicy, dc, &issued);
> >>> + if (dc->error == -EAGAIN) {
> >>> + congestion_wait(BLK_RW_ASYNC, HZ/50);
> >>> + __relocate_discard_cmd(dcc, dc);
> >>> + }
> >>>
> >>> if (issued >= dpolicy->max_requests)
> >>> break;
> >>>
> >

--
--
Sent by a consultant of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

2020-03-13 15:40:52

by Jaegeuk Kim

[permalink] [raw]
Subject: Re: [PATCH] f2fs: fix long latency due to discard during umount

On 03/13, Sahitya Tummala wrote:
> On Thu, Mar 12, 2020 at 06:45:35PM -0700, Jaegeuk Kim wrote:
> > On 03/13, Sahitya Tummala wrote:
> > > On Thu, Mar 12, 2020 at 10:02:42AM -0700, Jaegeuk Kim wrote:
> > > > On 03/12, Sahitya Tummala wrote:
> > > > > F2FS already has a default timeout of 5 secs for discards that
> > > > > can be issued during umount, but it can take more than the 5 sec
> > > > > timeout if the underlying UFS device queue is already full and there
> > > > > are no more available free tags to be used. In that case, submit_bio()
> > > > > will wait for the already queued discard requests to complete to get
> > > > > a free tag, which can potentially take way more than 5 sec.
> > > > >
> > > > > Fix this by submitting the discard requests with REQ_NOWAIT
> > > > > flags during umount. This will return -EAGAIN for UFS queue/tag full
> > > > > scenario without waiting in the context of submit_bio(). The FS can
> > > > > then handle these requests by retrying again within the stipulated
> > > > > discard timeout period to avoid long latencies.
> > > > >
> > > > > Signed-off-by: Sahitya Tummala <[email protected]>
> > > > > ---
> > > > > fs/f2fs/segment.c | 14 +++++++++++++-
> > > > > 1 file changed, 13 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> > > > > index fb3e531..a06bbac 100644
> > > > > --- a/fs/f2fs/segment.c
> > > > > +++ b/fs/f2fs/segment.c
> > > > > @@ -1124,10 +1124,13 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi,
> > > > > struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info;
> > > > > struct list_head *wait_list = (dpolicy->type == DPOLICY_FSTRIM) ?
> > > > > &(dcc->fstrim_list) : &(dcc->wait_list);
> > > > > - int flag = dpolicy->sync ? REQ_SYNC : 0;
> > > > > + int flag;
> > > > > block_t lstart, start, len, total_len;
> > > > > int err = 0;
> > > > >
> > > > > + flag = dpolicy->sync ? REQ_SYNC : 0;
> > > > > + flag |= dpolicy->type == DPOLICY_UMOUNT ? REQ_NOWAIT : 0;
> > > > > +
> > > > > if (dc->state != D_PREP)
> > > > > return 0;
> > > > >
> > > > > @@ -1203,6 +1206,11 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi,
> > > > > bio->bi_end_io = f2fs_submit_discard_endio;
> > > > > bio->bi_opf |= flag;
> > > > > submit_bio(bio);
> > > > > + if ((flag & REQ_NOWAIT) && (dc->error == -EAGAIN)) {
> > > > > + dc->state = D_PREP;
> > > > > + err = dc->error;
> > > > > + break;
> > > > > + }
> > > > >
> > > > > atomic_inc(&dcc->issued_discard);
> > > > >
> > > > > @@ -1510,6 +1518,10 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi,
> > > > > }
> > > > >
> > > > > __submit_discard_cmd(sbi, dpolicy, dc, &issued);
> > > > > + if (dc->error == -EAGAIN) {
> > > > > + congestion_wait(BLK_RW_ASYNC, HZ/50);
> > > >
> > > > --> need to be DEFAULT_IO_TIMEOUT
> > >
> > > Yes, i will update it.
> > >
> > > >
> > > > > + __relocate_discard_cmd(dcc, dc);
> > > >
> > > > It seems we need to submit bio first, and then move dc to wait_list, if there's
> > > > no error, in __submit_discard_cmd().
> > >
> > > Yes, that is not changed and it still happens for the failed request
> > > that is re-queued here too when it gets submitted again later.
> > >
> > > I am requeuing the discard request failed with -EAGAIN error back to
> > > dcc->pend_list[] from wait_list. It will call submit_bio() for this request
> > > and also move to wait_list when it calls __submit_discard_cmd() again next
> > > time. Please let me know if I am missing anything?
> >
> > This patch has no problem, but I'm thinking that __submit_discard_cmd() needs
> > to return with any values by assumption where the waiting list should have
> > submitted commands.
>
> I think dc->queued will indicated that dc is moved to wait_list. This can be
> used along with return value to take right action. Can you check if this
> works?

I mean why can't do this *in* __submit_discard_cmd()? Otherwise, existing and
future callers should consider to handle the errors everytime.

>
> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> index a06bbac..91df060 100644
> --- a/fs/f2fs/segment.c
> +++ b/fs/f2fs/segment.c
> @@ -1478,7 +1478,7 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi,
> struct list_head *pend_list;
> struct discard_cmd *dc, *tmp;
> struct blk_plug plug;
> - int i, issued = 0;
> + int i, err, issued = 0;
> bool io_interrupted = false;
>
> if (dpolicy->timeout != 0)
> @@ -1517,8 +1517,10 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi,
> break;
> }
>
> - __submit_discard_cmd(sbi, dpolicy, dc, &issued);
> - if (dc->error == -EAGAIN) {
> + err = __submit_discard_cmd(sbi, dpolicy, dc, &issued);
> + if (err && err != -EAGAIN) {
> + __remove_discard_cmd(sbi, dc);
> + } else if (err == -EAGAIN && dc->queued) {
> congestion_wait(BLK_RW_ASYNC, HZ/50);
> __relocate_discard_cmd(dcc, dc);
> }
>
> thanks,
> >
> > >
> > > Thanks,
> > >
> > > >
> > > > > + }
> > > > >
> > > > > if (issued >= dpolicy->max_requests)
> > > > > break;
> > > > > --
> > > > > Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.
> > > > > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.
> > >
> > > --
> > > --
> > > Sent by a consultant of the Qualcomm Innovation Center, Inc.
> > > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
>
> --
> --
> Sent by a consultant of the Qualcomm Innovation Center, Inc.
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

2020-03-16 00:53:31

by Chao Yu

[permalink] [raw]
Subject: Re: [PATCH] f2fs: fix long latency due to discard during umount

On 2020/3/13 19:08, Sahitya Tummala wrote:
> On Fri, Mar 13, 2020 at 02:30:55PM +0800, Chao Yu wrote:
>> On 2020/3/13 11:39, Sahitya Tummala wrote:
>>> On Fri, Mar 13, 2020 at 10:20:04AM +0800, Chao Yu wrote:
>>>> On 2020/3/12 19:14, Sahitya Tummala wrote:
>>>>> F2FS already has a default timeout of 5 secs for discards that
>>>>> can be issued during umount, but it can take more than the 5 sec
>>>>> timeout if the underlying UFS device queue is already full and there
>>>>> are no more available free tags to be used. In that case, submit_bio()
>>>>> will wait for the already queued discard requests to complete to get
>>>>> a free tag, which can potentially take way more than 5 sec.
>>>>>
>>>>> Fix this by submitting the discard requests with REQ_NOWAIT
>>>>> flags during umount. This will return -EAGAIN for UFS queue/tag full
>>>>> scenario without waiting in the context of submit_bio(). The FS can
>>>>> then handle these requests by retrying again within the stipulated
>>>>> discard timeout period to avoid long latencies.
>>>>>
>>>>> Signed-off-by: Sahitya Tummala <[email protected]>
>>>>> ---
>>>>> fs/f2fs/segment.c | 14 +++++++++++++-
>>>>> 1 file changed, 13 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
>>>>> index fb3e531..a06bbac 100644
>>>>> --- a/fs/f2fs/segment.c
>>>>> +++ b/fs/f2fs/segment.c
>>>>> @@ -1124,10 +1124,13 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi,
>>>>> struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info;
>>>>> struct list_head *wait_list = (dpolicy->type == DPOLICY_FSTRIM) ?
>>>>> &(dcc->fstrim_list) : &(dcc->wait_list);
>>>>> - int flag = dpolicy->sync ? REQ_SYNC : 0;
>>>>> + int flag;
>>>>> block_t lstart, start, len, total_len;
>>>>> int err = 0;
>>>>>
>>>>> + flag = dpolicy->sync ? REQ_SYNC : 0;
>>>>> + flag |= dpolicy->type == DPOLICY_UMOUNT ? REQ_NOWAIT : 0;
>>>>> +
>>>>> if (dc->state != D_PREP)
>>>>> return 0;
>>>>>
>>>>> @@ -1203,6 +1206,11 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi,
>>>>> bio->bi_end_io = f2fs_submit_discard_endio;
>>>>> bio->bi_opf |= flag;
>>>>> submit_bio(bio);
>>>>> + if ((flag & REQ_NOWAIT) && (dc->error == -EAGAIN)) {
>>>>
>>>> If we want to update dc->state, we need to cover it with dc->lock.
>>>
>>> Sure, will update it.
>>>
>>>>
>>>>> + dc->state = D_PREP;
>>>>
>>>> BTW, one dc can be referenced by multiple bios, so dc->state could be updated to
>>>> D_DONE later by f2fs_submit_discard_endio(), however we just relocate it to
>>>> pending list... which is inconsistent status.
>>>
>>> In that case dc->bio_ref will reflect it and until it becomes 0, the dc->state
>>> will not be updated to D_DONE in f2fs_submit_discard_endio()?
>>
>> __submit_discard_cmd()
>> lock()
>> dc->state = D_SUBMIT;
>> dc->bio_ref++;
>> unlock()
>> ...
>> submit_bio()
>> f2fs_submit_discard_endio()
>> dc->error = -EAGAIN;
>> lock()
>> dc->bio_ref--;
>>
>> dc->state = D_PREP;
>>
>> dc->state = D_DONE;
>> unlock()
>>
>> So finally, dc's state is D_DONE, and it's in wait list, then will be relocated
>> to pending list.
>
> In case of queue full, f2fs_submit_discard_endio() will not be called

I guess the case is there are multiple bios related to one dc and partially callback
of bio is called asynchronously and the other is called synchronously, so the race
condition could happen.

Thanks,

> asynchronously. It will be called in the context of submit_bio() itself.
> So by the time, submit_bio returns dc->error will be -EAGAIN and dc->state
> will be D_DONE.
>
> submit_bio()
> ->blk_mq_make_request
> ->blk_mq_get_request()
> ->bio_wouldblock_error() (called due to queue full)
> ->bio_endio()
>
> Thanks,
>>
>>>
>>> Thanks,
>>>
>>>>
>>>> Thanks,
>>>>
>>>>> + err = dc->error;
>>>>> + break;
>>>>> + }
>>>>>
>>>>> atomic_inc(&dcc->issued_discard);
>>>>>
>>>>> @@ -1510,6 +1518,10 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi,
>>>>> }
>>>>>
>>>>> __submit_discard_cmd(sbi, dpolicy, dc, &issued);
>>>>> + if (dc->error == -EAGAIN) {
>>>>> + congestion_wait(BLK_RW_ASYNC, HZ/50);
>>>>> + __relocate_discard_cmd(dcc, dc);
>>>>> + }
>>>>>
>>>>> if (issued >= dpolicy->max_requests)
>>>>> break;
>>>>>
>>>
>

2020-03-16 03:53:22

by Sahitya Tummala

[permalink] [raw]
Subject: Re: [PATCH] f2fs: fix long latency due to discard during umount

Hi Chao,

On Mon, Mar 16, 2020 at 08:52:25AM +0800, Chao Yu wrote:
> On 2020/3/13 19:08, Sahitya Tummala wrote:
> > On Fri, Mar 13, 2020 at 02:30:55PM +0800, Chao Yu wrote:
> >> On 2020/3/13 11:39, Sahitya Tummala wrote:
> >>> On Fri, Mar 13, 2020 at 10:20:04AM +0800, Chao Yu wrote:
> >>>> On 2020/3/12 19:14, Sahitya Tummala wrote:
> >>>>> F2FS already has a default timeout of 5 secs for discards that
> >>>>> can be issued during umount, but it can take more than the 5 sec
> >>>>> timeout if the underlying UFS device queue is already full and there
> >>>>> are no more available free tags to be used. In that case, submit_bio()
> >>>>> will wait for the already queued discard requests to complete to get
> >>>>> a free tag, which can potentially take way more than 5 sec.
> >>>>>
> >>>>> Fix this by submitting the discard requests with REQ_NOWAIT
> >>>>> flags during umount. This will return -EAGAIN for UFS queue/tag full
> >>>>> scenario without waiting in the context of submit_bio(). The FS can
> >>>>> then handle these requests by retrying again within the stipulated
> >>>>> discard timeout period to avoid long latencies.
> >>>>>
> >>>>> Signed-off-by: Sahitya Tummala <[email protected]>
> >>>>> ---
> >>>>> fs/f2fs/segment.c | 14 +++++++++++++-
> >>>>> 1 file changed, 13 insertions(+), 1 deletion(-)
> >>>>>
> >>>>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> >>>>> index fb3e531..a06bbac 100644
> >>>>> --- a/fs/f2fs/segment.c
> >>>>> +++ b/fs/f2fs/segment.c
> >>>>> @@ -1124,10 +1124,13 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi,
> >>>>> struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info;
> >>>>> struct list_head *wait_list = (dpolicy->type == DPOLICY_FSTRIM) ?
> >>>>> &(dcc->fstrim_list) : &(dcc->wait_list);
> >>>>> - int flag = dpolicy->sync ? REQ_SYNC : 0;
> >>>>> + int flag;
> >>>>> block_t lstart, start, len, total_len;
> >>>>> int err = 0;
> >>>>>
> >>>>> + flag = dpolicy->sync ? REQ_SYNC : 0;
> >>>>> + flag |= dpolicy->type == DPOLICY_UMOUNT ? REQ_NOWAIT : 0;
> >>>>> +
> >>>>> if (dc->state != D_PREP)
> >>>>> return 0;
> >>>>>
> >>>>> @@ -1203,6 +1206,11 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi,
> >>>>> bio->bi_end_io = f2fs_submit_discard_endio;
> >>>>> bio->bi_opf |= flag;
> >>>>> submit_bio(bio);
> >>>>> + if ((flag & REQ_NOWAIT) && (dc->error == -EAGAIN)) {
> >>>>
> >>>> If we want to update dc->state, we need to cover it with dc->lock.
> >>>
> >>> Sure, will update it.
> >>>
> >>>>
> >>>>> + dc->state = D_PREP;
> >>>>
> >>>> BTW, one dc can be referenced by multiple bios, so dc->state could be updated to
> >>>> D_DONE later by f2fs_submit_discard_endio(), however we just relocate it to
> >>>> pending list... which is inconsistent status.
> >>>
> >>> In that case dc->bio_ref will reflect it and until it becomes 0, the dc->state
> >>> will not be updated to D_DONE in f2fs_submit_discard_endio()?
> >>
> >> __submit_discard_cmd()
> >> lock()
> >> dc->state = D_SUBMIT;
> >> dc->bio_ref++;
> >> unlock()
> >> ...
> >> submit_bio()
> >> f2fs_submit_discard_endio()
> >> dc->error = -EAGAIN;
> >> lock()
> >> dc->bio_ref--;
> >>
> >> dc->state = D_PREP;
> >>
> >> dc->state = D_DONE;
> >> unlock()
> >>
> >> So finally, dc's state is D_DONE, and it's in wait list, then will be relocated
> >> to pending list.
> >
> > In case of queue full, f2fs_submit_discard_endio() will not be called
>
> I guess the case is there are multiple bios related to one dc and partially callback
> of bio is called asynchronously and the other is called synchronously, so the race
> condition could happen.

You are right. Let me review that case and try to fix it.

Thanks,

>
> Thanks,
>
> > asynchronously. It will be called in the context of submit_bio() itself.
> > So by the time, submit_bio returns dc->error will be -EAGAIN and dc->state
> > will be D_DONE.
> >
> > submit_bio()
> > ->blk_mq_make_request
> > ->blk_mq_get_request()
> > ->bio_wouldblock_error() (called due to queue full)
> > ->bio_endio()
> >
> > Thanks,
> >>
> >>>
> >>> Thanks,
> >>>
> >>>>
> >>>> Thanks,
> >>>>
> >>>>> + err = dc->error;
> >>>>> + break;
> >>>>> + }
> >>>>>
> >>>>> atomic_inc(&dcc->issued_discard);
> >>>>>
> >>>>> @@ -1510,6 +1518,10 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi,
> >>>>> }
> >>>>>
> >>>>> __submit_discard_cmd(sbi, dpolicy, dc, &issued);
> >>>>> + if (dc->error == -EAGAIN) {
> >>>>> + congestion_wait(BLK_RW_ASYNC, HZ/50);
> >>>>> + __relocate_discard_cmd(dcc, dc);
> >>>>> + }
> >>>>>
> >>>>> if (issued >= dpolicy->max_requests)
> >>>>> break;
> >>>>>
> >>>
> >

--
--
Sent by a consultant of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.