Hi,
We found an issue under Android OTA scenario that many readahead BIOs have
to do FEC and caused system stuck. These 2 patches try to fix this issue.
Thanks
Wu Bo (2):
dm verity: init fec io before cleaning it
dm verity: don't verity if readahead failed
drivers/md/dm-verity-target.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
--
2.25.1
We found an issue under Android OTA scenario that many BIOs have to do
FEC where the data under dm-verity is 100% complete and no corruption.
Android OTA has many dm-block layers, from upper to lower:
dm-verity
dm-snapshot
dm-origin & dm-cow
dm-linear
ufs
Dm tables have to change 2 times during Android OTA merging process.
When doing table change, the dm-snapshot will be suspended for a while.
During this interval, we found there are many readahead IOs are
submitted to dm_verity from filesystem. Then the kverity works are busy
doing FEC process which cost too much time to finish dm-verity IO. And
cause system stuck.
We add some debug log and find that each readahead IO need around 10s to
finish when this situation occurred. Because here has a IO
amplification:
dm-snapshot suspend
erofs_readahead // 300+ io is submitted
dm_submit_bio (dm_verity)
dm_submit_bio (dm_snapshot)
bio return EIO
bio got nothing, it's empty
verity_end_io
verity_verify_io
forloop range(0, io->n_blocks) // each io->nblocks ~= 20
verity_fec_decode
fec_decode_rsb
fec_read_bufs
forloop range(0, v->fec->rsn) // v->fec->rsn = 253
new_read
submit_bio (dm_snapshot)
end loop
end loop
dm-snapshot resume
Readahead BIO got nothing during dm-snapshot suspended. So all of them
will do FEC.
Each readahead BIO need to do io->n_blocks ~= 20 times verify.
Each block need to do fec, and every block need to do v->fec->rsn = 253
times read.
So during the suspend interval(~200ms), 300 readahead BIO make
300*20*253 IOs on dm-snapshot.
As readahead IO is not required by user space, and to fix this issue,
I think it would be better to pass it to upper layer to handle it.
Signed-off-by: Wu Bo <[email protected]>
---
drivers/md/dm-verity-target.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
index 42b2483eb08c..d242e50ec869 100644
--- a/drivers/md/dm-verity-target.c
+++ b/drivers/md/dm-verity-target.c
@@ -668,7 +668,9 @@ static void verity_end_io(struct bio *bio)
verity_fec_init_io(io);
if (bio->bi_status &&
- (!verity_fec_is_enabled(io->v) || verity_is_system_shutting_down())) {
+ (!verity_fec_is_enabled(io->v) ||
+ verity_is_system_shutting_down() ||
+ (bio->bi_opf & REQ_RAHEAD))) {
verity_finish_io(io, bio->bi_status);
return;
}
--
2.25.1
If BIO error, it may goto verity_finish_io() before
verity_fec_init_io(). Therefor, the fec_io->rs is not initialized and
may crash when doing memory freeing in verity_fec_finish_io().
Crash call stack:
die+0x90/0x2b8
__do_kernel_fault+0x260/0x298
do_bad_area+0x2c/0xdc
do_translation_fault+0x3c/0x54
do_mem_abort+0x54/0x118
el1_abort+0x38/0x5c
el1h_64_sync_handler+0x50/0x90
el1h_64_sync+0x64/0x6c
free_rs+0x18/0xac
fec_rs_free+0x10/0x24
mempool_free+0x58/0x148
verity_fec_finish_io+0x4c/0xb0
verity_end_io+0xb8/0x150
Signed-off-by: Wu Bo <[email protected]>
---
drivers/md/dm-verity-target.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
index e115fcfe723c..42b2483eb08c 100644
--- a/drivers/md/dm-verity-target.c
+++ b/drivers/md/dm-verity-target.c
@@ -642,7 +642,6 @@ static void verity_work(struct work_struct *w)
io->in_tasklet = false;
- verity_fec_init_io(io);
verity_finish_io(io, errno_to_blk_status(verity_verify_io(io)));
}
@@ -667,6 +666,7 @@ static void verity_end_io(struct bio *bio)
{
struct dm_verity_io *io = bio->bi_private;
+ verity_fec_init_io(io);
if (bio->bi_status &&
(!verity_fec_is_enabled(io->v) || verity_is_system_shutting_down())) {
verity_finish_io(io, bio->bi_status);
--
2.25.1
On Tue, Nov 21, 2023 at 01:55:28AM -0700, Wu Bo wrote:
> If BIO error, it may goto verity_finish_io() before
> verity_fec_init_io(). Therefor, the fec_io->rs is not initialized and
> may crash when doing memory freeing in verity_fec_finish_io().
>
> Crash call stack:
> die+0x90/0x2b8
> __do_kernel_fault+0x260/0x298
> do_bad_area+0x2c/0xdc
> do_translation_fault+0x3c/0x54
> do_mem_abort+0x54/0x118
> el1_abort+0x38/0x5c
> el1h_64_sync_handler+0x50/0x90
> el1h_64_sync+0x64/0x6c
> free_rs+0x18/0xac
> fec_rs_free+0x10/0x24
> mempool_free+0x58/0x148
> verity_fec_finish_io+0x4c/0xb0
> verity_end_io+0xb8/0x150
>
> Signed-off-by: Wu Bo <[email protected]>
What commit introduced this bug? It looks like it was commit 5721d4e5a9cd ("dm
verity: Add optional "try_verify_in_tasklet" feature"), which moved the call to
verity_fec_init_io() from verity_map() to verity_work(). Do you agree? Should
the call to verity_fec_init_io() perhaps be moved back to verity_map(), instead
of putting it in verity_end_io()? Or maybe another call to verity_fec_init_io()
should be added only just before the call to verity_finish_io() in
verity_end_io(), given that it's not needed in the tasklet case?
In any case, please also add a Fixes tag with the appropriate commit, as well as
a "Cc stable" tag.
- Eric
On Tue, Nov 21, 2023 at 01:55:29AM -0700, Wu Bo wrote:
> We found an issue under Android OTA scenario that many BIOs have to do
> FEC where the data under dm-verity is 100% complete and no corruption.
>
> Android OTA has many dm-block layers, from upper to lower:
> dm-verity
> dm-snapshot
> dm-origin & dm-cow
> dm-linear
> ufs
>
> Dm tables have to change 2 times during Android OTA merging process.
> When doing table change, the dm-snapshot will be suspended for a while.
> During this interval, we found there are many readahead IOs are
> submitted to dm_verity from filesystem. Then the kverity works are busy
> doing FEC process which cost too much time to finish dm-verity IO. And
> cause system stuck.
>
> We add some debug log and find that each readahead IO need around 10s to
> finish when this situation occurred. Because here has a IO
> amplification:
>
> dm-snapshot suspend
> erofs_readahead // 300+ io is submitted
> dm_submit_bio (dm_verity)
> dm_submit_bio (dm_snapshot)
> bio return EIO
> bio got nothing, it's empty
> verity_end_io
> verity_verify_io
> forloop range(0, io->n_blocks) // each io->nblocks ~= 20
> verity_fec_decode
> fec_decode_rsb
> fec_read_bufs
> forloop range(0, v->fec->rsn) // v->fec->rsn = 253
> new_read
> submit_bio (dm_snapshot)
> end loop
> end loop
> dm-snapshot resume
>
> Readahead BIO got nothing during dm-snapshot suspended. So all of them
> will do FEC.
> Each readahead BIO need to do io->n_blocks ~= 20 times verify.
> Each block need to do fec, and every block need to do v->fec->rsn = 253
> times read.
> So during the suspend interval(~200ms), 300 readahead BIO make
> 300*20*253 IOs on dm-snapshot.
>
> As readahead IO is not required by user space, and to fix this issue,
> I think it would be better to pass it to upper layer to handle it.
>
> Signed-off-by: Wu Bo <[email protected]>
> ---
> drivers/md/dm-verity-target.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
> index 42b2483eb08c..d242e50ec869 100644
> --- a/drivers/md/dm-verity-target.c
> +++ b/drivers/md/dm-verity-target.c
> @@ -668,7 +668,9 @@ static void verity_end_io(struct bio *bio)
>
> verity_fec_init_io(io);
> if (bio->bi_status &&
> - (!verity_fec_is_enabled(io->v) || verity_is_system_shutting_down())) {
> + (!verity_fec_is_enabled(io->v) ||
> + verity_is_system_shutting_down() ||
> + (bio->bi_opf & REQ_RAHEAD))) {
> verity_finish_io(io, bio->bi_status);
> return;
> }
Thanks, this seems reasonable to me. As with your previous patch: what commit
introduced this issue? To me this looks like a longstanding issue, maybe dating
back to the original addition of FEC support to dm-verity by commit a739ff3f543a
("dm verity: add support for forward error correction"); do you agree? Can you
please add Fixes and "Cc stable" tags to your patch? Thanks!
- Eric