2021-01-09 15:19:35

by Ignat Korchagin

[permalink] [raw]
Subject: [PATCH] dm crypt: do not call bio_endio() from the dm-crypt tasklet

Sometimes, when dm-crypt executes decryption in a tasklet, we may get the
following on a kasan-enabled kernel:

[ 397.613621][ C74] ==================================================================
[ 397.621794][ C74] BUG: KASAN: use-after-free in tasklet_action_common.constprop.0+0x151/0x2b0
[ 397.630759][ C74] Write of size 8 at addr ffff88a139054648 by task swapper/74/0
[ 397.638465][ C74]
[ 397.640725][ C74] CPU: 74 PID: 0 Comm: swapper/74 Not tainted 5.10.5-cloudflare-kasan-2021.1.3 #1
[ 397.650048][ C74] Hardware name: does not matter
[ 397.658834][ C74] Call Trace:
[ 397.662073][ C74] <IRQ>
[ 397.664862][ C74] dump_stack+0x7d/0xa3
[ 397.668994][ C74] print_address_description.constprop.0+0x1c/0x210
[ 397.675632][ C74] ? _raw_spin_lock_irqsave+0x87/0xe0
[ 397.681017][ C74] ? _raw_write_unlock_bh+0x60/0x60
[ 397.686221][ C74] ? tasklet_action_common.constprop.0+0x151/0x2b0
[ 397.692772][ C74] ? tasklet_action_common.constprop.0+0x151/0x2b0
[ 397.699324][ C74] kasan_report.cold+0x1f/0x37
[ 397.704086][ C74] ? tasklet_action_common.constprop.0+0x151/0x2b0
[ 397.710634][ C74] check_memory_region+0x13c/0x180
[ 397.715752][ C74] tasklet_action_common.constprop.0+0x151/0x2b0
[ 397.722122][ C74] __do_softirq+0x1a0/0x667
[ 397.732905][ C74] asm_call_irq_on_stack+0x12/0x20
[ 397.744322][ C74] </IRQ>
[ 397.753537][ C74] do_softirq_own_stack+0x37/0x40
[ 397.764745][ C74] irq_exit_rcu+0x110/0x1b0
[ 397.775248][ C74] common_interrupt+0x74/0x120
[ 397.785851][ C74] asm_common_interrupt+0x1e/0x40
[ 397.796568][ C74] RIP: 0010:cpuidle_enter_state+0x188/0xab0
[ 397.808040][ C74] Code: 89 c4 0f 1f 44 00 00 31 ff e8 d4 bc a4 fe 80 3c 24 00 74 12 9c 58 f6 c4 02 0f 85 2f 06 00 00 31 ff e8 2c 9d b6 fe fb 45 85 f6 <0f> 88 8d 03 00 00 4d 63 ee 49 83 fd 09 0f 87 e1 06 00 00 49 6b c5
[ 397.845062][ C74] RSP: 0018:ffff88a05325fe10 EFLAGS: 00000202
[ 397.857116][ C74] RAX: dffffc0000000000 RBX: ffff88a068850800 RCX: 000000000000001f
[ 397.871232][ C74] RDX: 1ffff115f9f76634 RSI: 0000000037a6f674 RDI: ffff88afcfbb31a0
[ 397.885178][ C74] RBP: ffffffff969c0920 R08: 0000011bef59a0e8 R09: ffff88afcfbb3383
[ 397.899043][ C74] R10: ffffed15f9f76670 R11: 0000005c1545d831 R12: 0000005c939dd540
[ 397.912779][ C74] R13: 0000000000000002 R14: 0000000000000002 R15: ffffffff969c0a48
[ 397.926511][ C74] cpuidle_enter+0x4a/0xa0
[ 397.936547][ C74] ? call_cpuidle+0x3c/0xb0
[ 397.946522][ C74] do_idle+0x2bd/0x320
[ 397.955903][ C74] ? arch_cpu_idle_exit+0x40/0x40
[ 397.966195][ C74] ? swake_up_locked+0x72/0x190
[ 397.976193][ C74] cpu_startup_entry+0x19/0x20
[ 397.985918][ C74] secondary_startup_64_no_verify+0xb0/0xbb
[ 397.996679][ C74]
[ 398.003698][ C74] Allocated by task 0:
[ 398.012533][ C74] (stack is not available)
[ 398.021729][ C74]
[ 398.028707][ C74] Freed by task 0:
[ 398.037155][ C74] kasan_save_stack+0x20/0x50
[ 398.046457][ C74] kasan_set_track+0x1c/0x30
[ 398.055610][ C74] kasan_set_free_info+0x1b/0x30
[ 398.065170][ C74] __kasan_slab_free+0x110/0x150
[ 398.074676][ C74] slab_free_freelist_hook+0x66/0x120
[ 398.084586][ C74] kmem_cache_free+0x104/0x480
[ 398.093834][ C74] dec_pending+0x1ed/0x8f0 [dm_mod]
[ 398.103486][ C74] clone_endio+0x215/0x870 [dm_mod]
[ 398.113058][ C74] tasklet_action_common.constprop.0+0x20d/0x2b0
[ 398.123838][ C74] __do_softirq+0x1a0/0x667
[ 398.132768][ C74]
[ 398.139448][ C74] The buggy address belongs to the object at ffff88a139054600
[ 398.139448][ C74] which belongs to the cache bio-5 of size 1144
[ 398.162373][ C74] The buggy address is located 72 bytes inside of
[ 398.162373][ C74] 1144-byte region [ffff88a139054600, ffff88a139054a78)
[ 398.185131][ C74] The buggy address belongs to the page:
[ 398.195384][ C74] page:0000000020f4c3cc refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x2139050
[ 398.214988][ C74] head:0000000020f4c3cc order:3 compound_mapcount:0 compound_pincount:0
[ 398.228183][ C74] flags: 0xaffff800010200(slab|head)
[ 398.238146][ C74] raw: 00affff800010200 dead000000000100 dead000000000122 ffff88a13b99da00
[ 398.251691][ C74] raw: 0000000000000000 0000000080190019 00000001ffffffff 0000000000000000
[ 398.265140][ C74] page dumped because: kasan: bad access detected
[ 398.276422][ C74]
[ 398.283502][ C74] Memory state around the buggy address:
[ 398.293892][ C74] ffff88a139054500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 398.306895][ C74] ffff88a139054580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 398.319841][ C74] >ffff88a139054600: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 398.332767][ C74] ^
[ 398.344064][ C74] ffff88a139054680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 398.357159][ C74] ffff88a139054700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 398.370160][ C74] ==================================================================

When the decryption fully completes in the tasklet, dm-crypt will call
bio_endio(), which in turn will call clone_endio() from dm.c core code. That
function frees the resources associated with the bio, including per bio private
structures. For dm-crypt it will free the current struct dm_crypt_io, which
contains our tasklet object, causing use-after-free, when the tasklet is being
dequeued by the kernel.

To avoid this, do not call bio_endio() from the current tasklet context, but
delay its execution to the dm-crypt IO workqueue.

Fixes: 39d42fa96ba1 ("dm crypt: add flags to optionally bypass kcryptd workqueues")
Cc: <[email protected]> # v5.9+
Signed-off-by: Ignat Korchagin <[email protected]>
---
drivers/md/dm-crypt.c | 24 +++++++++++++++++++++++-
1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index 53791138d78b..90f109d8a4a7 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -1691,6 +1691,12 @@ static void crypt_inc_pending(struct dm_crypt_io *io)
atomic_inc(&io->io_pending);
}

+static void kcryptd_io_bio_endio(struct work_struct *work)
+{
+ struct dm_crypt_io *io = container_of(work, struct dm_crypt_io, work);
+ bio_endio(io->base_bio);
+}
+
/*
* One of the bios was finished. Check for completion of
* the whole request and correctly clean up the buffer.
@@ -1713,7 +1719,23 @@ static void crypt_dec_pending(struct dm_crypt_io *io)
kfree(io->integrity_metadata);

base_bio->bi_status = error;
- bio_endio(base_bio);
+
+ /*
+ * If we are running this function from our tasklet,
+ * we can't call bio_endio() here, because it will call
+ * clone_endio() from dm.c, which in turn will
+ * free the current struct dm_crypt_io structure with
+ * our tasklet. In this case we need to delay bio_endio()
+ * execution to after the tasklet is done and dequeued.
+ */
+ if (tasklet_trylock(&io->tasklet)) {
+ tasklet_unlock(&io->tasklet);
+ bio_endio(base_bio);
+ return;
+ }
+
+ INIT_WORK(&io->work, kcryptd_io_bio_endio);
+ queue_work(cc->io_queue, &io->work);
}

/*
--
2.20.1