Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp6735820pxu; Thu, 24 Dec 2020 10:49:55 -0800 (PST) X-Google-Smtp-Source: ABdhPJyIxFD5EG9f2BRV5k3gCcUvSAxXIdWHyQ0w2ZslE8dWQ3aZ7WyOvKTshhoP0GnZ4fITfyS7 X-Received: by 2002:aa7:d74d:: with SMTP id a13mr30201914eds.78.1608835795270; Thu, 24 Dec 2020 10:49:55 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1608835795; cv=none; d=google.com; s=arc-20160816; b=bdHqSlgNUtU1JPrbzdDLtvPYO2yzVFsxgV5QJwqvi+ih4dZvtxhKfDjyc/f2B8NDz2 sr/fnBb1UyWgkTvK/02Cj29jNjwpEUJ5Jl7rFSmWgu6UPsP0SoV7UW/K6gD4fkDe6Iuq Wjr6duFdqAefl28Oy9/e/bNIO7Y2htqHcrYC3X4YVuugWm6j1cPGRmep2uRPNFGvZK7E BDkH4QA+254gm6ObD5SarfUnwkc9kdwNlnNFazXLg2uneaL4ywKyAl6q2Xw/+ZBc2hnz f0G6VJiF6aemo36HcqaoKIsWcjVYIsOfcvE24WLqHLS1OwWOtSAB3eS6pGv/OD8K8Ti9 6T2g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=nsNUBloWAUhQfAKDNr2zmKo9lnafAmXiCyM9Nf0pRww=; b=pO8xbvIUUJlJQ3JgdbSzsreHNDgcezf9GcK32EUw0nBb27CATDGOWBcG3dW8EmsNaC Q8f86ON5aRem6Dgs3LvH8yX68ZUjeWILhscaadh5vEaaegxpMG7EVa6NCcW6aITY+60i UFeoEKQvQ40rWm+ehjk5gIvW0e3/Vv0Yz9gGiwjcIYrTEP8oqV3RBJ0Y15fChIDtHa4r 4pbejI0d0FHwBwad0h2YpFJiBJ6DNXLb+FaaEVQVxz//hK0ap/oaxiF89T7SUsOOHCqi yt2HwfL8/HzOohsrVpVNnw3Nxsl2KU3tvQlc+jKZv41JQ+ae28bVn3nA6rCjGAMGucoN kBZQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@cloudflare.com header.s=google header.b=a8A93bvb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=cloudflare.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id a18si13465855ejb.180.2020.12.24.10.49.32; Thu, 24 Dec 2020 10:49:55 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@cloudflare.com header.s=google header.b=a8A93bvb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=cloudflare.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728851AbgLXSr0 (ORCPT + 99 others); Thu, 24 Dec 2020 13:47:26 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40710 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727778AbgLXSrZ (ORCPT ); Thu, 24 Dec 2020 13:47:25 -0500 Received: from mail-il1-x12a.google.com (mail-il1-x12a.google.com [IPv6:2607:f8b0:4864:20::12a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 537A7C0613C1 for ; Thu, 24 Dec 2020 10:46:30 -0800 (PST) Received: by mail-il1-x12a.google.com with SMTP id r17so2592190ilo.11 for ; Thu, 24 Dec 2020 10:46:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=nsNUBloWAUhQfAKDNr2zmKo9lnafAmXiCyM9Nf0pRww=; b=a8A93bvb/S42nAxAXLnD61jX94h4I86jOUNYbqKygMlKpbLqZf+ZINKUEF2SOEhl// 0eljsyfPk4tJikmwT9PEODyw3boxEMBYA+p5a7oVUJVAHEDl0bcPRw9EEoiauZ+n4KiG TPv2vqwmGRCHbKTfaM4NVYK0ASePVgnZgUyw8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=nsNUBloWAUhQfAKDNr2zmKo9lnafAmXiCyM9Nf0pRww=; b=He+J1J43l1N06mVQkX4cGw8hsdJmyM+PKZTICsYO9AE4wdYkXVWE++cwXnOFoF/gmc b3FwrNO4snyIaNLfaH0Af0xef/H3/0ghM9aPajT4U0WGYwqYEqm1Xc/QMEd1c5m3xwAi YXRjCiZUKMh4sYJWcbwKNG2LJzliUupLdybBthXvL6Xs7f6C8DjfKd9EWEvIeO6SM/gk Uz/0OKpQJ6pTmmdt0CmSPUOGDvMy/MANyOXvnHbBXM7sqapSb9/feh1ZbDfuJ7ZIbzz+ 9AflbQsU7NCOxZ9skI+saa3zDYsZOTYcA25Q2c7K/NP+IbF7GYzqQVvnLtrfoBtyaJdF ddgg== X-Gm-Message-State: AOAM532wNXeq1efUCLDmPUoCEOjSNym2vNaNjZZ63RVPZHOsmUzHeBy8 0jgo5mpq52RPUqICGNZSEEBhyBgUAHLBpwJI+2dWQA== X-Received: by 2002:a05:6e02:5c2:: with SMTP id l2mr30093792ils.231.1608835589405; Thu, 24 Dec 2020 10:46:29 -0800 (PST) MIME-Version: 1.0 References: <16ffadab-42ba-f9c7-8203-87fda3dc9b44@maciej.szmigiero.name> <74c7129b-a437-ebc4-1466-7fb9f034e006@maciej.szmigiero.name> <20201223205642.GA19817@gondor.apana.org.au> In-Reply-To: <20201223205642.GA19817@gondor.apana.org.au> From: Ignat Korchagin Date: Thu, 24 Dec 2020 18:46:18 +0000 Message-ID: Subject: Re: dm-crypt with no_read_workqueue and no_write_workqueue + btrfs scrub = BUG() To: Herbert Xu Cc: "Maciej S. Szmigiero" , Alasdair G Kergon , Mike Snitzer , device-mapper development , dm-crypt@saout.de, linux-kernel , Eric Biggers , Damien Le Moal , Mikulas Patocka , kernel-team , Nobuto Murata , Chris Mason , Josef Bacik , David Sterba , linux-btrfs@vger.kernel.org, linux-crypto Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Dec 23, 2020 at 8:57 PM Herbert Xu wrote: > > On Wed, Dec 23, 2020 at 04:37:34PM +0100, Maciej S. Szmigiero wrote: > > > > It looks like to me that the skcipher API might not be safe to > > call from a softirq context, after all. > > skcipher is safe to use in a softirq. The problem is only in > dm-crypt where it tries to allocate memory with GFP_NOIO. Hm.. After eliminating the GFP_NOIO (as well as some other sleeping paths) from dm-crypt softirq code I still hit an occasional crash in my extreme setup (QEMU with 1 CPU and cryptd_max_cpu_qlen set to 1) (decoded with stacktrace_decode.sh): [ 89.324723] BUG: kernel NULL pointer dereference, address: 0000000000000008 [ 89.325713] #PF: supervisor write access in kernel mode [ 89.326460] #PF: error_code(0x0002) - not-present page [ 89.327211] PGD 0 P4D 0 [ 89.327589] Oops: 0002 [#1] PREEMPT SMP PTI [ 89.328200] CPU: 0 PID: 21 Comm: kworker/0:1 Not tainted 5.10.0+ #79 [ 89.329109] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015 [ 89.330284] Workqueue: cryptd cryptd_queue_worker [ 89.330999] RIP: 0010:crypto_dequeue_request (/cfsetup_build/./include/linux/list.h:112 /cfsetup_build/./include/linux/list.h:135 /cfsetup_build/./include/linux/list.h:146 /cfsetup_build/crypto/algapi.c:957) [ 89.331757] Code: e9 c9 d0 a8 48 c7 c7 f9 c9 d0 a8 e8 c2 88 fe ff 4c 8b 23 48 c7 c6 e9 c9 d0 a8 48 c7 c7 f9 c9 d0 a8 49 8b 14 24 49 8b 44 24 08 <48> 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 49 89 04 24 48 All code ======== 0: e9 c9 d0 a8 48 jmpq 0x48a8d0ce 5: c7 c7 f9 c9 d0 a8 mov $0xa8d0c9f9,%edi b: e8 c2 88 fe ff callq 0xfffffffffffe88d2 10: 4c 8b 23 mov (%rbx),%r12 13: 48 c7 c6 e9 c9 d0 a8 mov $0xffffffffa8d0c9e9,%rsi 1a: 48 c7 c7 f9 c9 d0 a8 mov $0xffffffffa8d0c9f9,%rdi 21: 49 8b 14 24 mov (%r12),%rdx 25: 49 8b 44 24 08 mov 0x8(%r12),%rax 2a:* 48 89 42 08 mov %rax,0x8(%rdx) <-- trapping instruction 2e: 48 89 10 mov %rdx,(%rax) 31: 48 b8 00 01 00 00 00 movabs $0xdead000000000100,%rax 38: 00 ad de 3b: 49 89 04 24 mov %rax,(%r12) 3f: 48 rex.W Code starting with the faulting instruction =========================================== 0: 48 89 42 08 mov %rax,0x8(%rdx) 4: 48 89 10 mov %rdx,(%rax) 7: 48 b8 00 01 00 00 00 movabs $0xdead000000000100,%rax e: 00 ad de 11: 49 89 04 24 mov %rax,(%r12) 15: 48 rex.W [ 89.334414] RSP: 0018:ffffba64c00bbe68 EFLAGS: 00010246 [ 89.335165] RAX: 0000000000000000 RBX: ffff9b9d6fc28d88 RCX: 0000000000000000 [ 89.336182] RDX: 0000000000000000 RSI: ffffffffa8d0c9e9 RDI: ffffffffa8d0c9f9 [ 89.337204] RBP: 0000000000000000 R08: ffffffffa906e708 R09: 0000000000000058 [ 89.338208] R10: ffffffffa9068720 R11: 00000000fffffc00 R12: ffff9b9a43797478 [ 89.339216] R13: 0000000000000020 R14: ffff9b9d6fc28e00 R15: 0000000000000000 [ 89.340231] FS: 0000000000000000(0000) GS:ffff9b9d6fc00000(0000) knlGS:0000000000000000 [ 89.341376] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 89.342207] CR2: 0000000000000008 CR3: 000000014cd76002 CR4: 0000000000170ef0 [ 89.343238] Call Trace: [ 89.343609] cryptd_queue_worker (/cfsetup_build/crypto/cryptd.c:172) [ 89.344218] process_one_work (/cfsetup_build/./arch/x86/include/asm/preempt.h:26 /cfsetup_build/kernel/workqueue.c:2284) [ 89.344821] ? rescuer_thread (/cfsetup_build/kernel/workqueue.c:2364) [ 89.345399] worker_thread (/cfsetup_build/./include/linux/list.h:282 /cfsetup_build/kernel/workqueue.c:2422) [ 89.345923] ? rescuer_thread (/cfsetup_build/kernel/workqueue.c:2364) [ 89.346504] kthread (/cfsetup_build/kernel/kthread.c:292) [ 89.346986] ? kthread_create_worker_on_cpu (/cfsetup_build/kernel/kthread.c:245) [ 89.347713] ret_from_fork (/cfsetup_build/arch/x86/entry/entry_64.S:302) [ 89.348255] Modules linked in: [ 89.348708] CR2: 0000000000000008 [ 89.349197] ---[ end trace b7e9618b4122ed3b ]--- [ 89.349863] RIP: 0010:crypto_dequeue_request (/cfsetup_build/./include/linux/list.h:112 /cfsetup_build/./include/linux/list.h:135 /cfsetup_build/./include/linux/list.h:146 /cfsetup_build/crypto/algapi.c:957) [ 89.350606] Code: e9 c9 d0 a8 48 c7 c7 f9 c9 d0 a8 e8 c2 88 fe ff 4c 8b 23 48 c7 c6 e9 c9 d0 a8 48 c7 c7 f9 c9 d0 a8 49 8b 14 24 49 8b 44 24 08 <48> 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 49 89 04 24 48 All code ======== 0: e9 c9 d0 a8 48 jmpq 0x48a8d0ce 5: c7 c7 f9 c9 d0 a8 mov $0xa8d0c9f9,%edi b: e8 c2 88 fe ff callq 0xfffffffffffe88d2 10: 4c 8b 23 mov (%rbx),%r12 13: 48 c7 c6 e9 c9 d0 a8 mov $0xffffffffa8d0c9e9,%rsi 1a: 48 c7 c7 f9 c9 d0 a8 mov $0xffffffffa8d0c9f9,%rdi 21: 49 8b 14 24 mov (%r12),%rdx 25: 49 8b 44 24 08 mov 0x8(%r12),%rax 2a:* 48 89 42 08 mov %rax,0x8(%rdx) <-- trapping instruction 2e: 48 89 10 mov %rdx,(%rax) 31: 48 b8 00 01 00 00 00 movabs $0xdead000000000100,%rax 38: 00 ad de 3b: 49 89 04 24 mov %rax,(%r12) 3f: 48 rex.W Code starting with the faulting instruction =========================================== 0: 48 89 42 08 mov %rax,0x8(%rdx) 4: 48 89 10 mov %rdx,(%rax) 7: 48 b8 00 01 00 00 00 movabs $0xdead000000000100,%rax e: 00 ad de 11: 49 89 04 24 mov %rax,(%r12) 15: 48 rex.W [ 89.353266] RSP: 0018:ffffba64c00bbe68 EFLAGS: 00010246 [ 89.354003] RAX: 0000000000000000 RBX: ffff9b9d6fc28d88 RCX: 0000000000000000 [ 89.355048] RDX: 0000000000000000 RSI: ffffffffa8d0c9e9 RDI: ffffffffa8d0c9f9 [ 89.356063] RBP: 0000000000000000 R08: ffffffffa906e708 R09: 0000000000000058 [ 89.357082] R10: ffffffffa9068720 R11: 00000000fffffc00 R12: ffff9b9a43797478 [ 89.358088] R13: 0000000000000020 R14: ffff9b9d6fc28e00 R15: 0000000000000000 [ 89.359127] FS: 0000000000000000(0000) GS:ffff9b9d6fc00000(0000) knlGS:0000000000000000 [ 89.360296] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 89.361129] CR2: 0000000000000008 CR3: 000000014cd76002 CR4: 0000000000170ef0 [ 89.362160] Kernel panic - not syncing: Fatal exception in interrupt [ 89.363145] Kernel Offset: 0x26000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [ 89.364730] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]--- This happens when running dm-crypt with no_read_workqueues on top of an emulated NVME in QEMU (NVME driver "completes" IO in IRQ context). Somehow sending decryption requests to cryptd in some fashion in softirq context corrupts the crypto queue it seems. Regards, Ignat > Cheers, > -- > Email: Herbert Xu > Home Page: http://gondor.apana.org.au/~herbert/ > PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt