Received: by 2002:a05:7412:b130:b0:e2:908c:2ebd with SMTP id az48csp311784rdb; Thu, 16 Nov 2023 22:14:17 -0800 (PST) X-Google-Smtp-Source: AGHT+IE+62MshNnge+j1V9PCoWOg7za5VAVr8U1IfLMRxNakhNicF5cv9XDiEqTcKKsZyQI+6rjl X-Received: by 2002:a05:6a00:4c8d:b0:68e:2478:d6c9 with SMTP id eb13-20020a056a004c8d00b0068e2478d6c9mr18558310pfb.2.1700201657534; Thu, 16 Nov 2023 22:14:17 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700201657; cv=none; d=google.com; s=arc-20160816; b=vb18SbA5KfQywGjRa7xfazxuBXflULY6brueK2Y6kVz3WAqOU8IRfo+rsdjsRzSqgt tWMaXC+AZ0bxA3DEQbXXhfbUsbd+rkwwd6NxPjzG4rli597F9ROfFx+PlbK3y1K6txuz xyF8+YJFuIxW0ITHkidB7aPQ7fsI447ike7nHUeq/P3Xh6gMbOtNGoQ0o4kYPgBs1B3p Mcc6b4877avxIT+Iog+PmLgzA5ArI5uDMS+y/8no1WabSYuHBCeDSMcC6o0DN35Glj7Q vBcVD/LDzeJseiCOxwx8jcOaa4fyuEoZiMwOJnglt29Eq1Sld6v6zBl/aUXdqnCJNZMx C7YQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:message-id:in-reply-to :subject:cc:to:from:date:dkim-signature; bh=/B7NwRNI3mffCj/k823eAaIVnPcXgsdKxdK4KdP36to=; fh=TCWJloqSh8g2NQLgyX5M2nnCSzUW+lTHto3LdCNW3kE=; b=HCthTufoOHRX9vK5QNufSyml0NGrZdMUxoEHpm5v5Dfuuy+xzFLEmWmd6pR8okIgJq hxzWNlCgbu1qVNMIAX52e45L9/vCEO7Mz106JA6YUUgYzMLP0ENEOBIunZELCMYgQyAE Kwc4xCagOAaeO3eFe6I6hofgpN0JiuA+gZTBunywks1a80KTXgAJgVJnfy4xmOqv5IRZ MBybFKgs1ykWY+77oXAp9BHkh7lEsv20/O4t7bft9KMj38cpCN/cWgJex8+FTudfjxj7 Ztm8OskxXM3Mr7Twlhn5gHBLZDjL7zuld0ikKtlb25G5AtbDlVPB2YQXaIjIYcbCVnDy 1pyg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=RsEdp6gI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from groat.vger.email (groat.vger.email. [23.128.96.35]) by mx.google.com with ESMTPS id b5-20020a63eb45000000b005b3b8896199si1140876pgk.591.2023.11.16.22.14.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 16 Nov 2023 22:14:17 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) client-ip=23.128.96.35; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=RsEdp6gI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id 23550826A043; Thu, 16 Nov 2023 22:14:14 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229905AbjKQGOD (ORCPT + 99 others); Fri, 17 Nov 2023 01:14:03 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44136 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229436AbjKQGOB (ORCPT ); Fri, 17 Nov 2023 01:14:01 -0500 Received: from mail-yw1-x1131.google.com (mail-yw1-x1131.google.com [IPv6:2607:f8b0:4864:20::1131]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A1FC38E for ; Thu, 16 Nov 2023 22:13:57 -0800 (PST) Received: by mail-yw1-x1131.google.com with SMTP id 00721157ae682-5afbdbf3a19so17841807b3.2 for ; Thu, 16 Nov 2023 22:13:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1700201637; x=1700806437; darn=vger.kernel.org; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=/B7NwRNI3mffCj/k823eAaIVnPcXgsdKxdK4KdP36to=; b=RsEdp6gIvzUbCFyxbxO7czBsdMOXqafGEX4WXr3TNydv/09+hlUdos1AlntonPDeqf gQ7JcvMmtwKpw8Ui9zKeHffkJhJh6/tVTiAHsvWYtvtBAif9a82fVAnUrSgJzdOc09DW axvQrujhVPHo52chLMRWeFgrPm+UCwtETGYBqBKHiHbX10jm12oz6F+RR1CVovITjicB 56cgYOPLghNvGWg6YIZXUQx/tlVR9QM21Dc4v00hFibdS26E6TaT/Ksnwl/swe2arQrq 9JCR0pV6SB5RavKj+LdRW9NhKweOB+KZDQFN+5LdgWj5RbcTJkc3buvfar/SvFVSkMhw VejQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700201637; x=1700806437; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=/B7NwRNI3mffCj/k823eAaIVnPcXgsdKxdK4KdP36to=; b=B/yKlS3AHhBiCAhnCEdsXDPwrtmBVAhk2+DGNDuVb2inFg2EaXMlw88YTHp0dv2tS6 z6Pn6AX84qRbU7AStygNSPG8o/TuNkFEh6z85Fbk1MqbzFiUuy/j9/PlJaHn5dziKSi6 tKxoXo1+f9V5hxPrRwXwydtSy0T/F+0NeLYIkS57T/wQYiGW3WC4e/VChezeQAiA5mU9 JS+wZfh2SPssWXTWURO2REFNd2y2e2t+8SIK0FxI5CwbmcQedKuxr7NByzJhSNbGuH7W ZyO2YLG0VdUHFI4qGuVC8aFNSmlchnsgUq+6O+gJo8zsawbWXwg39eHfoAW1EwoHtFVF Pnvg== X-Gm-Message-State: AOJu0YwiwjYxrlhgKmxaEDrmJzCoNFzPqqV21IrXsi3+nt6hLuZdpE2x E5fpcDyl2Rcs6AtLkgOvwX4vNA== X-Received: by 2002:a0d:c0c3:0:b0:59b:bd55:8452 with SMTP id b186-20020a0dc0c3000000b0059bbd558452mr18205875ywd.36.1700201636632; Thu, 16 Nov 2023 22:13:56 -0800 (PST) Received: from ripple.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id w129-20020a0dd487000000b005a7b785f66bsm326994ywd.39.2023.11.16.22.13.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 16 Nov 2023 22:13:56 -0800 (PST) Date: Thu, 16 Nov 2023 22:13:54 -0800 (PST) From: Hugh Dickins X-X-Sender: hugh@ripple.attlocal.net To: =?ISO-8859-15?Q?Jos=E9_Pekkarinen?= cc: Hugh Dickins , Matthew Wilcox , akpm@linux-foundation.org, skhan@linuxfoundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-kernel-mentees@lists.linux.dev, syzbot+89edd67979b52675ddec@syzkaller.appspotmail.com, Jann Horn Subject: Re: [PATCH] mm/pgtable: return null if no ptl in __pte_offset_map_lock In-Reply-To: <3cd8b7048ee38f5c5e6f9f6c5dab2deb@foxhound.fi> Message-ID: <74a866a0-3211-7e31-1dc3-7c96da340332@google.com> References: <20231115065506.19780-1-jose.pekkarinen@foxhound.fi> <1c4cb1959829ecf4f0c59691d833618c@foxhound.fi> <515cb9c1-abcd-c3f3-cc0d-c3cd248b9d6f@google.com> <3cd8b7048ee38f5c5e6f9f6c5dab2deb@foxhound.fi> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="-1463753983-1288526280-1700201635=:3610" X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE, USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Thu, 16 Nov 2023 22:14:14 -0800 (PST) This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. ---1463753983-1288526280-1700201635=:3610 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: QUOTED-PRINTABLE On Thu, 16 Nov 2023, Jos=C3=A9 Pekkarinen wrote: > On 2023-11-16 07:23, Hugh Dickins wrote: > > On Wed, 15 Nov 2023, Matthew Wilcox wrote: > >> On Wed, Nov 15, 2023 at 06:05:30PM +0200, Jos=C3=A9 Pekkarinen wrote: > >>=20 > >> > > I don't think we should be changing ptlock_ptr(). > >> > > >> > This is where the null ptr dereference originates, so the only > >> > alternative I can think of is to protect the life cycle of the ptdes= c > >> > to prevent it to die between the pte check and the spin_unlock of > >> > __pte_offset_map_lock. Would that work for you? > >=20 > > Thanks for pursuing this, Jos=C3=A9, but I agree with Matthew: I don't > > think your patch is right at all. The change in ptlock_ptr() did not > > make sense to me, and the change in __pte_offset_map_lock() leaves us > > still wondering what has gone wrong (and misses an rcu_read_unlock()). > >=20 > > You mentioned "I tested the syzbot reproducer in x86 and it doesn't > > produce this kasan report anymore": are you saying that you were able > > to reproduce the issue on x86 (without your patch)? That would be very > > interesting (and I think would disprove my hypothesis below). I ought > > to try on x86 if you managed to reproduce on it, but it's not worth > > the effort if you did not. If you have an x86 stack and registers, > > please show (though I'm uncertain how much KASAN spoils the output). >=20 > Hi, >=20 > Yes, I have a local setup based in [1], where I can spin a small > vm, build the reproducer and run it in. The only thing I took from > the webpage is the kernel config file, and the image I made it locally > by debootstrapping and running the modifications in create-image.sh > manually, the kasan report follows: >=20 > [ 111.408746][ T8885] general protection fault, probably for non-canonic= al > address 0xdffffc0000000005: 0000 [#1] PREEMPT SMP KASAN NOPTI > [ 111.413181][ T8885] KASAN: null-ptr-deref in range > [0x0000000000000028-0x000000000000002f] > [ 111.413181][ T8885] CPU: 1 PID: 8885 Comm: handle_kernel_p Not tainted > 6.7.0-rc1-00007-ge612cb00e200 #6 > [ 111.413181][ T8885] Hardware name: QEMU Standard PC (i440FX + PIIX, 19= 96), > BIOS 1.16.2-debian-1.16.2-1 04/01/2014 > [ 111.413181][ T8885] RIP: 0010:__pte_offset_map_lock+0xfa/0x310 > [ 111.423642][ T8885] Code: 48 c1 e8 03 80 3c 10 00 0f 85 12 02 00 00 4c= 03 > 3d db 92 cf 0b 48 b8 00 00 00 00 00 fc ff df 49 8d 7f 28 48 89 fa 48 c1 e= a 03 > <80> 3c 02 00 0f 85 e2 01 00 00 4d 8b 7f 28 4c 89 ff e8 f0 a1 3a 09 > [ 111.423642][ T8885] RSP: 0018:ffffc90005baf738 EFLAGS: 00010216 > [ 111.423642][ T8885] RAX: dffffc0000000000 RBX: 0005800000000067 RCX: > ffffffff81ada02e > [ 111.423642][ T8885] RDX: 0000000000000005 RSI: ffffffff81ad9f0f RDI: > 0000000000000028 > [ 111.423642][ T8885] RBP: ffff8880224c4800 R08: 0000000000000007 R09: > 0000000000000000 > [ 111.423642][ T8885] R10: 0000000000000000 R11: 0000000000000000 R12: > 0005088000000a80 > [ 111.423642][ T8885] R13: 1ffff92000b75ee9 R14: ffffc90005bafa88 R15: > 0000000000000000 > [ 111.423642][ T8885] FS: 00007f8d3972c6c0(0000) GS:ffff888069700000(00= 00) > knlGS:0000000000000000 > [ 111.423642][ T8885] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 111.423642][ T8885] CR2: 00007f8d3970af78 CR3: 00000000224d6000 CR4: > 00000000000006f0 > [ 111.423642][ T8885] Call Trace: > [ 111.423642][ T8885] > [ 111.423642][ T8885] ? show_regs+0x8f/0xa0 > [ 111.423642][ T8885] ? die_addr+0x4f/0xd0 > [ 111.423642][ T8885] ? exc_general_protection+0x150/0x220 > [ 111.423642][ T8885] ? asm_exc_general_protection+0x26/0x30 > [ 111.423642][ T8885] ? __pte_offset_map_lock+0x1de/0x310 > [ 111.423642][ T8885] ? __pte_offset_map_lock+0xbf/0x310 > [ 111.423642][ T8885] ? __pte_offset_map_lock+0xfa/0x310 > [ 111.423642][ T8885] ? __pte_offset_map_lock+0xbf/0x310 > [ 111.423642][ T8885] ? __pfx___pte_offset_map_lock+0x10/0x10 > [ 111.423642][ T8885] filemap_map_pages+0x336/0x13b0 > [ 111.423642][ T8885] ? __pfx_filemap_map_pages+0x10/0x10 > [ 111.423642][ T8885] ? rcu_read_unlock+0x33/0xb0 > [ 111.423642][ T8885] do_fault+0x86a/0x1350 > [ 111.423642][ T8885] __handle_mm_fault+0xe53/0x23a0 > [ 111.423642][ T8885] ? __pfx___handle_mm_fault+0x10/0x10 > [ 111.483413][ T8885] handle_mm_fault+0x369/0x890 > [ 111.483413][ T8885] __get_user_pages+0x46d/0x15d0 > [ 111.483413][ T8885] ? __pfx___get_user_pages+0x10/0x10 > [ 111.483413][ T8885] populate_vma_page_range+0x2de/0x420 > [ 111.483413][ T8885] ? __pfx_populate_vma_page_range+0x10/0x10 > [ 111.483413][ T8885] ? __pfx_find_vma_intersection+0x10/0x10 > [ 111.483413][ T8885] ? vm_mmap_pgoff+0x299/0x3c0 > [ 111.483413][ T8885] __mm_populate+0x1da/0x380 > [ 111.483413][ T8885] ? __pfx___mm_populate+0x10/0x10 > [ 111.483413][ T8885] ? up_write+0x1b3/0x520 > [ 111.483413][ T8885] vm_mmap_pgoff+0x2d1/0x3c0 > [ 111.483413][ T8885] ? __pfx_vm_mmap_pgoff+0x10/0x10 > [ 111.483413][ T8885] ksys_mmap_pgoff+0x7d/0x5b0 > [ 111.483413][ T8885] __x64_sys_mmap+0x125/0x190 > [ 111.483413][ T8885] do_syscall_64+0x45/0xf0 > [ 111.483413][ T8885] entry_SYSCALL_64_after_hwframe+0x6e/0x76 > [ 111.483413][ T8885] RIP: 0033:0x7f8d39831559 > [ 111.483413][ T8885] Code: 08 89 e8 5b 5d c3 66 2e 0f 1f 84 00 00 00 00= 00 > 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0= f 05 > <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 77 08 0d 00 f7 d8 64 89 01 48 > [ 111.483413][ T8885] RSP: 002b:00007f8d3972be78 EFLAGS: 00000216 ORIG_R= AX: > 0000000000000009 > [ 111.483413][ T8885] RAX: ffffffffffffffda RBX: 00007f8d3972c6c0 RCX: > 00007f8d39831559 > [ 111.483413][ T8885] RDX: b635773f07ebbeea RSI: 0000000000b36000 RDI: > 0000000020000000 > [ 111.483413][ T8885] RBP: 00007f8d3972bea0 R08: 00000000ffffffff R09: > 0000000000000000 > [ 111.483413][ T8885] R10: 0000000000008031 R11: 0000000000000216 R12: > ffffffffffffff80 > [ 111.483413][ T8885] R13: 0000000000000000 R14: 00007fffcef921d0 R15: > 00007f8d3970c000 > [ 111.483413][ T8885] > [ 111.483413][ T8885] Modules linked in: > [ 111.763549][ T8885] ---[ end trace 0000000000000000 ]--- > [ 111.773557][ T8885] RIP: 0010:__pte_offset_map_lock+0xfa/0x310 > [ 111.776045][ T8885] Code: 48 c1 e8 03 80 3c 10 00 0f 85 12 02 00 00 4c= 03 > 3d db 92 cf 0b 48 b8 00 00 00 00 00 fc ff df 49 8d 7f 28 48 89 fa 48 c1 e= a 03 > <80> 3c 02 00 0f 85 e2 01 00 00 4d 8b 7f 28 4c 89 ff e8 f0 a1 3a 09 > [ 111.805040][ T8885] RSP: 0018:ffffc90005baf738 EFLAGS: 00010216 > [ 111.820041][ T8885] RAX: dffffc0000000000 RBX: 0005800000000067 RCX: > ffffffff81ada02e > [ 111.837884][ T8885] RDX: 0000000000000005 RSI: ffffffff81ad9f0f RDI: > 0000000000000028 > [ 111.855313][ T8885] RBP: ffff8880224c4800 R08: 0000000000000007 R09: > 0000000000000000 > [ 111.878314][ T8885] R10: 0000000000000000 R11: 0000000000000000 R12: > 0005088000000a80 > [ 111.910624][ T8885] R13: 1ffff92000b75ee9 R14: ffffc90005bafa88 R15: > 0000000000000000 > [ 111.923627][ T8885] FS: 00007f8d3972c6c0(0000) GS:ffff888069700000(00= 00) > knlGS:0000000000000000 > [ 111.932017][ T8885] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 111.941166][ T8885] CR2: 00007fa26ac38178 CR3: 00000000224d6000 CR4: > 00000000000006f0 > [ 111.950619][ T8885] Kernel panic - not syncing: Fatal exception > [ 111.953981][ T8885] Kernel Offset: disabled > [ 111.953981][ T8885] Rebooting in 86400 seconds.. >=20 > I can test some patches for you if it helps finding out > the issue. Thanks a lot, and you'll see that I've just asked syzbot to try what I now believe is the correct fix: over in the other thread, since it didn't recognize yesterday's when I sent from this thread. Please give that a try yourself, if you have time - thanks. It turned out that all that I needed was your assurance that you had the repro working on x86 - I guess I'm simply too x86-centric, and had assumed that syzbot's arm64 report implied something special on arm, such as the subtler barriers there. I gave repro a try on bare metal x86, and it reproduced within a minute: though in my case not quite the stack trace you and syzbot reported, but a more obvious oops in pmd_install(). Depending on one's "memory model", the macro pfn_to_page() can be more or less strict: in my case it was strict, and pmd_install() oopsed right there in pmd_populate(); whereas in your case pmd_populate() uncomplainingly puts something silly into the pmd entry, leaving __pte_offset_map_lock() to stumble on that immediately afterwards. (Neither KASAN nor lockdep required - though lockdep's spinlock pointer probably helps to make the badness more obvious, if pmd_install() did not crash already.) The problem is simply that filemap_map_pmd() assumed that prealloc_pte is supplied with a preallocated page table whenever pmd_none(); but if it has racily become pmd_none() since the preallocation decision, then the oops. My changes have certainly provided an easy way to get that race, but if I'm not mistaken, there was already another such race, with the possible bug going back to 5.12. I'll work on the commit message while waiting to hear from syzbot. Hugh ---1463753983-1288526280-1700201635=:3610--