Received: by 2002:a05:7412:d1aa:b0:fc:a2b0:25d7 with SMTP id ba42csp744384rdb; Mon, 29 Jan 2024 17:58:47 -0800 (PST) X-Google-Smtp-Source: AGHT+IHDt9GW4IU0NDMKxYF8GZmhYpuyd2IxN4ZxPFSNfpODUV76CSTGCBqJskHTgxlx0sTKtL17 X-Received: by 2002:a17:902:eb05:b0:1d8:e94b:3f44 with SMTP id l5-20020a170902eb0500b001d8e94b3f44mr301663plb.48.1706579927285; Mon, 29 Jan 2024 17:58:47 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706579927; cv=pass; d=google.com; s=arc-20160816; b=zS9EVSGjGRrnpN2fA7uAxUUFpKyen20UoIUt8RWcYQbbvs/2x7yfmE1ab5WRkhJLh1 LIsa2IO7+QvlgoexzJi59qwpMFa+WjNS0bjW/0bFUaKKRr89q6g0USlWmUUS1vOTfrwm +YnCsuZ1td4q9sIngpTlYCVJ1ivWPpWj4fkUsZsyOElksG/Z3qYPrVFbonPeD7aGj3KN LOYrjOvB3qr3R23jNrb8qrABBK1ZDx02Y43qa8z/U4rzUbrHVbgV270HzI5ZhXxPJ4Cx UUN0FHc+qlHM1s3hG27J7AyxlYh7R2TORt+t0h7USbHQFwkwIcndN6znEmh33ySW1xAz M42A== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:list-unsubscribe:list-subscribe:list-id:precedence :dkim-signature; bh=4JiXyOYVlI6BSEdoNtzDGECSezeX6jIOydJh3gzKSfA=; fh=Md50wLehZoDD5Sx7nueCG/UN4tfSAivmPz4nrPW3cUY=; b=M6I++bAaU7noOABnIzlIm6qRmvHMxqvvFQXS4bcjN+bmPzfrDWA1gnDO1LcstORBBj 5h+V4lp3xmsYIeus7ruacF9lKOpYOhJ4R+jULuEDyneS1BiZi/x5B8lxueRil7qWk0RE vfEgNbAF8Q7nCMzRcLQcKLr6r/Jj0LwWH3C8VULsZj8RXV4da/sawl0M8u6UFddk1rn6 6qqcfIKPqa4RHtBQeZ6HbzGfF4Ya8PizHVEeUg8BM5uvMS9wxqvRbnJR//zxlddlHX/v XZ9s+obM7+a0wH3r17H0KyJrlb/9EH+hHwv97y6En8255JuZf9iNQKxUUVngl3ntjCsv 6S5w== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=bVF17mhD; arc=pass (i=1 spf=pass spfdomain=linuxfoundation.org dkim=pass dkdomain=linux-foundation.org); spf=pass (google.com: domain of linux-kernel+bounces-43769-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-43769-linux.lists.archive=gmail.com@vger.kernel.org" Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [2604:1380:40f1:3f00::1]) by mx.google.com with ESMTPS id b7-20020a170903228700b001d8f9683b3fsi1336811plh.338.2024.01.29.17.58.46 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 29 Jan 2024 17:58:47 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-43769-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) client-ip=2604:1380:40f1:3f00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=bVF17mhD; arc=pass (i=1 spf=pass spfdomain=linuxfoundation.org dkim=pass dkdomain=linux-foundation.org); spf=pass (google.com: domain of linux-kernel+bounces-43769-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-43769-linux.lists.archive=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 51846B25496 for ; Tue, 30 Jan 2024 01:51:17 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id D4A2436AF3; Tue, 30 Jan 2024 01:51:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b="bVF17mhD" Received: from mail-ed1-f49.google.com (mail-ed1-f49.google.com [209.85.208.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E8504364A1 for ; Tue, 30 Jan 2024 01:50:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.208.49 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706579460; cv=none; b=qfa2BLm0FHOwsQHcG8O6W3+NfNodc9YCqhiJu1bychufQHjg/QQy9eh5f2T+L0KXNZHGet26qYKrON6Slaa8HeA6Jz9h3sijzXp8t2Q9fDxdIer1huEwERjtoYTmI+HVgcqzJY2+cDTXfGYpxnGXnXeLmXu6Q2pj7EZGT7AkY5M= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706579460; c=relaxed/simple; bh=zyDZ08Jamx3NYeGqhVvbunwn56jBnGzJ3cGlRkrcSRg=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=EU90xOn6dHCroyaalnGWCQcY/mFgAKDTR8ubKytYxUSbkUa+00GZb2n/Gpod0EAsTKoygFwN+h7qwNjBxu4RLKhzXuYxk6T8B1alpqm9Og5//vszJ/8NB95HFBlgGxQU3ztd6kskUdxUxeCmRfXGEaa4Kzvy4IijdG6ZWboUryM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org; spf=pass smtp.mailfrom=linuxfoundation.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b=bVF17mhD; arc=none smtp.client-ip=209.85.208.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linuxfoundation.org Received: by mail-ed1-f49.google.com with SMTP id 4fb4d7f45d1cf-55790581457so3871537a12.3 for ; Mon, 29 Jan 2024 17:50:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; t=1706579457; x=1707184257; darn=vger.kernel.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=4JiXyOYVlI6BSEdoNtzDGECSezeX6jIOydJh3gzKSfA=; b=bVF17mhDhNj6ABQr/QpFyJm3YAMJnklVu0o1QMOB3EK/5WiPsGI8mG7iKNTRDJp5DA vf62jU9sn+GbH3yFbJF9hn+lTjKMoF9EpjIEQcMHnO8cI9Z7py61CloaAHv28kW1Xeba bPh9AT6/N/7Ue9rXxEAi49YnTKGjvbv3ih9ds= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706579457; x=1707184257; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=4JiXyOYVlI6BSEdoNtzDGECSezeX6jIOydJh3gzKSfA=; b=LqScWxTOFcv8K68gSh8gXyUGRZGG8mX+cERh3UVKQTONhxjF5EnuDfsXybtv6IX9SS KRnFsiRLpEUMPxzhVVwelVIJzANI0XNJBl2wCDe4OQ7MDw254OepM4a9fuILJGc1OdB/ oqsrxb2OBtYn87JOHNhDNbN6o4VDHO61lTc/Dzl5rL6d4hhUr5ZM86fK1t9CS73KOiSh e7PBxItK7GHubtbFS7zbdkTuIb+wZNM8goZ85qc5eGYTEEwFTYy/COZWNI50KJfoJOsv cW5kpeFMJaJf/Di2md39OW3In1PZ5Pd+4bxAIFnB1+R3NTKBzdsI3x0Y2MqmU+HEwueS R4XQ== X-Gm-Message-State: AOJu0Yzehm1dTJ9UBbNROn3S/xxzox0BDXqWkgWmCqev46YRJvJfPJmi FDslJ0Msg8Qt36ODvU5q/lhucZ23UPf9V8E23rpH+rIu9vdLmUfTJ/ySUCnX+DSBRAMkRj2r5vu YI5cuSQ== X-Received: by 2002:a05:6402:416:b0:55c:20f7:4ef8 with SMTP id q22-20020a056402041600b0055c20f74ef8mr4787804edv.23.1706579456979; Mon, 29 Jan 2024 17:50:56 -0800 (PST) Received: from mail-ed1-f52.google.com (mail-ed1-f52.google.com. [209.85.208.52]) by smtp.gmail.com with ESMTPSA id x6-20020a056402414600b0055d36e6f1a7sm4163160eda.82.2024.01.29.17.50.55 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 29 Jan 2024 17:50:55 -0800 (PST) Received: by mail-ed1-f52.google.com with SMTP id 4fb4d7f45d1cf-55a5e7fa471so3335928a12.1 for ; Mon, 29 Jan 2024 17:50:55 -0800 (PST) X-Received: by 2002:a05:6402:3514:b0:55e:e22c:c1fd with SMTP id b20-20020a056402351400b0055ee22cc1fdmr4283813edd.4.1706579455285; Mon, 29 Jan 2024 17:50:55 -0800 (PST) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <202401291043.e62e89dc-oliver.sang@intel.com> <20240129120125.605e97af@gandalf.local.home> <20240129152600.7587d1aa@gandalf.local.home> <20240129172200.1725f01b@gandalf.local.home> <20240129174950.5a17a86c@gandalf.local.home> <20240129193549.265f32c8@gandalf.local.home> In-Reply-To: <20240129193549.265f32c8@gandalf.local.home> From: Linus Torvalds Date: Mon, 29 Jan 2024 17:50:38 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [linus:master] [eventfs] 852e46e239: BUG:unable_to_handle_page_fault_for_address To: Steven Rostedt Cc: kernel test robot , oe-lkp@lists.linux.dev, lkp@intel.com, linux-kernel@vger.kernel.org, Masami Hiramatsu , Mark Rutland , Mathieu Desnoyers , Christian Brauner , Al Viro , Ajay Kaher , linux-trace-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" On Mon, 29 Jan 2024 at 16:35, Steven Rostedt wrote: > > # echo 'p:sched schedule' >> /sys/kernel/tracing/kprobe_events > # ls -l events/kprobes/ > ls: cannot access 'events/kprobes/': No such file or directory > > Where it should now exist but doesn't. But the lookup code never triggered. > > If the lookup fails, does it cache the result? I think you end up still having negative dentries around. The old code then tried to compensate for that by trying to remember the old dentry with 'ei->dentry' and 'ei->d_children[]', and would at lookup time try to use the *old* dentry instead of the new one. And because dentries are just caches and can go away, it then had that odd dance with '.d_iput', so that when a dentry was removed, it would be removed from the 'ei->dentry' and 'ei->d_children[]' array too. Except that d_iput() of an old dentry isn't actually serialized with ->d_lookup() in any way, so you end up with the whole race that I already talked about earlier, where you could still have an 'ei->dentry' that pointed to something that had already been unhashed, but d_iput() hadn't been called *yet*, so d_lookup() is called with a new dentry, but the tracefs code then desperately tries to use the old dentry pointer that just isn't _valid_ any more, but it doesn't know that because d_iput() hasn't been called yet... And as I *also* pointed out when I described that originally, you'll practically never hit this race, because you just need to be *very* unlucky with the whole "dentry is freed due to memory pressure". But basically, this is why I absolutely *HATE* that "ei->dentry" backpointer. It's truly fundamentally broken. You can't reference-count it, since the whole point of your current tracefs scheme is to *not* keep dentries and inodes around forever, and doing a "dget()" on that 'ei->dentry' would thus fundamentally screw that up. But you also cannot keep it in sync with dentries being released due to memory pressure, because of the above thing. See why I've tried to tell you that the back-pointer is basically a 100% sign of a bug. The *only* time you can have a valid dentry pointer is when you have also taken a ref to it with dget(), and you can't do that. So then you have all that completely broken code that _tries_ to maintain consistency with ->d_children[] etc, and it works 99.9% in practice, because the race is just so hard to hit because dentries only normally get evicted either synchronously (which you do under the eventfs_mutex) or under memory pressure (which is basically never going to be something you can test). And yes, my lookup patch removed all the band-aids for "if I have an ei->dentry, I'll reuse it". So I think it ends up exposing all the previous bugs that the old "let's reuse the old dentry" code tried to hide. But, as mentioned, that ei->dentry pointer really REALLY is broken. NBow, having looked at this a lot, I think I have a way forward. Because there is actually *one* case where you actually *do* do the whole "dget()" to get a stable dentry pointer. And that's exactly the "events" directory creation (ie eventfs_create_events_dir()). So what I propose is that - ei->dentry and ei->d_children[] need to die. Really. They are buggy. There is no way to save them. There never was. - but we *can* introduce a new 'ei->events_dir' pointer that is *only* set by eventfs_create_events_dir(), and which is stable exactly because that function also does a dget() on it, so now the dentry will actually continue to exist reliably I think that works. The only thing that actually *needs* the existing 'ei->dentry' is literally the eventfs_remove_events_dir() that gets rid of the stable events directory. It's undoing eventfs_create_events_dir(), and it will do the final dput() too. I will try to make a patch for this. I do think it means that every time we do that dentry->d_fsdata = ei; we need to also do proper reference counting of said 'ei'. Because we can't release 'ei' early when we have dentries that point to it. Let me see how painful this will be. Linus