Received: by 2002:a05:6a11:4021:0:0:0:0 with SMTP id ky33csp2844290pxb; Tue, 21 Sep 2021 08:51:46 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwbyxW2MT0kIhzCqgkAGy84PVtnHF8IMICuIOHzam1OM5miFGrUEE1TCJgGwLhiJ5BKFDo4 X-Received: by 2002:a50:e10f:: with SMTP id h15mr4320119edl.73.1632239506657; Tue, 21 Sep 2021 08:51:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1632239506; cv=none; d=google.com; s=arc-20160816; b=ODB8wKGi1aAQHO9bZF4ZxDEQZB9KIggyW14sQmu8kHs3H7ams1Pplvuf3UIFBOmSqf dWhaQhmnTNLUCXPgSdKPO3+nF4ik5umxJSqDUXOI643/dQPTgwMB7fWCoMHLQuDVdRwy 1KDdChCx+55t7NMQ20JXGL9JgTkQIT/GZPCR7qmaeKnJDQS5sKZiO+Wgn5XwZktsgWTq 0D+9odi6FXYC13MGu19jzly973a3THfGgPSpPHWAPz8/9AggWqjPGp5HtTu9k+z4bKU0 mIQEkCzR2Q6z0LBs0m9jfHul1PFhZLTaEDLYsZbTvrQ213erCXwgrQzn0RyHcpeK67// g4/A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=H0YQY3wMw1kDzOg2VFayycF0up9tHxX4yN9RrrihYWU=; b=Bdo/Ho1QE1vbOA9qBs8DG9wH9zb8ZpFxrkL8F73MUUAu01WZfJsZangUc0CTSSDFF0 LCvFlGQcRtYw6j3DL/xfoXtTlbl/rOzwSLgaW9C52cqkKwiqPx/OQ4/Ge6cH/roNK19Q YYKEWJ76OBtBdflqsiM9HLHAiYfDnoJnFbnGHvJtNdKBfdiV3YmICwbTsKHAb4AVwdzw eOzjq10OGq3scfFxvuU/Qpwq0olWYpMvBMGA3ip9z7oQ8CM7Ajs4SIineH3Qf9xKBvyH 7gAFySWmCwXuqVW+fHOx1RPn8A86xl14gEfDJAHVmynsqUShzSKC0YKOQyuEKPhq2vSS 5VLA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=UaaVRIqs; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id j5si19946545ejm.413.2021.09.21.08.51.19; Tue, 21 Sep 2021 08:51:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=UaaVRIqs; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234219AbhIUPul (ORCPT + 99 others); Tue, 21 Sep 2021 11:50:41 -0400 Received: from mail.kernel.org ([198.145.29.99]:38636 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229804AbhIUPuk (ORCPT ); Tue, 21 Sep 2021 11:50:40 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 2B3B061183; Tue, 21 Sep 2021 15:49:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1632239352; bh=SYPIsUeB0QpNl9iOqz8TrUCBGO94e+noRg1ECZdTJME=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=UaaVRIqswzz+NS3S8fQEaKMO9RtdOXsyeprnTEIheXJKQGHNj08P7eyrd/PQOFBCR AN5tBBs86O+nP3GdbvOj2N+vfIACcMw2gY7GyzZmcnfLHUMHWEqqTO8tVeryaAcPBC wa3PUemimJFndCXNOCuGMsRHqw57UEYG+Vl8QmekwFARdVX0/7ufVsMqf8T+9g7Szp diWZkbbph3gXn35UaYmr5cUyleLHlYXnKdZMqMkRG/D9zQZoy6oFBj2a59MNnxAHRK YIYKWUHNZvNhOBYpWGFXq5g7hiH0GMlpMxwr02jYDVngebiKxCzBjK5YB3wvzgzdsQ QI6X04sAKlF5w== Received: by mail-qt1-f178.google.com with SMTP id x9so6771427qtv.0; Tue, 21 Sep 2021 08:49:12 -0700 (PDT) X-Gm-Message-State: AOAM530DTdfrLatLheUDRErjR0q716Etht84WYtht7AKXxWF0VYHTMnt NVdMaemE2/1nYxdiR7t9kXytp6X1M9drt3Q8/zI= X-Received: by 2002:ac8:4113:: with SMTP id q19mr29065258qtl.108.1632239341024; Tue, 21 Sep 2021 08:49:01 -0700 (PDT) MIME-Version: 1.0 References: <20210917194709.3562413-1-mcgrof@kernel.org> <20210917194709.3562413-10-mcgrof@kernel.org> In-Reply-To: From: Luis Chamberlain Date: Tue, 21 Sep 2021 08:48:49 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v7 09/12] sysfs: fix deadlock race with module removal To: David Laight Cc: "tj@kernel.org" , "gregkh@linuxfoundation.org" , "akpm@linux-foundation.org" , "minchan@kernel.org" , "jeyu@kernel.org" , "shuah@kernel.org" , "rdunlap@infradead.org" , "rafael@kernel.org" , "masahiroy@kernel.org" , "ndesaulniers@google.com" , "yzaikin@google.com" , "nathan@kernel.org" , "ojeda@kernel.org" , "vitor@massaru.org" , "elver@google.com" , "jarkko@kernel.org" , "glider@google.com" , "rf@opensource.cirrus.com" , "stephen@networkplumber.org" , "bvanassche@acm.org" , "jolsa@kernel.org" , "andriy.shevchenko@linux.intel.com" , "trishalfonso@google.com" , "andreyknvl@gmail.com" , "jikos@kernel.org" , "mbenes@suse.com" , "ngupta@vflare.org" , "sergey.senozhatsky.work@gmail.com" , "reinette.chatre@intel.com" , "fenghua.yu@intel.com" , "bp@alien8.de" , "x86@kernel.org" , "hpa@zytor.com" , "lizefan.x@bytedance.com" , "hannes@cmpxchg.org" , "daniel.vetter@ffwll.ch" , "bhelgaas@google.com" , "kw@linux.com" , "dan.j.williams@intel.com" , "senozhatsky@chromium.org" , "hch@lst.de" , "joe@perches.com" , "hkallweit1@gmail.com" , "axboe@kernel.dk" , "jpoimboe@redhat.com" , "tglx@linutronix.de" , "keescook@chromium.org" , "rostedt@goodmis.org" , "peterz@infradead.org" , "linux-spdx@vger.kernel.org" , "linux-doc@vger.kernel.org" , "linux-block@vger.kernel.org" , "linux-fsdevel@vger.kernel.org" , "linux-kselftest@vger.kernel.org" , "cgroups@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "copyleft-next@lists.fedorahosted.org" , Tetsuo Handa Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Sep 21, 2021 at 1:24 AM David Laight wrote: > > From: Luis Chamberlain > > Sent: 17 September 2021 20:47 > > > > When sysfs attributes use a lock also used on module removal we can > > race to deadlock. This happens when for instance a sysfs file on > > a driver is used, then at the same time we have module removal call > > trigger. The module removal call code holds a lock, and then the sysfs > > file entry waits for the same lock. While holding the lock the module > > removal tries to remove the sysfs entries, but these cannot be removed > > yet as one is waiting for a lock. This won't complete as the lock is > > already held. Likewise module removal cannot complete, and so we deadlock. > > Isn't the real problem the race between a sysfs file action and the > removal of the sysfs node? Nope, that is taken care of by kernfs. > This isn't really related to module unload - except that may > well remove some sysfs nodes. Nope, the issue is a deadlock that can happen due to a shared lock on module removal and a driver sysfs operation. > This is the same problem as removing any other kind of driver callback. > There are three basic solutions: > 1) Use a global lock - not usually useful. > 2) Have the remove call sleep until any callbacks are complete. > 3) Have the remove just request removal and have a final > callback (from a different context). Kernfs already does a sort of combination of 1) and 2) but 1) is using atomic reference counts. > If the remove can sleep (as in 2) then there is a requirement > on the driver code to not hold any locks across the 'remove' > that can be acquired during the callbacks. And this is the part that kernfs has no control over since the removal and sysfs operation are implementation specific. > Now, for sysfs, you probably only want to sleep the remove code > while a read/write is in progress - not just because the node > is open. > That probably requires marking an open node 'invalid' and > deferring delete to close. This is already done by kernfs. > None of this requires a reference count on the module. You are missing the point to the other aspect of the try_module_get(), it lets you also check if module exit has been entered. By using try_module_get() you let the module exit trump proceeding with an operation, therefore also preventing any potential use of a shared lock on module exit and the driver specific sysfs operation. Luis