Received: by 2002:a05:6a11:4021:0:0:0:0 with SMTP id ky33csp2386898pxb; Mon, 20 Sep 2021 21:01:47 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxUkYxnTh921XjxJD03IbfJIFVamB3WPsPFRi92lWTVfgvCktbgRBg01/rk2nK8KYZoSVaB X-Received: by 2002:a50:8161:: with SMTP id 88mr32446582edc.54.1632196907180; Mon, 20 Sep 2021 21:01:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1632196907; cv=none; d=google.com; s=arc-20160816; b=lW+QP4bpdvN0oHKMFdxCfDz8OyGdzMWTGA2y/YClI4obcaNDhVcEIGpEuqvvz7EOi5 hUVY05X0bLA4mZqlW/R4kw+JoGHFMaDCA69UqeQEi2B6GtL5Ap4JZLa0DTaBG5mx0e7Z Su7vb47pe56TCAMgA59cGCa7yQPWpVulOxjkpPynIRPJ3aN0fWGvkSlubVE3DQ2UJ93H lfphmUZRmBJY2gTWp96jzmWr3IjDtR1KQw982V2c32uBH52gac+Tr8uXCBtUl+6tVoqw nAtG1qdAhWOyF+aXJvD6Mw7iu+agiF1738UthPwMvdRSJjN5aSZp8x7IeU7fdd7Qkbf0 M+uw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=amBkTo8paIO6KKoyr0sD3I13KXEgBWAgALAI9bRUOrY=; b=r+6G5g0ftpk4PSpwbBDiaA1y8omi9P5ObyeVufSO+FrACJFgD+eFBYjM1S1rFsn8FJ yFCk9Y13RUdlGYOU+7YbRE2T9as4KrejbfU3xtQRu+kh5bdP77ehsxd5vZeBiAgHl0Bc dhB5fNpt8xQo5sjxTx2A8/Z5Odpe3YiMXPDsblx/Cem+MITxOGjWpDYAY/0ZGVgvllQ6 X6cVqpM7gdrD1kfpTShhkOMibu47cZIhjJmPlerxmfAQviJpS6Rj8fAP7WcZHD9Wsxc6 WzE4FfYcm29uCyg/zz5i24GzHn1amvutkcRwXTKMkyAP1bHOUU7QNBdSh13sBRUnJ0iA EibQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel-com.20210112.gappssmtp.com header.s=20210112 header.b=XL3OFLcR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b4si17900539eds.333.2021.09.20.21.01.23; Mon, 20 Sep 2021 21:01:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@intel-com.20210112.gappssmtp.com header.s=20210112 header.b=XL3OFLcR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236395AbhIUCGd (ORCPT + 99 others); Mon, 20 Sep 2021 22:06:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33454 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236316AbhIUBuE (ORCPT ); Mon, 20 Sep 2021 21:50:04 -0400 Received: from mail-pf1-x42b.google.com (mail-pf1-x42b.google.com [IPv6:2607:f8b0:4864:20::42b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 41EEAC06AB0B for ; Mon, 20 Sep 2021 14:55:22 -0700 (PDT) Received: by mail-pf1-x42b.google.com with SMTP id w14so7017087pfu.2 for ; Mon, 20 Sep 2021 14:55:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=amBkTo8paIO6KKoyr0sD3I13KXEgBWAgALAI9bRUOrY=; b=XL3OFLcRkLeLa9eXRmHa7cKmBwEZybbc9mdJnrjPUsNmyhIp9vj8F3iNcHNSqXTb1K L9OEBMzu/TTeVT/s4PldqIkTZ0HgtlJEgIwwbfok6X1NLdtuYuyvrzXuuS1QyCQHhDf3 FY2aphwijxMkpTX43r3twUk5IdFvYJi1wu5sZUQ0IQCbDLg80wgE3Q0dm575Bea3zf1f JfQJ5ifnA2ISAORhBlZA82ZPIyQ2zFi5MELZoyko92jJrrmD7LWhkltsG8XSY3PQ3mqC YdMiNP+ggNC5rUbR8bzVM/xfxgJFK13EPKdACqQQqNx9Bpea47bYvjeXMJ6UV3MCOnZ/ JrFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=amBkTo8paIO6KKoyr0sD3I13KXEgBWAgALAI9bRUOrY=; b=StEXRZwlhXanWYrmlNPC2SzK6jpmq80ZLqI3wwx3xfBj4pKsiX70TB6Bg0RD4FRZWG dPVKjfcVkfwN6380VW0bTopU39FPJpXuPVEDfdOBWXbrUe4tYbmhTJ53+tWvkzEQW8mX I8D5B3HprwI/VQt7IHQn9LPAIB7UgrAKLAm66ZceftwUhflXPmP79+oEdHUA3O/ESFZy U/FouzZOE3OYP3KAj3NM4pw2dgktX69RwgRPVZiAXkWkDcJlcvr1h8L4NutiZRJJO9bH C71xzPmQvcb6bBk4/J+xMy5lsm8XaeQP5NOTeexoHqypg1n7lhF2vh2TDVZ9YMSf6LmU R1Eg== X-Gm-Message-State: AOAM5301L4cYE6pDvdGh76eTDcIzdd7zocUwHKgDfbjFQl4g1uW0CWPp iJgzG+wPY1fvi3YTTAcPSKoctwoNpTdR0p9b68Jkug== X-Received: by 2002:a63:1e0e:: with SMTP id e14mr25231633pge.5.1632174921576; Mon, 20 Sep 2021 14:55:21 -0700 (PDT) MIME-Version: 1.0 References: <20210918050430.3671227-1-mcgrof@kernel.org> <20210918050430.3671227-10-mcgrof@kernel.org> In-Reply-To: From: Dan Williams Date: Mon, 20 Sep 2021 14:55:10 -0700 Message-ID: Subject: Re: [PATCH v7 09/12] sysfs: fix deadlock race with module removal To: Luis Chamberlain Cc: Tejun Heo , Greg KH , Andrew Morton , Minchan Kim , jeyu@kernel.org, shuah , Randy Dunlap , "Rafael J. Wysocki" , Masahiro Yamada , Nick Desaulniers , yzaikin@google.com, Nathan Chancellor , ojeda@kernel.org, Tetsuo Handa , vitor@massaru.org, elver@google.com, Jarkko Sakkinen , Alexander Potapenko , rf@opensource.cirrus.com, Stephen Hemminger , David Laight , bvanassche@acm.org, jolsa@kernel.org, Andy Shevchenko , trishalfonso@google.com, andreyknvl@gmail.com, Jiri Kosina , mbenes@suse.com, Nitin Gupta , Sergey Senozhatsky , Reinette Chatre , Fenghua Yu , Borislav Petkov , X86 ML , "H. Peter Anvin" , lizefan.x@bytedance.com, Johannes Weiner , Daniel Vetter , Bjorn Helgaas , =?UTF-8?Q?Krzysztof_Wilczy=C5=84ski?= , senozhatsky@chromium.org, Christoph Hellwig , Joe Perches , hkallweit1@gmail.com, Jens Axboe , Josh Poimboeuf , Thomas Gleixner , Kees Cook , Steven Rostedt , Peter Zijlstra , linux-spdx@vger.kernel.org, Linux Doc Mailing List , linux-block@vger.kernel.org, linux-fsdevel , linux-kselftest@vger.kernel.org, cgroups@vger.kernel.org, Linux Kernel Mailing List , copyleft-next@lists.fedorahosted.org Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Sep 20, 2021 at 2:17 PM Luis Chamberlain wrote: > > On Mon, Sep 20, 2021 at 01:52:21PM -0700, Dan Williams wrote: > > On Fri, Sep 17, 2021 at 10:05 PM Luis Chamberlain wrote: > > > This deadlock was first reported with the zram driver, however the live > > > patching folks have acknowledged they have observed this as well with > > > live patching, when a live patch is removed. I was then able to > > > reproduce easily by creating a dedicated selftests. > > > > > > A sketch of how this can happen follows: > > > > > > CPU A CPU B > > > whatever_store() > > > module_unload > > > mutex_lock(foo) > > > mutex_lock(foo) > > > del_gendisk(zram->disk); > > > device_del() > > > device_remove_groups() > > > > This flow seems possible to trigger with: > > > > echo $dev > /sys/bus/$bus/drivers/$driver/unbind > > > > I am missing why module pinning > > The aspect of try_module_get() which comes to value to prevent the > deadlock is it ensures kernfs ops do not run once exit is on the way. > > > is part of the solution when it's the > > device_del() path that is racing? > > But its not, the device_del() path will yield until the kernfs op > completes. It is fine to wait there. > > The deadlock happens if a module exit routine uses a lock which is > also used on a sysfs op. If the lock was first held by module exit, > and module exit is waiting for the kernfs op to complete, and the > kernfs op is waiting to hold the same lock then the exit will wait > forever. > > > Module removal is just a more coarse > > grained way to trigger unbind => device_del(). > > Right, but the device_del() path is not sharing a lock with the sysfs op. The deadlock in the example comes from holding a lock over device_del() that is also taken in a kernfs op. For example, the code above looks like something that runs from driver.remove(), not exclusively module_exit(). Yes, module_exit() may trigger driver.remove() via driver_unregister(), but there's other ways to trigger driver.remove() that do not involve module_exit(). > > Isn't the above a bug > > in the driver, not missing synchronization in kernfs? > > We can certainly take the position as an alternative: > > "thou shalt not use a lock on exit which is also used on a syfs op" > > However that seems counter intuitive, specially if we can resolve the > issue easily with a try_module_get(). Again, I don't see how try_module_get() can affect the ABBA failure case of holding a lock over device_del() that is also held inside sysfs op. I agree that the problem is subtle. Does lockdep not complain about this case? If it's going to be avoided in the core it seems try_module_get() does not completely cover the hole that unsuspecting driver writers might fall into.