Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp3983199pxf; Tue, 6 Apr 2021 05:22:50 -0700 (PDT) X-Google-Smtp-Source: ABdhPJztKOSha9q7hb4VaK2zPapqUBBLbJm4RYS8kCjHnvwggRO43yAPUpg/XJGAcWDDdrmn8O17 X-Received: by 2002:a17:906:5d06:: with SMTP id g6mr34040588ejt.216.1617711769857; Tue, 06 Apr 2021 05:22:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1617711769; cv=none; d=google.com; s=arc-20160816; b=H2RwINFbBOmRrYKWZxYSgmC9J5oPk0MeoPd/owHO1TUI24j+XTJyd7UqmUYiveK7uD /qdaGOEtWgVvlAQk1M/e38LC/l/0sIGczzf3zxwWUfAocKXJyweSKpyl0CgpessnwM+1 h3KBN4MR+Hjqn5GcKtv2T9mLjU676n84h7KAmdt+agFyWKJ0SrNa8KgxG7jptm5sMWf4 MIiMAX7vw8NGQQhfbt8mB+Cle3KDYM8K9BHWm7z3qDWJR4klQb4fgNmilzKOxySwRfQ9 SqVszF+Sds0bqAoVGn9B9xBFIWXO1y0lxLlVG7lTcCwUTa8bjSYPiCx1SYrfk/+YLIHj 2PXA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=bpTDpMwxa167NjcxbJJbniTo0V4vg9vOXn8pffBW4T4=; b=FFiMTYLhKs5A4T2zNHbXbKSdagXLY1fdRpH/rqnfkWn5nlv2pd9ttJfBV07RannKlR UBqScF3rOEHJqXLOzB4Swm9E4npuckeUf+Fu7yATgnERDy2462/srYdIHR3ye360MMVo WXOwV+lzZ9P5MOJQT3TdAldHcMEbJTx/n4mWMBSRHMt/jc4B3cog7UziZ5JF80WDlDMG Oi7bbOElYBGLAmeU5eqY1Kcj/T9/9R15fwkJeu9JLZDo8G54nyqkYd9nN8n5NU7iwftp VBg9kpPvJ5Sxn/IoKJdHNZBnUF9t1vdGS2aRqkWZ4QDW2qlAekP7LLYOiJdwnJ9nmned ZuNQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id o6si12372561edi.480.2021.04.06.05.22.26; Tue, 06 Apr 2021 05:22:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242996AbhDFA3T (ORCPT + 99 others); Mon, 5 Apr 2021 20:29:19 -0400 Received: from mail-pg1-f179.google.com ([209.85.215.179]:37573 "EHLO mail-pg1-f179.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242987AbhDFA3S (ORCPT ); Mon, 5 Apr 2021 20:29:18 -0400 Received: by mail-pg1-f179.google.com with SMTP id k8so9157477pgf.4; Mon, 05 Apr 2021 17:29:11 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=bpTDpMwxa167NjcxbJJbniTo0V4vg9vOXn8pffBW4T4=; b=uT4x4vlUavRKSzeBdlPNLHyu34PV4kjVJ+4Vy7xONLAmYWxGFLjm+W5NCNyIxO1caD SnKq5wJtCQ4vaiPD7sYefShbFlPswIL0d4I1sAC3lYjtjAnquWOMhwX1VUCbwV4yDGM2 fE8ywxPO0itR5w/UjJlr/asZoOMNOGNtobtZqgQccXlV4SalZ8DntWC1S8C5KM0pGZ9q xFxaRAZxdSGAU5LmIbcn47OqX2fPp0DyPZQQdKWNHANvJvkhg+IMvBcIjUyjUjFb2qU+ TU4D/qzV65TXKKnXC3G0HD56fBGZKct7iRS+02REVLQEQcvYPJllluIhO+GRiDknsjLk af/g== X-Gm-Message-State: AOAM532CURCkrYLr5bw+GXhdo6UVwSs2aGJ1M3gN8+rbZRfWQmSZgKhU yCXJrp6JS0r3UVFuafTpJveA8I3oxdYRjA== X-Received: by 2002:aa7:8648:0:b029:1fd:85ae:169b with SMTP id a8-20020aa786480000b02901fd85ae169bmr25935586pfo.28.1617668951346; Mon, 05 Apr 2021 17:29:11 -0700 (PDT) Received: from 42.do-not-panic.com (42.do-not-panic.com. [157.230.128.187]) by smtp.gmail.com with ESMTPSA id p11sm508754pjo.48.2021.04.05.17.29.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Apr 2021 17:29:10 -0700 (PDT) Received: by 42.do-not-panic.com (Postfix, from userid 1000) id 9BAB5404D5; Tue, 6 Apr 2021 00:29:09 +0000 (UTC) Date: Tue, 6 Apr 2021 00:29:09 +0000 From: Luis Chamberlain To: Minchan Kim Cc: keescook@chromium.org, dhowells@redhat.com, hch@infradead.org, mbenes@suse.com, gregkh@linuxfoundation.org, ngupta@vflare.org, sergey.senozhatsky.work@gmail.com, axboe@kernel.dk, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 1/2] zram: fix crashes due to use of cpu hotplug multistate Message-ID: <20210406002909.GY4332@42.do-not-panic.com> References: <20210312183238.GW4332@42.do-not-panic.com> <20210319190924.GK4332@42.do-not-panic.com> <20210322204156.GM4332@42.do-not-panic.com> <20210401235925.GR4332@42.do-not-panic.com> <20210405190023.GX4332@42.do-not-panic.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Apr 05, 2021 at 12:58:05PM -0700, Minchan Kim wrote: > On Mon, Apr 05, 2021 at 07:00:23PM +0000, Luis Chamberlain wrote: > > On Mon, Apr 05, 2021 at 10:07:24AM -0700, Minchan Kim wrote: > > > On Thu, Apr 01, 2021 at 11:59:25PM +0000, Luis Chamberlain wrote: > > > > And come to think of it the last patch I had sent with a new > > > > DECLARE_RWSEM(zram_unload) also has this same issue making most > > > > sysfs attributes rather fragile. > > > > > > Thanks for looking the way. I agree the single zram_index_rwlock is > > > not the right approach to fix it. However, I still hope we find more > > > generic solution to fix them at once since I see it's zram instance > > > racing problem. > > > > They are 3 separate different problems. Related, but different. > > What are 3 different problems? I am asking since I remember only two: > one for CPU multistate and the other one for sysfs during rmmod. The third one is the race to use sysfs attributes and those routines then derefernece th egendisk private_data. > > If the idea then is to busy out rmmod if a sysfs attribute is being > > read, that could then mean rmmod can sometimes never complete. Hogging > > up / busying out sysfs attributes means the module cannto be removed. > > It's true but is it a big problem? There are many cases that system > just return error if it's busy and rely on the admin. IMHO, rmmod should > be part of them. It depends on existing userspace scripts which are used to test and expectations set. Consider existing tests, you would know better, and since you are the maintainer you decide. I at least know for many other types of device drivers an rmmod is a sledge hammer. You decide. I just thought it would be good to highlight the effect now rather than us considering it later. > > Which is why the *try_module_get()* I think is much more suitable, as > > it will always fails if we're already going down. > > How does the try_module_get solved the problem? The try stuff only resolves the deadlock. The bget() / bdput() resolves the race to access to the gendisk private_data. > > > I see one of the problems is how I could make new zram object's > > > attribute group for zram knobs under /sys/block/zram0 since block > > > layer already made zram0 kobject via device_add_disk. > > > > Right.. well the syfs attribute races uncovered here actually do > > apply to any block driver as well. And which is why I was aiming > > for something generic if possible. > > It would be great but that's not the one we have atm so want to > proceed to fix anyway. What is not the one we have atm? I *do* have a proposed generic solution for 2/3 issues we have been disussing: a) deadlock on sysfs access b) gendisk private_data race But so far Greg does not see enough justification for a), so we can either show how wider this issue is (which I can do using coccinelle), or we just open code the try_module_get() / put on each driver that needs it for now. Either way it would resolve the issue. As for b), given that I think even you had missed my attempt to generialzie the bdget/bdput solution for any attribute type (did you see my dev_type_get() and dev_type_put() proposed changes?), I don't think this problem is yet well defined in a generic way for us to rule it out. It is however easier to simply open code this per driver that needs it for now given that I don't think Greg is yet convinced the deadlock is yet a widespread issue. I however am pretty sure both races races *do* exist outside of zram in many places. > > I am not sure if you missed the last hunks of the generic solution, > > but that would resolve the issue you noted. Here is the same approach > > but in a non-generic solution, specific to just one attribute so far > > and to zram: > > So idea is refcount of the block_device's inode Yes that itself prevents races against the gendisk private_data from being invalid. Why because a bdget() would not be successful after del_gendisk(): del_gendisk() --> invalidate_partition() --> __invalidate_device() --> invalidate_inodes() > and module_exit path > checks also the inode refcount to make rmmod failure? They try_module_get() approach resolves the deadlock race, but it does so in a lazy way. I mean lazy in that then rmmod wins over sysfs knobs. So touching sysfs knobs won't make an rmmod fail. I think that's more typical expected behaviour. Why? Because I find it odd that looping forever touching sysfs attributes should prevent a module removal. But that's a personal preference. Luis