Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp1522527pxf; Fri, 12 Mar 2021 11:32:33 -0800 (PST) X-Google-Smtp-Source: ABdhPJz8aJ9nd2agF0ax+vu5ewZDioix0CWhH1XudWKj+Dg9EOX1bYMnyPUHZH7T3CzpqakmwpZv X-Received: by 2002:a50:fd15:: with SMTP id i21mr15858598eds.384.1615577553771; Fri, 12 Mar 2021 11:32:33 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1615577553; cv=none; d=google.com; s=arc-20160816; b=Na8/fA4sByY2Zh2p1enHuSkWwH7p0pCHAtoXVdAKwghoP53d6zB8gvj9jcVNZY6jX1 A4g7634f8MXj1gbiHq2f6X3inPE6OK9MvQG47DGLhF3OXL2/r8Yn6Op0m8vQ92c1h8XH da8RsVU2EDbMdrwSjY9nf+dQ2Mw3sh0vkjtiuaYlLbJVRXAN2uzfwHyAu40H2CyzCanv Sqi3Rjhd+/MiqR3w5ZXaRQduo7OEBf6OLFK/32mxdAnHmIA2QSwjojuFp3iAqC9eMH2I SqNyxOSqFLzJzGsr7oTFkKcwHKHpkhrZxJ3b5bG495zqRD+atbjgPyLtFxOfGJivt64w RqZg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:sender:dkim-signature; bh=xsXCUAe7GzyEFcIbzRGPi+QWjqVbRvQwSdbS7Lm1E4M=; b=cXJSfxeF9loUo36XsmnT05/DKnav5HBIifYU/MktAHb2YoAOeX+24OdmzHKsQsoHsw 0urIBqpgi47oMJzqu324hDmlWnB0PHq7fD1dNi7bcxZf6ztOtTxEJ7A7N6IPYkEFtyBB R+aIvFZ9uaix9VftZ7GnK9n2M0+e9FeX3Ay7shrnBGVN09YoWFu++Peksqskp5+h9w27 Dnd5aczb5IvutU7e7JZXsOz/iHKXTEPn99sUsqOYeP3DSPeFCSxqiNnaA8XJHtaoOaZu /IK3fGdH7V9CSPsWtA7zNcFuE7MEI8JnSqfFiVqjlMgG8RDraYIgq/NrEc8INauxs+4h RFtQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=myFMEhn8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id j10si5086114edj.128.2021.03.12.11.32.10; Fri, 12 Mar 2021 11:32:33 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=myFMEhn8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233912AbhCLT25 (ORCPT + 99 others); Fri, 12 Mar 2021 14:28:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39264 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233260AbhCLT2Z (ORCPT ); Fri, 12 Mar 2021 14:28:25 -0500 Received: from mail-pl1-x636.google.com (mail-pl1-x636.google.com [IPv6:2607:f8b0:4864:20::636]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 27046C061574; Fri, 12 Mar 2021 11:28:25 -0800 (PST) Received: by mail-pl1-x636.google.com with SMTP id z5so12408686plg.3; Fri, 12 Mar 2021 11:28:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=xsXCUAe7GzyEFcIbzRGPi+QWjqVbRvQwSdbS7Lm1E4M=; b=myFMEhn81hANx0heDy1GieIbR1tsA44TdlU0yulRozgxvyj5qQ7FTmE1fCBxSfcWE7 cthXXk+g9MsuaAlmFNRgv6U5/VYUVPrmCIDlaKiMBbxvP5YsmpSABPa/W5wsDpF9Al1A NIUyOkqi3aP5gX7p3HcPsV1rvdcOw3ffJHZ9lHAvHil1Y+wUklRd5/6lFTorMIxR4IkC jn9qXU1yP2Rh4trZX5VPSPXXcFSt44P+Rk58R/k3exJsMf6P7TX9L4sZn/QskmENj/7j A4pImP9Z6TQajenglJM/VSAWR+mGoM446jHKvevW1INOeokZsIpFgTYGaUQfhnkuyySz KpvA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to; bh=xsXCUAe7GzyEFcIbzRGPi+QWjqVbRvQwSdbS7Lm1E4M=; b=lGVUpbv+v1uqOnVQBq0Am/yDtDa7kKytlbj7XcqyChFBc1EyCIXIwdbfhvJFQtLm4A DlcCCr5+dtPSWELwgxBvMw9Q1QUK38Ht8II2cAObUj3V/qJrDOewFCUDqctvqG5/z25J BMBPu0VO1DTn1FLLHqwkpgnd8TJyOG7jp+EJ8aggnkoCJ7LBOedS3l1FiaAdSFrQvYEf lu0rAaKjpiTfQGWRNnucQv9zGQOLcPZyv13O0HJbLSA0pmoY0ujI1Pg8Nkz83QUg8lqd Y8J3+wBFTr7eK88ptrZwIKm1Nv0N8TdqldiXFfIfq5LsX3hoISyZ75xi4AKq8zSXMqg6 hF8w== X-Gm-Message-State: AOAM532XzSceOY7Cso2rYXrQ5TgrOdwpFzsSwSmlcRl/HgcFnZXbsM5Y U7rl/7JPP+kTzmU9FfNXnKfGPacgPWY= X-Received: by 2002:a17:902:6a87:b029:e6:6a3d:29e8 with SMTP id n7-20020a1709026a87b02900e66a3d29e8mr83679plk.10.1615577304498; Fri, 12 Mar 2021 11:28:24 -0800 (PST) Received: from google.com ([2620:15c:211:201:8d76:9272:43a9:a6d0]) by smtp.gmail.com with ESMTPSA id b9sm6020236pgn.42.2021.03.12.11.28.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 12 Mar 2021 11:28:23 -0800 (PST) Sender: Minchan Kim Date: Fri, 12 Mar 2021 11:28:21 -0800 From: Minchan Kim To: Luis Chamberlain Cc: gregkh@linuxfoundation.org, ngupta@vflare.org, sergey.senozhatsky.work@gmail.com, axboe@kernel.dk, mbenes@suse.com, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 1/2] zram: fix crashes due to use of cpu hotplug multistate Message-ID: References: <20210306022035.11266-1-mcgrof@kernel.org> <20210306022035.11266-2-mcgrof@kernel.org> <20210310212128.GR4332@42.do-not-panic.com> <20210312183238.GW4332@42.do-not-panic.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210312183238.GW4332@42.do-not-panic.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 12, 2021 at 06:32:38PM +0000, Luis Chamberlain wrote: > On Thu, Mar 11, 2021 at 06:14:40PM -0800, Minchan Kim wrote: > > On Wed, Mar 10, 2021 at 09:21:28PM +0000, Luis Chamberlain wrote: > > > On Mon, Mar 08, 2021 at 06:55:30PM -0800, Minchan Kim wrote: > > > > If I understand correctly, bugs you found were related to module > > > > unloading race while the zram are still working. > > > > > > No, that is a simplifcation of the issue. The issue consists of > > > two separate issues: > > > > > > a) race against module unloading in light of incorrect racty use of > > > cpu hotplug multistate support > > > > > > Could you add some pusedo code sequence to show the race cleary? > > Let us deal with each issue one at time. First, let's address > understanding the kernel warning can be reproduced easily by > triggering zram02.sh from LTP twice: > > kernel: ------------[ cut here ]------------ > kernel: Error: Removing state 63 which has instances left. > kernel: WARNING: CPU: 7 PID: 70457 at kernel/cpu.c:2069 __cpuhp_remove_state_cpuslocked+0xf9/0x100 > kernel: Modules linked in: zram(E-) zsmalloc(E) > > The first patch prevents this race. This race is possible because on > module init we associate callbacks for CPU hotplug add / remove: > > static int __init zram_init(void) > { > ... > ret = cpuhp_setup_state_multi(CPUHP_ZCOMP_PREPARE, "block/zram:prepare", > zcomp_cpu_up_prepare, zcomp_cpu_dead); > ... > } > > The zcomp_cpu_dead() accesses the zcom->comp, and if zcomp->comp is > removed and this function is called, clearly we'll be accessing some > random data here and can easily crash afterwards: I am trying to understand this crash. You mentioned the warning above but here mention crash(I understand sysfs race but this is different topic). What's the relation with above warning and crash here? You are talking the warning could change to the crash sometimes? > > int zcomp_cpu_dead(unsigned int cpu, struct hlist_node *node) > { > struct zcomp *comp = hlist_entry(node, struct zcomp, node); > struct zcomp_strm *zstrm; > > zstrm = per_cpu_ptr(comp->stream, cpu); > zcomp_strm_free(zstrm); > return 0; > } > > And zram's syfs reset_store() lets userspace call zram_reset_device() > which calls zcomp_destroy(): > > void zcomp_destroy(struct zcomp *comp) > { > cpuhp_state_remove_instance(CPUHP_ZCOMP_PREPARE, &comp->node); > free_percpu(comp->stream); > kfree(comp); > } Do you mean the race between module unloading and zram_reset_device? If I understood correctly, the approach I put in the first reply would prevent this problem. > > > It would be great if it goes in the description, too since it's > > more clear to show the problme. > > Does the above do it? Rather than that, let's have this kinds of explanation, which is more clear(it's a just example, not describing exact race you saw. You could be better to descibe it). CPU A CPU B moudle_unload destroy_devices zram_remove_cb reset_store zcomp_cpu_dead zram_reset_device zcomp_destroy cpuhp_state_remove_instance zcomp_strm_free > > > > > b) module unload race with sysfs attribute race on *any* driver which > > > has sysfs attributes which also shares the same lock as used during > > > module unload > > > > Yub, that part I missed. Maybe, we need some wrapper to zram sysfs > > to get try_module_get in the warapper funnction and then call sub > > rountine only if it got the refcount. > > > > zram_sysfs_wrapper(func, A, B) > > if (!try_module_get(THIS_MODULE) > > return -ENODEV; > > ret = func(A,B); > > module_put(THIS_MODULE); > > return ret; > > I'd much prefer this be resolved in kernfs later, if you look at the kernel > there are already some drivers which may have realized this requirement > the hard way. Open coding this I think makes the race / intent clearer. > > Right now we have no semantics possible for a generic solution, but I > can work on one later. Yub, no problem. My point is let's have some zram sysfs wrapper to manage module_get/put automatically as I mentioned above so no need to deal with module refcount in worker functions.