Received: by 2002:a05:6a10:1a4d:0:0:0:0 with SMTP id nk13csp3053787pxb; Thu, 10 Feb 2022 11:08:23 -0800 (PST) X-Google-Smtp-Source: ABdhPJwRC3g2tosX3J9F314klAqP48Vh4JjRHLhzvB9h7/jEDrksRsDIlhwthj78KNtNTZ8vtYj/ X-Received: by 2002:a17:902:ea01:: with SMTP id s1mr2409577plg.133.1644520103664; Thu, 10 Feb 2022 11:08:23 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1644520103; cv=none; d=google.com; s=arc-20160816; b=TlY+c52iPYaYoHp6tEKjKSiuEFii3GfGaMiFJP8fo1Mb0MpwAhkYInbBwDvNjnrF1P AYQQ1dITeRd3gmUqrKbeyYj4d8Jhh/6lr3UBbjgQGm0rFVoxgPv0ZlESZ04vicHbvUo9 i811M4E+TblpUaVKJEN0d6IV33gdsNJgIMXxd5bEhPt2zM/Sp5Y7v4E45ux6hVD8y/iI nTrVUkCjflMzTCs6Occ2CVYRxCrphI9zqACit6O4PZuZmWtOgZUhkJm7nUOOBnpvzejX 3OLNhKtKUqTfgaCaSS0GWuzcdTfZbSc32h7z2EFWOae0FXIFH2NyvK3YmNs588Z2yIMh egTw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=JGRdrr1G+d7hefqfddrrQW5S3m6cjrI+wEU2MLhyvxg=; b=OvKId3SYl9QF9nFtiLa/HOa99mxY/P5qGwLCg1ayR8EEpCvUTM7knaGqGJFUzYQaIh if4GXETWfmngUpvsg2V550JMQn103in4W2HUiQ7PvhY2A13cPAC4s6Z+9zq4cBWvmkxg uwIY9HOvV7fsWV8oPJMNr7Js+KbpworE+28SgkHBEHKOFtcsk3BX9rvURCUehCwtD4Nb hc3UjQSx4TN4p/s+VJm0MMb2ElTADnJ2MRPRzt4+K+iXSVkxJwEQ8mOwIMk6lFKHUL2B GVoPIIg6YSjPBJAxGtpE/iCjXt6v5Hox2jB81aiJkx8hI8lfXDDXbQE5vhIv69bY2nO9 dDKQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=mit.edu Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id gq4si2091761pjb.84.2022.02.10.11.08.11; Thu, 10 Feb 2022 11:08:23 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=mit.edu Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243840AbiBJPzH (ORCPT + 99 others); Thu, 10 Feb 2022 10:55:07 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:45304 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238388AbiBJPzG (ORCPT ); Thu, 10 Feb 2022 10:55:06 -0500 Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A9626C2D; Thu, 10 Feb 2022 07:55:06 -0800 (PST) Received: from cwcc.thunk.org (pool-108-7-220-252.bstnma.fios.verizon.net [108.7.220.252]) (authenticated bits=0) (User authenticated as tytso@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 21AFsWxL026635 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 10 Feb 2022 10:54:33 -0500 Received: by cwcc.thunk.org (Postfix, from userid 15806) id 97AF715C0040; Thu, 10 Feb 2022 10:54:32 -0500 (EST) Date: Thu, 10 Feb 2022 10:54:32 -0500 From: "Theodore Ts'o" To: "Paul E. McKenney" Cc: Jan Kara , Qian Cai , Jan Kara , Neeraj Upadhyay , Joel Fernandes , Boqun Feng , linux-ext4@vger.kernel.org, rcu@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [RFC PATCH] jbd2: avoid __GFP_ZERO with SLAB_TYPESAFE_BY_RCU Message-ID: References: <20220209165742.5659-1-quic_qiancai@quicinc.com> <20220209181010.gfn66rvip56i54df@quack3.lan> <20220209201137.GY4285@paulmck-ThinkPad-P17-Gen-1> <20220210054357.GH4285@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220210054357.GH4285@paulmck-ThinkPad-P17-Gen-1> X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Feb 09, 2022 at 09:43:57PM -0800, Paul E. McKenney wrote: > So the typical RCU approach (not involving SLAB_TYPESAFE_BY_RCU) > is to take the grace period at the time of the free. This can be > done synchronously using synchronize_rcu(), but is often instead done > asynchronously using call_rcu() or kfree_rcu(). So in this case, > you don't need synchronize_rcu() on allocation because the required > grace period already happened at *free() time. > > But there are a few situations where it makes sense to free blocks that > readers might still be referencing. Readers must then add validity > checks to detect this case, and also prevent freeing, for example, > using a per-block spinlock for synchronization. For example, a reader > might acquire a spinlock in the block to prevent changes, recheck the > lookup key, and if the key does not match, release the lock and pretend > not to have found the block. If the key does match, anything attempting > to delete and free the block will be spinning on that same spinlock. > > And so if you specify SLAB_TYPESAFE_BY_RCU, the slab allocator is > guaranteeing type safety to RCU readers instead of the usual existence > guarantee. A memory block might be freed out from under an RCU reader, > but its type will remain the same. This means that the grace period > happens internally to the slab allocator when a slab is returned to > the system. > > So either the validation checks are quite novel, the kmem_cache_zalloc() > calls should be replaced by kmem_cache_alloc() plus validation checks, > or the SLAB_TYPESAFE_BY_RCU should be removed. > > Just out of curiosity, what is your mental model of SLAB_TYPESAFE_BY_RCU? Hmm, so the code in question the flag was called SLAB_DESTROY_BY_RCU in June 205 by commit de92c8caf16c ("jbd2: speedup jbd2_journal_get_[write|undo]_access()"), and it was written by Jan. I don't see anything to make sure the jh doesn't get freed until after the grace period, and so that looks like a problem unless I'm missing something. Jan, what do you think? - Ted