Received: by 2002:a05:7412:5112:b0:fa:6e18:a558 with SMTP id fm18csp1753927rdb; Thu, 25 Jan 2024 05:20:59 -0800 (PST) X-Google-Smtp-Source: AGHT+IEg8GPnVJoStWv6d/vE+ua15i0erqHfwGgM88o71n5r1oxIuhZITI63F8Orx700G3RIwKvv X-Received: by 2002:a17:907:20ee:b0:a30:bfda:66db with SMTP id rh14-20020a17090720ee00b00a30bfda66dbmr249192ejb.227.1706188858937; Thu, 25 Jan 2024 05:20:58 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706188858; cv=pass; d=google.com; s=arc-20160816; b=Z3OpGZuQOPhsOv5T9lZn8WlchbBCfbFLDIe5EH8enDxl8lfvTRVowoq3/50sQSVKG/ BGs68slChoy6PLHODw/5ptXtvv5uihntwY2wXN8Bs4i+CAtJc3Oow+IZtp2OYwdKkAlJ f2/Xgns/5hKQW1oFjRHtIaUFV3InORqqnsjb53Le69Z67qLs5bMmUw6SY1/+1GO0wTcS 4c9Uw2DwIH2kKhz/Ag5hnah7CswF9TYAW5ywacCrwYnj4LSRpcAHyPiEwU0SAsYQUb+0 Z3T5bkSdndDmgpAnuw5RI4GQjYAQMYumq95gTKLzOiVeQ2m/RUB4bI6GrHhVPxuH9RPp MQqw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:reply-to:message-id :subject:cc:to:from:date:dkim-signature; bh=wTXhlMivyrlqakHv4RHJI/CN0uZo+69hXJcQksLdlZw=; fh=oL+yNtZvYndz5X+ubyoZtJZUVpSZirCFxvQ86L+wY2s=; b=Fcd8U6zZ8LXvj90Xx9sCu3Ha9TJh/TSZLBm9FqhplvrlEM9v2XU60CEqobtmFMu7Tu tYPJfA6/vuziWwYLdeEx7ZoflBezgS3qqYzery4Cnwbd6xF68BwxmUcIbQKs+atbmzNq LCH17Vh/d8DuSAHBhxWbrQngS/MKC4yhBAIPsUg7KvBkcjxIWhhop+ZFOUX7kitQ0sBk vKNOux52LplSM2BbtAGmLMXIOy8zbc2g2DardT1tld2uTYVuQakrrtK3XcKaY0woG9va GDoe9auNGJx0Ztl8XCLC1y2xf9g9BNItDmZO/mqOzkpt8Np4x6MudWcUcpkuut+f+u0z fYuA== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=XvPwZ6Az; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-38638-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-38638-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id ca23-20020a170906a3d700b00a30a6c2eea7si894433ejb.635.2024.01.25.05.20.58 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 25 Jan 2024 05:20:58 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-38638-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=XvPwZ6Az; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-38638-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-38638-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id AECDC1F25D46 for ; Thu, 25 Jan 2024 13:20:58 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id D731C50A97; Thu, 25 Jan 2024 13:20:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="XvPwZ6Az" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F2A8650A71; Thu, 25 Jan 2024 13:20:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706188846; cv=none; b=ltmBi4on713LOemDg2V9/wCu9vWS1Bz1NEPF2XMDNSC2YhCTpwwIC2nkGOPL4OviFzV/NQHljLPNapn1GKcI6DB0R+wce/O4JARJ6lsGt7MSnArhKwrNDuZAnPAngbp253K0Ui+eUpqjhba7LBuwfvToprwmO+nstX0xspWgiD0= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706188846; c=relaxed/simple; bh=REBZVG9ehyWFJMZdJCjxgTWUOv9y1bMZcqreSj6u5EI=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=id8kvQ6XSudqaafBYNjURxtN70KWZ6ONk6uYgNo+SV+I1qf737haIJe3JvranBMkRQFbvLubFhRM1mZSVKEFosAbvoOnBzUTJW2s3CPdkop44XnLLD0dMyzvtH/Xx6XiOgZ6yWB5+wHCQ05sMIKamXQ0A4Q2uKlgGPj8pvfrR80= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=XvPwZ6Az; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 549DCC433F1; Thu, 25 Jan 2024 13:20:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1706188845; bh=REBZVG9ehyWFJMZdJCjxgTWUOv9y1bMZcqreSj6u5EI=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=XvPwZ6AzhOVK6A3/vsuVyOFrAjO4c/mAA4zUw9lmJ6R/R6KL+mNzUgYzYot46Levk FzWTm8sMRVWeLfbMe4vMOBygGZdpmHeJoLmuSRU3JfLE0oGmmJY4FCresqAsmfLqmv hq0B6ly4ckUdxQ3A5+NEV806+lmOzSqI2GjVp26vbRgUrhIHWKL685tdCvsGUbk1I4 T2V4X9nZ35yxMAyMQbmfuzaO/6Ro3Nq55TbN1lb1HzB+NXF6sn3AvEv9n3lqFDBG1Y NziJ6+upFovUW2qO8Jyqz0vTUynOnQbbbNVt5E34YzJYAAfliyQlDV2zQrzgCbsSaA CcLD9dMoaUdlw== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id E6F0FCE189D; Thu, 25 Jan 2024 05:20:44 -0800 (PST) Date: Thu, 25 Jan 2024 05:20:44 -0800 From: "Paul E. McKenney" To: Marco Elver Cc: Alexander Potapenko , quic_charante@quicinc.com, akpm@linux-foundation.org, aneesh.kumar@linux.ibm.com, dan.j.williams@intel.com, david@redhat.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, mgorman@techsingularity.net, osalvador@suse.de, vbabka@suse.cz, Dmitry Vyukov , kasan-dev@googlegroups.com, Ilya Leoshkevich , Nicholas Miehlbradt , rcu@vger.kernel.org Subject: Re: [PATCH] mm/sparsemem: fix race in accessing memory_section->usage Message-ID: <9d94958c-7ab3-4f0d-a718-1f72c1467925@paulmck-laptop> Reply-To: paulmck@kernel.org References: <1697202267-23600-1-git-send-email-quic_charante@quicinc.com> <20240115184430.2710652-1-glider@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Thu, Jan 18, 2024 at 10:43:06AM +0100, Marco Elver wrote: > On Thu, Jan 18, 2024 at 10:01AM +0100, Alexander Potapenko wrote: > > > > > > Hrm, rcu_read_unlock_sched_notrace() can still call > > > __preempt_schedule_notrace(), which is again instrumented by KMSAN. > > > > > > This patch gets me a working kernel: > > > > [...] > > > Disabling interrupts is a little heavy handed - it also assumes the > > > current RCU implementation. There is > > > preempt_enable_no_resched_notrace(), but that might be worse because it > > > breaks scheduling guarantees. > > > > > > That being said, whatever we do here should be wrapped in some > > > rcu_read_lock/unlock_() helper. > > > > We could as well redefine rcu_read_lock/unlock in mm/kmsan/shadow.c > > (or the x86-specific KMSAN header, depending on whether people are > > seeing the problem on s390 and Power) with some header magic. > > But that's probably more fragile than adding a helper. > > > > > > > > Is there an existing helper we can use? If not, we need a variant that > > > can be used from extremely constrained contexts that can't even call > > > into the scheduler. And if we want pfn_valid() to switch to it, it also > > > should be fast. > > The below patch also gets me a working kernel. For pfn_valid(), using > rcu_read_lock_sched() should be reasonable, given its critical section > is very small and also enables it to be called from more constrained > contexts again (like KMSAN). > > Within KMSAN we also have to suppress reschedules. This is again not > ideal, but since it's limited to KMSAN should be tolerable. > > WDYT? I like this one better from a purely selfish RCU perspective. ;-) Thanx, Paul > ------ >8 ------ > > diff --git a/arch/x86/include/asm/kmsan.h b/arch/x86/include/asm/kmsan.h > index 8fa6ac0e2d76..bbb1ba102129 100644 > --- a/arch/x86/include/asm/kmsan.h > +++ b/arch/x86/include/asm/kmsan.h > @@ -64,6 +64,7 @@ static inline bool kmsan_virt_addr_valid(void *addr) > { > unsigned long x = (unsigned long)addr; > unsigned long y = x - __START_KERNEL_map; > + bool ret; > > /* use the carry flag to determine if x was < __START_KERNEL_map */ > if (unlikely(x > y)) { > @@ -79,7 +80,21 @@ static inline bool kmsan_virt_addr_valid(void *addr) > return false; > } > > - return pfn_valid(x >> PAGE_SHIFT); > + /* > + * pfn_valid() relies on RCU, and may call into the scheduler on exiting > + * the critical section. However, this would result in recursion with > + * KMSAN. Therefore, disable preemption here, and re-enable preemption > + * below while suppressing rescheduls to avoid recursion. > + * > + * Note, this sacrifices occasionally breaking scheduling guarantees. > + * Although, a kernel compiled with KMSAN has already given up on any > + * performance guarantees due to being heavily instrumented. > + */ > + preempt_disable(); > + ret = pfn_valid(x >> PAGE_SHIFT); > + preempt_enable_no_resched(); > + > + return ret; > } > > #endif /* !MODULE */ > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > index 4ed33b127821..a497f189d988 100644 > --- a/include/linux/mmzone.h > +++ b/include/linux/mmzone.h > @@ -2013,9 +2013,9 @@ static inline int pfn_valid(unsigned long pfn) > if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) > return 0; > ms = __pfn_to_section(pfn); > - rcu_read_lock(); > + rcu_read_lock_sched(); > if (!valid_section(ms)) { > - rcu_read_unlock(); > + rcu_read_unlock_sched(); > return 0; > } > /* > @@ -2023,7 +2023,7 @@ static inline int pfn_valid(unsigned long pfn) > * the entire section-sized span. > */ > ret = early_section(ms) || pfn_section_valid(ms, pfn); > - rcu_read_unlock(); > + rcu_read_unlock_sched(); > > return ret; > }