Received: by 2002:a05:7412:8d09:b0:fa:4c10:6cad with SMTP id bj9csp73069rdb; Mon, 15 Jan 2024 12:35:27 -0800 (PST) X-Google-Smtp-Source: AGHT+IEgKIM1ZlCg7nsa6X1yPA7TgsnnfyUX7R5JWzPRjEx0MHIAJdXgjPtVkoqSQVOH5nxdYcb3 X-Received: by 2002:a92:d389:0:b0:35f:b148:c24c with SMTP id o9-20020a92d389000000b0035fb148c24cmr5368062ilo.1.1705350927232; Mon, 15 Jan 2024 12:35:27 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1705350927; cv=none; d=google.com; s=arc-20160816; b=H0iDvc8N4gbBJ1b2dbZyzPxG3fWcQZyhcNuJQcmaFaiL8wbsSvII/eGIeSJEu9by5z ZTTbeQZ7woDYDazFYCV8GeGwgfIhrMU3WuOa1j/N804J2ocnTsoJP473/O4zvk3tEarl V+0v8Ai3AkkFxiUmLcQ/oePV5N3nwR1kdUEkSp84k5/Jd2cwNsmBkjbUhOOvy5sqFUpJ /1bzVPCQoy/hqYGKrK4urU6mCXMxFUzHcSkc0ZbmCRG3Z1uIiM7I675imlliTTixkoyc giMStR4WT2PSYtiH5rKKp8VY5RHKyOKHnxb0RWjs/MeqpuFkOZQqjkpV4mks8cP/MojG 5VNw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:list-unsubscribe:list-subscribe:list-id:precedence :dkim-signature; bh=vMS0J/7CyXgLKOAf0V+IWQO6ynhdDfGM83qwFs2ZDn0=; fh=V/YoFzKlSwVzG1uyqPUWUAENm7zViGv6IjT7j68hIPU=; b=lmBKXBQr/LSzcG5DRJuplVCY/d8nUbEYj7O8vbYakgBSIRurkNs4x8XZN5l0QbBZQ0 +wzc1VvTCNVnl5DYO/svOH+M3yfxWCa5yLEosSxE/5QigLs6h+JoJNnVrWnJKo+3uRQd d8lgO211rWvP+qT6UfnJLsGa3rvIdzRAMxQWjYjQhI8+nR9ERWfEDzY8oVJp9+f1Uce4 vBfry0uqJluiSaTHrpq8k7Ydtcq2ifF+enMyVCaWD5gKzgKLoUUkd4zTHxe4W3+zifxc X97CJgbkthUs9TgVjIN+BU3LUbqS3eJ9L/xiFJpGhtvGOQZuIFYl2OZmXP+Y1Pb+RXV2 WU0g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=G76hveRv; spf=pass (google.com: domain of linux-kernel+bounces-26482-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-26482-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [2604:1380:40f1:3f00::1]) by mx.google.com with ESMTPS id h17-20020a170902f7d100b001d5a4de4672si7534719plw.627.2024.01.15.12.35.26 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 15 Jan 2024 12:35:27 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-26482-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) client-ip=2604:1380:40f1:3f00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=G76hveRv; spf=pass (google.com: domain of linux-kernel+bounces-26482-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-26482-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 6103CB21F1B for ; Mon, 15 Jan 2024 20:35:22 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 6ACFA1AAC1; Mon, 15 Jan 2024 20:35:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="G76hveRv" Received: from mail-vs1-f50.google.com (mail-vs1-f50.google.com [209.85.217.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4482F1AAA5 for ; Mon, 15 Jan 2024 20:35:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Received: by mail-vs1-f50.google.com with SMTP id ada2fe7eead31-469531dd926so330817137.2 for ; Mon, 15 Jan 2024 12:35:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1705350913; x=1705955713; darn=vger.kernel.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=vMS0J/7CyXgLKOAf0V+IWQO6ynhdDfGM83qwFs2ZDn0=; b=G76hveRvX0AzVKGahtaVXGRdK3o6kP6AQgF/iHLF5ly5ULvRl5gRur/MfIP7mdizaZ rG8XBuqX5iNwYsRAFY8uF4npo+5xsjRvpaHZ3X76G2xXm5LzciJrtbs292zzobPJceUo mcs1tCjXklfcrkHViNBtL03pZHrjnWNUFBZEG3cOMB9h8fJrRQ+xhZOGXvTEOGxCz1Pc FDHhMRECHSs2nZvY2UPCOfdTLapdrJrwidWbmkLcm5rw/2HHwsvZ84APUxki7mYW3qT7 UZHW0aJIgljLfj5L7RZk/YDIOcKy6OoKJDVECmgCsPfNE6KkGEhnPk4Kfm/1GRniZE/y /6JA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705350913; x=1705955713; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=vMS0J/7CyXgLKOAf0V+IWQO6ynhdDfGM83qwFs2ZDn0=; b=kYlhughCizorocDpZguMK0RfjJc86T8dzS8ccOr2A83o2ISi4aTKEGngKw7N4xuL8Z eFavZh1Lh/Bv3Mz509iFvepwE1IXCHifrflDZQjT3DU5gpcMgdPZpoN4/WOySSpxW7Ey PKmvds21eyq2O7D+b2G2J54yAU2XgfnOKVL6clRMtdInYVCSBKBN3lnzompeLHOtoFVU /m7XywMx0twYLiFqvPGvD6VPGbpIWuaSZX0QTY7L7wAYRHXGf0HwkYsqppvPeGrPQiON 156g2L5571jSXSqVy2IdLn6s45w1cuP1vHGtyI1sskxe1U76vC20u0feAMNFoo/16kxh yjvg== X-Gm-Message-State: AOJu0YzSmhVZUcl1PQAjQi1VD9lYGdOMbF5hMp9M0v05q6b4Ogk38IKW 1r5ky/H1a/KUPdPOphExx2exyKVawJDGr+i0DhIo9WLw83rjtchBzdN81LPjdA== X-Received: by 2002:a05:6102:48c:b0:468:dca:dd58 with SMTP id n12-20020a056102048c00b004680dcadd58mr3124255vsa.17.1705350912977; Mon, 15 Jan 2024 12:35:12 -0800 (PST) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <1697202267-23600-1-git-send-email-quic_charante@quicinc.com> <20240115184430.2710652-1-glider@google.com> In-Reply-To: <20240115184430.2710652-1-glider@google.com> From: Marco Elver Date: Mon, 15 Jan 2024 21:34:35 +0100 Message-ID: Subject: Re: [PATCH] mm/sparsemem: fix race in accessing memory_section->usage To: Alexander Potapenko Cc: quic_charante@quicinc.com, akpm@linux-foundation.org, aneesh.kumar@linux.ibm.com, dan.j.williams@intel.com, david@redhat.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, mgorman@techsingularity.net, osalvador@suse.de, vbabka@suse.cz, "Paul E. McKenney" , Dmitry Vyukov , kasan-dev@googlegroups.com, Ilya Leoshkevich , Nicholas Miehlbradt Content-Type: text/plain; charset="UTF-8" On Mon, 15 Jan 2024 at 19:44, Alexander Potapenko wrote: > > Cc: "Paul E. McKenney" > Cc: Marco Elver > Cc: Dmitry Vyukov > Cc: kasan-dev@googlegroups.com > Cc: Ilya Leoshkevich > Cc: Nicholas Miehlbradt > > Hi folks, > > (adding KMSAN reviewers and IBM people who are currently porting KMSAN to other > architectures, plus Paul for his opinion on refactoring RCU) > > this patch broke x86 KMSAN in a subtle way. > > For every memory access in the code instrumented by KMSAN we call > kmsan_get_metadata() to obtain the metadata for the memory being accessed. For > virtual memory the metadata pointers are stored in the corresponding `struct > page`, therefore we need to call virt_to_page() to get them. > > According to the comment in arch/x86/include/asm/page.h, virt_to_page(kaddr) > returns a valid pointer iff virt_addr_valid(kaddr) is true, so KMSAN needs to > call virt_addr_valid() as well. > > To avoid recursion, kmsan_get_metadata() must not call instrumented code, > therefore ./arch/x86/include/asm/kmsan.h forks parts of arch/x86/mm/physaddr.c > to check whether a virtual address is valid or not. > > But the introduction of rcu_read_lock() to pfn_valid() added instrumented RCU > API calls to virt_to_page_or_null(), which is called by kmsan_get_metadata(), > so there is an infinite recursion now. I do not think it is correct to stop that > recursion by doing kmsan_enter_runtime()/kmsan_exit_runtime() in > kmsan_get_metadata(): that would prevent instrumented functions called from > within the runtime from tracking the shadow values, which might introduce false > positives. > > I am currently looking into inlining __rcu_read_lock()/__rcu_read_unlock(), into > KMSAN code to prevent it from being instrumented, but that might require factoring > out parts of kernel/rcu/tree_plugin.h into a non-private header. Do you think this > is feasible? __rcu_read_lock/unlock() is only outlined in PREEMPT_RCU. Not sure that helps. Otherwise, there is rcu_read_lock_sched_notrace() which does the bare minimum and is static inline. Does that help? > Another option is to cut some edges in the code calling virt_to_page(). First, > my observation is that virt_addr_valid() is quite rare in the kernel code, i.e. > not all cases of calling virt_to_page() are covered with it. Second, every > memory access to KMSAN metadata residing in virt_to_page(kaddr)->shadow always > accompanies an access to `kaddr` itself, so if there is a race on a PFN then > the access to `kaddr` will probably also trigger a fault. Third, KMSAN metadata > accesses are inherently non-atomic, and even if we ensure pfn_valid() is > returning a consistent value for a single memory access, calling it twice may > already return different results. > > Considering the above, how bad would it be to drop synchronization for KMSAN's > version of pfn_valid() called from kmsan_virt_addr_valid()?