Received: by 2002:a05:6a10:6d10:0:0:0:0 with SMTP id gq16csp866340pxb; Fri, 22 Apr 2022 13:00:40 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzku77g+fU3YuD/+n1dQDwXxQI9K/eWLRmqxA6FtRbGk7u2Ygp1aelvNEl9Xu/1iW1ZA8eV X-Received: by 2002:aa7:92c8:0:b0:4fd:c1aa:3e11 with SMTP id k8-20020aa792c8000000b004fdc1aa3e11mr6701265pfa.29.1650657640469; Fri, 22 Apr 2022 13:00:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1650657640; cv=none; d=google.com; s=arc-20160816; b=qmmltKi3EdWwV2PVpy2Oq7HHSI0EmGq+EgHaWFmHKvsmcfvo65RVmJ+b8s5E76KZeI DSCN0+AaqRXMuxOal+Logx1ixFnX96PfUTNDlFv4QHyITGxMLnTzDJiGnoqVILbDAF4B 6MtA7F9JqdfXjoxMi/Mja0bGrjSjQ4eF7vFekgD83fDTVBHkVAFFZYumaujMsMNj3ixK TwZo4yEiLjduOaNX3wDC0c3XFoqCS9ZF0lfjFXnX6RWq2/Q6MX6BChkkXZ/u6wTM/ykt 5ueyo/LIL8qgDM/D06sCxi037v3joViShZsr2VpsqVXg3hLC5Wn6bLpFK+zl5AKUnNiq ybWQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=4vf9qzw6EwWuGtFM8KdW4lKIGlFd+PJWzRjTD+b7BiI=; b=dvvJdzphRJA5Xx3wp8H59QOw+rW/+1JbcjVDRKHDz+bWJSXA2utu0bBqbATcYh0tIo qfZnSDCAnFf3iFW3osLeE5/ieQRXX94JYP+P7FEr01QdvFnXsmm1b2h//8NEYglxgdK0 UVQpL4U56EWwB2gXSqjU4NSr9Iaunyq2Z+hTkyt7kFjqbLqWuNMnqVJHAbGDr1HYbDbM jPT/JoDwP2KI4mhuKdSrlLo9vaNTy6K2a2eLUkkJTu0ojHHBnnaGWu2LMYN1ZofHaCdt drwA5DDxaDSk3z68KR6L+fcU7u1Tq33sYxZeqoWZrmfjjZ8scDvAxhKbQZfHjPRyY6zU hPJA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ZodC3dqX; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id v25-20020a63ac19000000b003aaed4add01si606975pge.422.2022.04.22.13.00.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Apr 2022 13:00:40 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ZodC3dqX; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 2AF5C230D29; Fri, 22 Apr 2022 11:58:56 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1355711AbiDSRLU (ORCPT + 99 others); Tue, 19 Apr 2022 13:11:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60110 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343836AbiDSRKA (ORCPT ); Tue, 19 Apr 2022 13:10:00 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7F50C1164 for ; Tue, 19 Apr 2022 10:07:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1650388037; x=1681924037; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=HzuyO7S2Ekg+g7XHoS8vhO/ifVxDTWKb+avzh6c7FFM=; b=ZodC3dqXByJS8k+TtY/lFmPAvcDm5mikucM1js787jV/Bwa+4Awuf1Bu h6etn0JQrmtXqhumXNFyLrn0y47IjMcSvSvLohOXyXHsy/mFAOLgeHAEm oHH04r9je2qB96zN49qfldrTLyuF5Ob5tnzb827SqAPjLz3rdd7zP3Wok odA+bA10NDAJyi7qh4zEJFKDfbLhNxe8eE/Hx3tICsmBj8cP29/MGMm4U P65GTXIZcXSOj722jW2g9cRqebxlruA50WF8IpDKGRatf0jCZQzfAuFLk KLpSn1tUCFXepLkSEYPXpd7/o28QfjbRtP31mNGw0fozEMQTGePaQEOwF Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10322"; a="324261496" X-IronPort-AV: E=Sophos;i="5.90,273,1643702400"; d="scan'208";a="324261496" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Apr 2022 10:07:17 -0700 X-IronPort-AV: E=Sophos;i="5.90,273,1643702400"; d="scan'208";a="530498895" Received: from ajacosta-mobl1.amr.corp.intel.com (HELO localhost) ([10.212.11.4]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Apr 2022 10:07:16 -0700 From: ira.weiny@intel.com To: Dave Hansen , "H. Peter Anvin" , Dan Williams Cc: Ira Weiny , Fenghua Yu , Rick Edgecombe , "Shankar, Ravi V" , linux-kernel@vger.kernel.org Subject: [PATCH V10 31/44] memremap_pages: Define pgmap_set_{readwrite|noaccess}() calls Date: Tue, 19 Apr 2022 10:06:36 -0700 Message-Id: <20220419170649.1022246-32-ira.weiny@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220419170649.1022246-1-ira.weiny@intel.com> References: <20220419170649.1022246-1-ira.weiny@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.5 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Ira Weiny A thread that wants to access memory protected by PGMAP protections must first enable access, and then disable access when it is done. Introduce pgmap_set_{readwrite|noaccess}() for this purpose. The two calls are destined to be used by the kmap API and take a struct page for convenience. They determine if the page is protected and, if so, perform the requested operation. Toggling between Read/Write and No Access was chosen as it fits well with the accessibility of a kmap'ed page. Discussions did occur regarding making a finer grained mapping for Read Only but that is something which can be added at a later date. In addition, two lower level functions are exported. They take the dev_pagemap object directly for internal consumers who have knowledge of the of the dev_pagemap. All changes in the protections must be through the above calls. They abstract the protection implementation (currently the PKS API) from upper layer consumers. The calls are made nestable by the use of a per task reference count. This ensures that the first call to re-enable protection does not 'break' the last access of the device memory. Expansion of the task struct is unavoidable due to the desire to maintain kmap_local_page() as non-atomic and migratable. The only other idea to track a reference count was in a per-cpu variable. However, doing so would make kmap_local_page() equivalent to kmap_atomic() which is undesirable. Access to device memory during exceptions (#PF) is expected only from user faults. Therefore there is no need to maintain the reference count during exceptions. NOTE: It is not anticipated that any code path will directly nest these calls. For this reason multiple reviewers, including Dan and Thomas, asked why this reference counting was needed at this level rather than in a higher level call such as kmap_local_page(). The reason is that pgmap_set_readwrite() can nest with kmap_{atomic,local_page}(). Therefore this reference counting is pushed to the lower level to ensure that any combination of calls is nestable. Signed-off-by: Ira Weiny --- Changes for V10 Move code from mm.h to memremap.h Upstream separated memremap.h functionality from mm.h dc90f0846df4 ("mm: don't include in ") Changes for V9 From Dan Williams Update the commit message with details on why the thread struct needs to be expanded. Following on Dave Hansens suggestion for pks_mk s/pgmap_mk_*/pgmap_set_*/ Changes for V8 Split these functions into their own patch. This helps to clarify the commit message and usage. --- include/linux/memremap.h | 35 +++++++++++++++++++++++++++++++++++ include/linux/sched.h | 7 +++++++ init/init_task.c | 3 +++ mm/memremap.c | 14 ++++++++++++++ 4 files changed, 59 insertions(+) diff --git a/include/linux/memremap.h b/include/linux/memremap.h index 6325f00096ec..1012c6c4c664 100644 --- a/include/linux/memremap.h +++ b/include/linux/memremap.h @@ -240,8 +240,43 @@ static inline bool devmap_protected(struct page *page) return false; } +void __pgmap_set_readwrite(struct dev_pagemap *pgmap); +void __pgmap_set_noaccess(struct dev_pagemap *pgmap); + +static inline bool pgmap_check_pgmap_prot(struct page *page) +{ + if (!devmap_protected(page)) + return false; + + /* + * There is no known use case to change permissions in an irq for pgmap + * pages + */ + lockdep_assert_in_irq(); + return true; +} + +static inline void pgmap_set_readwrite(struct page *page) +{ + if (!pgmap_check_pgmap_prot(page)) + return; + __pgmap_set_readwrite(page->pgmap); +} + +static inline void pgmap_set_noaccess(struct page *page) +{ + if (!pgmap_check_pgmap_prot(page)) + return; + __pgmap_set_noaccess(page->pgmap); +} + #else +static inline void __pgmap_set_readwrite(struct dev_pagemap *pgmap) { } +static inline void __pgmap_set_noaccess(struct dev_pagemap *pgmap) { } +static inline void pgmap_set_readwrite(struct page *page) { } +static inline void pgmap_set_noaccess(struct page *page) { } + static inline bool pgmap_protection_available(void) { return false; diff --git a/include/linux/sched.h b/include/linux/sched.h index d5e3c00b74e1..7da0d2a0ac74 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1498,6 +1498,13 @@ struct task_struct { struct callback_head l1d_flush_kill; #endif +#ifdef CONFIG_DEVMAP_ACCESS_PROTECTION + /* + * NOTE: pgmap_prot_count is modified within a single thread of + * execution. So it does not need to be atomic_t. + */ + u32 pgmap_prot_count; +#endif /* * New fields for task_struct should be added above here, so that * they are included in the randomized portion of task_struct. diff --git a/init/init_task.c b/init/init_task.c index 73cc8f03511a..948b32cf8139 100644 --- a/init/init_task.c +++ b/init/init_task.c @@ -209,6 +209,9 @@ struct task_struct init_task #ifdef CONFIG_SECCOMP_FILTER .seccomp = { .filter_count = ATOMIC_INIT(0) }, #endif +#ifdef CONFIG_DEVMAP_ACCESS_PROTECTION + .pgmap_prot_count = 0, +#endif }; EXPORT_SYMBOL(init_task); diff --git a/mm/memremap.c b/mm/memremap.c index 215ab9c51917..491bb49255ae 100644 --- a/mm/memremap.c +++ b/mm/memremap.c @@ -94,6 +94,20 @@ static void devmap_protection_disable(void) static_branch_dec(&dev_pgmap_protection_static_key); } +void __pgmap_set_readwrite(struct dev_pagemap *pgmap) +{ + if (!current->pgmap_prot_count++) + pks_set_readwrite(PKS_KEY_PGMAP_PROTECTION); +} +EXPORT_SYMBOL_GPL(__pgmap_set_readwrite); + +void __pgmap_set_noaccess(struct dev_pagemap *pgmap) +{ + if (!--current->pgmap_prot_count) + pks_set_noaccess(PKS_KEY_PGMAP_PROTECTION); +} +EXPORT_SYMBOL_GPL(__pgmap_set_noaccess); + #else /* !CONFIG_DEVMAP_ACCESS_PROTECTION */ static void devmap_protection_enable(void) { } -- 2.35.1