Received: by 2002:a05:6a10:413:0:0:0:0 with SMTP id 19csp774248pxp; Fri, 11 Mar 2022 14:47:31 -0800 (PST) X-Google-Smtp-Source: ABdhPJwSDSyHKsEkIBfo7WoYhtGKXC62MePDOLq89lAjpq8hFQaJ1XDWYgTtub7KrbKeehNukC/P X-Received: by 2002:a17:902:e889:b0:151:a56d:eb8f with SMTP id w9-20020a170902e88900b00151a56deb8fmr12300837plg.142.1647038851172; Fri, 11 Mar 2022 14:47:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1647038851; cv=none; d=google.com; s=arc-20160816; b=nfBr2YHWI/jitXaSsd4QACPqFEhNmYZ6/IdP0ZApR4oOaUDMHiKaAFaIy/dJML4n4j M+U8hElXnckPiB+QLUNq0RYzcn8or/F/qQ5fYQVQdrQ5G2evLHOPQTafmIvfxRawvqym 1jqQ2HGcOgFFZU8wWjN45ncQ8hPx2yIeHqPw1lYlz9gjTTd7dNbyxcPu5Dxi1r2jMQ2e H7TMQLGd8CmDsRDpTWYzlSYqDXs72eZZZXZDzJX3I3e+EdI5EeRW/5l3s1OlNKWVHQCN HzFHHZaHbUOhmrjJA90vcgudjs9ssWhKSokYHguEnfYzL+nLfuoKi5eY6926MWmovRb1 mvIw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Qibjo2j1z17mUhHdLCy6v6IUKGPT7LFTOvOsH2op0Q0=; b=0+OyW2Cq58E5fqKl7A1Mm4NmZewyy2jvOrN5cLt0vlz6+v9RMkbcdjA5Zsdjr1qdse Ltn5yg2q3z820/HyhDNKF/7Pu0qnJGmbys0FO5qdCvZkIIcXpOZZRf0eWomiysmexAc4 t/A4s9otabV5DICtBpXpnhItZXJ2Lqlgerkzks59tUlC4WBC/LxiqKEH55+ETw1/WNXz tv8f63utbZ2qv2Sh2vwmw60lrBjIpkQtOWaXS0D0syLOCxZY5g+y7NOr5XvOIRIHM7y/ FlVJFA2XveMH2vTchUEq/U8ETBIvyWoTHClmatvjfrhAARnz41i4r99qBmsASNHNa7eE elxw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=eq4Iv0ER; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id u27-20020a056a00099b00b004f75cf4036csi8762786pfg.15.2022.03.11.14.47.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 11 Mar 2022 14:47:31 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=eq4Iv0ER; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 288EC345BF0; Fri, 11 Mar 2022 13:46:40 -0800 (PST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238922AbiCJRYE (ORCPT + 99 others); Thu, 10 Mar 2022 12:24:04 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54904 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S245077AbiCJRXF (ORCPT ); Thu, 10 Mar 2022 12:23:05 -0500 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C9B5319D62A for ; Thu, 10 Mar 2022 09:21:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1646932914; x=1678468914; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+flBIZRWKPr2MB29G1xvchPHlBu8Wo2Bj121QKQdJR0=; b=eq4Iv0ERTfZLTPF9lSggp2Q3omtfu5Zn8Bst7PkW6No6RnyxCf2MjCJ/ /ZyBaZDhG1dd9WMvsScZHWdzpnkikMWYPhQuCm/VMdII8aX5fJ6Tkr2xG 5YLo42rM0v4AwxrJJeIxZfk2ZzeLKOz3oVJlGM0mH5gE8yAtdSAMd72pB RFuHHQcEbe7yqeObgNlYuP4yo9wsrQ58dZ27vHeUsHashs8cTsIDrRsa6 4Q857hgo2iNrp0d/YifYwqGawbu2S9Q9yBDP+vW8SRiOIu9qEcp7yA/ur GDuUVT3Bic5jThBtDmiQ24K3LRmIKKUUNxBVFh+1Z7ZD6EGPrIISaAh6c A==; X-IronPort-AV: E=McAfee;i="6200,9189,10282"; a="255260201" X-IronPort-AV: E=Sophos;i="5.90,171,1643702400"; d="scan'208";a="255260201" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Mar 2022 09:21:54 -0800 X-IronPort-AV: E=Sophos;i="5.90,171,1643702400"; d="scan'208";a="633064626" Received: from gdavids1-mobl.amr.corp.intel.com (HELO localhost) ([10.212.65.108]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Mar 2022 09:21:53 -0800 From: ira.weiny@intel.com To: Dave Hansen , "H. Peter Anvin" , Dan Williams Cc: Ira Weiny , Fenghua Yu , Rick Edgecombe , "Shankar, Ravi V" , linux-kernel@vger.kernel.org Subject: [PATCH V9 40/45] memremap_pages: Define pgmap_set_{readwrite|noaccess}() calls Date: Thu, 10 Mar 2022 09:20:14 -0800 Message-Id: <20220310172019.850939-41-ira.weiny@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220310172019.850939-1-ira.weiny@intel.com> References: <20220310172019.850939-1-ira.weiny@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Ira Weiny A thread that wants to access memory protected by PGMAP protections must first enable access, and then disable access when it is done. Introduce pgmap_set_{readwrite|noaccess}() for this purpose. The two calls are destined to be used by the kmap API and take a struct page for convenience. They determine if the page is protected and, if so, perform the requested operation. Toggling between Read/Write and No Access was chosen as it fits well with the accessibility of a kmap'ed page. Discussions did occur regarding making a finer grained mapping for Read Only but that is something which can be added at a later date. In addition, two lower level functions are exported. They take the dev_pagemap object directly for internal consumers who have knowledge of the of the dev_pagemap. All changes in the protections must be through the above calls. They abstract the protection implementation (currently the PKS api) from upper layer consumers. The calls are made nestable by the use of a per task reference count. This ensures that the first call to re-enable protection does not 'break' the last access of the device memory. Expansion of the task struct is unavoidable due to the desire to maintain kmap_local_page() as non-atomic and migratable. The only other idea to track a reference count was in a per-cpu variable. However, doing so would make kmap_local_page() equivalent to kmap_atomic() which is undesirable. Access to device memory during exceptions (#PF) is expected only from user faults. Therefore there is no need to maintain the reference count during exceptions. NOTE: It is not anticipated that any code path will directly nest these calls. For this reason multiple reviewers, including Dan and Thomas, asked why this reference counting was needed at this level rather than in a higher level call such as kmap_local_page(). The reason is that pgmap_set_readwrite() can nest with kmap_{atomic,local_page}(). Therefore this reference counting is pushed to the lower level to ensure that any combination of calls is nestable. Signed-off-by: Ira Weiny --- Changes for V9 From Dan Williams Update the commit message with details on why the thread struct needs to be expanded. Following on Dave Hansens suggestion for pks_mk s/pgmap_mk_*/pgmap_set_*/ Changes for V8 Split these functions into their own patch. This helps to clarify the commit message and usage. --- include/linux/mm.h | 35 +++++++++++++++++++++++++++++++++++ include/linux/sched.h | 7 +++++++ init/init_task.c | 3 +++ mm/memremap.c | 14 ++++++++++++++ 4 files changed, 59 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index 4ca24329848a..c85189b24eca 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1168,8 +1168,43 @@ static inline bool devmap_protected(struct page *page) return false; } +void __pgmap_set_readwrite(struct dev_pagemap *pgmap); +void __pgmap_set_noaccess(struct dev_pagemap *pgmap); + +static inline bool pgmap_check_pgmap_prot(struct page *page) +{ + if (!devmap_protected(page)) + return false; + + /* + * There is no known use case to change permissions in an irq for pgmap + * pages + */ + lockdep_assert_in_irq(); + return true; +} + +static inline void pgmap_set_readwrite(struct page *page) +{ + if (!pgmap_check_pgmap_prot(page)) + return; + __pgmap_set_readwrite(page->pgmap); +} + +static inline void pgmap_set_noaccess(struct page *page) +{ + if (!pgmap_check_pgmap_prot(page)) + return; + __pgmap_set_noaccess(page->pgmap); +} + #else +static inline void __pgmap_set_readwrite(struct dev_pagemap *pgmap) { } +static inline void __pgmap_set_noaccess(struct dev_pagemap *pgmap) { } +static inline void pgmap_set_readwrite(struct page *page) { } +static inline void pgmap_set_noaccess(struct page *page) { } + static inline bool pgmap_protection_available(void) { return false; diff --git a/include/linux/sched.h b/include/linux/sched.h index 75ba8aa60248..a79f2090e291 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1492,6 +1492,13 @@ struct task_struct { struct callback_head l1d_flush_kill; #endif +#ifdef CONFIG_DEVMAP_ACCESS_PROTECTION + /* + * NOTE: pgmap_prot_count is modified within a single thread of + * execution. So it does not need to be atomic_t. + */ + u32 pgmap_prot_count; +#endif /* * New fields for task_struct should be added above here, so that * they are included in the randomized portion of task_struct. diff --git a/init/init_task.c b/init/init_task.c index 73cc8f03511a..948b32cf8139 100644 --- a/init/init_task.c +++ b/init/init_task.c @@ -209,6 +209,9 @@ struct task_struct init_task #ifdef CONFIG_SECCOMP_FILTER .seccomp = { .filter_count = ATOMIC_INIT(0) }, #endif +#ifdef CONFIG_DEVMAP_ACCESS_PROTECTION + .pgmap_prot_count = 0, +#endif }; EXPORT_SYMBOL(init_task); diff --git a/mm/memremap.c b/mm/memremap.c index cefdf541bcc1..6fa259748a0b 100644 --- a/mm/memremap.c +++ b/mm/memremap.c @@ -95,6 +95,20 @@ static void devmap_protection_disable(void) static_branch_dec(&dev_pgmap_protection_static_key); } +void __pgmap_set_readwrite(struct dev_pagemap *pgmap) +{ + if (!current->pgmap_prot_count++) + pks_set_readwrite(PKS_KEY_PGMAP_PROTECTION); +} +EXPORT_SYMBOL_GPL(__pgmap_set_readwrite); + +void __pgmap_set_noaccess(struct dev_pagemap *pgmap) +{ + if (!--current->pgmap_prot_count) + pks_set_noaccess(PKS_KEY_PGMAP_PROTECTION); +} +EXPORT_SYMBOL_GPL(__pgmap_set_noaccess); + #else /* !CONFIG_DEVMAP_ACCESS_PROTECTION */ static void devmap_protection_enable(void) { } -- 2.35.1