Received: by 2002:a05:6359:c8b:b0:c7:702f:21d4 with SMTP id go11csp2837684rwb; Thu, 29 Sep 2022 16:07:08 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4TR4DWZ5PY279jsJJRfV6esLQKumr6/WjMykcKL2JUtnse9Q7te8yDhoWQlFG0oIMPNn5S X-Received: by 2002:a17:902:7598:b0:178:3f96:4ffc with SMTP id j24-20020a170902759800b001783f964ffcmr6002618pll.53.1664492828362; Thu, 29 Sep 2022 16:07:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664492828; cv=none; d=google.com; s=arc-20160816; b=kGCeNl6TOtG/wOfKc5w7JZnYAYr4wdXaa5wqAgNtVGCR9uuGoV3ffdONfNT+LR/NMp YfBRbmF74Fe+JTxM86PfxZxnX0IgwGhIwxI3NKKKYCFLZm+L1ZgW+2UqkayKIurSPMPc SgzrfvLp7fVq1UImpXNS67LwPI7G2aO6FEPQpr7UnfzPWz5SV6lvROmD+nJW6e4dXT7D g1eFl2s0TyBoZ9IEXgDQ6hhORho7/l4ST8zURb/dnbKnWfeDr05VqnFyzC7Prb+W81UT VAb78E/UT0vYauhQQH0e43zcLii+U4gNHKzxXTpCRDu13YjSHCGWuGzySTemouEjGC5f jVVw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:references:in-reply-to:message-id:date:subject :cc:to:from:dkim-signature; bh=0lqQoOoJwESjO+CTCm+DZMhNYniwX6TJlB5GrhfWP90=; b=MPDsBMB/tScgqgavcweEeGvfMprULCzc5/ZLiPZ+DGVr0GUvayTHki6zSu4CqKUVQr nPtFm1bA39qZpyybK0Nw/RqcyAVbbDc2GkQxjXC2cwYkDewG3S4duCPYoL9zO137u3bv SDSYUkL5DfyCGAJCblrXEbwj3sTFxyqJ0DSbxGTVKFNDnkjXFViFdznyP6SKctTPxrjc Lqf4o+G3/+FqRxXY2t0g7T7R2XHTqenYEGZ6Rq3ap799VqZV1G8Ps6Ozh66WyXttXMgr ElH0h8N6u46CS0D+L9L6JpIoJLI1IINBxsD/mCvn9SLT+X9N9BJpSDwVgv2kz6/LQgZE +AmQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=HIz69FRl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b5-20020a170902d50500b00172549e3acfsi1325214plg.175.2022.09.29.16.06.56; Thu, 29 Sep 2022 16:07:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=HIz69FRl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230014AbiI2Wdn (ORCPT + 99 others); Thu, 29 Sep 2022 18:33:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53622 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229472AbiI2Wcl (ORCPT ); Thu, 29 Sep 2022 18:32:41 -0400 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2EDD3A027F; Thu, 29 Sep 2022 15:30:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1664490656; x=1696026656; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=eV9GA8OUMVYJDpffX6i9+4mzKnfsWlOOmYPUr5qRV9M=; b=HIz69FRlddDWnW/mvhS/MTup8kGquW+ClKQoWFX6MO0zW1DQjkrTItwX mEUk+i+uRDNRkoAnuMIQ52ptMfvOVji5jeQVBCGRl1gfQO25qX8QdJK4F +e+btP1d6xmyHT5/YDGF6o3Nw/k5ztFLvuhuVGACWcO7fooDhoGocVYEK /myJYdtCsGVv7PC4jYU3ic+G4LT4xJbggY0KoDhQug4nsPuj3W09MktKN fu2Fkt/4yrPZvBi4x59qzgYb5IMS8uExW+HUvjHWzaaqrScDgLaIMyWRv wWt0VZwI16AjsVDugp0EUDFYpZ1g/e3ibXarWJHaIjTOYV5WubYEBZHKz g==; X-IronPort-AV: E=McAfee;i="6500,9779,10485"; a="303531410" X-IronPort-AV: E=Sophos;i="5.93,356,1654585200"; d="scan'208";a="303531410" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Sep 2022 15:30:10 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10485"; a="691016186" X-IronPort-AV: E=Sophos;i="5.93,356,1654585200"; d="scan'208";a="691016186" Received: from sergungo-mobl.amr.corp.intel.com (HELO rpedgeco-desk.amr.corp.intel.com) ([10.251.25.88]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Sep 2022 15:30:08 -0700 From: Rick Edgecombe To: x86@kernel.org, "H . Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H . J . Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V . Shankar" , Weijiang Yang , "Kirill A . Shutemov" , joao.moreira@intel.com, John Allen , kcc@google.com, eranian@google.com, rppt@kernel.org, jamorris@linux.microsoft.com, dethoma@microsoft.com Cc: rick.p.edgecombe@intel.com, Yu-cheng Yu Subject: [PATCH v2 12/39] x86/mm: Update ptep_set_wrprotect() and pmdp_set_wrprotect() for transition from _PAGE_DIRTY to _PAGE_COW Date: Thu, 29 Sep 2022 15:29:09 -0700 Message-Id: <20220929222936.14584-13-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220929222936.14584-1-rick.p.edgecombe@intel.com> References: <20220929222936.14584-1-rick.p.edgecombe@intel.com> X-Spam-Status: No, score=-4.5 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Yu-cheng Yu When Shadow Stack is in use, Write=0,Dirty=1 PTE are reserved for shadow stack. Copy-on-write PTes then have Write=0,Cow=1. When a PTE goes from Write=1,Dirty=1 to Write=0,Cow=1, it could become a transient shadow stack PTE in two cases: The first case is that some processors can start a write but end up seeing a Write=0 PTE by the time they get to the Dirty bit, creating a transient shadow stack PTE. However, this will not occur on processors supporting Shadow Stack, and a TLB flush is not necessary. The second case is that when _PAGE_DIRTY is replaced with _PAGE_COW non- atomically, a transient shadow stack PTE can be created as a result. Thus, prevent that with cmpxchg. Dave Hansen, Jann Horn, Andy Lutomirski, and Peter Zijlstra provided many insights to the issue. Jann Horn provided the cmpxchg solution. Signed-off-by: Yu-cheng Yu Co-developed-by: Rick Edgecombe Signed-off-by: Rick Edgecombe --- v2: - Compile out some code due to clang build error - Clarify commit log (dhansen) - Normalize PTE bit descriptions between patches (dhansen) - Update comment with text from (dhansen) Yu-cheng v30: - Replace (pmdval_t) cast with CONFIG_PGTABLE_LEVELES > 2 (Borislav Petkov). arch/x86/include/asm/pgtable.h | 36 ++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 2f2963429f48..58c7bf9d7392 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1287,6 +1287,23 @@ static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm, static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { +#ifdef CONFIG_X86_SHADOW_STACK + /* + * Avoid accidentally creating shadow stack PTEs + * (Write=0,Dirty=1). Use cmpxchg() to prevent races with + * the hardware setting Dirty=1. + */ + if (cpu_feature_enabled(X86_FEATURE_SHSTK)) { + pte_t old_pte, new_pte; + + old_pte = READ_ONCE(*ptep); + do { + new_pte = pte_wrprotect(old_pte); + } while (!try_cmpxchg(&ptep->pte, &old_pte.pte, new_pte.pte)); + + return; + } +#endif clear_bit(_PAGE_BIT_RW, (unsigned long *)&ptep->pte); } @@ -1339,6 +1356,25 @@ static inline pud_t pudp_huge_get_and_clear(struct mm_struct *mm, static inline void pmdp_set_wrprotect(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp) { +#ifdef CONFIG_X86_SHADOW_STACK + /* + * If Shadow Stack is enabled, pmd_wrprotect() moves _PAGE_DIRTY + * to _PAGE_COW (see comments at pmd_wrprotect()). + * When a thread reads a RW=1, Dirty=0 PMD and before changing it + * to RW=0, Dirty=0, another thread could have written to the page + * and the PMD is RW=1, Dirty=1 now. + */ + if (cpu_feature_enabled(X86_FEATURE_SHSTK)) { + pmd_t old_pmd, new_pmd; + + old_pmd = READ_ONCE(*pmdp); + do { + new_pmd = pmd_wrprotect(old_pmd); + } while (!try_cmpxchg(&pmdp->pmd, &old_pmd.pmd, new_pmd.pmd)); + + return; + } +#endif clear_bit(_PAGE_BIT_RW, (unsigned long *)pmdp); } -- 2.17.1