Received: by 2002:a05:6a10:1a4d:0:0:0:0 with SMTP id nk13csp3172282pxb; Thu, 10 Feb 2022 14:11:48 -0800 (PST) X-Google-Smtp-Source: ABdhPJy7EqN4bNNUX1JUWXcFch/qtFlMWmltXO/fZ0tCQxUVQozdDHOPbj6QMCYqZUuPwolRVDQR X-Received: by 2002:a05:6a00:22d1:: with SMTP id f17mr9565495pfj.13.1644531108350; Thu, 10 Feb 2022 14:11:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1644531108; cv=none; d=google.com; s=arc-20160816; b=RoHu2xvxNmkQUDvaP67s6tv8QWVOuzTCrpi0ujRrMR6CpYOyYy62auMSnTm5XoP+cA dg4cR6rDBz8x/hn/+J/DWAMTDKz285Exy5rZQX/shd05d2bHAUxU21P1kKNPVHgttkl+ FygsE/wVpwZgLPbJmTF3xa9lKl4T4fUDwLqVESn+NBHQJ+sPcxN/tZrZLG0UBGJWsUin 0FZROroaekqeAooSEjywL65C0YDS9w+BH2K79vCM51OxDIbjNsECDCCq82XuSHmi5ft3 xyZ4uGvskhJlp3EGN8uW6bHNkl+ftpf/BZ1DorPUp1e8i4uL5YtJq6ENbO9JsiZe445M HyKw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:subject :from:references:cc:to:content-language:user-agent:mime-version:date :message-id:dkim-signature; bh=xDVK8OSK6gpqyQbMHBk1515tAXS7R2Ht38jvBqjKucs=; b=TPE8M6xXr0zohlBSyiAeGdTtzfrVdlPpSbALYMoRdF9LM2svvmkfHOcEj/hnIPEIG9 RsewuDa9Qe4R2hwEWkJ1Pq49ii/vIJ05FX2sAd2MX/qeCp5Y8nSSyRgUvwh1in4etISV wQywVmYwdbWsLoXs+wPiolPztb8OpJ9Sqi3btAvQx/67rsNM/ebwH4ouMRiZ0hHdnpKU 8DBN3S8T3yeaRcYUTyEphFUddItqOr8xYt4bcC5urJ8SGIAaT4TUN0Gpwwp0ZbnpxFhG wvTdmGTVwl75Y4LJ9gUgNIiMU06tJoGFXtdrWasG9joizS1gNNQu0ftIHgX4oOzdNhi0 wOjA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Q3XCtvxE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id r124si12853200pgr.610.2022.02.10.14.11.36; Thu, 10 Feb 2022 14:11:48 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Q3XCtvxE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343980AbiBJT1k (ORCPT + 99 others); Thu, 10 Feb 2022 14:27:40 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:60308 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343544AbiBJT1j (ORCPT ); Thu, 10 Feb 2022 14:27:39 -0500 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E9CC51BF; Thu, 10 Feb 2022 11:27:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1644521259; x=1676057259; h=message-id:date:mime-version:to:cc:references:from: subject:in-reply-to:content-transfer-encoding; bh=KgGGYz1uf3JX/t90pBcGCFyoEJL2Bi12Sok3epGvjoU=; b=Q3XCtvxEN8ejztXdjfWGuXT/iSEa5/ftyrHs/SecpTNtpO8MpeifNCR2 ZXw0zmf9xMvGLExjFyyRe17N0ZXTYJyPdjnQ5HQw8Tcu4k7xG+4dyWjRc yvshqU2pSsMCzhMQgLpltaOO6p1xsoI+/y+jvoAUKo5ytCa4bakLv8nyJ 7JJKzH4yKKn+BdyR+E/yc7/mBv2VFJ1CIRHX5g0it0Q/isMYBhhelnnF5 Mr2t7mapvSS+VrPZG05Tv9N8bpjiCe1hhEzP1tIoAdr1ZfZj7dWs2agLG QqW7iKPtn21GpwbdAd+IFgwpv0BUxNyxgoogwPV/KzbeoJlXZDNG7rkUA w==; X-IronPort-AV: E=McAfee;i="6200,9189,10254"; a="335996186" X-IronPort-AV: E=Sophos;i="5.88,359,1635231600"; d="scan'208";a="335996186" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Feb 2022 11:27:39 -0800 X-IronPort-AV: E=Sophos;i="5.88,359,1635231600"; d="scan'208";a="500514005" Received: from pengyusu-mobl.amr.corp.intel.com (HELO [10.212.149.216]) ([10.212.149.216]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Feb 2022 11:27:37 -0800 Message-ID: Date: Thu, 10 Feb 2022 11:27:34 -0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0 Content-Language: en-US To: Rick Edgecombe , x86@kernel.org, "H . Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H . J . Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V . Shankar" , Dave Martin , Weijiang Yang , "Kirill A . Shutemov" , joao.moreira@intel.com, John Allen , kcc@google.com, eranian@google.com Cc: Yu-cheng Yu References: <20220130211838.8382-1-rick.p.edgecombe@intel.com> <20220130211838.8382-22-rick.p.edgecombe@intel.com> From: Dave Hansen Subject: Re: [PATCH 21/35] mm/mprotect: Exclude shadow stack from preserve_write In-Reply-To: <20220130211838.8382-22-rick.p.edgecombe@intel.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A, RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 1/30/22 13:18, Rick Edgecombe wrote: > In change_pte_range(), when a PTE is changed for prot_numa, _PAGE_RW is > preserved to avoid the additional write fault after the NUMA hinting fault. > However, pte_write() now includes both normal writable and shadow stack > (RW=0, Dirty=1) PTEs, but the latter does not have _PAGE_RW and has no need > to preserve it. This series creates an interesting situation: it causes a logical disconnection between things that were tightly coupled before. For instance, before this series, _PAGE_RW=1 and "writable" really were synonyms. They meant the same thing. One of the complexities in this series is differentiating the two. For instance, a shadow stack page can be written to, even though it has _PAGE_RW=0. This particular patch seems to be hacking around the problem that a p*_mkwrite() doesn't work on shadow stack PTE/PMDs. First, that makes me wonder what *actually* happens if we do a plain pte_mkwrite() on a shadow stack PTE. I *think* it will take the [Write=0,Dirty=1] PTE and pte = pte_set_flags(pte, _PAGE_RW); so we'll end up with [Write=1,Dirty=1], which is bad. Let's say pte_mkwrite() can't be fixed. We should probably make it VM_BUG_ON() if it's ever asked to muck with a shadow stack PTE. It's also weird because we have this pte_write()==1 PTE in a !VM_WRITE VMA. Then, we're trying to pte_mkwrite() under this !VM_WRITE VMA. pte_write() <-- returns true for on shadow stack PTE! pte_mkwrite() <-- illegal on shadow stack PTE I need to think about this a little more. I don't have a solution. But, as-is, it seems untenable. The rules are just too counter intuitive to live.