Received: by 2002:a05:6a10:1d13:0:0:0:0 with SMTP id pp19csp2742708pxb; Tue, 24 Aug 2021 06:37:55 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzUmPxeHqqtNNpBIAJSubBq0EKfxJ9fBqhVPmBYQlqIVn2tOsf7OXenYtn2aFQpfXWHzvGY X-Received: by 2002:a05:6e02:1054:: with SMTP id p20mr27139864ilj.159.1629812275307; Tue, 24 Aug 2021 06:37:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1629812275; cv=none; d=google.com; s=arc-20160816; b=wGMx8OucNrTLIZK1tTxXVDrEY+r5SyBJ7LFBl97J7LXzgWbIZCtdWluRS1p1/1Te++ EKarLetuPIKUOaVhRNuuQY7YgZxoRjgAD+MVdoAb/SQCWAMG0LidAU3m0hy8v2w66R4G H/IdqbxqLkxj+HyrO4xT2JeY6jcu/9WUOOO7YzsiDBNWB5wBveIz18EAlrIkr0eVb07e hNPQdEECCqOX6jgZJqlDhTxKKks/lpVLfIqd1vXNP1NWiaZAqYj0m57jZS93QfJrSHN3 L7iNMx/fufXKwHuq9vqSpeUabt2u6y2Zl480fExhxdFQKETPelfSV8aYI5bXzggsgGox GuTw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=vju9YOwFKg9MGNbigM9dTLB0OaX857ielEIA80CPyVw=; b=N+/Q/qQfJWSAVSHyvOJthFdhuxAvDqJbIAS+8/tkUkC/uu1b5rOrgepiNCTa59fitT BAmw4wbN22rjcL2l+NKa/1MsUjat4WcWDfgu0vZrY5oraBqy3dIlIgokCVU5zIoOVuva 0mQ0+3MLaIWLZrKsWxbIAS8VaHPOM7zmUrmw4mBLgA61b0/yssZP4s9r9U2EsITagQok k/rrrHOd9LSlDFJA9HE+SDELRcLbXq4hGmsnCiijFYHfkJ/mWJ6kSN830uz269a8EN5M P/iL3WGlryTQyX2trMGz2mpuBClRkQiORTInx+m3yQGNpNGzuJo8oXEVPFHzyQpGdjuf Gm+Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=rrT0k65q; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id r5si17197071jan.104.2021.08.24.06.37.43; Tue, 24 Aug 2021 06:37:55 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=rrT0k65q; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237595AbhHXNhG (ORCPT + 99 others); Tue, 24 Aug 2021 09:37:06 -0400 Received: from mail.kernel.org ([198.145.29.99]:49592 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237571AbhHXNhE (ORCPT ); Tue, 24 Aug 2021 09:37:04 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 9911B6113B; Tue, 24 Aug 2021 13:36:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1629812180; bh=WjUMkQNz8BXdmmzGgZfIvQXaLnwVHiI+UpdqeqPwc4o=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=rrT0k65qzo5JlmgqGUXvSzZpfqN/HVq6vntPw10xDmDpa1CxSXEFviP/xnYfLHV+f mWeJzoVjcsxwlXalrN73X/M75Nh8rqwVGGNLqG+nbVmFK/YKvCrVZZ+F/uu4dog+Mc 75QIB2Zpd4X013k9FzyhBzyM7AY8HZJTrS/7H76hYQC1cpx3j8cIHV1diaw3yqbhbq iiVQ+GqYcYTfPkGXqwL0+EVWyuM+g/LbWbE4lXEjozryNLCtWqjfzySTum1nPi8QEi bvqZFzkke44PLYiAYbeqdD0aRW5DRW4MU74pizzsg3O7c+rLP0dJeZOWtGM3TzqTDz FNwwb9kdAlOAg== Date: Tue, 24 Aug 2021 16:36:13 +0300 From: Mike Rapoport To: Nadav Amit Cc: Linux-MM , Andrew Morton , Andy Lutomirski , Dave Hansen , Ira Weiny , Kees Cook , Mike Rapoport , Peter Zijlstra , Rick Edgecombe , Vlastimil Babka , x86@kernel.org, linux-kernel@vger.kernel.org Subject: Re: [RFC PATCH 4/4] x86/mm: write protect (most) page tables Message-ID: References: <20210823132513.15836-1-rppt@kernel.org> <20210823132513.15836-5-rppt@kernel.org> <05242256-4B5F-4AD6-B7DA-46A583335E5C@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <05242256-4B5F-4AD6-B7DA-46A583335E5C@gmail.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Aug 23, 2021 at 10:34:42PM -0700, Nadav Amit wrote: > On Aug 23, 2021, at 10:32 PM, Nadav Amit wrote: > > > > On Aug 23, 2021, at 6:25 AM, Mike Rapoport wrote: > > > > From: Mike Rapoport > > > > Allocate page table using __GFP_PTE_MAPPED so that they will have 4K PTEs > > in the direct map. This allows to switch _PAGE_RW bit each time a page > > table page needs to be made writable or read-only. > > > > The writability of the page tables is toggled only in the lowest level page > > table modifiction functions and immediately switched off. > > > > The page tables created early in the boot (including the direct map page > > table) are not write protected. > > > > > > [ snip ] > > > +static void pgtable_write_set(void *pg_table, bool set) > > +{ > > + int level = 0; > > + pte_t *pte; > > + > > + /* > > + * Skip the page tables allocated from pgt_buf break area and from > > + * memblock > > + */ > > + if (!after_bootmem) > > + return; > > + if (!PageTable(virt_to_page(pg_table))) > > + return; > > + > > + pte = lookup_address((unsigned long)pg_table, &level); > > + if (!pte || level != PG_LEVEL_4K) > > + return; > > + > > + if (set) { > > + if (pte_write(*pte)) > > + return; > > + > > + WRITE_ONCE(*pte, pte_mkwrite(*pte)); > > I think that the pte_write() test (and the following one) might hide > latent bugs. Either you know whether the PTE is write-protected or you > need to protect against nested/concurrent calls to pgtable_write_set() > by disabling preemption/IRQs. > > Otherwise, you risk in having someone else write-protecting the PTE > after it is write-unprotected and before it is written - causing a crash, > or write-unprotecting it after it is protected - which circumvents the > protection. > > Therefore, I would think that instead you should have: > > VM_BUG_ON(pte_write(*pte)); // (or WARN_ON_ONCE()) > > In addition, if there are assumptions on the preemptability of the code, > it would be nice to have some assertions. I think that the code assumes > that all calls to pgtable_write_set() are done while holding the > page-table lock. If that is the case, perhaps adding some lockdep > assertion would also help to confirm the correctness. > > [ I put aside the lack of TLB flushes, which make the whole matter of > delivered protection questionable. I presume that once PKS is used, > this is not an issue. ] As I said in another reply, the actual page table protection is merely to exercise the allocator. I'll consider to actually use PKS for the next versions (unless Rick beats me to it). -- Sincerely yours, Mike.