Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751894AbdITXqo (ORCPT ); Wed, 20 Sep 2017 19:46:44 -0400 Received: from mga03.intel.com ([134.134.136.65]:27459 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751636AbdITXqn (ORCPT ); Wed, 20 Sep 2017 19:46:43 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.42,422,1500966000"; d="scan'208";a="1016777385" Subject: Re: [PATCH v6 03/11] mm, x86: Add support for eXclusive Page Frame Ownership (XPFO) To: Tycho Andersen , Yisheng Xie References: <20170907173609.22696-1-tycho@docker.com> <20170907173609.22696-4-tycho@docker.com> <302be94d-7e44-001d-286c-2b0cd6098f7b@huawei.com> <20170911145020.fat456njvyagcomu@docker> <57e95ad2-81d8-bf83-3e78-1313daa1bb80@canonical.com> <431e2567-7600-3186-1489-93b855c395bd@huawei.com> <20170912143636.avc3ponnervs43kj@docker> <20170912181303.aqjj5ri3mhscw63t@docker> Cc: Juerg Haefliger , linux-kernel@vger.kernel.org, linux-mm@kvack.org, kernel-hardening@lists.openwall.com, Marco Benatto , x86@kernel.org From: Dave Hansen Message-ID: <91923595-7f02-3be0-9c59-9c1fd20c82a8@intel.com> Date: Wed, 20 Sep 2017 16:46:41 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 MIME-Version: 1.0 In-Reply-To: <20170912181303.aqjj5ri3mhscw63t@docker> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1503 Lines: 39 On 09/12/2017 11:13 AM, Tycho Andersen wrote: > -void xpfo_alloc_pages(struct page *page, int order, gfp_t gfp) > +void xpfo_alloc_pages(struct page *page, int order, gfp_t gfp, bool will_map) > { > int i, flush_tlb = 0; > struct xpfo *xpfo; > @@ -116,8 +116,14 @@ void xpfo_alloc_pages(struct page *page, int order, gfp_t gfp) > * Tag the page as a user page and flush the TLB if it > * was previously allocated to the kernel. > */ > - if (!test_and_set_bit(XPFO_PAGE_USER, &xpfo->flags)) > + bool was_user = !test_and_set_bit(XPFO_PAGE_USER, > + &xpfo->flags); > + > + if (was_user || !will_map) { > + set_kpte(page_address(page + i), page + i, > + __pgprot(0)); > flush_tlb = 1; > + } Shouldn't the "was_user" be "was_kernel"? Also, the way this now works, let's say we have a nice, 2MB pmd_t (page table entry) mapping a nice, 2MB page in the allocator. Then it gets allocated to userspace. We do for (i = 0; i < (1 << order); i++) { ... set_kpte(page_address(page + i), page+i, __pgprot(0)); } The set_kpte() will take the nice, 2MB mapping and break it down into 512 4k mappings, all pointing to a non-present PTE, in a newly-allocated PTE page. So, you get the same result and waste 4k of memory in the process, *AND* make it slower because we added a level to the page tables. I think you actually want to make a single set_kpte() call at the end of the function. That's faster and preserves the large page in the direct mapping.