Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp4760922ybi; Tue, 28 May 2019 02:01:06 -0700 (PDT) X-Google-Smtp-Source: APXvYqydITyUsTEiOqvNS5FHlYL9MogvykF96N+kkM4CMKIpyR8piiA/GqDFBS6IfJo3LoO+JXZa X-Received: by 2002:a63:e358:: with SMTP id o24mr46109789pgj.78.1559034066341; Tue, 28 May 2019 02:01:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1559034066; cv=none; d=google.com; s=arc-20160816; b=iUDtyHZNwK7zVwryuQjFGN+eB3hNn1WPC2xK0diFYXi8hgGVaOOTelHeIxp28jWUX4 XluMRMOs6Srtfz4E7rtFDrbca9Znr8aqc0cdTTwdeOovPaFMBuSbr7Hr1hCEtHvWpxvv KGASP6irO55FEHtNK2E/fMK2hdYsjAruTFVVGD3Eek5rG5Al6cBmfg1aKs8QMWmc2WJC IZp7L5sRiMEUshpGPmaZ7LEK56ksPnESmt9IPvguk/dNgWLNSCcUnHZfATUB3YRdXrpj RQ4n+eKW+QhEvMwoAnrusWClwfrEmUm4WL5TGLQVljQtf7ZzU/XCr+r1bHwXKDPX9qDB 4d0g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=T39CXTY0hyQBVWGh9VpE2VmC6CqJSN7KGTQbzv63DZs=; b=IKycTLyzBJ7EaIWHrBfoHaTNRzM0mVVGVBSHAkZ3LD84hnVYB52D3wYFfIWDezaysp 1bL6QCcQk7KnZ+DHowbaT/dBDtO3AzYgH+fC9ybRE2BnVvitQ4hpUnxrLsBcNWIUzXeS UC9heYxZ1SNh+NGA5Wg9M+mfnFbPJWDtRpmfxZs137Vis9sB8Ct/vzqE2Cne9sB+XZZ8 zFZDG8/SvKn5/zZSSn3GdUWaBuRYnNsMuqUMl+5FdHfrTQSFS54kH9xdSjYNyD26ATMJ DaaFHGQnM2YQJ4Hp4GaSrDXU4ZHaSRM6X6ZGqOGLUeSQfsOKsPyYm/Fq+c7JGY+SLkrz 0hJQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n9si23028027pff.14.2019.05.28.02.00.49; Tue, 28 May 2019 02:01:06 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726574AbfE1I61 (ORCPT + 99 others); Tue, 28 May 2019 04:58:27 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:52688 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726279AbfE1I61 (ORCPT ); Tue, 28 May 2019 04:58:27 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 6506A341; Tue, 28 May 2019 01:58:26 -0700 (PDT) Received: from [192.168.1.27] (usa-sjc-mx-foss1.foss.arm.com [217.140.101.70]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id E9EFD3F59C; Tue, 28 May 2019 01:58:23 -0700 (PDT) Subject: Re: [PATCH 2/4] arm64/mm: wire up CONFIG_ARCH_HAS_SET_DIRECT_MAP To: Anshuman Khandual , linux-arm-kernel@lists.infradead.org Cc: mark.rutland@arm.com, marc.zyngier@arm.com, Will Deacon , linux-kernel@vger.kernel.org, Peter Zijlstra , Nadav Amit , Masami Hiramatsu , James Morse , Andrew Morton , Rick Edgecombe References: <20190523102256.29168-1-ard.biesheuvel@arm.com> <20190523102256.29168-3-ard.biesheuvel@arm.com> <7ab5edf6-e538-f052-471a-8e56f62bcfef@arm.com> From: Ard Biesheuvel Message-ID: <9be05b0a-6ef1-7f60-e900-08f45114ecf7@arm.com> Date: Tue, 28 May 2019 10:58:21 +0200 User-Agent: Mozilla/5.0 (X11; Linux aarch64; rv:60.0) Gecko/20100101 Thunderbird/60.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 5/28/19 10:41 AM, Anshuman Khandual wrote: > > > On 05/28/2019 01:50 PM, Ard Biesheuvel wrote: >> On 5/28/19 10:10 AM, Anshuman Khandual wrote: >>> >>> >>> On 05/23/2019 03:52 PM, Ard Biesheuvel wrote: >>>> Wire up the special helper functions to manipulate aliases of vmalloc >>>> regions in the linear map. >>> >>> IMHO the commit message here could be bit more descriptive because of the >>> amount of changes this patch brings in. >>> >>>> >>>> Signed-off-by: Ard Biesheuvel >>>> --- >>>>   arch/arm64/Kconfig                  |  1 + >>>>   arch/arm64/include/asm/cacheflush.h |  3 ++ >>>>   arch/arm64/mm/pageattr.c            | 48 ++++++++++++++++---- >>>>   mm/vmalloc.c                        | 11 ----- >>>>   4 files changed, 44 insertions(+), 19 deletions(-) >>>> >>>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig >>>> index ca9c175fb949..4ab32180eabd 100644 >>>> --- a/arch/arm64/Kconfig >>>> +++ b/arch/arm64/Kconfig >>>> @@ -26,6 +26,7 @@ config ARM64 >>>>       select ARCH_HAS_MEMBARRIER_SYNC_CORE >>>>       select ARCH_HAS_PTE_SPECIAL >>>>       select ARCH_HAS_SETUP_DMA_OPS >>>> +    select ARCH_HAS_SET_DIRECT_MAP >>>>       select ARCH_HAS_SET_MEMORY >>>>       select ARCH_HAS_STRICT_KERNEL_RWX >>>>       select ARCH_HAS_STRICT_MODULE_RWX >>>> diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h >>>> index 19844211a4e6..b9ee5510067f 100644 >>>> --- a/arch/arm64/include/asm/cacheflush.h >>>> +++ b/arch/arm64/include/asm/cacheflush.h >>>> @@ -187,4 +187,7 @@ static inline void flush_cache_vunmap(unsigned long start, unsigned long end) >>>>     int set_memory_valid(unsigned long addr, int numpages, int enable); >>>>   +int set_direct_map_invalid_noflush(struct page *page); >>>> +int set_direct_map_default_noflush(struct page *page); >>>> + >>>>   #endif >>>> diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c >>>> index 6cd645edcf35..9c6b9039ec8f 100644 >>>> --- a/arch/arm64/mm/pageattr.c >>>> +++ b/arch/arm64/mm/pageattr.c >>>> @@ -159,17 +159,48 @@ int set_memory_valid(unsigned long addr, int numpages, int enable) >>>>                       __pgprot(PTE_VALID)); >>>>   } >>>>   -#ifdef CONFIG_DEBUG_PAGEALLOC >>>> +int set_direct_map_invalid_noflush(struct page *page) >>>> +{ >>>> +    struct page_change_data data = { >>>> +        .set_mask = __pgprot(0), >>>> +        .clear_mask = __pgprot(PTE_VALID), >>>> +    }; >>>> + >>>> +    if (!rodata_full) >>>> +        return 0; >>> >>> Why rodata_full needs to be probed here ? Should not we still require the following >>> transition even if rodata_full is not enabled. Just wondering whether we can use >>> VM_FLUSH_RESET_PERMS feature without these required transitions. >>> >>>          /* >>>           * Set direct map to something invalid so that it won't be cached if >>>           * there are any accesses after the TLB flush, then flush the TLB and >>>           * reset the direct map permissions to the default. >>>           */ >>>          set_area_direct_map(area, set_direct_map_invalid_noflush); >>>          _vm_unmap_aliases(start, end, 1); >>>          set_area_direct_map(area, set_direct_map_default_noflush); >>> >>>   > + >> >> How would that work? With rodata_full disabled, the linear region is not mapped down to pages, and so there is no way we can manipulate linear aliases at page granularity. > > Then should not we check debug_pagealloc_enabled() as well as in case > for __kernel_map_pages() later in the patch. > Yes, what we could do technically is to permit the direct_map_xx() calls to proceed if debug_pagealloc_enabled() returns true, since we are guaranteed to have a page granular mapping of the entire linear region. However, that means we will be manipulating the linear aliases of vmalloc mappings even if the user has requested us not to, i.e., by passing rodata=on rather than rodata=full, so I don't think we should be doing that. >> >>>> +    return apply_to_page_range(&init_mm, >>>> +                   (unsigned long)page_address(page), >>>> +                   PAGE_SIZE, change_page_range, &data); >>>> +} >>>> + >>>> +int set_direct_map_default_noflush(struct page *page) >>>> +{ >>>> +    struct page_change_data data = { >>>> +        .set_mask = __pgprot(PTE_VALID | PTE_WRITE), >>>> +        .clear_mask = __pgprot(PTE_RDONLY), >>> >>> Replace __pgprot(PTE_VALID | PTE_WRITE) with PAGE_KERNEL instead ! >>> >> >> This is a delta mask, so no need to pull in the PTE_MAYBE_NG or other attributes that we know we haven't changed. >> >>>> +    }; >>>> + >>>> +    if (!rodata_full) >>>> +        return 0; >>>> + >>>> +    return apply_to_page_range(&init_mm, >>>> +                   (unsigned long)page_address(page), >>>> +                   PAGE_SIZE, change_page_range, &data); >>>> +} >>>> + >>> >>> IIUC set_direct_map_invalid_noflush() and set_direct_map_default_noflush() >>> should set *appropriate* permissions as seen fit from platform perspective >>> to implement this transition. >>> >>> In here set_direct_map_invalid_noflush() makes the entry invalid preventing >>> further MMU walks (hence new TLB entries). set_direct_map_default_noflush() >>> makes it a valid write entry. Though it looks similar to PAGE_KERNEL which >>> is the default permission for linear mapping on arm64 via __map_memblock(). >>> Should not PAGE_KERNEL be used explicitly as suggested above. >>> >> >> No. We should restore the attributes that we cleared when manipulating the linear aliases. There isn't a lot of code that does that, so I don't see the need to use PAGE_KERNEL here. > > Fair enough. > >> >>>>   void __kernel_map_pages(struct page *page, int numpages, int enable) >>>>   { >>>> +    if (!debug_pagealloc_enabled() && !rodata_full) >>>> +        return; >>>> + >>> >>> I guess this is not related to CONFIG_ARCH_HAS_SET_DIRECT_MAP here and should >>> be a fix or an enhancement to CONFIG_DEBUG_PAGEALLOC implementation. Just >>> curious, !rodata_full check here to ensure that linear mapping does not have >>> block or contig mappings and should be backed by regular pages only ? >>> >> >> It is related. CONFIG_ARCH_HAS_SET_DIRECT_MAP introduces references to __kernel_map_pages() in generic code, so enabling CONFIG_ARCH_HAS_SET_DIRECT_MAP should make __kernel_map_pages() available unconditionally as well. > > Ahh, got it. > >> >>>>       set_memory_valid((unsigned long)page_address(page), numpages, enable); >>>>   } >>>> -#ifdef CONFIG_HIBERNATION >>>> + >>>>   /* >>>> - * When built with CONFIG_DEBUG_PAGEALLOC and CONFIG_HIBERNATION, this function >>>> - * is used to determine if a linear map page has been marked as not-valid by >>>> - * CONFIG_DEBUG_PAGEALLOC. Walk the page table and check the PTE_VALID bit. >>>> - * This is based on kern_addr_valid(), which almost does what we need. >>>> + * This function is used to determine if a linear map page has been marked as >>>> + * not-valid. Walk the page table and check the PTE_VALID bit. This is based >>>> + * on kern_addr_valid(), which almost does what we need. >>>>    * >>>>    * Because this is only called on the kernel linear map,  p?d_sect() implies >>>>    * p?d_present(). When debug_pagealloc is enabled, sections mappings are >>>> @@ -183,6 +214,9 @@ bool kernel_page_present(struct page *page) >>>>       pte_t *ptep; >>>>       unsigned long addr = (unsigned long)page_address(page); >>>>   +    if (!debug_pagealloc_enabled() && !rodata_full) >>>> +        return true; >>>> + >>> >>> Ditto, not related to CONFIG_ARCH_HAS_SET_DIRECT_MAP. >>> >> >> It is related. > > Never mind. >