Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp3451238ybi; Mon, 29 Jul 2019 06:50:03 -0700 (PDT) X-Google-Smtp-Source: APXvYqx2v8v/ykEjY1Pn0Rh3NDZ1fXbaVNf6kpfhMDz0+W7nuu+BAN7LIIQvUu1XTixeljJBJyoG X-Received: by 2002:a62:3895:: with SMTP id f143mr36324514pfa.116.1564408203216; Mon, 29 Jul 2019 06:50:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1564408203; cv=none; d=google.com; s=arc-20160816; b=nJVQCJiLtpprejqvTtl4NpgnSoCh5FtdoI5m0A1rjUUb/9vnVUVBydEnbsNK2diPpx rdgUsu9L9mbmCV+CGXuxHWvpKtUA+qHgz1DIvmpjD3iUI7xJD4ZnRwpUYjYzliikUGxA GL967C1JheOEAWp8fKUu5bQGoxtztdr3gEKGUg3HPzDgvebX6d8lxTAI9FpYnwYIc5hU fowGe2lx//pNnzHMpbKEaXTAIeJAZ3vJdeRXgo4U6ADLQyaSSjEzFrOn33eGIAwvCMSz HMzjqiR7q8e5kFIER4lj8nMpVJEatCHGUWGJdGrDb0YAROK6QozK9s7kUyCsGJtbwNC8 F6dA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=iy8CtCvUK/X0Mf7iemEryUaudCdZUro4kY2MKyBrdoc=; b=d6S1zWBqz5V+8o6lXAzGzyh+t0su56T/ZQQnSjBfRlbTTbJ9NcJ7/jFbO0dc6HYDnH vVCtiavyPnQvRnQfSMw+9DQEAliBwZtjcXSUgUaB7HrP+FKzfYpXeUNOXXovD1FI9Emg m2+aji2anbsy44GFZRE3/CgAl+UoEeAFnhkB5qtqZwpmTNQYsS3yt97NMr5c4PGmqvFO 96JOb8FwLcisxliWekRyTj6n9NivkTyYB8/A0tNZuHJbHDxuZoTOK5Ls5Sz63+RNq0Gy TcdyuX4ukBGzSf8tKalGj0PeAgjb1Rz9zhjW0JG1facPGffYIl9m9MqvCEVr6qq0CMj8 cT/Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v16si28426870pfe.39.2019.07.29.06.49.47; Mon, 29 Jul 2019 06:50:03 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387879AbfG2Mex (ORCPT + 99 others); Mon, 29 Jul 2019 08:34:53 -0400 Received: from foss.arm.com ([217.140.110.172]:43442 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727625AbfG2Mex (ORCPT ); Mon, 29 Jul 2019 08:34:53 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 83BF228; Mon, 29 Jul 2019 05:34:52 -0700 (PDT) Received: from [10.1.196.133] (e112269-lin.cambridge.arm.com [10.1.196.133]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 117973F575; Mon, 29 Jul 2019 05:34:49 -0700 (PDT) Subject: Re: [PATCH v9 13/21] mm: pagewalk: Add test_p?d callbacks To: Anshuman Khandual , linux-mm@kvack.org Cc: Mark Rutland , x86@kernel.org, Arnd Bergmann , Ard Biesheuvel , Peter Zijlstra , Catalin Marinas , Dave Hansen , linux-kernel@vger.kernel.org, =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= , Ingo Molnar , Borislav Petkov , Andy Lutomirski , "H. Peter Anvin" , James Morse , Thomas Gleixner , Will Deacon , Andrew Morton , linux-arm-kernel@lists.infradead.org, "Liang, Kan" References: <20190722154210.42799-1-steven.price@arm.com> <20190722154210.42799-14-steven.price@arm.com> From: Steven Price Message-ID: Date: Mon, 29 Jul 2019 13:34:48 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-GB Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 28/07/2019 14:41, Anshuman Khandual wrote: > > > On 07/22/2019 09:12 PM, Steven Price wrote: >> It is useful to be able to skip parts of the page table tree even when >> walking without VMAs. Add test_p?d callbacks similar to test_walk but >> which are called just before a table at that level is walked. If the >> callback returns non-zero then the entire table is skipped. >> >> Signed-off-by: Steven Price >> --- >> include/linux/mm.h | 11 +++++++++++ >> mm/pagewalk.c | 24 ++++++++++++++++++++++++ >> 2 files changed, 35 insertions(+) >> >> diff --git a/include/linux/mm.h b/include/linux/mm.h >> index b22799129128..325a1ca6f820 100644 >> --- a/include/linux/mm.h >> +++ b/include/linux/mm.h >> @@ -1447,6 +1447,11 @@ void unmap_vmas(struct mmu_gather *tlb, struct vm_area_struct *start_vma, >> * value means "do page table walk over the current vma," >> * and a negative one means "abort current page table walk >> * right now." 1 means "skip the current vma." >> + * @test_pmd: similar to test_walk(), but called for every pmd. >> + * @test_pud: similar to test_walk(), but called for every pud. >> + * @test_p4d: similar to test_walk(), but called for every p4d. >> + * Returning 0 means walk this part of the page tables, >> + * returning 1 means to skip this range. >> * @mm: mm_struct representing the target process of page table walk >> * @vma: vma currently walked (NULL if walking outside vmas) >> * @private: private data for callbacks' usage >> @@ -1471,6 +1476,12 @@ struct mm_walk { >> struct mm_walk *walk); >> int (*test_walk)(unsigned long addr, unsigned long next, >> struct mm_walk *walk); >> + int (*test_pmd)(unsigned long addr, unsigned long next, >> + pmd_t *pmd_start, struct mm_walk *walk); >> + int (*test_pud)(unsigned long addr, unsigned long next, >> + pud_t *pud_start, struct mm_walk *walk); >> + int (*test_p4d)(unsigned long addr, unsigned long next, >> + p4d_t *p4d_start, struct mm_walk *walk); >> struct mm_struct *mm; >> struct vm_area_struct *vma; >> void *private; >> diff --git a/mm/pagewalk.c b/mm/pagewalk.c >> index 1cbef99e9258..6bea79b95be3 100644 >> --- a/mm/pagewalk.c >> +++ b/mm/pagewalk.c >> @@ -32,6 +32,14 @@ static int walk_pmd_range(pud_t *pud, unsigned long addr, unsigned long end, >> unsigned long next; >> int err = 0; >> >> + if (walk->test_pmd) { >> + err = walk->test_pmd(addr, end, pmd_offset(pud, 0UL), walk); >> + if (err < 0) >> + return err; >> + if (err > 0) >> + return 0; >> + } > > Though this attempts to match semantics with test_walk() and be comprehensive > just wondering what are the real world situations when page walking need to be > aborted based on error condition at a given page table level. I'm not aware of a situation yet where aborting early is necessary - but as you say this matches the semantics of test_walk() and was easy to implement. Steve