Received: by 2002:a05:7412:6592:b0:d7:7d3a:4fe2 with SMTP id m18csp1005148rdg; Fri, 11 Aug 2023 07:03:13 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFL9Y27YF+aYlYXdfvSj7RLEhyT1yxxorJ6l89KkowtlGkyFTLZnt7WMi2qoLaAHUBUjSNs X-Received: by 2002:a05:6a00:ad3:b0:687:20d6:fade with SMTP id c19-20020a056a000ad300b0068720d6fademr2679732pfl.20.1691762593448; Fri, 11 Aug 2023 07:03:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691762593; cv=none; d=google.com; s=arc-20160816; b=slYfcVJY5hgZVb+TPkxbjJNb+W8ltappbWUC9yrztrHMYfLKRza2GaJJWrjXnWnSLs Zu8+8xiKSZ0aQeSOg1z1EYgBaMOaUo67UNAWlXlXtKspomy4chYol+565FxF3spQt/gw THWBkyms2G5xJBrkNaNwVpXs1bZVbvZoj6Nhf+pBoDhBsIs2QLjTyRAI3722V0X7/xao +jdIBWdhxsQN/ITMfzugO5dHYnXQgsd8jEE6pAg4bryLFAFRF704jokNjCwfXDRBcPml ci8+CZ9mB1iDi0Sn84ghvT+ByhU32JDcW/qiD2XA2qXlNDM8ac7LKnIKQ9jiOprct30+ t+bg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=ypEKTkLpCeJxqqHeAxtQ8Lkcxka0IkUbUa7Fv39L/lc=; fh=+IxPxAKfuyOQCtx4t8OBRPDeWP0JGhkhVR9LPsk1gnE=; b=jPEq0VPUZEF9t3nTd5at3VH8o3Yg/9RSMLtL8sWVI23ORVA8L/dQaYEr37Rdb/afNv WIpDKRM+e5mEGUkJyq+Gpt7MHY0cdf/eWQ1Fec66PtxyKbJXYaKBkxHi5Rw13J15qz/h 5fjl84exFpAPXTbyqQCjk4+Z5+miQwxG/JhXNZrVeN/0f0n8nHAJPxgnN73SqTei1STb S9MtFS1nXb67Gj5rmN7LvOBxdrns51W2pPCy2pRGaHh2hm4fJtc/xRJlMhC+rTmibzgR rMcKkEeXS1IO/cpTe+N64Ylpmxe0FU2acOpHrwQtraVh6b1qJLoCpYkT4NSXRO81fZC/ CHvg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@rere.qmqm.pl header.s=1 header.b=LrLplw1V; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=rere.qmqm.pl Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id fi17-20020a056a00399100b0068709992abfsi3436728pfb.324.2023.08.11.07.02.58; Fri, 11 Aug 2023 07:03:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@rere.qmqm.pl header.s=1 header.b=LrLplw1V; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=rere.qmqm.pl Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235681AbjHKNcj (ORCPT + 99 others); Fri, 11 Aug 2023 09:32:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56426 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229523AbjHKNcj (ORCPT ); Fri, 11 Aug 2023 09:32:39 -0400 Received: from rere.qmqm.pl (rere.qmqm.pl [91.227.64.183]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 20D0F2D52; Fri, 11 Aug 2023 06:32:38 -0700 (PDT) Received: from remote.user (localhost [127.0.0.1]) by rere.qmqm.pl (Postfix) with ESMTPSA id 4RMl8c53QwzH1; Fri, 11 Aug 2023 15:32:32 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=rere.qmqm.pl; s=1; t=1691760756; bh=NfrgfFGHCga9/EC5NbBTXDZZMQ+au5SsG2T+NouDd98=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=LrLplw1VKNapfSiTZM0ge43vU7/4YDPbdoM8upUMUpGUcvwmbVp2S1UWV1X9oowhK 2LW9XDZUrqRegyaDh1es58iNHcJhmITsYCWxHUS6avG+8V9flhoJREZaFx4Hz5Tr9e xHbAy866iRFY572fHAogvK8ZCPWs9vKxuxZF9i3C601NDP84vpWwJx2KEflpIQzAf3 2i4DDZ31Y2P5RnRZBmU5DwHDLg/COv2TbaCKYZsSI2hhLe5kHHMz2VDJ7xVuVfPthy E5ZR16V6YjH02Tl15rTTUWcotslLVJHCDb7ND4kC+kKsQyxri7RBlVIEb1EogZ/pBW P6IYc0Bt7r64w== X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.8 at mail Date: Fri, 11 Aug 2023 15:32:31 +0200 From: =?iso-8859-2?Q?Micha=B3_Miros=B3aw?= To: Muhammad Usama Anjum Cc: =?iso-8859-2?Q?Micha=B3_Miros=B3aw?= , Peter Xu , David Hildenbrand , Andrew Morton , Andrei Vagin , Danylo Mocherniuk , Paul Gofman , Cyrill Gorcunov , Mike Rapoport , Nadav Amit , Alexander Viro , Shuah Khan , Christian Brauner , Yang Shi , Vlastimil Babka , "Liam R . Howlett" , Yun Zhou , Suren Baghdasaryan , Alex Sierra , Matthew Wilcox , Pasha Tatashin , Axel Rasmussen , "Gustavo A . R . Silva" , Dan Williams , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, Greg KH , kernel@collabora.com Subject: Re: [PATCH v28 2/6] fs/proc/task_mmu: Implement IOCTL to get and optionally clear info about PTEs Message-ID: References: <20230809061603.1969154-1-usama.anjum@collabora.com> <20230809061603.1969154-3-usama.anjum@collabora.com> <97de19a3-bba2-9260-7741-cd5b6f4581e9@collabora.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <97de19a3-bba2-9260-7741-cd5b6f4581e9@collabora.com> X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,SPF_HELO_PASS,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Aug 11, 2023 at 05:02:44PM +0500, Muhammad Usama Anjum wrote: > On 8/10/23 10:32 PM, Michał Mirosław wrote: > > On Wed, 9 Aug 2023 at 08:16, Muhammad Usama Anjum > > wrote: [...] > >> --- a/fs/proc/task_mmu.c > >> +++ b/fs/proc/task_mmu.c > > [...] > >> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE > >> +static unsigned long pagemap_thp_category(pmd_t pmd) > >> +{ > >> + unsigned long categories = PAGE_IS_HUGE; > >> + > >> + if (pmd_present(pmd)) { > >> + categories |= PAGE_IS_PRESENT; > >> + if (!pmd_uffd_wp(pmd)) > >> + categories |= PAGE_IS_WRITTEN; > >> + if (is_zero_pfn(pmd_pfn(pmd))) > >> + categories |= PAGE_IS_PFNZERO; > >> + } else if (is_swap_pmd(pmd)) { > >> + categories |= PAGE_IS_SWAPPED; > >> + if (!pmd_swp_uffd_wp(pmd)) > >> + categories |= PAGE_IS_WRITTEN; > >> + } > >> + > >> + return categories; > >> +} > > I guess THPs can't be file-backed currently, but can we somehow mark > > this assumption so it can be easily found if the capability arrives? > Yeah, THPs cannot be file backed. Lets not care for some feature which may > not arrive in several years or eternity. Yes, it might not arrive. But please add at least a comment, so that it is clearly visible that lack if PAGE_IS_FILE here is intentional. > >> +#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ > >> + > >> +#ifdef CONFIG_HUGETLB_PAGE > >> +static unsigned long pagemap_hugetlb_category(pte_t pte) > >> +{ > >> + unsigned long categories = PAGE_IS_HUGE; > >> + > >> + if (pte_present(pte)) { > >> + categories |= PAGE_IS_PRESENT; > >> + if (!huge_pte_uffd_wp(pte)) > >> + categories |= PAGE_IS_WRITTEN; > >> + if (!PageAnon(pte_page(pte))) > >> + categories |= PAGE_IS_FILE; > >> + if (is_zero_pfn(pte_pfn(pte))) > >> + categories |= PAGE_IS_PFNZERO; > >> + } else if (is_swap_pte(pte)) { > >> + categories |= PAGE_IS_SWAPPED; > >> + if (!pte_swp_uffd_wp_any(pte)) > >> + categories |= PAGE_IS_WRITTEN; > >> + } > > > > BTW, can a HugeTLB page be file-backed and swapped out? > Accourding to pagemap_hugetlb_range(), file-backed HugeTLB page cannot be > swapped. Here too a comment that leaving out this case is intentional would be useful. > > [...] > >> + walk_start = p.arg.start; > >> + for (; walk_start < p.arg.end; walk_start = p.arg.walk_end) { [...[ > >> + ret = mmap_read_lock_killable(mm); > >> + if (ret) > >> + break; > >> + ret = walk_page_range(mm, walk_start, p.arg.end, > >> + &pagemap_scan_ops, &p); > >> + mmap_read_unlock(mm); [...] > >> + if (ret != -ENOSPC || p.arg.vec_len - 1 == 0 || > >> + p.found_pages == p.arg.max_pages) > >> + break; > > > > The second condition is equivalent to `p.arg.vec_len == 1`, but why is > > that an ending condition? Isn't the last entry enough to gather one > > more range? (The walk could have returned -ENOSPC due to buffer full > > and after flushing it could continue with the last free entry.) > Now we are walking the entire range walk_page_range(). We don't break loop > when we get -ENOSPC as this error may only mean that the temporary buffer > is full. So we need check if max pages have been found or output buffer is > full or ret is 0 or any other error. When p.arg.vec_len = 1 is end > condition as the last entry is in cur. As we have walked over the entire > range, cur must be full after which the walk returned. > > So current condition is necessary. I've double checked it. I'll change it > to `p.arg.vec_len == 1`. If we have walked the whole range, then the loop will end anyway due to `walk_start < walk_end` not held in the `for()`'s condition. [...] > >> +/* > >> + * struct pm_scan_arg - Pagemap ioctl argument > >> + * @size: Size of the structure > >> + * @flags: Flags for the IOCTL > >> + * @start: Starting address of the region > >> + * @end: Ending address of the region > >> + * @walk_end Address where the scan stopped (written by kernel). > >> + * walk_end == end informs that the scan completed on entire range. > > > > Can we ensure this holds also for the tagged pointers? > No, we cannot. So this need explanation in the comment here. (Though I'd still like to know how the address tags are supposed to be used from someone that knows them.) Best Regards Michał Mirosław