Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp7660269rwd; Tue, 20 Jun 2023 04:37:48 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4P5GT0q3OgkHGAgIsRKixBNelRHAneXXMQzMufTIdNbXS/1TRi3A3DlQZ1EkqBI5pVfQs+ X-Received: by 2002:a05:6808:15a7:b0:395:f0a7:da55 with SMTP id t39-20020a05680815a700b00395f0a7da55mr16906866oiw.34.1687261068219; Tue, 20 Jun 2023 04:37:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687261068; cv=none; d=google.com; s=arc-20160816; b=rUReldKpyoWjtv6Mke6gYmE1630Nz5YGsR18btMo16R6U58+rYggscRzDIyMZljDm/ 1N55SKweJrM1Fqbv28FPNWi/K/80CtokzegIOEt6qnwyElj2RdeYOe33opzBvwHoMWEL gmO63833PVVJL1Tls3Ty45XMZVIXnVEbO461YJW8/qcl2ntS1pnmRUszVM8dorm3OrMb g6w0NaVcKNW26t4P3vd7AGcoBkIGydn0UaStOQfFbxdtYWtsEr3V6o1EQ9n1pcw2yjdZ cqjWJnGXJZ+opa6GCf8ISdYbJTP2jO0tpyI3ypIXWMNnrZvs/WqX/pbszoP9zsFCUKXU nxjw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:to:content-language:subject:cc:user-agent:mime-version :date:message-id:dkim-signature; bh=jCjwK1lNmazcCoOJdQHx5yvGVDV5NGpcbyx3OT1qFLo=; b=leJIwKEoD+GC3FNeUiP3hoLiefh4TCLFNJzxS5vTbnvT1HiTcGaz2MZt0Sy7e9MU6E cERnuKM7xB39J/zX2Px0gl2izpD0E8Cwdce0qIxXvaN4KwXuml04d7fdELRTkpzO8nww dEcDpIjlY01ioM8F9U8PeF85zHguBzVudp5/iNX7rK5KFIzUehFFp9o+ggfr+WOd8I13 zoqvvwgNpiw1SeGVU7mGf6OqqAQBgz9L5mWuzUWaukHZzGT8bCd48zC2/X5NFUQFfUSx zRdwi/+ULviFW387oIFFtFHVxXgGNf5T+MKC8e/9NpHi91KrQoDmDnALUniC0wCAJWcS SRDQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@collabora.com header.s=mail header.b=nMwLd464; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=collabora.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id bg9-20020a17090b0d8900b002586dbee167si1691464pjb.170.2023.06.20.04.37.35; Tue, 20 Jun 2023 04:37:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@collabora.com header.s=mail header.b=nMwLd464; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=collabora.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232313AbjFTLUG (ORCPT + 99 others); Tue, 20 Jun 2023 07:20:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52506 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229489AbjFTLUE (ORCPT ); Tue, 20 Jun 2023 07:20:04 -0400 Received: from madras.collabora.co.uk (madras.collabora.co.uk [IPv6:2a00:1098:0:82:1000:25:2eeb:e5ab]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 986961708; Tue, 20 Jun 2023 04:19:39 -0700 (PDT) Received: from [192.168.10.54] (unknown [119.155.63.248]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: usama.anjum) by madras.collabora.co.uk (Postfix) with ESMTPSA id ADDA16606F55; Tue, 20 Jun 2023 12:19:30 +0100 (BST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1687259977; bh=w8Q4vKMQFmuajKNPlPTCQbzOZl5aCRmpDJ5fB3c4fm8=; h=Date:Cc:Subject:To:References:From:In-Reply-To:From; b=nMwLd464ivUuXC7Q2sLXA7ZzNELRv3wO1V2k9CDaYKGFRsmoFOwfqwclGsmWr7CZi rDZEhGTv5U+SsdhIOJtLmYrq4sORoL1jrefM0m5uS4UYuLelyD5ANXWvWAi6e4T+4G KfD0Dl6PNn0+zTOrVQCjn1QDQ8vShWBb3pnOAzcG+p/32TeOe9E0hmZuPTsoRFBQcr 9Kg/VAzN0Zxqv9WJEKmBpCvUTch3/2atH6K1cFYGN5bZcJNh5FKuwhAgOIJspBQczM kHdCoMm6Ihc++MBZam8MzCMbZ2ezGTxr0kzc4TV79l7diJX53uKDTYAD/OnPkSxEpI tVEk3p1JaNyDw== Message-ID: <344449fe-56f1-ed2a-b13f-d66abb57a1fe@collabora.com> Date: Tue, 20 Jun 2023 16:19:26 +0500 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Cc: Muhammad Usama Anjum , Peter Xu , David Hildenbrand , Andrew Morton , =?UTF-8?B?TWljaGHFgiBNaXJvc8WC?= =?UTF-8?Q?aw?= , Danylo Mocherniuk , Paul Gofman , Cyrill Gorcunov , Mike Rapoport , Nadav Amit , Alexander Viro , Shuah Khan , Christian Brauner , Yang Shi , Vlastimil Babka , "Liam R . Howlett" , Yun Zhou , Suren Baghdasaryan , Alex Sierra , Matthew Wilcox , Pasha Tatashin , Axel Rasmussen , "Gustavo A . R . Silva" , Dan Williams , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, Greg KH , kernel@collabora.com Subject: Re: [PATCH v19 2/5] fs/proc/task_mmu: Implement IOCTL to get and optionally clear info about PTEs Content-Language: en-US To: Andrei Vagin References: <20230615141144.665148-1-usama.anjum@collabora.com> <20230615141144.665148-3-usama.anjum@collabora.com> <212e331f-35b0-5ae7-6371-26caa577d637@collabora.com> From: Muhammad Usama Anjum In-Reply-To: <212e331f-35b0-5ae7-6371-26caa577d637@collabora.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.2 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 6/19/23 11:06 AM, Muhammad Usama Anjum wrote: > On 6/17/23 11:39 AM, Andrei Vagin wrote: >> On Thu, Jun 15, 2023 at 07:11:41PM +0500, Muhammad Usama Anjum wrote: >>> +static int pagemap_scan_pmd_entry(pmd_t *pmd, unsigned long start, >>> + unsigned long end, struct mm_walk *walk) >>> +{ >>> + bool is_written, flush = false, is_interesting = true; >>> + struct pagemap_scan_private *p = walk->private; >>> + struct vm_area_struct *vma = walk->vma; >>> + unsigned long bitmap, addr = end; >>> + pte_t *pte, *orig_pte, ptent; >>> + spinlock_t *ptl; >>> + int ret = 0; >>> + >>> + arch_enter_lazy_mmu_mode(); >>> + >>> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE >>> + ptl = pmd_trans_huge_lock(pmd, vma); >>> + if (ptl) { >>> + unsigned long n_pages = (end - start)/PAGE_SIZE; >>> + >>> + if (p->max_pages && n_pages > p->max_pages - p->found_pages) >>> + n_pages = p->max_pages - p->found_pages; >>> + >>> + is_written = !is_pmd_uffd_wp(*pmd); >>> + >>> + /* >>> + * Break huge page into small pages if the WP operation need to >>> + * be performed is on a portion of the huge page. >>> + */ >>> + if (is_written && IS_PM_SCAN_WP(p->flags) && >>> + n_pages < HPAGE_SIZE/PAGE_SIZE) { >>> + spin_unlock(ptl); >>> + >>> + split_huge_pmd(vma, pmd, start); >>> + goto process_smaller_pages; >>> + } >>> + >>> + bitmap = PM_SCAN_FLAGS(is_written, (bool)vma->vm_file, >>> + pmd_present(*pmd), is_swap_pmd(*pmd)); >>> + >>> + if (IS_PM_SCAN_GET(p->flags)) { >>> + is_interesting = pagemap_scan_is_interesting_page(bitmap, p); >>> + if (is_interesting) >>> + ret = pagemap_scan_output(bitmap, p, start, n_pages); >>> + } >>> + >>> + if (IS_PM_SCAN_WP(p->flags) && is_written && is_interesting && >>> + ret >= 0) { >>> + make_uffd_wp_pmd(vma, start, pmd); >>> + flush_tlb_range(vma, start, end); >>> + } >>> + >>> + spin_unlock(ptl); >>> + >>> + arch_leave_lazy_mmu_mode(); >>> + return ret; >>> + } >>> + >>> +process_smaller_pages: >>> +#endif >>> + >>> + orig_pte = pte = pte_offset_map_lock(vma->vm_mm, pmd, start, &ptl); >>> + if (!pte) { >> >> Do we need to unlock ptl here? >> >> spin_unlock(ptl); > No, please look at these recently merged patches: > https://lore.kernel.org/all/c1c9a74a-bc5b-15ea-e5d2-8ec34bc921d@google.com > >> >>> + walk->action = ACTION_AGAIN; >>> + return 0; >>> + } >>> + >>> + for (addr = start; addr < end && !ret; pte++, addr += PAGE_SIZE) { >>> + ptent = ptep_get(pte); >>> + is_written = !is_pte_uffd_wp(ptent); >>> + >>> + bitmap = PM_SCAN_FLAGS(is_written, (bool)vma->vm_file, >>> + pte_present(ptent), is_swap_pte(ptent)); >> >> The vma->vm_file check isn't correct in this case. You can look when >> pte_to_pagemap_entry sets PM_FILE. This flag is used to detect what >> pages have a file backing store and what pages are anonymous. > I'll update. > >> >> I was trying to integrate this new interace into CRIU and I found >> one more thing that is required. We need to detect zero pages. Can we not add this zero page flag now as we are already at v20? If you have time to review and test the patches, then something can be done. > Should we name it ZERO_PFN_PRESENT_PAGE to be exact or what? > >> >> It should look something like this: >> >> #define PM_SCAN_FLAGS(wt, file, present, swap, zero) \ >> ((wt) | ((file) << 1) | ((present) << 2) | ((swap) << 3) | ((zero) << 4)) >> >> >> bitmap = PM_SCAN_FLAGS(is_written, page && !PageAnon(page), >> pte_present(ptent), is_swap_pte(ptent), >> pte_present(ptent) && is_zero_pfn(pte_pfn(ptent))); > Okay. Can you please confirm my assumptions: > - A THP cannot be file backed. (PM_FILE isn't being set for THP case) > - A hole is also not file backed. > > A hole isn't present in memory. So its pfn would be zero. But as it isn't > present, it shouldn't report zero page. Right? For hole:: > > PM_SCAN_FLAGS(false, false, false, false, false) Please let me know about the test results you have been doing. > > >> >> Thanks, >> Andrei > -- BR, Muhammad Usama Anjum