Received: by 2002:ac2:464d:0:0:0:0:0 with SMTP id s13csp190422lfo; Tue, 17 May 2022 21:50:24 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwCxopwQHCiADXn1txSV0bLEMAbKpdInGoeNX37SQBRhdl/GxcJd1D45IVVWqHOuvdMeJdL X-Received: by 2002:a05:6a00:1a08:b0:510:a1db:1a91 with SMTP id g8-20020a056a001a0800b00510a1db1a91mr25652347pfv.69.1652849424729; Tue, 17 May 2022 21:50:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1652849424; cv=none; d=google.com; s=arc-20160816; b=VvZRUiaG4M9YC44efbz2iy75Sr4IvOQbGdH0rOS+b6CPCP8iuHbIO7c0Fhrgu44C6p LvATWi+tCvJ4FSEkCvfCiZe/aUK+Mq51FUBZ7x7YNxgP8O6Swcqly2QUyU6RX1zOy9sY 0DzIk/eyd41Dl3NLXHeTOs0r09DI9e5P7lTt5RoMDOP7Ierqmnr0ZZa8l5B3ECfTUmxa iaSvmME+DfRpwGuRDSpRF3wDN9NN7WE9stA3DSGAGg05RgfAyou6ZoKfb3pARPRBHUPY eAkyHPYCM2ttgaYT/dmtEr1zTuY2dZ4R8FMFTwsO4UBDu7OdcvNO7NiRAtRArIY46QGT qnNA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=mtJ/bpIOmz9OpWFPpJzofsevEelc2TrXPtl++xn6T/0=; b=etZ2DZiM7laZq2PqTzM5CEOz+sdnMm/uT1o9BX8JeRHwA2B6XYzhILyiAMTWPKQj0Z zytTDVNpqBIBgHfb1Fb3/MI8Kk1F6MSuqfyZ+jnUHuhhOJqzLCuYDOYCX8sUbSJSUosV 2VxiE0HsCijODBLErqoA6zWYUnEDFxw50sGACNH/tTHkzV4judElPKVut9NMyQzk9yfV /BoHqdok/G2Amki8+FppJ7dTIhHuvTGqRG49NUGdRYyro8PqwryZhYX6X/Wy24Q7UWix d7d4g4pjZ/qBaaMxZ9t1/glC0VvbYstXd0psDpVPkFu0QzkHL1oASDKAo3t8DzpZ/J9r kTpw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=korg header.b=AutC+nYS; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id 73-20020a63014c000000b003ab20c3459fsi1247525pgb.201.2022.05.17.21.50.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 May 2022 21:50:24 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=korg header.b=AutC+nYS; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id E5585132A06; Tue, 17 May 2022 21:01:08 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232276AbiERAFH (ORCPT + 99 others); Tue, 17 May 2022 20:05:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35692 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232253AbiERAFF (ORCPT ); Tue, 17 May 2022 20:05:05 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 23B1B4F9D6 for ; Tue, 17 May 2022 17:05:04 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id B210E6144A for ; Wed, 18 May 2022 00:05:03 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E9DD7C385B8; Wed, 18 May 2022 00:05:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1652832303; bh=002uA/OhCvgSuCBXwvu+tXYkgZTTOLcnGkvUEQBf7LI=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=AutC+nYSX1aPAKJulnJ4OmrJOn8VnWdcmA+sOROdyQgX/TybUILLrdH3kxIaGCHX5 Ak+dBVPuBTzcWumqSUQgIsKtQt+6p2NaOD2ar62iQN9xTsjSdtgjyYar6UR9z1ps8T PTr1NHeSSdrRzbHwWFKHRW6bvGb0b2xzsjS/jWeA= Date: Tue, 17 May 2022 17:05:02 -0700 From: Andrew Morton To: Yang Shi Cc: willy@infradead.org, songmuchun@bytedance.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [v2 PATCH] mm: pvmw: check possible huge PMD map by transhuge_vma_suitable() Message-Id: <20220517170502.cf4433f5cd1e69f27305cf19@linux-foundation.org> In-Reply-To: <20220513191705.457775-1-shy828301@gmail.com> References: <20220513191705.457775-1-shy828301@gmail.com> X-Mailer: Sylpheed 3.7.0 (GTK+ 2.24.33; x86_64-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-4.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,NICE_REPLY_A,RDNS_NONE,SPF_HELO_NONE, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 13 May 2022 12:17:05 -0700 Yang Shi wrote: > IIUC PVMW checks if the vma is possibly huge PMD mapped by > transparent_hugepage_active() and "pvmw->nr_pages >= HPAGE_PMD_NR". > > Actually pvmw->nr_pages is returned by compound_nr() or > folio_nr_pages(), so the page should be THP as long as "pvmw->nr_pages > >= HPAGE_PMD_NR". And it is guaranteed THP is allocated for valid VMA > in the first place. But it may be not PMD mapped if the VMA is file > VMA and it is not properly aligned. The transhuge_vma_suitable() > is used to do such check, so replace transparent_hugepage_active() to > it, which is too heavy and overkilling. I messed with the changelog a bit. The function is called page_vma_mapped_walk(), so let's call it that. This patch has been in the trees since May 12, which isn't terribly long. Does anyone feel up to a reviewed-by? Thanks. From: Yang Shi Subject: mm/page_vma_mapped.c: check possible huge PMD map with transhuge_vma_suitable() Date: Fri, 13 May 2022 12:17:05 -0700 IIUC page_vma_mapped_walk() checks if the vma is possibly huge PMD mapped with transparent_hugepage_active() and "pvmw->nr_pages >= HPAGE_PMD_NR". Actually pvmw->nr_pages is returned by compound_nr() or folio_nr_pages(), so the page should be THP as long as "pvmw->nr_pages >= HPAGE_PMD_NR". And it is guaranteed THP is allocated for valid VMA in the first place. But it may be not PMD mapped if the VMA is file VMA and it is not properly aligned. The transhuge_vma_suitable() is used to do such check, so replace transparent_hugepage_active() to it, which is too heavy and overkilling. Link: https://lkml.kernel.org/r/20220513191705.457775-1-shy828301@gmail.com Signed-off-by: Yang Shi Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Signed-off-by: Andrew Morton --- include/linux/huge_mm.h | 8 ++++++-- mm/page_vma_mapped.c | 2 +- 2 files changed, 7 insertions(+), 3 deletions(-) --- a/include/linux/huge_mm.h~mm-pvmw-check-possible-huge-pmd-map-by-transhuge_vma_suitable +++ a/include/linux/huge_mm.h @@ -117,8 +117,10 @@ extern struct kobj_attribute shmem_enabl extern unsigned long transparent_hugepage_flags; static inline bool transhuge_vma_suitable(struct vm_area_struct *vma, - unsigned long haddr) + unsigned long addr) { + unsigned long haddr; + /* Don't have to check pgoff for anonymous vma */ if (!vma_is_anonymous(vma)) { if (!IS_ALIGNED((vma->vm_start >> PAGE_SHIFT) - vma->vm_pgoff, @@ -126,6 +128,8 @@ static inline bool transhuge_vma_suitabl return false; } + haddr = addr & HPAGE_PMD_MASK; + if (haddr < vma->vm_start || haddr + HPAGE_PMD_SIZE > vma->vm_end) return false; return true; @@ -342,7 +346,7 @@ static inline bool transparent_hugepage_ } static inline bool transhuge_vma_suitable(struct vm_area_struct *vma, - unsigned long haddr) + unsigned long addr) { return false; } --- a/mm/page_vma_mapped.c~mm-pvmw-check-possible-huge-pmd-map-by-transhuge_vma_suitable +++ a/mm/page_vma_mapped.c @@ -243,7 +243,7 @@ restart: * cleared *pmd but not decremented compound_mapcount(). */ if ((pvmw->flags & PVMW_SYNC) && - transparent_hugepage_active(vma) && + transhuge_vma_suitable(vma, pvmw->address) && (pvmw->nr_pages >= HPAGE_PMD_NR)) { spinlock_t *ptl = pmd_lock(mm, pvmw->pmd); _