Received: by 2002:a05:6a10:6006:0:0:0:0 with SMTP id w6csp220757pxa; Wed, 26 Aug 2020 08:51:14 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxixrHXjqEK/KzUyNkMp82MDBHpkMjnROy3L74zEvkP1NrfMFsQBrRky9cNlTcGWgXTFSXI X-Received: by 2002:a17:906:52d9:: with SMTP id w25mr15973286ejn.491.1598457074231; Wed, 26 Aug 2020 08:51:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1598457074; cv=none; d=google.com; s=arc-20160816; b=sDzVACEZ9/bkX0yrXv3dr9GjizDC2fOhzraDnMyBiz77LWA9YuptlOxJEimYiDrNLo lXSeFAq06lrDFv4sErqr2ykEqRsL5dYTr+2RFlCYrJxHNhRJwZxRN++MpQeMrQbdPwBk CAH9GsGMZRJzNb1X2PPK2KL2akpCLo4w2BDw0aejqhqNwvLBVgoMTFp/g7/JfnLLN5lY zkMfqASSkZVwDbcnlSRY4S2HEN0ASVpWroxQtaY3pakE71z+Rrg4m/0CAvzZhBuhKHHn ifyC/AUmlM7IC7jZCuQTxYaS1hSsTMz6FgQL+7xb0gaGmOIzcucpijUpG336WOztxijV McPg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=I18u0dNCe4slxMxufMguyF6f9uk6YgWYDMxiYXb8r/M=; b=iM29a6V+EGjswm1d1TPwx/15YA469uo1zAn9MiprygVK7ad0KJa6SlX2VRgMT/hRuD v374B75aF5ZWqt0apIrRaeATiCqi9J7bmxmWx2dAiiOBqdsN+zCyK6u1GN8mbWy58O7d 9mhC+U4inZ3/8LTfovtqs9ypFYDNkYzP1O11rxCu4SIPLp88ClRRXHvbIAkksCGoMh9g IiuSi8pXE5bGf1fWnCESa6GI7msdiwtlGZ4K1mpmehplHQE3E5MaPNiKSpJ7pXggdJqA Pcly97XZMr/xNf0MReWEb8dwD4TOD/8H3SidOpWtOBpWDE4z2ey+mJWcZwrjMwGw6AWV CTaw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=casper.20170209 header.b=UkAgccpf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id n11si1677594eds.209.2020.08.26.08.50.49; Wed, 26 Aug 2020 08:51:14 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=casper.20170209 header.b=UkAgccpf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727115AbgHZPtk (ORCPT + 99 others); Wed, 26 Aug 2020 11:49:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42864 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726739AbgHZPt3 (ORCPT ); Wed, 26 Aug 2020 11:49:29 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1C558C061574; Wed, 26 Aug 2020 08:49:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=I18u0dNCe4slxMxufMguyF6f9uk6YgWYDMxiYXb8r/M=; b=UkAgccpf8C57KDxqkodEH9olO/ 1uRw550MvDayPZHtymBnFfxwmNtcgF2M6M96967ti7Q0kRqhAJD7EntbGZL40Lyy/dSXmziTiz7Hk lbnctZsn662cXzfd6hEb2utiYDSGSBQ9MMuBEMkU4epwpDT9aRW6rBD0kAE2/WhSDIlgYgN0uvYHv c5TtMuUjgUaaySu/tXn9XT3Fdu34TLsipV80GR1LCDvPxHlwYwEnRuupZUec5g7a9rqmgnBVv2OwM AfLYitWZTrvbKthc6NKOJtYTa0Bz0jsofeKe/qhWz/aTsyo6BYZPDxDJfSEiDV625F6g9O9aiVfKH 0a4QY2Ng==; Received: from willy by casper.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1kAxfX-0005m4-3j; Wed, 26 Aug 2020 15:48:59 +0000 Date: Wed, 26 Aug 2020 16:48:59 +0100 From: Matthew Wilcox To: Johannes Weiner Cc: linux-mm@kvack.org, Andrew Morton , Hugh Dickins , William Kucharski , Jani Nikula , Alexey Dobriyan , Chris Wilson , Matthew Auld , Huang Ying , intel-gfx@lists.freedesktop.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 6/8] mm: Convert find_get_entry to return the head page Message-ID: <20200826154859.GT17456@casper.infradead.org> References: <20200819184850.24779-1-willy@infradead.org> <20200819184850.24779-7-willy@infradead.org> <20200826150925.GE988805@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200826150925.GE988805@cmpxchg.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Aug 26, 2020 at 11:09:25AM -0400, Johannes Weiner wrote: > On Wed, Aug 19, 2020 at 07:48:48PM +0100, Matthew Wilcox (Oracle) wrote: > > There are only three callers remaining of find_get_entry(). > > find_get_swap_page() is happy to get the head page instead of the subpage. > > Add find_subpage() calls to find_lock_entry() and pagecache_get_page() > > to avoid auditing all their callers. > > I believe this would cause a subtle bug in memcg charge moving for pte > mapped huge pages. We currently skip over tail pages in the range > (they don't have page->mem_cgroup set) and account for the huge page > once from the headpage. After this change, we would see the headpage > and account for it 512 times (or whatever the number is on non-x86). Hmm ... so if you have the last 511 pages of a huge page mapped, you actually don't charge for it at all today? I think you're right that I'd introduce this bug, and so that needs to be fixed. > But that aside, I don't quite understand the intent. > > Before, all these functions simply return the base page at @index, > whether it's a regular page or a tail page. > > Afterwards, find_lock_entry(), find_get_page() et al still do, but > find_get_entry() returns headpage at @index & HPAGE_CACHE_INDEX_MASK. > > Shouldn't we be consistent about how we handle huge pages when > somebody queries the tree for a given base page index? > > [ Wouldn't that mean that e.g. find_get_swap_page() would return tail > pages for regular files and head pages for shmem files? ] What I'd _like_ to do is convert all the callers to cope with tail pages never being returned from all the find_* functions. That seems like a lot of disruption. My intent in this series is to get all the find_*_entr{y,ies} functions to the point where they don't return tail pages. Also find_get_pages_tag() because tags are only set on head pages. This is generally what the callers want anyway. There's even a hack in find_get_entries() in current to terminate early on finding a THP (see commit 71725ed10c40696dc6bdccf8e225815dcef24dba). If I want to remove that, I need to do _something_ to not put all the subpages of a THP into the pagevec. So the new rule will be that find_*_entry() don't return tail pages but find_*_page() do. With the full THP patchset in place, THPs become quite common, so bugs in this area will surface quickly instead of lingering for years and only popping out in rare circumstances.