Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp3368248pxb; Mon, 18 Oct 2021 13:48:31 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxQf7t3jTyBP+N+hbKfzI90awmlhgfnfkwQV3+LcnAlhl4v4izyk1wSKRMeXzDiUOymBGjm X-Received: by 2002:a17:906:85d1:: with SMTP id i17mr33479797ejy.489.1634590111116; Mon, 18 Oct 2021 13:48:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1634590111; cv=none; d=google.com; s=arc-20160816; b=sYdRMkcsJZOYyjxj6YkTnD/jP+P1cXLrqJLLCG0AjhLZTUI/JWr2as65TUOL706Z24 Vq2Hl5SCFgOEWlMw5A8QfT57wmnhDskwMF5G2KjWmBG3ykcX1855ljFe9KUdBD5y3TyW Sum0SJn0XcrVWcIrAVbT9iefUNw2A4mGZciZj04TVAJNsBrXqwmAcSnMBY06doKn22v9 8jRXAlWdm3H024128G5fcBiRqSO5z9qW1fmUA7rQwzccYEi8vLmoOqGyOblQrmFVFW84 IM0Iel8l6mzXUoME3Ac/t42UjyId3YhEtNgbZwrAfYc6uYqyfJ+eGL4vjVJ9HvhLavse ueAg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=aN+GBnxvSYxI1CBykhy1W3cisyjCt5R6Tkf+A9sowYI=; b=stU5z930xOziixbo0MkaewT5mLpybwl9jjZ439jGqNa8XhiSCJJ8uc0p2kUHvSa+51 NR9ss0lVc3T4wCDaMEU6hx4B8zztf3YYSZoMRQZXM++0d+eZY+uVYcz8JO11VeFPSlGy ryNgBJMNOrPPs1L1xfRsnP8G85oKWxP2CgAx41LLvYzMgzjpNB+s/onZ6BLVsXQ9qzNk FRslXs30+rP2BSj8Yn16j8tOs5qdoTWzasgKPISHCGWXnfgzYa3V93TUIt8CuOdVm0R6 NZPHZiQtzzhxN1W1WVsVb9dlxhE9qi1bGsgCtfo2ofHykS+PwUhSqfMs/Nb10ZIXfdOL u+hg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@cmpxchg-org.20210112.gappssmtp.com header.s=20210112 header.b=7HblSRuZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id k3si21249348edx.240.2021.10.18.13.48.02; Mon, 18 Oct 2021 13:48:31 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@cmpxchg-org.20210112.gappssmtp.com header.s=20210112 header.b=7HblSRuZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229529AbhJRUsU (ORCPT + 99 others); Mon, 18 Oct 2021 16:48:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50998 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229615AbhJRUsP (ORCPT ); Mon, 18 Oct 2021 16:48:15 -0400 Received: from mail-qt1-x834.google.com (mail-qt1-x834.google.com [IPv6:2607:f8b0:4864:20::834]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 92651C06161C for ; Mon, 18 Oct 2021 13:46:03 -0700 (PDT) Received: by mail-qt1-x834.google.com with SMTP id i1so16452860qtr.6 for ; Mon, 18 Oct 2021 13:46:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=aN+GBnxvSYxI1CBykhy1W3cisyjCt5R6Tkf+A9sowYI=; b=7HblSRuZT2o71Rzj9KO+L+0WlowU1JnFRhSL6OY6SDP6BXZPP3+f2WK9cDXISXymD7 DiFKD/2V+V6ud355yDsI2WDB8+Zl9Qg1aWTI2h9r0iQrcteNGwSew4TDYCNK7o5bJ5UE B00B2MANKfv4s0jrdv0XHCTHS/l8s7d8AxHaV4QLN6jQMDwRs/hly+qikjzcQWadup8S UhSt3ez4l9Bfuo5+C0Rsp3gH42aCY3hGG8MyZZmT0lsmylwQsXjVLhPVnFOGVprqr9mV km9KInz16To2OqJwRmFsmTP72MMbdD64BGYEq+1zsD4hklaiIw8Wxls0D3ejQScWLyPO ZntA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=aN+GBnxvSYxI1CBykhy1W3cisyjCt5R6Tkf+A9sowYI=; b=qnwaeCd22mDwHq6AsJipy4cxPXuIglY/hhsS19J3OqgRRjEUcIEMTf6Bd8jyBa1oX2 AoAaLVrYIAuRgxuKGDcnCqpMpypF5/HjbQJZKQiHOsQus5QnJ6C6hFzOsi6kwFiRQFsC fnZCXTC8ws8SuPMBuQYt1kHTz++oL44IVfB7I0aHEEVdN+4rT2z0aQ7AsWiOHttQgBK+ Q24E+8aUJZSvthnx7zOoE6U4KT6XEMjJKaFwN50nErWjsijpQIR2VwcYvC9pABsjIUJX yww30QjV48kZvzIhkKbzwDtCqCfU9a615qqh3V6qrLC0VwSdOchWGZTyEcXyN7cu5ZEZ Lupg== X-Gm-Message-State: AOAM533e1fc1VQs/skRNJ12cP/GnVVhSmKYvA3ax9hO4opH49y4RgtY9 ckxrrk8R5njZS0njhBDo8kcc+w== X-Received: by 2002:ac8:4b57:: with SMTP id e23mr31353689qts.328.1634589961607; Mon, 18 Oct 2021 13:46:01 -0700 (PDT) Received: from localhost (cpe-98-15-154-102.hvc.res.rr.com. [98.15.154.102]) by smtp.gmail.com with ESMTPSA id d9sm6859236qtd.76.2021.10.18.13.46.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Oct 2021 13:46:00 -0700 (PDT) Date: Mon, 18 Oct 2021 16:45:59 -0400 From: Johannes Weiner To: Kent Overstreet Cc: Matthew Wilcox , Linus Torvalds , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Andrew Morton , "Darrick J. Wong" , Christoph Hellwig , David Howells Subject: Re: Folios for 5.15 request - Was: re: Folio discussion recap - Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Oct 18, 2021 at 02:12:32PM -0400, Kent Overstreet wrote: > On Mon, Oct 18, 2021 at 12:47:37PM -0400, Johannes Weiner wrote: > > I find this line of argument highly disingenuous. > > > > No new type is necessary to remove these calls inside MM code. Migrate > > them into the callsites and remove the 99.9% very obviously bogus > > ones. The process is the same whether you switch to a new type or not. > > Conversely, I don't see "leave all LRU code as struct page, and ignore anonymous > pages" to be a serious counterargument. I got that you really don't want > anonymous pages to be folios from the call Friday, but I haven't been getting > anything that looks like a serious counterproposal from you. > > Think about what our goal is: we want to get to a world where our types describe > unambigiuously how our data is used. That means working towards > - getting rid of type punning > - struct fields that are only used for a single purpose How is a common type inheritance model with a generic page type and subclasses not a counter proposal? And one which actually accomplishes those two things you're saying, as opposed to a shared folio where even 'struct address_space *mapping' is a total lie type-wise? Plus, really, what's the *alternative* to doing that anyway? How are we going to implement code that operates on folios and other subtypes of the page alike? And deal with attributes and properties that are shared among them all? Willy's original answer to that was that folio is just *going* to be all these things - file, anon, slab, network, rando driver stuff. But since that wasn't very popular, would not get rid of type punning and overloaded members, would get rid of efficiently allocating descriptor memory etc.- what *is* the alternative now to common properties between split out subtypes? I'm not *against* what you and Willy are saying. I have *genuinely zero idea what* you are saying. > Leaving all the LRU code as struct page means leaving a shit ton of type punning > in place, and you aren't outlining any alternate ways of dealing with that. As > long as all the LRU code is using struct page, that halts efforts towards > separately allocating these types and making struct page smaller (which was one > of your stated goals as well!), and it would leave a big mess in place for god > knows how long. I don't follow either of these claims. Converting to a shared anon/file folio makes almost no dent into the existing type punning we have, because head/tail page disambiguation is a tiny part of the type inferment we do on struct page. And leaving the LRU linkage in the struct page doesn't get in the way of allocating separate subtype descriptors. All these types need a list_head anyway, from anon to file to slab to the buddy allocator. Maybe anon, file, slab don't need it at the 4k granularity all the time, but the buddy allocator does anyway as long as it's 4k based and I'm sure you don't want to be allocating a new buddy descriptor every time we're splitting a larger page block into a smaller one? I really have no idea how that would even work. > It's been a massive effort for Willy to get this far, who knows when > someone else with the requisite skillset would be summoning up the > energy to deal with that - I don't see you or I doing it. > > Meanwhile: we've got people working on using folios for anonymous pages to solve > some major problems > > - it cleans up all of the if (normalpage) else if (hugepage) mess No it doesn't. > - it'll _majorly_ help with our memory fragmentation problems, as I recently > outlined. As long as we've got a very bimodal distribution in our allocation > sizes where the peaks are at order 0 and HUGEPAGE_ORDER, we're going to have > problems allocating hugepages. If anonymous + file memory can be arbitrary > sized compound pages, we'll end up with more of a poisson distribution in our > allocation sizes, and a _great deal_ of our difficulties with memory > fragmentation are going to be alleviated. > > - and on architectures that support merging of TLB entries, folios for > anonymous memory are going to get us some major performance improvements due > to reduced TLB pressure, same as hugepages but without nearly as much memory > fragmetation pain It doesn't do those, either. It's a new name for headpages, that's it. Converting to arbitrary-order huge pages needs to rework assumptions around what THP pages mean in various places of the code. Mainly the page table code. Presumably. We don't have anything even resembling a proposal on how this is all going to look like implementation-wise. How does changing the name help with this? How does not having the new name get in the way of it? > And on top of all that, file and anonymous pages are just more alike than they > are different. I don't know what you're basing this on, and you can't just keep making this claim without showing code to actually unify them. They have some stuff in common, and some stuff is deeply different. All about this screams class & subclass. Meanwhile you and Willy just keep coming up with hacks on how we can somehow work around this fact and contort the types to work out anyway. You yourself said that folio including slab and other random stuff is a bonkers idea. But that means we need to deal with properties that are going to be shared between subtypes, and I'm the only one that has come up with a remotely coherent proposal on how to do that. > > (I'll send more patches like the PageSlab() ones to that effect. It's > > easy. The only reason nobody has bothered removing those until now is > > that nobody reported regressions when they were added.) > > I was also pretty frustrated by your response to Willy's struct slab patches. > > You claim to be all in favour of introducing more type safety and splitting > struct page up into multiple types, but on the basis of one objection - that his > patches start marking tail slab pages as PageSlab (and I agree with your > objection, FWIW) - instead of just asking for that to be changed, or posting a > patch that made that change to his series, you said in effect that we shouldn't > be doing any of the struct slab stuff by posting your own much more limited > refactoring, that was only targeted at the compound_head() issue, which we all > agree is a distraction and not the real issue. Why are you letting yourself get > distracted by that? Kent, you can't be serious. I actually did exactly what you suggested I should have done. The struct slab patches are the right thing to do. I had one minor concern (which you seem to share) and suggested a small cleanup. Willy worried about this cleanup adding a needless compound_head() call, so *I sent patches to eliminate this call and allow this cleanup and the struct slab patches to go ahead.* My patches are to unblock Willy's. He then moved the goal posts and started talking about prefetching, but that isn't my fault. I was collaborating and putting my own time and effort where my mouth is. Can you please debug your own approach to reading these conversations?