Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 59B36C05027 for ; Tue, 14 Feb 2023 19:17:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231845AbjBNTRs (ORCPT ); Tue, 14 Feb 2023 14:17:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32996 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231935AbjBNTRo (ORCPT ); Tue, 14 Feb 2023 14:17:44 -0500 Received: from mail-pj1-x102b.google.com (mail-pj1-x102b.google.com [IPv6:2607:f8b0:4864:20::102b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6B53526879; Tue, 14 Feb 2023 11:17:41 -0800 (PST) Received: by mail-pj1-x102b.google.com with SMTP id d8-20020a17090ad98800b002344fa17c8bso469797pjv.5; Tue, 14 Feb 2023 11:17:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=G6VI4UtKMXGf1NILxkcNf7EW/7unsO+N5Ql2yHcyswI=; b=RD75b2TbvISNpQa3I0LdGHwJUdi6tx3O6Uqz1DY/dlHboU15NvSDsBv8Lq1f72CV8P F7ApiY7xk9aklr/3IwN3wHEQhzw2Xh0CpapfVgCAsdxo+h1tm2hPwuL6XlR3TH29OcHc AunP9G4sloIKw138O9BYP8/EVaVl5nKcS1wWYJJ9BRt4hcxSrBZ03FLk2yrrCeS78g0e pdPTY5RwZ55QsWpUh03LCfzaFie5td3diC3UQ6NxPJ/eXLDrh7egh3y23ZIfxvk8T2PG Yx8ziVNHlt1m7wSbU/WG2Lc4124UhOS1BjYtxLKL0Tj+qwcevi7ivzeLGFeZWdWUHkUL 0LOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=G6VI4UtKMXGf1NILxkcNf7EW/7unsO+N5Ql2yHcyswI=; b=XkQG5TuYuu5MRpSxlfF9mWOQlEBXwSwkrHMiu5eZWwAbPu+ceqWq2GVmeR39DJpJOv L7rw6qgRyfTDhb6gzEQtvuO4Nlx1Dr6xHoX8GfLaVdRX3RaDzfA35u1PEEa0hm6dWi+3 DUugrAh9+5cX5TdiqAgKWkHEDh8Ks1afujeTwXpSvC52C7kQ8dcq+9g/6YUOYncqbb0Z lp52xSKeQyKUlLBMLMfSj5onDxvZfUVf0zlbIR32lCk2pp3oM/uN627Npk+Joo8Z5oL+ D8pObPvOhOLa3ak6wBgV//ppIm/Vsosg+qlotvwSHUUduqSh1PUmwK2DqgbmfzPXueFT pKLg== X-Gm-Message-State: AO0yUKXsXoiJQTrJDhGekeiEziI8gftLqfOR9XLqdn3zHz5bXU9q1CFP OAQbEo8zmKL1fDzkLYrgqP0= X-Google-Smtp-Source: AK7set+cZzS5QlAok+A3hUaDau+IPIxcFuWKmquBgdY6gCbMbiInMO7ULCrCSY5QJmWyZW0RRRY2IA== X-Received: by 2002:a05:6a20:914a:b0:bc:d601:ebfc with SMTP id x10-20020a056a20914a00b000bcd601ebfcmr4361736pzc.54.1676402260776; Tue, 14 Feb 2023 11:17:40 -0800 (PST) Received: from strix-laptop (2001-b011-20e0-1465-11be-7287-d61f-f938.dynamic-ip6.hinet.net. [2001:b011:20e0:1465:11be:7287:d61f:f938]) by smtp.gmail.com with ESMTPSA id h18-20020a656392000000b004fb4489969bsm9231927pgv.49.2023.02.14.11.17.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 14 Feb 2023 11:17:40 -0800 (PST) Date: Wed, 15 Feb 2023 03:17:30 +0800 From: Chih-En Lin To: Pasha Tatashin Cc: David Hildenbrand , Andrew Morton , Qi Zheng , "Matthew Wilcox (Oracle)" , Christophe Leroy , John Hubbard , Nadav Amit , Barry Song , Steven Rostedt , Masami Hiramatsu , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Yang Shi , Peter Xu , Vlastimil Babka , Zach O'Keefe , Yun Zhou , Hugh Dickins , Suren Baghdasaryan , Yu Zhao , Juergen Gross , Tong Tiangen , Liu Shixin , Anshuman Khandual , Li kunyu , Minchan Kim , Miaohe Lin , Gautam Menghani , Catalin Marinas , Mark Brown , Will Deacon , Vincenzo Frascino , Thomas Gleixner , "Eric W. Biederman" , Andy Lutomirski , Sebastian Andrzej Siewior , "Liam R. Howlett" , Fenghua Yu , Andrei Vagin , Barret Rhoden , Michal Hocko , "Jason A. Donenfeld" , Alexey Gladkov , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Dinglan Peng , Pedro Fonseca , Jim Huang , Huichun Feng Subject: Re: [PATCH v4 00/14] Introduce Copy-On-Write to Page Table Message-ID: References: <20230207035139.272707-1-shiyn.lin@gmail.com> <62c44d12-933d-ee66-ef50-467cd8d30a58@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Feb 14, 2023 at 01:52:16PM -0500, Pasha Tatashin wrote: > On Tue, Feb 14, 2023 at 1:42 PM Chih-En Lin wrote: > > > > On Tue, Feb 14, 2023 at 11:30:26AM -0500, Pasha Tatashin wrote: > > > > > The thing with THP is, that during fork(), we always allocate a backup PTE > > > > > table, to be able to PTE-map the THP whenever we have to. Otherwise we'd > > > > > have to eventually fail some operations we don't want to fail -- similar to > > > > > the case where break_cow_pte() could fail now due to -ENOMEM although we > > > > > really don't want to fail (e.g., change_pte_range() ). > > > > > > > > > > I always considered that wasteful, because in many scenarios, we'll never > > > > > ever split a THP and possibly waste memory. > > > > > > > > > > Optimizing that for THP (e.g., don't always allocate backup THP, have some > > > > > global allocation backup pool for splits + refill when close-to-empty) might > > > > > provide similar fork() improvements, both in speed and memory consumption > > > > > when it comes to anonymous memory. > > > > > > > > When collapsing huge pages, do/can they reuse those PTEs for backup? > > > > So, we don't have to allocate the PTE or maintain the pool. > > > > > > It might not work for all pages, as collapsing pages might have had > > > holes in the user page table, and there were no PTE tables. > > > > So if there have holes in the user page table, after we doing the > > collapsing and then splitting. Do those holes be filled? Assume it is, > > then, I think it's the reason why it's not work for all the pages. > > > > But, after those operations, Will the user get the additional and > > unexpected memory (which is from the huge page filling)? > > Yes, more memory is going to be allocated for a process in such THP > collapse case. This is similar to madvise huge pages, and touching the > first byte may allocate 2M. Thanks for the explanation. Yeah, It seems like the reuse case can't work for all the pages. Thanks, Chih-En Lin