Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D40F4C61DA2 for ; Thu, 26 Jan 2023 16:59:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231661AbjAZQ7c (ORCPT ); Thu, 26 Jan 2023 11:59:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37430 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229802AbjAZQ7b (ORCPT ); Thu, 26 Jan 2023 11:59:31 -0500 Received: from mail-wr1-x433.google.com (mail-wr1-x433.google.com [IPv6:2a00:1450:4864:20::433]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E121527499 for ; Thu, 26 Jan 2023 08:59:29 -0800 (PST) Received: by mail-wr1-x433.google.com with SMTP id n7so2428878wrx.5 for ; Thu, 26 Jan 2023 08:59:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=neDGA3ZoiU4GEtv5HBc+8mGFjglFh5fImPAsKZ4CGNA=; b=Y9aKhvCdAeSa/x+9sluyhUTp4oQvpT3oy9WWedj87O3Ft+0zx8hCq0sH+J/+hsnW/n PpsJcEESfRYV+/b5qNWeUsplTBvA5dz5gnoadRgGvWYZjy/auuHJvPyIdbWYBZCKB5xp r5rxUcFwA7u0zXU1XJO851zbSyH6RrjaZq1R0pOyF6/POCMmnG8UJ56QcLz4MJY8b+EW PW6KfMGOaG44wSKUDn0Ce7tzkA81kRaoD5E++rtmJlkfTUvlAZMZoFiC5lvb18EM7T9r 3YdTw01gik5GHyWtwHvcFbUU/HquYcvJwIfX3CcPJO0/a2r5V8PmU1t3Ne6agzrUcF0X bCJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=neDGA3ZoiU4GEtv5HBc+8mGFjglFh5fImPAsKZ4CGNA=; b=oLZaWhvoVbYY9aUdUhu1V+MkfiUxiGh2UkZGQ98GqcE/T21NDnLL4hEZ/PfVIKXZIy +ypRpTvQyXbSbieyIz43S+US2YNN6PU5iN6Om8VJXgTmLaAix+RICqKK3dMM0TuivimK l/Xxe46pBu7ao+XkEpI4fAPbqMzFOVtGzLIYNHRy9RRRnpJtQJwXOdmj85z31YSMfFtQ 1J9bo9JMqpub1s3FyEMS6yg+BsOw6QcYPq3f86xC6171U5wEGu/XCqgpwuoHLjSHS3FU sqkXtdN0bwg3+Jdb6DGEA6fH4ZiUR5oggVr5Jw1q8crHsrGxGwx5Je9fBx8iZFppIFyf UUgw== X-Gm-Message-State: AO0yUKVu8p1xhBve58lMEkfHMKuF9P/7WjBsIbcZIm6TJOcsC8zw1iZu kIkgK2G0cl01gt8giqtXvkfQ2HGw3pZuNdkl6paz+w== X-Google-Smtp-Source: AK7set+0Bk/K1r68hC2EVR99RRGeNZ+S3IbX8/b1rRgetk3VUJRNEcZ3nNxrMlDvYshlplWdmB2KgHCX3NBVZw8N5A0= X-Received: by 2002:adf:a54c:0:b0:2bf:b373:149a with SMTP id j12-20020adfa54c000000b002bfb373149amr375162wrb.355.1674752368212; Thu, 26 Jan 2023 08:59:28 -0800 (PST) MIME-Version: 1.0 References: <06423461-c543-56fe-cc63-cabda6871104@redhat.com> <6548b3b3-30c9-8f64-7d28-8a434e0a0b80@redhat.com> In-Reply-To: From: James Houghton Date: Thu, 26 Jan 2023 08:58:51 -0800 Message-ID: Subject: Re: [PATCH 21/46] hugetlb: use struct hugetlb_pte for walk_hugetlb_range To: Mike Kravetz Cc: Peter Xu , David Hildenbrand , Muchun Song , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jan 19, 2023 at 11:42 AM James Houghton wrote: > > On Thu, Jan 19, 2023 at 9:32 AM Mike Kravetz wrote: > > > > On 01/19/23 08:57, James Houghton wrote: > > > FWIW, what makes the most sense to me right now is to implement the > > > THP-like scheme and mark HGM as mutually exclusive with the vmemmap > > > optimization. We can later come up with a scheme that lets us retain > > > compatibility. (Is that what you mean by "this can be done somewhat > > > independently", Mike?) > > > > Sort of, I was only saying that getting the ref/map counting right seems > > like a task than can be independently worked. Using the THP-like scheme > > is good. > > Ok! So if you're ok with the intermediate mapping sizes, it sounds > like I should go ahead and implement the THP-like scheme. It turns out that the THP-like scheme significantly slows down MADV_COLLAPSE: decrementing the mapcounts for the 4K subpages becomes the vast majority of the time spent in MADV_COLLAPSE when collapsing 1G mappings. It is doing 262k atomic decrements, so this makes sense. This is only really a problem because this is done between mmu_notifier_invalidate_range_start() and mmu_notifier_invalidate_range_end(), so KVM won't allow vCPUs to access any of the 1G page while we're doing this (and it can take like ~1 second for each 1G, at least on the x86 server I was testing on). - James