Received: by 2002:a05:6358:7058:b0:131:369:b2a3 with SMTP id 24csp1428414rwp; Thu, 13 Jul 2023 10:39:34 -0700 (PDT) X-Google-Smtp-Source: APBJJlFoIPvj2TBjU2OZGn4trja3Pks05YBXNJkvYUj5Cgha+I7q0hNso5mvnW6RMfhW3G9ZQV/E X-Received: by 2002:a67:fd57:0:b0:444:c1f6:5e3a with SMTP id g23-20020a67fd57000000b00444c1f65e3amr1512967vsr.25.1689269974539; Thu, 13 Jul 2023 10:39:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689269974; cv=none; d=google.com; s=arc-20160816; b=mQrYjlONC/cUQypQzEtJ5xpH+nbWxKSN+BSBnQK3QIRR1CxeUumjJ/fb6ocLPu5EQA HBE1I29lacNtBwvqAyAfg4NCi19yjTWpHpsEUG0ZCuKsTF7st/DDvmnt9PtF/Ereb41i LaAwsSGhyx1xMUQLrUFmQDO3QjkFVtbktn5gCS6vdR9q34UvcngpDEu7N2gMIIYRjdaH 0ndy/4iD6MHu/2j48JBA17S3HYZ8anFVE06Q+6eE9HoRWkHc8S8iOtGJtX0jQ+MEAgaL 8SMVzzvMWJo6bVHqKijHYMfy/ab5Csp9CANt0mExpdhlj5Dja2AfQ5jcdtuw45zsawm5 1mSw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=7uEldnb1t08MNmip2vTnC9gPWFwn4oXzQ8E0XYFFNnU=; fh=ikVV47n/iDk5zehlSCqina0DXqiDpz1yaMfanZFhUOc=; b=ok164+Kbfufuq1BnaEabI0701L6LpPIYfeJK7sXxuR7M1bVhvuV4G1fD8MXLgy5+Pz jBH8sPpwQuPS5iCgr4NXvX3eD5+vn7prCIV3v4Ol2iIf0ubvqUGSSlIdKc9iCQ4LFvuW zmpmnBEciBrpQgj78TTCuYDU1gCDSJAhF9mickHNAYFIKQctJhwg1Zyzwa1S6V0qDn4i 3wx1HKsxwQHu4cVQu7EemCRT4PZaETj3keT5KkpFuw/Zfr75RfYPe1v24PxHKNqjlkOg Q3RFuzBtwjh6ZGm7+9ExqZsZY7gRFOTa9PJALONW1du0+i3iZ6cpuDA5/0pGSF7hkq7t o+ZA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=korg header.b=OHpIyvXa; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id o20-20020a656154000000b0051b1966e6b6si545884pgv.521.2023.07.13.10.39.22; Thu, 13 Jul 2023 10:39:34 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=korg header.b=OHpIyvXa; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231171AbjGMReN (ORCPT + 99 others); Thu, 13 Jul 2023 13:34:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55392 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230468AbjGMReM (ORCPT ); Thu, 13 Jul 2023 13:34:12 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D256226A0 for ; Thu, 13 Jul 2023 10:34:10 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 706DB61ADB for ; Thu, 13 Jul 2023 17:34:10 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 390ECC433C7; Thu, 13 Jul 2023 17:34:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1689269649; bh=E5ywfrtI0Wco8qvyzBJpgMPdT7W3NwtbJglM7sjmO7U=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=OHpIyvXaC7NiiFrbSOrnSr6QNzIs34S29H5zYDRfLdgbxGk1H7VW8z7EbpMdSkFp0 N5Kor/L7l2mylFn31eRxzBdTCS+PrMrcIpB1Tm5Jw6CUiUxr7CE7a8QCBsfMQ4dSdG ytOxhDNS4uQPuGaUJ9wMfy6iCgeH5OAVeCyY7nxE= Date: Thu, 13 Jul 2023 10:34:07 -0700 From: Andrew Morton To: Mike Kravetz Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Jiaqi Yan , Naoya Horiguchi , Muchun Song , Miaohe Lin , Axel Rasmussen , James Houghton , Michal Hocko , Greg Kroah-Hartman Subject: Re: [PATCH 0/2] Fix hugetlb free path race with memory errors Message-Id: <20230713103407.902e24dc90e85a9779ba885c@linux-foundation.org> In-Reply-To: <20230711220942.43706-1-mike.kravetz@oracle.com> References: <20230711220942.43706-1-mike.kravetz@oracle.com> X-Mailer: Sylpheed 3.8.0beta1 (GTK+ 2.24.33; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-7.2 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 11 Jul 2023 15:09:40 -0700 Mike Kravetz wrote: > In the discussion of Jiaqi Yan's series "Improve hugetlbfs read on > HWPOISON hugepages" the race window was discovered. > https://lore.kernel.org/linux-mm/20230616233447.GB7371@monkey/ > > Freeing a hugetlb page back to low level memory allocators is performed > in two steps. > 1) Under hugetlb lock, remove page from hugetlb lists and clear destructor > 2) Outside lock, allocate vmemmap if necessary and call low level free > Between these two steps, the hugetlb page will appear as a normal > compound page. However, vmemmap for tail pages could be missing. > If a memory error occurs at this time, we could try to update page > flags non-existant page structs. > > A much more detailed description is in the first patch. > > The first patch addresses the race window. However, it adds a > hugetlb_lock lock/unlock cycle to every vmemmap optimized hugetlb > page free operation. This could lead to slowdowns if one is freeing > a large number of hugetlb pages. > > The second path optimizes the update_and_free_pages_bulk routine > to only take the lock once in bulk operations. > > The second patch is technically not a bug fix, but includes a Fixes > tag and Cc stable to avoid a performance regression. It can be > combined with the first, but was done separately make reviewing easier. > I feel that backporting performance improvements into -stable is not a usual thing to do. Perhaps the fact that it's a regression fix changes this, but why? Much hinges on the magnitude of the performance change. Are you able to quantify this at all?