Received: by 2002:a05:7412:2a8c:b0:e2:908c:2ebd with SMTP id u12csp2262697rdh; Tue, 26 Sep 2023 18:56:40 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGQwjrSs8RAVQwnvEinJX8V8AS2Bc7DF7ce20RuOirXTw7I85qXOjemCJGUA9NUywuodTzC X-Received: by 2002:a17:902:ce90:b0:1c3:bc7b:8805 with SMTP id f16-20020a170902ce9000b001c3bc7b8805mr526942plg.52.1695779800608; Tue, 26 Sep 2023 18:56:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695779800; cv=none; d=google.com; s=arc-20160816; b=0CS415KXlEvZMNM5LXWH9ojOnFcinEscoYpiXApXtLGiD7r0EsQqU+R+5pm/I0ZniB caw5vT+7oq29fAQsybVdlO6VclpfdGiZAZ9clt+T7pKq6eSm1myWbIelmXtHC0vnGXBa lfcYfniPimvhRC344dG39VRuhHJUl/kCnDHGX8VItDTl++S0xrPVJ9XJig3Cehi5Kt1q n/ZAu2Q44r8s0nahmnEtiK6pYtXyZv43y51Zk/lmyyz/qD0YAVFX6cD/ilT5Ps3Oomhr U/6HWEZeZ6bAKBnCtlGJCmVBbmGp5noMrQvp4EJB+S7bFwcFB2SuO0JQ3oiVToaYRk2U G3gw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=3b+RVbB+4o77UpDwBTm5/0td/Pf5JxoV1XUePhf/9EY=; fh=klBNXgqHNLN28Sy9xWegPy1QuF+zplPcI1MHQa5+hpk=; b=M9ZEM3fz/A9rBVPlQsvgsK200kWdJHKvwghd7dMrTfB9moLphaXj5EN0R26A4AD+gc MNzFeRMgsA8eNESNUvHBHitJ/9IcV6JBmgWystxKG+Z5L+GnD/fyC7ziCO03nFUIs8b5 CB5cOLvFlGNEcsVKWeGDKfnG0oJM4keVz1lzkPSR0NZ42b5ZnOI6JylB/c9vdCjgRUzT 4uEVcYrg6c7sePb/lQkyNDNx0LpFJAVfCToHTOKSF2M1QAd3i04Y5TeoPklQU1fkIjhV wdpuPwYP6frFe6uBpPr2S6ub5XOuc+y40mKU1GhvbB6lkFbn8vEOast3FE+A0vY1Uesj vX/w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=korg header.b=Gid+BjIV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id km11-20020a17090327cb00b001c4248c3f8bsi13027963plb.559.2023.09.26.18.56.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 Sep 2023 18:56:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=korg header.b=Gid+BjIV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 4FCC981444F6; Tue, 26 Sep 2023 16:29:56 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231781AbjIZX3y (ORCPT + 99 others); Tue, 26 Sep 2023 19:29:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47758 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229581AbjIZX1x (ORCPT ); Tue, 26 Sep 2023 19:27:53 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5EBE41F9D0 for ; Tue, 26 Sep 2023 15:32:20 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8A278C43215; Tue, 26 Sep 2023 21:15:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1695762931; bh=8Ow3T98wjJTpXnAUN+3mn4M9J+ijsbPHhUgnoSqynpo=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=Gid+BjIVN7lHoeH6PgshhbGSLsnYcSQ1UpDs30BX2V0A6+VgKPuvAql/4YzmQKvG5 g1KibwCqaFMsRCeNj6/WTFIm6f4P330z30ldUIT0SfYVGTP2m1Oo1xbGSEPRGP0dwG NxvraLRp4y9jId8iLhVbFiSEqYLXBbeFcl93UiJw= Date: Tue, 26 Sep 2023 14:15:30 -0700 From: Andrew Morton To: riel@surriel.com Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, linux-mm@kvack.org, muchun.song@linux.dev, mike.kravetz@oracle.com, leit@meta.com, willy@infradead.org Subject: Re: [PATCH 2/3] hugetlbfs: close race between MADV_DONTNEED and page fault Message-Id: <20230926141530.26bc8550f2f2411945b566f1@linux-foundation.org> In-Reply-To: <20230926031245.795759-3-riel@surriel.com> References: <20230926031245.795759-1-riel@surriel.com> <20230926031245.795759-3-riel@surriel.com> X-Mailer: Sylpheed 3.8.0beta1 (GTK+ 2.24.33; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3.6 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Tue, 26 Sep 2023 16:29:56 -0700 (PDT) On Mon, 25 Sep 2023 23:10:51 -0400 riel@surriel.com wrote: > From: Rik van Riel > > Malloc libraries, like jemalloc and tcalloc, take decisions on when > to call madvise independently from the code in the main application. > > This sometimes results in the application page faulting on an address, > right after the malloc library has shot down the backing memory with > MADV_DONTNEED. > > Usually this is harmless, because we always have some 4kB pages > sitting around to satisfy a page fault. However, with hugetlbfs > systems often allocate only the exact number of huge pages that > the application wants. > > Due to TLB batching, hugetlbfs MADV_DONTNEED will free pages outside of > any lock taken on the page fault path, which can open up the following > race condition: > > CPU 1 CPU 2 > > MADV_DONTNEED > unmap page > shoot down TLB entry > page fault > fail to allocate a huge page > killed with SIGBUS > free page > > Fix that race by pulling the locking from __unmap_hugepage_final_range > into helper functions called from zap_page_range_single. This ensures > page faults stay locked out of the MADV_DONTNEED VMA until the > huge pages have actually been freed. > Was a -stable backport considered?