Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp1406569pxb; Thu, 24 Mar 2022 20:02:15 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwse+yyV5gTdDEu1qSqYvbSVjJnacEmIq2XUeVXiuuwUEPuvpjNAyRS/dCEr6K5Xp+qPUe0 X-Received: by 2002:a17:907:9801:b0:6db:ab31:96f4 with SMTP id ji1-20020a170907980100b006dbab3196f4mr9101453ejc.571.1648177335356; Thu, 24 Mar 2022 20:02:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1648177335; cv=none; d=google.com; s=arc-20160816; b=Ek/pb/ZWXJhe2qELrTDTnfDPmzneQKxiLjq02KJSFVMdbnwCj5sVlaRTYRWwHbfLz0 kJ9qhyRdKA155FPHudi710Avzr7DvIY8fFNYdlzQL0rfACcbuJUFxbZ5xxL/jkqtv96t H/36pgWnG1bAsWLDAQBA2AZf4rIu7kQzXxOXKjg9gB0F57R74w0lt6hqOXXSVDFY+yDZ W3svLhEJcixf23QlcSaFyrIw0tofpKdTcziTvRyyP/ntbrx+jpk/zxA/sqngeHG13Dku 7c0FWxRmUPfp1V77YH9RwhbfeAQOKPU69S2Nbmd3IWuvEJiSubWaw0eQFFlGnnJrkVZK 0s3A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:content-transfer-encoding :content-id:content-language:accept-language:in-reply-to:references :message-id:date:thread-index:thread-topic:subject:cc:to:from; bh=cMQLT2ClBo3wz/AhWKvGAO/MuKh5kefAdzlh8h3G21Y=; b=tr1VwGQXRX6a6YievhHteSTdpqo3fulfgLXhfmxMqKiGAS0jhxOnAKRgwMqkMsgFAK L/TMptMX3PWiRvh8J4/KZQIerwNOy2egbGXMbvmUG4DLfp4pLUN6VO0qc1NVCMB32xua 0BlTuOeCXG5WdbYJiy7UAU3cOP1IbvGqZbySLb3aDuXCr0laCHEDtoGPwSLs0WqVv1if nTVMOTsUAGDp2zCljpqCGTr9ab7FvRezpgxkeGzkm6Yd0g/i1kad3W86bRKkOt+6hq7k H2o82uEkj6Bc9bO6j75SNc2RPB0tscDDfil7mRyoMM6JEfY6h+i8Ej0UfoXN1Bli1B6x +1ww== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intersystems.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id d23-20020a1709063ed700b006e002ecc818si1220246ejj.379.2022.03.24.20.01.50; Thu, 24 Mar 2022 20:02:15 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intersystems.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1356796AbiCYAEB convert rfc822-to-8bit (ORCPT + 99 others); Thu, 24 Mar 2022 20:04:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41478 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1357211AbiCYADm (ORCPT ); Thu, 24 Mar 2022 20:03:42 -0400 Received: from mail2.intersystems.com (mail2.intersystems.com [38.105.105.84]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 20280B6D13 for ; Thu, 24 Mar 2022 17:02:10 -0700 (PDT) X-InterSystems: Sent from InterSystems X-InterSystems: Sent from InterSystems X-InterSystems: Sent from InterSystems X-InterSystems: Sent from InterSystems From: Ray Fucillo To: Mike Kravetz CC: Ray Fucillo , "linux-kernel@vger.kernel.org" , linux-mm Subject: Re: scalability regressions related to hugetlb_fault() changes Thread-Topic: scalability regressions related to hugetlb_fault() changes Thread-Index: AQHYP7uAlCiDRvHUGUacjbJUOtVkhKzPVz2AgAANE4CAABZqAA== Date: Fri, 25 Mar 2022 00:02:08 +0000 Message-ID: <8E9438A4-56BF-4DBF-9424-2161A488352B@intersystems.com> References: <43faf292-245b-5db5-cce9-369d8fb6bd21@infradead.org> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [172.17.254.204] x-c2processedorg: 5d7e5ca7-6395-445f-80da-8568a4fc58e5 Content-Type: text/plain; charset="us-ascii" Content-ID: <9BD5E1F3232CF04A8B389603D457D492@exchangemail.iscinternal.com> Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > On Mar 24, 2022, at 6:41 PM, Mike Kravetz wrote: > > I also seem to remember thinking about the possibility of > avoiding the synchronization if pmd sharing was not possible. That may be > a relatively easy way to speed things up. Not sure if pmd sharing comes > into play in your customer environments, my guess would be yes (shared > mappings ranges more than 1GB in size and aligned to 1GB). Hi Mike, This is one very large shared memory segment allocated at database startup. It's common for it to be hundreds of GB. We allocate it with shmget() passing SHM_HUGETLB (when huge pages have been reserved for us). Not sure if that answers... > Also, do you have any specifics about the regressions your customers are > seeing? Specifically what paths are holding i_mmap_rwsem in write mode > for long periods of time. I would expect something related to unmap. > Truncation can have long hold times especially if there are may shared > mapping. Always worth checking specifics, but more likely this is a general > issue. We've seen the write lock originate from calling shmat(), shmdt() and process exit. We've also seen it from a fork() off of one of the processes that are attached to the shared memory segment. Some evidence suggests that fork is a more costly case. However, while there are some important places where we'd use fork(), it's more unusual because most process creation will vfork() and execv() a new database process (which then attaches with shmat()).