Received: by 10.213.65.68 with SMTP id h4csp851892imn; Wed, 14 Mar 2018 01:53:11 -0700 (PDT) X-Google-Smtp-Source: AG47ELshPLI/63QsjJPLoJP4UbxbmkUxPyXBi70982YBBkUoOv1G8imAWSqDIl4vFY3cirCludzF X-Received: by 10.101.91.133 with SMTP id i5mr3013735pgr.20.1521017591851; Wed, 14 Mar 2018 01:53:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521017591; cv=none; d=google.com; s=arc-20160816; b=zVuOLB3wDB+flRR91KRD5pt8fxIWwjl9NcqAYdnbYqepo7WaPbj7fzKE5XIX84voS9 LecoRya9uEbuwLjFalwCy+Zg4gsyoK22VSlORxwdNP5VchrKOJDxzEAq10vlvpHPbOP5 ixtFW5/sPVVPtVoFbKurx9K8gxwNwM0NiNLB3UJmBHMWWdCf80YHu9kZc9SgK1JEJAbO 4tTVotQhJlxZQCy0syahxHq8tXkmYBl/YJ311TabGQgqMC/hJQdesL6tc0lpZy0W1zc4 c7zBuFgg5ejQEEskRjnzbJumRMOj04n4Sya/BzlX4g3a/K2RCfE60T3FtsHgpiYZ368+ pe2Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=5Hcz7R1ULUSJgt8poYETJFObSycBIwl1DNTvNiHDrt0=; b=QsT4KN26HcmcRI59y6a0ctmtC3w3XLp/Br/EKCr1HG6t9203OvA8aILuAxkS0LGAJY 1dFZ7QePLZIHCuoTvljrMtxPhRIH6suqlwMLISafxjOfavsNpfGAODdo2welBAJixOmt sDb5CbD+jZlju44qe0PKyYpPSeoMf/6nSGPaMfOG6/4BzG+MeTlsMWe/Y4FmunH+yBxj HEqp6GivImkL7wkpz1sDlVKP7Yoc1PuM3oLrLszPg3AD9/Qj3VIslW/eVqAS2xNDIKdo 2i+QSTlX3cBLIvklq8KKQSFb+Vi52uhsrx7hp9mJmC+W9Q2silsJsuF+ao0kD1ISSjyG MFfQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=merlin.20170209 header.b=Z+mWHi+D; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 3-v6si1605609plr.440.2018.03.14.01.52.57; Wed, 14 Mar 2018 01:53:11 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=merlin.20170209 header.b=Z+mWHi+D; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752520AbeCNIwG (ORCPT + 99 others); Wed, 14 Mar 2018 04:52:06 -0400 Received: from merlin.infradead.org ([205.233.59.134]:35012 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751480AbeCNIwF (ORCPT ); Wed, 14 Mar 2018 04:52:05 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=merlin.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=5Hcz7R1ULUSJgt8poYETJFObSycBIwl1DNTvNiHDrt0=; b=Z+mWHi+DIoGaqUOUtYyeInpkf XWNUDnyV53O6vfTCK6/+dWA0yr/5ucWMxXGqtjybLDUrvuJUxBwW/I9xZrGtbi4FjOLcfSnWyqmnE mbnLsHEqAEjmBaNhM/8vRUzzTVWK2Em79ESVwu4cIjwp4brBKbUWyb4Pg/ZH/GnYwYZ1cbb/BXUEV PBlG5K6Nn8kHpIGAEMFw5VbTOKYdO8halcChXOMhG6Uoy8orAcfZJ71bjrhsZBELJGN0PQMZtSdzn 00xyqFryKZteCk6HKhzUTPo5bJCZ80wl8lBcOKLLiC1yMezg8v/+sFSGvAoC6xpN37N4ca/L9/Phd O1MHErTMg==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=hirez.programming.kicks-ass.net) by merlin.infradead.org with esmtpsa (Exim 4.90_1 #2 (Red Hat Linux)) id 1ew25V-0004C9-7T; Wed, 14 Mar 2018 08:48:45 +0000 Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id 0FAA12029F86C; Wed, 14 Mar 2018 09:48:44 +0100 (CET) Date: Wed, 14 Mar 2018 09:48:44 +0100 From: Peter Zijlstra To: Laurent Dufour Cc: paulmck@linux.vnet.ibm.com, akpm@linux-foundation.org, kirill@shutemov.name, ak@linux.intel.com, mhocko@kernel.org, dave@stgolabs.net, jack@suse.cz, Matthew Wilcox , benh@kernel.crashing.org, mpe@ellerman.id.au, paulus@samba.org, Thomas Gleixner , Ingo Molnar , hpa@zytor.com, Will Deacon , Sergey Senozhatsky , Andrea Arcangeli , Alexei Starovoitov , kemi.wang@intel.com, sergey.senozhatsky.work@gmail.com, Daniel Jordan , linux-kernel@vger.kernel.org, linux-mm@kvack.org, haren@linux.vnet.ibm.com, khandual@linux.vnet.ibm.com, npiggin@gmail.com, bsingharora@gmail.com, Tim Chen , linuxppc-dev@lists.ozlabs.org, x86@kernel.org Subject: Re: [PATCH v9 17/24] mm: Protect mm_rb tree with a rwlock Message-ID: <20180314084844.GP4043@hirez.programming.kicks-ass.net> References: <1520963994-28477-1-git-send-email-ldufour@linux.vnet.ibm.com> <1520963994-28477-18-git-send-email-ldufour@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1520963994-28477-18-git-send-email-ldufour@linux.vnet.ibm.com> User-Agent: Mutt/1.9.3 (2018-01-21) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 13, 2018 at 06:59:47PM +0100, Laurent Dufour wrote: > This change is inspired by the Peter's proposal patch [1] which was > protecting the VMA using SRCU. Unfortunately, SRCU is not scaling well in > that particular case, and it is introducing major performance degradation > due to excessive scheduling operations. Do you happen to have a little more detail on that? > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h > index 34fde7111e88..28c763ea1036 100644 > --- a/include/linux/mm_types.h > +++ b/include/linux/mm_types.h > @@ -335,6 +335,7 @@ struct vm_area_struct { > struct vm_userfaultfd_ctx vm_userfaultfd_ctx; > #ifdef CONFIG_SPECULATIVE_PAGE_FAULT > seqcount_t vm_sequence; > + atomic_t vm_ref_count; /* see vma_get(), vma_put() */ > #endif > } __randomize_layout; > > @@ -353,6 +354,9 @@ struct kioctx_table; > struct mm_struct { > struct vm_area_struct *mmap; /* list of VMAs */ > struct rb_root mm_rb; > +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT > + rwlock_t mm_rb_lock; > +#endif > u32 vmacache_seqnum; /* per-thread vmacache */ > #ifdef CONFIG_MMU > unsigned long (*get_unmapped_area) (struct file *filp, When I tried this, it simply traded contention on mmap_sem for contention on these two cachelines. This was for the concurrent fault benchmark, where mmap_sem is only ever acquired for reading (so no blocking ever happens) and the bottle-neck was really pure cacheline access. Only by using RCU can you avoid that thrashing. Also note that if your database allocates the one giant mapping, it'll be _one_ VMA and that vm_ref_count gets _very_ hot indeed.