Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp394975ybt; Fri, 10 Jul 2020 02:37:25 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzsvwxZZB0cuYh57VFhRyUm5aQxYTzNaEKy2TG5CcIzZFREEgEcJXrKo0gCRrDvj2fXWdHA X-Received: by 2002:aa7:c816:: with SMTP id a22mr59549832edt.28.1594373845802; Fri, 10 Jul 2020 02:37:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1594373845; cv=none; d=google.com; s=arc-20160816; b=0AgLAQ6+3A1to0IomqMZrG1gqLnv+zTv19s4S3zjq36TWN1P12RV77xDEfCs0SNslc wskKsANBnwro2PUHrshxhNgMqlMs8h5fLD1PEF2hd7XGb9z5BYsvjOT9J4JI56zmzzFk 3kT/LPn2ZKMmQnkveXzh6mAu1V6gzDFeYP9zTfxmmd2YFRIQYiwf/HPS+uOwf9Ohv76d wrLu/6Z9wvaVAzaMD6BZl8r4C9LtWw4TzN/A2raADWcFsNQZBBjPppPJu60pzMjbqpmQ 82O9R8yoQpG1g9DzX+iu3kvLof+h/9LtX7dTLhU7XyJ3jKXIIxagQiZhlg+NLUlWbmMH ow4A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=Pz3n/bmDz0/TOVCscTPa/2tvCyIHBk++33SZIn5QfrE=; b=lYB/dCoGBTgPy4ZjYhU4fFk278dZtgFD31rARH5WekWBq0KY7bN2v0LbN8hZ5X6jtb sqerbqz3D4/HTg0L5mll+1aajhliOaRN0ubpwhTraoDnBAVmROyumqIYDgyytOw6ZV/l xip7/I14cCfTlT1L+5+3OpyU7w5S6ofF8ubGERD6clhNEyC0cN7h4uaNNyzxVLT1DUO7 jV2e4R1Kj2ZUBvN2/qji9BI0H6rOpY5x0XBcznkdetqnFFpGMieTWktpH4dgFRjKtbP4 wvxztOh9M9zlfq24XDLupDhqDI6Z4ixmQ3vR5GzcQjvY9A27uMxTBS9FZ6VzAuYVcVx6 haMA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=casper.20170209 header.b=qOccywes; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id ca17si3775371ejb.533.2020.07.10.02.37.02; Fri, 10 Jul 2020 02:37:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=casper.20170209 header.b=qOccywes; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726933AbgGJJgQ (ORCPT + 99 others); Fri, 10 Jul 2020 05:36:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54926 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726288AbgGJJgP (ORCPT ); Fri, 10 Jul 2020 05:36:15 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A6E8EC08C5CE; Fri, 10 Jul 2020 02:36:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=Pz3n/bmDz0/TOVCscTPa/2tvCyIHBk++33SZIn5QfrE=; b=qOccywes3VQXkn+HfWcUHh55kG MBOf7pYpA0zeOdJZRzb8sneDHo+1Q3OHPGdPY2ldjVM4B0dbh+9It/saBEg0oROqO4ei9uPf3QKzC oFUwOUawlzbiFkl2L7vbe/Hdpf0GwTPaLwZ1X6QXqxGfEYYscRVeZu90wbFsSX3RFtfhxuzmnQ6TL qX1W81Z3c4irF3kbpp/6qIevZa+30MiN383Fmau66CgysGMLTDGqiO4GbofMgSIrBD9+0H+cicSm4 gm/mnvE5MHLhWI+5T35OceZPl6zIEQzoRSsyByeQivEfIKZI+ZoOHtDKTj5QsaGVCK7eljKXWc6N2 JHJnfJsw==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.92.3 #3 (Red Hat Linux)) id 1jtpRl-0001Ay-M5; Fri, 10 Jul 2020 09:35:57 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 96A0F3013E5; Fri, 10 Jul 2020 11:35:56 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id 837652B5130C4; Fri, 10 Jul 2020 11:35:56 +0200 (CEST) Date: Fri, 10 Jul 2020 11:35:56 +0200 From: Peter Zijlstra To: Nicholas Piggin Cc: linux-arch@vger.kernel.org, x86@kernel.org, Mathieu Desnoyers , Arnd Bergmann , linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-mm@kvack.org, Anton Blanchard Subject: Re: [RFC PATCH 7/7] lazy tlb: shoot lazies, a non-refcounting lazy tlb option Message-ID: <20200710093556.GY4800@hirez.programming.kicks-ass.net> References: <20200710015646.2020871-1-npiggin@gmail.com> <20200710015646.2020871-8-npiggin@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200710015646.2020871-8-npiggin@gmail.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jul 10, 2020 at 11:56:46AM +1000, Nicholas Piggin wrote: > On big systems, the mm refcount can become highly contented when doing > a lot of context switching with threaded applications (particularly > switching between the idle thread and an application thread). > > Abandoning lazy tlb slows switching down quite a bit in the important > user->idle->user cases, so so instead implement a non-refcounted scheme > that causes __mmdrop() to IPI all CPUs in the mm_cpumask and shoot down > any remaining lazy ones. > > On a 16-socket 192-core POWER8 system, a context switching benchmark > with as many software threads as CPUs (so each switch will go in and > out of idle), upstream can achieve a rate of about 1 million context > switches per second. After this patch it goes up to 118 million. That's mighty impressive, however: > +static void shoot_lazy_tlbs(struct mm_struct *mm) > +{ > + if (IS_ENABLED(CONFIG_MMU_LAZY_TLB_SHOOTDOWN)) { > + smp_call_function_many(mm_cpumask(mm), do_shoot_lazy_tlb, (void *)mm, 1); > + do_shoot_lazy_tlb(mm); > + } > +} IIRC you (power) never clear a CPU from that mask, so for other workloads I can see this resulting in massive amounts of IPIs. For instance, take as many processes as you have CPUs. For each, manually walk the task across all CPUs and exit. Again. Clearly, that's an extreme, but still...