Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp19179imm; Tue, 17 Jul 2018 13:06:18 -0700 (PDT) X-Google-Smtp-Source: AAOMgpcs6sAmj/v8uavuRbFVzi8WYFb7G+Ip3GeMqVMGzxixkXALcZfgQ/EC0MRuOjNroHhRmHY1 X-Received: by 2002:a17:902:2887:: with SMTP id f7-v6mr2981597plb.150.1531857978347; Tue, 17 Jul 2018 13:06:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1531857978; cv=none; d=google.com; s=arc-20160816; b=wzQD7UFfcoyJbZyWEqS/r8+pJIi30T03OHNQjX37IyNMKGrVawNSN9GrDN8266upfz Gt0wGfc2tlsYXey5jV7RzjSr1Ts/QZdO4H7CQSZSPd/CSSJqaNhf2gzsBu9Z6TS/eafX lla8kk4mXGM+MTQeNqNEiB/oqh/UgEnEsL5jiMfJWmdMwnKjPyVWUxOPmMkemsVOoI03 7uMRPEEJU1JHIBWBIB8FWzH8cZNym/bst2uHstqKeOGJipQoiEbXNYtE8NMXb2qg6uzy 5EUiwvB8tf9KUbuLXg4o5Iy3s/M+XtYM3SINp2rz44Q+Xp/BiUH0CCmmVBnvfwVRGZkj IDeQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=7krj5vzCo4aXxjkGrhun3PTFFkaTE+7YyEL4uLUcrnI=; b=E/twETjzP3x/5gERwuLn0G8dX/dEniob6MpdtUZ6pDdbuT5UZ/2zwWzW1fn39RlpjY 5dt9xsGJfDTIWJ5U44DWFcEF65P6btnWVtLbzxuQloGkOhx0HxvTPQM9T0UB3dAA/npH nwYnFgNF1Iyu2NrMBF/dNPm8NLb84QHDM509bdVLZhT+cqM5LoS+fSBvmYugOESKCnkY z3rQvHDBYMp8grBWrw8qQTol05o21+W7rbHZ0gOLlYOk/nEkW+icnkj4lrkke8asFNUe 0zI9loTDPnJDts4swjuos2Frajx1FoMTqamyG+9sZzmXZIDKZ4PUWvexgswE6gncriZg iR2w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=qIR1PKKi; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c38-v6si1711607pgb.489.2018.07.17.13.06.01; Tue, 17 Jul 2018 13:06:18 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=qIR1PKKi; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730161AbeGQUin (ORCPT + 99 others); Tue, 17 Jul 2018 16:38:43 -0400 Received: from mail.kernel.org ([198.145.29.99]:54540 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730027AbeGQUim (ORCPT ); Tue, 17 Jul 2018 16:38:42 -0400 Received: from mail-wm0-f53.google.com (mail-wm0-f53.google.com [74.125.82.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 9E91620834 for ; Tue, 17 Jul 2018 20:04:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1531857869; bh=W3m6y5FiKhTM8zUTgIIJg4Q9yMvxEpuw2xp8Z3qe/5w=; h=In-Reply-To:References:From:Date:Subject:To:Cc:From; b=qIR1PKKiqVAyBT0fAvgMcQRuStU8PtgJ9K50GvShL/O3Y8Ql9DX2kcGRdzYR74Pfv zZEEsToh2bqfSqvNT/AYn9BwKDcMMdKNV8Q5sA0WdirXr2d9zEPE27eyKa9qJ/s9cz Zy02fPDx/gJcXwtN3IbttD9dz6awMpHPUzWbO744= Received: by mail-wm0-f53.google.com with SMTP id h20-v6so555784wmb.4 for ; Tue, 17 Jul 2018 13:04:29 -0700 (PDT) X-Gm-Message-State: AOUpUlEfQ5gTW4fk7I2vXi8WjkUm1FKWKuPZSCaUsg+L+I/Tpr/hLgU0 noLb/LLYJAh4VkaqdlTlI8r6McwXPd/IiTwOnuYRDg== X-Received: by 2002:a1c:ef0f:: with SMTP id n15-v6mr2109118wmh.116.1531857868131; Tue, 17 Jul 2018 13:04:28 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a1c:d548:0:0:0:0:0 with HTTP; Tue, 17 Jul 2018 13:04:07 -0700 (PDT) In-Reply-To: <20180716190337.26133-5-riel@surriel.com> References: <20180716190337.26133-1-riel@surriel.com> <20180716190337.26133-5-riel@surriel.com> From: Andy Lutomirski Date: Tue, 17 Jul 2018 13:04:07 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH 4/7] x86,tlb: make lazy TLB mode lazier To: Rik van Riel Cc: LKML , X86 ML , Andrew Lutomirski , Mike Galbraith , kernel-team , Ingo Molnar , Dave Hansen Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jul 16, 2018 at 12:03 PM, Rik van Riel wrote: > Lazy TLB mode can result in an idle CPU being woken up by a TLB flush, > when all it really needs to do is reload %CR3 at the next context switch, > assuming no page table pages got freed. > > Memory ordering is used to prevent race conditions between switch_mm_irqs_off, > which checks whether .tlb_gen changed, and the TLB invalidation code, which > increments .tlb_gen whenever page table entries get invalidated. > > The atomic increment in inc_mm_tlb_gen is its own barrier; the context > switch code adds an explicit barrier between reading tlbstate.is_lazy and > next->context.tlb_gen. > > Unlike the 2016 version of this patch, CPUs with cpu_tlbstate.is_lazy set > are not removed from the mm_cpumask(mm), since that would prevent the TLB > flush IPIs at page table free time from being sent to all the CPUs > that need them. > > This patch reduces total CPU use in the system by about 1-2% for a > memcache workload on two socket systems, and by about 1% for a heavily > multi-process netperf between two systems. > I'm not 100% certain I'm replying to the right email, and I haven't gotten the tip-bot notification at all, but: I think you've introduced a minor-ish performance regression due to changing the old (admittedly terribly documented) control flow a bit. Before, if real_prev == next, we would skip: load_mm_cr4(next); switch_ldt(real_prev, next); Now we don't any more. I think you should reinstate that optimization. It's probably as simple as wrapping them in an if (real_priv != next) with a comment like /* Remote changes that would require a cr4 or ldt reload will unconditionally send an IPI even to lazy CPUs. So, if we aren't changing our mm, we don't need to refresh cr4 or the ldt */ Hmm. load_mm_cr4() should bypass itself when mm == &init_mm. Want to fix that part or should I? --Andy