Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp964154imm; Wed, 18 Jul 2018 14:00:05 -0700 (PDT) X-Google-Smtp-Source: AAOMgpeVN0pjWKe9G+FKIuOe/Pr9HndLsV2nPpwj/amiyP+s/QxZpfzz/Z0gwsdXOT0tT5JAcw6Z X-Received: by 2002:a63:220d:: with SMTP id i13-v6mr7367206pgi.212.1531947605883; Wed, 18 Jul 2018 14:00:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1531947605; cv=none; d=google.com; s=arc-20160816; b=Qa2Z3I66tBtcfwLPbMyGgg+dEKM1+qkzVXrEWAJSJlMpG7g0yVmEXaVVCxvsC3Iykd zpol1ahH2iqGpHYsnBneZoGUCWxJ6r8JhUwxHuOIool8DV6Ek6mYNTnQnJq+QyNAIlXR EhpfgUMDh9AkaJBPFig+DPiutyiUkKsru2KfjBAMubm/67qU7MKKZ3YjLI0AtbmQs1FU LwTInQU21NQFd9iYX6fVhsetXErybdWOQPN37rSM1d1Ywb5OuTYUQJAQQy4rzpeAW714 Vp/+aQuKWM3fsrnN23bgCBmdN3E7s6UlqDiosyYHs7sGFOxWSHefn1w8hYTpuYWsCS3w nFrw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:to:references:message-id :content-transfer-encoding:cc:date:in-reply-to:from:subject :mime-version:arc-authentication-results; bh=1CwobBXkNyofVdlj4iLVQzyOb0TEQZxW2RjFCEZhzqY=; b=E4v9JcpnK5zyvw5UCow8EUW89eHJf1X0NGmrNrpmlNvds+zDV2k3Tyw45Y4cu1XdbM kI9V7nsB2wnvHZdfDbA/6pp5/mTObgMCK5iuMuAIKjX6jaNSQ7bROu9R5CmVIuiGTd66 5YThAiARyL4KvDbHE6BsbGjdE8vmfqSo8526jMAtDJXBU07ft1J8QFivpOC33gUsKOoa pHr1b2RLpo55AxqJ9r5R/M4wVrddkdIUj0BxCe7RZ+dj1+85Be99kivuWyUF41e/5k8s vHWL03x59EIbODnsi3FXNSDSEV3Ifi5hbW8cLTPSye2RyYoj2YrJCSWGc/ce4ihEGqxh OArg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s184-v6si3966956pgs.492.2018.07.18.13.59.51; Wed, 18 Jul 2018 14:00:05 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729679AbeGRViR (ORCPT + 99 others); Wed, 18 Jul 2018 17:38:17 -0400 Received: from shelob.surriel.com ([96.67.55.147]:53838 "EHLO shelob.surriel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728929AbeGRViR (ORCPT ); Wed, 18 Jul 2018 17:38:17 -0400 Received: from [2620:10d:c091:200::3:15a3] (helo=[IPv6:2620:10d:c0a3:10fb:4c:7ee9:b6e2:d1d5]) by shelob.surriel.com with esmtpsa (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.90_1) (envelope-from ) id 1fftWo-0001UX-Ok; Wed, 18 Jul 2018 16:58:30 -0400 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\)) Subject: Re: [PATCH 4/7] x86,tlb: make lazy TLB mode lazier From: Rik van Riel In-Reply-To: Date: Wed, 18 Jul 2018 16:58:28 -0400 Cc: LKML , X86 ML , Mike Galbraith , kernel-team , Ingo Molnar , Dave Hansen Content-Transfer-Encoding: 7bit Message-Id: References: <20180716190337.26133-1-riel@surriel.com> <20180716190337.26133-5-riel@surriel.com> To: Andy Lutomirski X-Mailer: Apple Mail (2.3445.9.1) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > On Jul 17, 2018, at 4:04 PM, Andy Lutomirski wrote: > > > I think you've introduced a minor-ish performance regression due to > changing the old (admittedly terribly documented) control flow a bit. > Before, if real_prev == next, we would skip: > > load_mm_cr4(next); > switch_ldt(real_prev, next); > > Now we don't any more. I think you should reinstate that > optimization. It's probably as simple as wrapping them in an if > (real_priv != next) with a comment like /* Remote changes that would > require a cr4 or ldt reload will unconditionally send an IPI even to > lazy CPUs. So, if we aren't changing our mm, we don't need to refresh > cr4 or the ldt */ Looks like switch_ldt already skips reloading the LDT when prev equals next, or when they simply have the same LDT values: if (unlikely((unsigned long)prev->context.ldt | (unsigned long)next->context.ldt)) load_mm_ldt(next); It appears that the cr4 bits have a similar optimization: static inline void cr4_set_bits(unsigned long mask) { unsigned long cr4, flags; local_irq_save(flags); cr4 = this_cpu_read(cpu_tlbstate.cr4); if ((cr4 | mask) != cr4) __cr4_set(cr4 | mask); local_irq_restore(flags); } > > Hmm. load_mm_cr4() should bypass itself when mm == &init_mm. Want to > fix that part or should I? > Looks like there might not be anything to do here, after all. On to the lazy TLB mm_struct refcounting stuff :)