Received: by 2002:a25:31c3:0:0:0:0:0 with SMTP id x186csp749290ybx; Fri, 1 Nov 2019 10:35:24 -0700 (PDT) X-Google-Smtp-Source: APXvYqxLZCaqAqlJXUGAlXwJOlTh5UFwyJ0kuckhetzgWHIoKLqos/Num/jbho16+2CnsAaIsNMG X-Received: by 2002:aa7:da10:: with SMTP id r16mr14248063eds.304.1572629724741; Fri, 01 Nov 2019 10:35:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1572629724; cv=none; d=google.com; s=arc-20160816; b=kmOvAjA6r8gjr9a017NDP7CIhFmO59/xU995cg355qp7Z1vuFY80XuZlmLkNwm3qC0 fe2Cm15LyCVeNUQtYodM/+lPAd+fMLUn6EJF5EHKF+6KkMDM/8mdj4YX7ncF4tpA1OYP bBJU6pZt2cLfx0Q2sBMOSd8Az/jah99UV3CpP/nTe++Ypsh6cvJxAOmIvoZsYDZMsAHg pt+hW40Hm2OgsvKSBN0f4HsgZ66y2yHoNhlY5ImeclU47BLM0PjO9ar+Ub6x47UULwe+ Qc0UOHbm+8uC8Cj6UsZ8Lz0HI2CAhca3hk/urniswjRt9MuXbmlFx2ZcEQt8T9M3upOy VZfg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=5hesdD5hcV/xllzD4Wj6y+hHOcXjCriR/eNQvYu85nI=; b=u/lg40KZYFx/xBuDJc77zBv3Hm4yv9dIziLN+hHQ19q7rh3Zq+LyVjlL/jPBGD96Xm NmmqYVPFoI+JIS6iybA1YqOsmagKXnAoax1vHP8Vge9vxA7xIo3fJHoGr1q+8da9oUvm i2StRwOhFYIUt0caJjBckE3EA14Es3lCMw/f6NtErKTAhediJfMV1yRn3aXc9tFw3sYu mXo7l0Pay9xellx1UXoCe8rjyJcebWbqyktjNVLBLuYIOCf5yzZ5lQfr0CltyGMnpxY3 6hlbqgBoLs8wNx0jXP1E3A8aF8frlgX6K+VCIXa69sucJCTmTfjPQmU84R97VjMUmhLA Ox4Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=2qDebQSS; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v19si6201008ejx.308.2019.11.01.10.35.01; Fri, 01 Nov 2019 10:35:24 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=2qDebQSS; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728853AbfKAR27 (ORCPT + 99 others); Fri, 1 Nov 2019 13:28:59 -0400 Received: from mail.kernel.org ([198.145.29.99]:40750 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727207AbfKAR27 (ORCPT ); Fri, 1 Nov 2019 13:28:59 -0400 Received: from willie-the-truck (236.31.169.217.in-addr.arpa [217.169.31.236]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 539D62085B; Fri, 1 Nov 2019 17:28:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1572629337; bh=m02R3EnPw9HBK/yXEyfNbqhE61yOKQTOoNAaAEoR/Gk=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=2qDebQSS7mENXOUn9hv2TqOTWqQRgqiHFhwkM+1ALr14M42QYl5GXSdhetj/+RkYb T/wGe+xO/pj4RHWtB+LmtZBZ8Spnkggoa83J0JCy1qsgzdw6jJsdnaZKOZuisqMLFH MdvA/QV78Sp0FTsyeMo3TE2ejfHYxgR7m8uAMzzk= Date: Fri, 1 Nov 2019 17:28:51 +0000 From: Will Deacon To: "qi.fuli@fujitsu.com" Cc: Jonathan Corbet , Catalin Marinas , Will Deacon , Itaru Kitayama , "peterz@infradead.org" , Jon Masters , "linux-doc@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "linux-arm-kernel@lists.infradead.org" , "indou.takao@fujitsu.com" , "maeda.naoaki@fujitsu.com" , "misono.tomohiro@fujitsu.com" , "tokamoto@jp.fujitsu.com" Subject: Re: [PATCH 0/2] arm64: Introduce boot parameter to disable TLB flush instruction within the same inner shareable domain Message-ID: <20191101172851.GC3983@willie-the-truck> References: <20190617143255.10462-1-indou.takao@jp.fujitsu.com> <93009dbd-b31c-7364-86d2-21f0fac36676@jp.fujitsu.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <93009dbd-b31c-7364-86d2-21f0fac36676@jp.fujitsu.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, [please note that my email address has changed and the old one doesn't work any more] On Fri, Nov 01, 2019 at 09:56:05AM +0000, qi.fuli@fujitsu.com wrote: > First of all thanks for the comments for the patch. > > I'm still struggling with this problem to find out the solution. > As a result of an investigation on this problem, after all, I think it > is necessary to improve TLB flush mechanism of the kernel to fix this > problem completely. > > So, I'd like to restart a discussion. At first, I summarize this problem > to recall what was the problem and then I want to discuss how to fix it. > > Summary of the problem: > A few months ago I proposed patches to solve a performance problem due > to TLB flush.[1] > > A problem is that TLB flush on a core affects all other cores even if > all other cores do not need actual flush, and it causes performance > degradation. > > In this thread, I explained that: > * I found a performance problem which is caused by TLBI-is instruction. > * The problem occurs like this: > 1) On a core, OS tries to flush TLB using TLBI-is instruction > 2) TLBI-is instruction causes a broadcast to all other cores, and > each core received hard-wired signal > 3) Each core check if there are TLB entries which have the specified > ASID/VA For those following along at home, my understanding is that this "check" effectively stalls the pipeline as though it is being performed in software. Some questions: Does this mean a malicious virtual machine can effectively DoS the system? What about a malicious application calling mprotect()? Do all broadcast TLBI instructions cause this expensive check, or are some significantly slower than others? > 4) This check causes performance degradation > * We ran FWQ[2] and detected OS jitter due to this problem, this noise > is serious for HPC usage. > > The noise means here a difference between maximum time and minimum time > which the same work takes. > > How to fix: > I think the cause is TLB flush by TLBI-is because the instruction > affects cores that are not related to its flush. Does broadcast I-cache maintenance cause the same problem? > So the previous patch I posted is > * Use mm_cpumask in mm_struct to find appropriate CPUs for TLB flush > * Exec TLBI instead of TLBI-is only to CPUs specified by mm_cpumask > (This is the same behavior as arm32 and x86) > > And after the discussion about this patch, I got the following comments. > 1) This patch switches the behavior (original flush by TLBI-is and new > flush by TLBI) by boot parameter, this implementation is not acceptable > due to bad maintainability. > 2) Even if this patch fixes this problem, it may cause another > performance problem. > > I'd like to start over the implementation by considering these points. > For the second comment above, I will run a benchmark test to analyze the > impact on performance. > Please let me know if there are other points I should take into > consideration. I think it's worth bearing in mind that I have little sympathy for the problem that you are seeing. As far as I can tell, you've done the following: 1. You designed a CPU micro-architecture that stalls whenever it receives a TLB invalidation request. 2. You integrated said CPU design into a system where broadcast TLB invalidation is not filtered and therefore stalls every CPU every time that /any/ TLB invalidation is broadcast. 3. You deployed a mixture of Linux and jitter-sensitive software on this system, and now you're failing to meet your performance requirements. Have I got that right? If so, given that your CPU design isn't widely available, nobody else appears to have made this mistake and jitter hasn't been reported as an issue for any other systems, it's very unlikely that we're going to make invasive upstream kernel changes to support you. I'm sorry, but all I can suggest is that you check that your micro-architecture and performance requirements are aligned with the design of Linux *before* building another machine like this in future. I hate to be blunt, but I also don't want to waste your time. Thanks, Will