Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp2398690ybi; Thu, 20 Jun 2019 14:31:27 -0700 (PDT) X-Google-Smtp-Source: APXvYqxBqnFT5ekdHXo/b7tGynTCue3zlPH71vVvgzwQsfS5U+NUCKo8yOLqlMVsuM1mmogDbdGH X-Received: by 2002:a17:90a:3086:: with SMTP id h6mr1846054pjb.14.1561066287835; Thu, 20 Jun 2019 14:31:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1561066287; cv=none; d=google.com; s=arc-20160816; b=XgtUAyH7q1BTSl+l0rpmdVGoAn0dxAvE4a58N3afULWOBVY77XXZ7VoQhxoiqeP2AL WqRODuiAPn2pRztHksn7d/Dxx9l65MVNaRh3/InfFZbo+asUI14Cd+AB5NNSM66M8bqU 1u3RZMGBIPCdUEHyfnFxdrPPMHzmMD1oR7HhrlZ34mvVT/et9z3vyr2Y8UbvZXQZrSpp KRSC4XgD6qpzapVtBdK4ky341LYQtD1RbtipZ3EfvDD1iQiYWjQ/wTzIpehm+7Ce0K8d wRlHllTAyyxyWRQr9kVMm+qAZx3nJer02kmm09H5DHGYQfWoBCrJoSe0tehHP8b5qP4v nlug== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=enjs7CHak9fTSsrtrkc6AsjIa820JWzWgsah7wbOA1Y=; b=bZBko4qYRj7oQa3cYU2rnrXv/I1RIjH/lwz4FktWxZbWqg4bIwF1zNxGsj1o9LJxiI h1jvGT67WUfPxYkCe3O/EyQidMYUpc5ddt5j2iq9CBTa0bJgSzIQNZuC3rUZuqScjRWL gTe72ZvHBqy54u4JfYS4G2CO4UFXcjT+lt8iMn6rFzPz8PfsEcSmTk6IRyLjNULTDRpT oyf9PX8NNhIrKHu2471utaDNh/pGiwL+Oe+qdm7i/XuavF3YOfzpFwJuiunuCd5Kvj9i ghalhYcEeweW2SQvreba3QjiRSDo82DMSmR8jBkazuevqFDweGi5SREiqwbqckHEMXHN aNLQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=Yaq67oiP; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j11si558023pfi.279.2019.06.20.14.31.12; Thu, 20 Jun 2019 14:31:27 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=Yaq67oiP; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726351AbfFTVbD (ORCPT + 99 others); Thu, 20 Jun 2019 17:31:03 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:34324 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725905AbfFTVbD (ORCPT ); Thu, 20 Jun 2019 17:31:03 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=enjs7CHak9fTSsrtrkc6AsjIa820JWzWgsah7wbOA1Y=; b=Yaq67oiPAUNb6kpFKiMcxMNSy 1sLwhxmWqDoRJyxjUTMtjT+tgwEMyPQ7OWvj0YfFsdbO96kUZn/ZBK92ypjpaU0Y14XwzerZidJ5Z SSvz3jycgMDfYkK85HcblRrkUsIb3x56ebJ3WO/aipaP0V2x/Qg10m2k6iLndl/6x5BFkR+STNfEn 8AAhHI7ViErnWYtrKnVCFt14Df4Ke9+LnoC1bnKVphgIAQwfPAKJVENgxztxSt6hW7U5pbahOWkTg WSwc/nv9yW0L6FQaxSlXBwbspYOy+K9bglgMNdvEiJzWkIQ022GXEFYotyNySuPSA2zm/OaXipTHw nsTM/tvVw==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=hirez.programming.kicks-ass.net) by bombadil.infradead.org with esmtpsa (Exim 4.92 #3 (Red Hat Linux)) id 1he4e2-0000U4-Uc; Thu, 20 Jun 2019 21:30:59 +0000 Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id 5B1F82021E585; Thu, 20 Jun 2019 23:30:57 +0200 (CEST) Date: Thu, 20 Jun 2019 23:30:57 +0200 From: Peter Zijlstra To: Eugeniy Paltsev Cc: "Vineet.Gupta1@synopsys.com" , "jbaron@akamai.com" , "linux-snps-arc@lists.infradead.org" , Alexey Brodkin , "linux-kernel@vger.kernel.org" , "linux-arch@vger.kernel.org" , "pbonzini@redhat.com" , "ard.biesheuvel@linaro.org" Subject: Re: [PATCH] ARC: ARCv2: jump label: implement jump label patching Message-ID: <20190620213057.GD3436@hirez.programming.kicks-ass.net> References: <20190614164049.31626-1-Eugeniy.Paltsev@synopsys.com> <20190619081227.GL3419@hirez.programming.kicks-ass.net> <20190620070120.GU3402@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 20, 2019 at 06:34:55PM +0000, Eugeniy Paltsev wrote: > On Thu, 2019-06-20 at 09:01 +0200, Peter Zijlstra wrote: > > In particular we do not need the alignment. > > > > So what the x86 code does is: > > > > - overwrite the first byte of the instruction with a single byte trap > > instruction > > > > - machine wide IPI which synchronizes I$ > > > > At this point, any CPU that encounters this instruction will trap; and > > the trap handler will emulate the 'new' instruction -- typically a jump. > > > > - overwrite the tail of the instruction (if there is a tail) > > > > - machine wide IPI which syncrhonizes I$ > > > > At this point, nobody will execute the tail, because we'll still trap on > > that first single byte instruction, but if they were to read the > > instruction stream, the tail must be there. > > > > - overwrite the first byte of the instruction to now have a complete > > instruction. > > > > - machine wide IPI which syncrhonizes I$ > > > > At this point, any CPU will encounter the new instruction as a whole, > > irrespective of alignment. > > > > > > So the benefit of this scheme is that is works irrespective of the > > instruction fetch window size and don't need the 'funny' alignment > > stuff. > > > > Thanks for explanation. Now I understand how this x86 magic works. > > However it looks like even more complex than ARM implementation. > As I understand on ARM they do something like that: > ---------------------------->8------------------------- > on_each_cpu { > write_instruction > flush_data_cache_region > invalidate_instruction_cache_region > } > ---------------------------->8------------------------- > > https://elixir.bootlin.com/linux/v5.1/source/arch/arm/kernel/patch.c#L121 > > Yep, there is some overhead - as we don't need to do white and D$ flush on each cpu > but that makes code simple and avoids additional checks. > > And I don't understand in which cases x86 approach with trap is better. > In this ARM implementation we do one machine wide IPI instead of three in x86 trap approach. > > Probably there is some x86 specifics I don't get? It's about variable instruction length; ARM (RISC in general) doesn't have that, ARC does. Your current proposal works by keeping the instruction inside of the i-fetch window, but that then results in instruction padding (extra NOPs). And that is fine, it really should work. The x86 approach however allows you to get rid of that padding and should work for unaligned variable length instructions (we have 1-15 byte instructions). I just wanted to make sure you were aware of the possiblities such that you made an informed decision, I'm not trying to force complexity on you :-)