Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp1335453pxb; Tue, 26 Oct 2021 07:14:00 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwXaOccDAiD6w/e6iaERBb+rmQkrHTW2yjbxfGvo3s73i5Ye7Uc0Swa9TDuxPNFls9YQP8I X-Received: by 2002:a17:90a:cf90:: with SMTP id i16mr20168430pju.113.1635257637272; Tue, 26 Oct 2021 07:13:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1635257637; cv=none; d=google.com; s=arc-20160816; b=PMCAfI1US7reeOeeS0dyqmUD/U7Rb6RP0h0etB3P+w0jN9x/CKor4sTuKoRwhm85Xa SSwERmrYV1G7Slz3gZDHS+TUGa148+yTHtJVtGJEYMR9W2IkUjKz8N7ODAr05DRbxhqt NhMkIhhzRNRtFe7Vdi3i3oSJuHzHjBv/99fu+VA86a9hH6fri/VolnOiKl5vQd2EZwDm sR28iLSgs0sDDZWbIkhy6CZ9T+1dVfYSJI4NMgjcB3OYRdNnYWdracMfe5SyZUUlCjdz E41zgnAezdcp4rtT6IBqNc38cCmaLt15mGp6IRLcNLqAM29TJ1mnLJMh7gbI6PThKWLM d2xA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=YCBk45PIoduu9DwmXNV0ImxvHn7TgJztCbBY2vnNfcs=; b=w7C+la5Q+Cq6iXZrisimqx/h7KGoT6zNzopKJVwOJ8JDOpNJr6bdpvsp1o2IHeDPh3 UEswKIw/cpgUeB0RG1DxFzLr7+d5h/sHKep5PviaQ4JVm8jSpSAW+gCJ+yObd9oRsNWV pRz6dGJjw6y0dRqt7wwDCKNpdjeUmls4Wq3IGuUm6dcuhP/D7wFZr0JhId5xEFrfwVm5 2nSAjle382V7f9y8K1Xa48lPV92dznzLcloH+8snoD+Kf5KMdqGjofU6oFJGjLDTMACN PcweuaT0Bu/VWCaYfMokFKPyuabiFnmEH161tn3VKvXFvKS/lar00D2zbpoG0lFcjttU y8Ug== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id nu11si1209917pjb.121.2021.10.26.07.13.43; Tue, 26 Oct 2021 07:13:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233716AbhJZKji (ORCPT + 99 others); Tue, 26 Oct 2021 06:39:38 -0400 Received: from foss.arm.com ([217.140.110.172]:56576 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234886AbhJZKjZ (ORCPT ); Tue, 26 Oct 2021 06:39:25 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5D1501FB; Tue, 26 Oct 2021 03:37:00 -0700 (PDT) Received: from C02TD0UTHF1T.local (unknown [10.57.74.144]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 190393F73D; Tue, 26 Oct 2021 03:36:57 -0700 (PDT) Date: Tue, 26 Oct 2021 11:36:55 +0100 From: Mark Rutland To: Ard Biesheuvel Cc: Peter Zijlstra , Frederic Weisbecker , LKML , James Morse , David Laight , Quentin Perret , Catalin Marinas , Will Deacon Subject: Re: [PATCH 2/4] arm64: implement support for static call trampolines Message-ID: <20211026103655.GB30152@C02TD0UTHF1T.local> References: <20211025122102.46089-1-frederic@kernel.org> <20211025122102.46089-3-frederic@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Oct 25, 2021 at 05:10:24PM +0200, Ard Biesheuvel wrote: > On Mon, 25 Oct 2021 at 17:05, Peter Zijlstra wrote: > > > > On Mon, Oct 25, 2021 at 04:55:17PM +0200, Ard Biesheuvel wrote: > > > On Mon, 25 Oct 2021 at 16:47, Peter Zijlstra wrote: > > > > > > Perhaps a little something like so.. Shaves 2 instructions off each > > > > trampoline. > > > > > > > > --- a/arch/arm64/include/asm/static_call.h > > > > +++ b/arch/arm64/include/asm/static_call.h > > > > @@ -11,9 +11,7 @@ > > > > " hint 34 /* BTI C */ \n" \ > > > > insn " \n" \ > > > > " ldr x16, 0b \n" \ > > > > - " cbz x16, 1f \n" \ > > > > " br x16 \n" \ > > > > - "1: ret \n" \ > > > > " .popsection \n") > > > > > > > > #define ARCH_DEFINE_STATIC_CALL_TRAMP(name, func) \ > > > > --- a/arch/arm64/kernel/patching.c > > > > +++ b/arch/arm64/kernel/patching.c > > > > @@ -90,6 +90,11 @@ int __kprobes aarch64_insn_write(void *a > > > > return __aarch64_insn_write(addr, &i, AARCH64_INSN_SIZE); > > > > } > > > > > > > > +asm("__static_call_ret: \n" > > > > + " ret \n") > > > > + > > > > > > This breaks BTI as it lacks the landing pad, and it will be called indirectly. > > > > Argh! > > > > > > +extern void __static_call_ret(void); > > > > + > > > > > > Better to have an ordinary C function here (with consistent linkage), > > > but we need to take the address in a way that works with Clang CFI. > > > > There is that. > > > > > As the two additional instructions are on an ice cold path anyway, I'm > > > not sure this is an obvious improvement tbh. > > > > For me it's both simpler -- by virtue of being more consistent, and > > smaller. So double win :-) > > > > That is; you're already relying on the literal being unconditionally > > updated for the normal B foo -> NOP path, and having the RET -> NOP path > > be handled differently is just confusing. > > > > At least, that's how I'm seeing it today... > > Fair enough. I don't have a strong opinion either way, so I'll let > some other arm64 folks chime in as well. My preference overall is to keep the trampoline self-contained, and I'd prefer to keep the RET inline in the trampoline rather than trying to factor it out so that all the control-flow is clearly in one place. So I'd prefer that we have the sequence as-is: | 0: .quad 0x0 | bti c | < insn > | ldr x16, 0b | cbz x16, 1f | br x16 | 1: ret If we knew these were only called with IRQs enabled (and so we can take an IPI to generate a context synchronization event), we could patch to a RET and point the literal back at the BTI, e.g. | 0: .quad 0x0 | bti c | < insn > | ldr x16, 0b | br x16 ... but I'm pretty sure there are CPUs that will never re-fetch in that case, and will get stuck in an infinite loop. Thanks, Mark.