Received: by 2002:a05:6a10:1d13:0:0:0:0 with SMTP id pp19csp4363902pxb; Tue, 31 Aug 2021 03:24:40 -0700 (PDT) X-Google-Smtp-Source: ABdhPJySF2YKVQ/yCMbdEOrDwBTsLZcg1RGEtcYn3UmzCZr0AJG12BBqT4LU0oMOQ2uhS0Id9N2C X-Received: by 2002:a5d:9284:: with SMTP id s4mr22064380iom.131.1630405480089; Tue, 31 Aug 2021 03:24:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1630405480; cv=none; d=google.com; s=arc-20160816; b=GH8zNJePuqAyfM79GvLCzZ3Pz8x8FCsGA/jKa35cXFndgeqGCvgVkNIMVvnZOpVOEi zXMVV96YLRnQxlzIPB3w/JPDLiFxlgt/Tvbm7qFv45MRNqWWnHw+PRcn0LOLST5deIEG 925avwZiyGUHQm6g6KjtPGctVDF+VA0S0fw1T/c+XgmV2l/NakMaBj+/x6iNPqznCvMR Y3TIPUtUNwVDtsbixpHoH68wjO3lqLs+burFZgcy7ZxMo0ePsxlHOHu8C8CiSRtkcIYF QbTUvKWpXS2LcAoWjA+oXDzPlPdFQdz3VNXsLaM98OoWApkajlGRmND/fS3CYcnTHzAq Hwkw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=gJ6gqL2w1mUhhOcYp14RDIzoFrJ1R+dfXlBHBfCh28Q=; b=0TGEjthMA1+AJDJ0bNuoe8WVkQvMnO3fxgdzz6W94swqyWXGDYAJy6SkjmqqyDQmqr rb+bIOr2imKqBziK6Rvro1iroA9JRWF1NS223nPlGzud2saWx95Ifsbguzd5IbYEMURK Hm4J/b1QUdszsi9mc2YIQqX/vrmzrvR/aa49HialL/UhTGM+2c7EmjZ02BCK+DtPFFWA ihMo2hKS0VLOqgj++9ZmG/N2mfqzwZPbBXM7q2sK/+zKWZHh2rcorjmCTQ+kDNhnVpb/ UzybhB/TUClnfAK3vfbwB9PY9FJfAhSvK4PEKh4eljR8vJ9q64e+zfy8NoPSF4VpolK/ 5pjg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=GRQbXhOs; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b16si18157976ilf.112.2021.08.31.03.24.29; Tue, 31 Aug 2021 03:24:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=GRQbXhOs; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240988AbhHaKXZ (ORCPT + 99 others); Tue, 31 Aug 2021 06:23:25 -0400 Received: from mail.kernel.org ([198.145.29.99]:59910 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240908AbhHaKXY (ORCPT ); Tue, 31 Aug 2021 06:23:24 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id EE57D60FF2 for ; Tue, 31 Aug 2021 10:22:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1630405350; bh=CopK0+RPrdW0NrpmXSqnJg7a/FM6saM8cfHvnciJNVw=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=GRQbXhOspGBkkIdQarG+TUlyTgZ1ZIu+lRu3XB3bAPtMaH3eeOK7uUszP3UV0esUN y+omubUOvniXV0SqWGkdDSfBAtuuQumR4kSh6W5bi0l+6wO3MXGuyVRv68ET1Tbadm SYPfB/GK3EE/z7vpkowN6sqqsJ9V6JtkW/ZzOF42L7ocM4APl7HcV0qsY6mKRnyEGm UZ8NJ7hvZrRIx+QS4okWIKDi7lK0d0aNsP1XF2qxaPP0JXIEVVM67k6khIXNvnAu6v DscgIH027byMmXT8p7jQyR18P72N7Bx5hNzbxAzvid0I4NCz363bRF/ckN37pe3f+g cySVbxwB8TE+w== Received: by mail-oi1-f170.google.com with SMTP id bi4so19942827oib.9 for ; Tue, 31 Aug 2021 03:22:29 -0700 (PDT) X-Gm-Message-State: AOAM530moMHMaWiKeQ4gqANUl6SPbXPSinLdGe/lktlLKCmP9acFIfcK Au02F3WLEtyBxPTtjO3CbwgfE6DIB6LXl6hYz2A= X-Received: by 2002:aca:ea54:: with SMTP id i81mr2567189oih.174.1630405349269; Tue, 31 Aug 2021 03:22:29 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Ard Biesheuvel Date: Tue, 31 Aug 2021 12:22:18 +0200 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH] powerpc/32: Add support for out-of-line static calls To: Peter Zijlstra Cc: Christophe Leroy , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, Josh Poimboeuf , Jason Baron , Steven Rostedt Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 31 Aug 2021 at 10:53, Peter Zijlstra wrote: > > On Tue, Aug 31, 2021 at 08:05:21AM +0000, Christophe Leroy wrote: > > > +#define ARCH_DEFINE_STATIC_CALL_NULL_TRAMP(name) \ > > + asm(".pushsection .text, \"ax\" \n" \ > > + ".align 4 \n" \ > > + ".globl " STATIC_CALL_TRAMP_STR(name) " \n" \ > > + STATIC_CALL_TRAMP_STR(name) ": \n" \ > > + " blr \n" \ > > + " nop \n" \ > > + " nop \n" \ > > + " nop \n" \ > > + ".type " STATIC_CALL_TRAMP_STR(name) ", @function \n" \ > > + ".size " STATIC_CALL_TRAMP_STR(name) ", . - " STATIC_CALL_TRAMP_STR(name) " \n" \ > > + ".popsection \n") > > > +static int patch_trampoline_32(u32 *addr, unsigned long target) > > +{ > > + int err; > > + > > + err = patch_instruction(addr++, ppc_inst(PPC_RAW_LIS(_R12, PPC_HA(target)))); > > + err |= patch_instruction(addr++, ppc_inst(PPC_RAW_ADDI(_R12, _R12, PPC_LO(target)))); > > + err |= patch_instruction(addr++, ppc_inst(PPC_RAW_MTCTR(_R12))); > > + err |= patch_instruction(addr, ppc_inst(PPC_RAW_BCTR())); > > + > > + return err; > > +} > > There can be concurrent execution and modification; the above doesn't > look safe in that regard. What happens if you've say, done the first > two, but not the latter two and execution happens (on a different > CPU or through IRQ context, etc..)? > > > +void arch_static_call_transform(void *site, void *tramp, void *func, bool tail) > > +{ > > + int err; > > + unsigned long target = (long)func; > > + > > + if (!tramp) > > + return; > > + > > + mutex_lock(&text_mutex); > > + > > + if (!func) > > + err = patch_instruction(tramp, ppc_inst(PPC_RAW_BLR())); > > + else if (is_offset_in_branch_range((long)target - (long)tramp)) > > + err = patch_branch(tramp, target, 0); > > These two are single instruction modifications and I'm assuming the > hardware is sane enough that execution sees either the old or the new > instruction. So this should work. > > > + else if (IS_ENABLED(CONFIG_PPC32)) > > + err = patch_trampoline_32(tramp, target); > > + else > > + BUILD_BUG(); > > + > > + mutex_unlock(&text_mutex); > > + > > + if (err) > > + panic("%s: patching failed %pS at %pS\n", __func__, func, tramp); > > +} > > +EXPORT_SYMBOL_GPL(arch_static_call_transform); > > One possible solution that we explored on ARM64, was having the > trampoline be in 2 slots: > > > b 1f > > 1: blr > nop > nop > nop > > 2: blr > nop > nop > nop > > Where initially the first slot is active (per "b 1f"), then you write > the second slot, and as a final act, re-write the initial branch to > point to slot 2. > > Then you execute synchronize_rcu_tasks() under your text mutex > (careful!) to ensure all users of your slot1 are gone and the next > modification repeats the whole thing, except for using slot1 etc.. > > Eventually I think Ard came up with the latest ARM64 proposal which puts > a literal in a RO section (could be the text section I suppose) and > loads and branches to that. > Yes. The main reason is simply that anything else is premature optimization: we have a clear use case (CFI) where out-of-line static calls are faster than compiler generated indirect calls, even if the static call sequence is based on a literal load and an indirect branch, but CFI is not upstream [yet]. Once other use cases emerge, we will revisit this. > Anyway, the thing is, you can really only modify a single instruction at > the time and need to ensure concurrent execution is correct.