Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp714785imm; Fri, 5 Oct 2018 10:29:46 -0700 (PDT) X-Google-Smtp-Source: ACcGV61s4INFKXQArn+tymwBF7SiRtzsiRDqc3gaxcJb8IJJcyGdvyVuyIJqSJ2Ou8T3xl6ZswdO X-Received: by 2002:a17:902:b784:: with SMTP id e4-v6mr12635795pls.284.1538760585923; Fri, 05 Oct 2018 10:29:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1538760585; cv=none; d=google.com; s=arc-20160816; b=q4VnMyuHOp8O9tTzj1gVrQPcbunPnIZJSKPNIFBWVCv9Vt2QCbNCEHI4BqNAsNpx6o 9FBuWdrSJ6F+vInPsfW+39cWVf9gnrVWBDoOtXts3iZVOdzQu6YWTinOgm3u5NsoUage 6SbyjeM85dSbr9ZbddV3Prt5Usoe/be+Z1lEr70mshAPQQd14jiY9O/w6D39+fSwYOwz 9R1MefMG4hckI699EvaMUgG/cQm/f1M2SJ0aInc2jVW8ZIcG3JWLwIS0tbz3zdj6wOB6 lquhExVqjU20hieDOE7qg6TvNv+AkDYYMHGdWSd11wDE5qumAkd6TQTZrbQgJ3bY83mF tRMw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=fcMnK0siFYd4OfIdJsHJ3+Uj4Gc57SYHEFNXkUqAF3s=; b=njOh74bI2DdkSyTxioDI23T4rVK0L/bxPGsvbp2mrXeI4EnV8IQd+VqzXaeywmkHud 27w8auE/VXafmeQFzxXSNDR5xbdqLSerKweRG5zrpvNRGzkMUe8Jv5WsUMgzsHvUAFOU uCiUvMFpcg35J0tvjzxAJPlcaY+A5O0XumAH1OZNBNrLVEa8rfxxCARjUKpQLmJz+X2h lEeR6vkt1TK6jqhC7/c3ymyxEWnBXdmadtSQogCA3oYOXGTjkjINJEc+l4R99U5GGDR5 hNbmOZvQSyd3Q/+xbUS84fvIZaqW6Bi32Owa3hR92P2Jc3ap0EYy8+hI8vjqT5c440ZE LbfA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=X+0UpW2U; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v126-v6si8797892pgv.10.2018.10.05.10.29.30; Fri, 05 Oct 2018 10:29:45 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=X+0UpW2U; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728890AbeJFA2v (ORCPT + 99 others); Fri, 5 Oct 2018 20:28:51 -0400 Received: from mail.kernel.org ([198.145.29.99]:50222 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727958AbeJFA2v (ORCPT ); Fri, 5 Oct 2018 20:28:51 -0400 Received: from mail-wr1-f49.google.com (mail-wr1-f49.google.com [209.85.221.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 6395221477 for ; Fri, 5 Oct 2018 17:29:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1538760548; bh=MkvjJ/fQC95RoHlI0Pp5CoWyQvuhqmSPBPMV4OpYxEQ=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=X+0UpW2U+pHqqAphMYRCo//3SQj4NuS7xc9LTPyJFHWfmjxJAVS5pwm43xKWuHkTZ 65A966K6u2TW9u3kH2ngZV7igjEBFYSzn1vycME/lah8xLu/rFg1O0U9VWNBxqC+QV n15tPtzx0rK+RtpjlrvYq7Bpd3VmNPDZ/U1sq1bM= Received: by mail-wr1-f49.google.com with SMTP id z4-v6so14347652wrb.1 for ; Fri, 05 Oct 2018 10:29:08 -0700 (PDT) X-Gm-Message-State: ABuFfog1IyZvGsGWFQPKAjVRrO8CeL7urqSMQ+F7KEu3Cjy7mh37zaOW 7Alb90bmNyI1hcE+41BWpq2LiU+h42ZV7iW6dQ/qJg== X-Received: by 2002:adf:9792:: with SMTP id s18-v6mr9365531wrb.283.1538760546759; Fri, 05 Oct 2018 10:29:06 -0700 (PDT) MIME-Version: 1.0 References: <20181005081333.15018-1-ard.biesheuvel@linaro.org> <20181005081333.15018-2-ard.biesheuvel@linaro.org> <20181005141433.GS19272@hirez.programming.kicks-ass.net> <9E0E08C8-0DFC-4E50-A4FA-73208835EF9E@amacapital.net> In-Reply-To: From: Andy Lutomirski Date: Fri, 5 Oct 2018 10:28:55 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [RFC PATCH 1/9] kernel: add support for patchable function pointers To: Ard Biesheuvel Cc: Andrew Lutomirski , Peter Zijlstra , LKML , "Jason A. Donenfeld" , Eric Biggers , Samuel Neves , Arnd Bergmann , Herbert Xu , "David S. Miller" , Catalin Marinas , Will Deacon , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Thomas Gleixner , Ingo Molnar , Kees Cook , "Martin K. Petersen" , Greg KH , Andrew Morton , Richard Weinberger , Linux Crypto Mailing List , linux-arm-kernel , linuxppc-dev Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Oct 5, 2018 at 10:23 AM Ard Biesheuvel wrote: > > On 5 October 2018 at 19:20, Andy Lutomirski wrote: > > On Fri, Oct 5, 2018 at 10:11 AM Ard Biesheuvel > > wrote: > >> > >> On 5 October 2018 at 18:58, Andy Lutomirski wrote: > >> > On Fri, Oct 5, 2018 at 8:24 AM Ard Biesheuvel wrote: > >> >> > >> >> On 5 October 2018 at 17:08, Andy Lutomirski w= rote: > >> >> > > >> >> > > >> >> >> On Oct 5, 2018, at 7:14 AM, Peter Zijlstra wrote: > >> >> >> > >> >> >>> On Fri, Oct 05, 2018 at 10:13:25AM +0200, Ard Biesheuvel wrote: > >> >> >>> diff --git a/include/linux/ffp.h b/include/linux/ffp.h > >> >> >>> new file mode 100644 > >> >> >>> index 000000000000..8fc3b4c9b38f > >> >> >>> --- /dev/null > >> >> >>> +++ b/include/linux/ffp.h > >> >> >>> @@ -0,0 +1,43 @@ > >> >> >>> +/* SPDX-License-Identifier: GPL-2.0 */ > >> >> >>> + > >> >> >>> +#ifndef __LINUX_FFP_H > >> >> >>> +#define __LINUX_FFP_H > >> >> >>> + > >> >> >>> +#include > >> >> >>> +#include > >> >> >>> + > >> >> >>> +#ifdef CONFIG_HAVE_ARCH_FFP > >> >> >>> +#include > >> >> >>> +#else > >> >> >>> + > >> >> >>> +struct ffp { > >> >> >>> + void (**fn)(void); > >> >> >>> + void (*default_fn)(void); > >> >> >>> +}; > >> >> >>> + > >> >> >>> +#define DECLARE_FFP(_fn, _def) \ > >> >> >>> + extern typeof(_def) *_fn; \ > >> >> >>> + extern struct ffp const __ffp_ ## _fn > >> >> >>> + > >> >> >>> +#define DEFINE_FFP(_fn, _def) \ > >> >> >>> + typeof(_def) *_fn =3D &_def; \ > >> >> >>> + struct ffp const __ffp_ ## _fn \ > >> >> >>> + =3D { (void(**)(void))&_fn, (void(*)(void))&_def }; = \ > >> >> >>> + EXPORT_SYMBOL(__ffp_ ## _fn) > >> >> >>> + > >> >> >>> +static inline void ffp_set_target(const struct ffp *m, void *n= ew_fn) > >> >> >>> +{ > >> >> >>> + WRITE_ONCE(*m->fn, new_fn); > >> >> >>> +} > >> >> >>> + > >> >> >>> +static inline void ffp_reset_target(const struct ffp *m) > >> >> >>> +{ > >> >> >>> + WRITE_ONCE(*m->fn, m->default_fn); > >> >> >>> +} > >> >> >>> + > >> >> >>> +#endif > >> >> >>> + > >> >> >>> +#define SET_FFP(_fn, _new) ffp_set_target(&__ffp_ ## _fn, _= new) > >> >> >>> +#define RESET_FFP(_fn) ffp_reset_target(&__ffp_ ## _fn) > >> >> >>> + > >> >> >>> +#endif > >> >> >> > >> >> >> I don't understand this interface. There is no wrapper for the c= all > >> >> >> site, so how are we going to patch all call-sites when you updat= e the > >> >> >> target? > >> >> > > >> >> > I=E2=80=99m also confused. > >> >> > > >> >> > Anyway, we have patchable functions on x86. They=E2=80=99re calle= d PVOPs, and they=E2=80=99re way overcomplicated. > >> >> > > >> >> > I=E2=80=99ve proposed a better way that should generate better co= de, be more portable, and be more maintainable. It goes like this. > >> >> > > >> >> > To call the function, you literally just call the default implem= entation. It *might* be necessary to call a nonexistent wrapper to avoid a= nnoying optimizations. At build time, the kernel is built with relocations,= so the object files contain relocation entries for the call. We collect th= ese entries into a table. If we=E2=80=99re using the =E2=80=9Cnonexistent w= rapper=E2=80=9D approach, we can link in a .S or linker script to alias the= m to the default implementation. > >> >> > > >> >> > To patch them, we just patch them. It can=E2=80=99t necessarily b= e done concurrently because nothing forces the right alignment. But we can = do it at boot time and module load time. (Maybe we can patch at runtime on = architectures with appropriate instruction alignment. Or we ask gcc for an= extension to align calls to a function.) > >> >> > > >> >> > Most of the machinery already exists: this is roughly how the mod= ule loader resolves calls outside of a module. > >> >> > >> >> Yeah nothing is ever simple on x86 :-( > >> >> > >> >> So are you saying the approach i use in patch #2 (which would > >> >> translate to emitting a jmpq instruction pointing to the default > >> >> implementation, and patching it at runtime to point elsewhere) woul= d > >> >> not fly on x86? > >> > > >> > After getting some more sleep, I'm obviously wrong. The > >> > text_poke_bp() mechanism will work. It's just really slow. > >> > > >> > >> OK > >> > >> > Let me try to summarize some of the issues. First, when emitting > >> > jumps and calls from inline asm on x86, there are a few consideratio= ns > >> > that are annoying: > >> > > >> > 1. Following the x86_64 ABI calling conventions is basically > >> > impossible. x86_64 requires a 128-byte redzone and 16-byte stack > >> > alignment. After much discussion a while back, we decided that it w= as > >> > flat-out impossible on current gcc to get the stack pointer aligned = in > >> > a known manner in an inline asm statement. Instead, if we actually > >> > need alignment, we need to align manually. Fortunately, the kernel = is > >> > built with an override that forces only 8-byte alignment (on *most* > >> > GCC versions). But for crypto in particular, it sucks extra, since > >> > the crypto code is basically the only thing in the kernel that > >> > actually wants 16-byte alignment. I don't think this is a huge > >> > problem in practice, but it's annoying. And the kernel is built > >> > without a redzone. > >> > > >> > 2. On x86_64, depending on config, we either need frame pointers or > >> > ORC. ORC is no big deal -- it Just Works (tm). Frame pointers need > >> > extra asm hackery. It's doable, but it's still annoying. > >> > > >> > 3. Actually getting the asm constraints right to do what a C > >> > programmer expects is distinctly nontrivial. I just fixed an > >> > extremely longstanding bug in the vDSO code in which the asm > >> > constraints for the syscall fallback were wrong in such a way that G= CC > >> > didn't notice that the fallback wrote to its output parameter. > >> > Whoops. > >> > > >> > >> OK, so the thing I am missing is why this all matters. > >> > >> Note that the compiler should take care of all of this. It emits a > >> call a function with external linkage having prototype X, and all the > >> inline asm does is emit a jmp to some function having that same > >> prototype, either the default one or the one we patched in. > >> > >> Apologies if I am missing something obvious here: as you know, x86 is > >> not my focus in general. > > > > The big issue that bothers me isn't the x86-ism so much as the nasty > > interactions with the optimizer. On x86, we have all of this working. > > It's in arch/x86/include/asm/paravirt_types.h, and it looks roughly > > like: > > > > asm volatile(pre = \ > > paravirt_alt(PARAVIRT_CALL) = \ > > post = \ > > : call_clbr, ASM_CALL_CONSTRAINT = \ > > : paravirt_type(op), = \ > > paravirt_clobber(clbr), = \ > > ##__VA_ARGS__ = \ > > : "memory", "cc" extra_clbr); = \ > > > > With some extra magic for the constraints. And I don't even think > > this is strictly correct -- from very recent experience, telling the > > compiler that "memory" is clobbered and that a bunch of arguments are > > used as numeric inputs may not actually imply that the asm modifies > > the target of pointer arguments. Checks this lovely bug out: > > > > https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/commit/?= h=3Dx86/vdso-tglx&id=3D715bd9d12f84d8f5cc8ad21d888f9bc304a8eb0b > > > > As far as I can tell, the whole PVOP infrastructure has the same bug. > > And I don't see how to avoid it generically on x86 or any other > > architecture. (PeterZ, am I wrong? Are we really just getting lucky > > that x86 pvop calls actually work? Or do we not have enough of them > > that take pointers as arguments for this to matter?) > > > > Plus, asm volatile ( ..., "memory" ) is a barrier and makes code > > generation suck. > > > > Whereas, if we use my suggestion the semantics are precisely those of > > any other C function call because, as far as GCC is concerned, it *is* > > a C function call. So the generated code *is* a function call. > > > > But it is the *compiler* that actually emits the call. > > Only, the target of that call is a jmpq to another location where some > version of the routine lives, and all have the same prototype. > > How is that any different from PLTs in shared libraries? Ah, I see, I misunderstood your code. See other email. I think that, if you rework your series a bit to have a generic version that works on all architectures, then do it like you did on ARM, and make sure you leave the door open for the inline patching approach, then it looks pretty good. (None of this is to say that I disagree with Jason, though -- I'm not entirely convinced that this makes sense for Zinc. But maybe it can be done in a way that makes everyone happy.)