Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp696844imm; Fri, 5 Oct 2018 10:12:19 -0700 (PDT) X-Google-Smtp-Source: ACcGV61LpybIjivFEWHu5DNcBn42iu0FhRkUJ4EAKMCiJMsUXqO8zl4IA/kLgPbztgpjbySXKG6P X-Received: by 2002:a65:6409:: with SMTP id a9-v6mr11136793pgv.204.1538759539444; Fri, 05 Oct 2018 10:12:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1538759539; cv=none; d=google.com; s=arc-20160816; b=pI4RFXJel+nOTBdUCt1n3+JvQfoiuk0tM1OjhknOiaL4S+EJXEwCRwf/2JAqU0XPyl GUI3YF+rI7FguNxdm3xrT1/99s8eKB18srygZkC7h7igzifVGDH3VRsn3EuO45alYwtH FXxiOZAQF+dZyG1+E2IfYyIuegI9g+MZaKQpF4hzx7FXxPzWDJCFmw2v2Y/kZKiXQJr2 gKYFCMnsGqwq1e/o+GXDEviavuFISmfvWvQDAfDvnmmLUjZEZHRkFXRK7NkZZoGDVKb2 Jt9IhkJsEQwtcXZIs/TW9FLYJhAbubSdNQ3eaD4laD/kKUjV/79Fv3Qj6WDdnR6C9jRw 0YoQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:references:in-reply-to:mime-version :dkim-signature; bh=SUr6zJI2whow9x9OpfHeIW8md/b4SvoCH3/Fc0VhBmQ=; b=oB9J/oJBZwNBsj5lyyb/I3VyXY3wUyzKXgGWHTssWnHlIqRhl/CpUsLUTmXBxRDplk Nicrm9vdyA3BUxUFdcrWwUX/hDZQ5RN2rRHh2Rx7v2pk/Dtv0exkW7vDMK9h+Gh5rJCQ 5iFfgQhOwV7bkWIBihH4ZfqBOXVMGPGXBXPh4n3KCODXFRMBMhC1skBcsXQqhgjcGzLV q4q9axgojqCBcA6xGWol23Fk/Urw9Dl4Sg7aw2sgS9hTaeInW8xOxSrcvaAR85CShiH+ dyC/hoc7ADy9PuaCNVTZRvcZOUqctT9bHdGPHr1S29M6P7sZ4L1NleWm888JZDriQK/q EQtw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=R8KocMer; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t33-v6si8803943pgk.141.2018.10.05.10.12.03; Fri, 05 Oct 2018 10:12:19 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=R8KocMer; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728552AbeJFAL3 (ORCPT + 99 others); Fri, 5 Oct 2018 20:11:29 -0400 Received: from mail-io1-f67.google.com ([209.85.166.67]:35753 "EHLO mail-io1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727941AbeJFAL3 (ORCPT ); Fri, 5 Oct 2018 20:11:29 -0400 Received: by mail-io1-f67.google.com with SMTP id w11-v6so11223926iob.2 for ; Fri, 05 Oct 2018 10:11:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=SUr6zJI2whow9x9OpfHeIW8md/b4SvoCH3/Fc0VhBmQ=; b=R8KocMernaUgA69CSl7qi7imAZ/PnvUkkkfoaBsvsyjzPPtfOdE5cMFNngiEAtE/8I 1gLVxXorDnxPeEj2uFZoVuGIAC1NY//ifOWIFo/d7hcd8OGw+9pHPeQ9hOzwYTm6pin4 4DrTwSeGDEIZ2K55goesymYKpRFvz2A6RvC/A= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=SUr6zJI2whow9x9OpfHeIW8md/b4SvoCH3/Fc0VhBmQ=; b=pqypFBYofLG/uYh7XUKojI27fe9VEEKKTdSXwhGnMMiy0EnF+u4RnkxH9qjnX4kI6H qXKJjpUykFOl9NEih5IwSNUNAX1dncTrIru4h+OQJlqQYVk+mQKs9JcxT14xpHvyydX0 uEvwy1raFGLDDtUFqBV9ItRVntJcrke7LqvrBwCvwvQs7zaami5JHFG1H1f9g53lghlq uMH0ztsf7nMWTwpj8Ter1xhAIe/NAccq7wlWW1h3orP5MTZ7YKGYC67+mLqO6cpaX7Xy 5luUGEJTtF10np9HUWYmwzBeQH/ik0ZEdcm2zolTTZDCsR4OQhd7lh2K++op5IrHoarT ML1w== X-Gm-Message-State: ABuFfohroB3aAem4jNA7gJCaGLrpeJvsfg4WBUWTMlod1cu+S+3QLUCr 1aQZQKDVHhVoZhdyoN8rKr1rf+E8dNws5hUGwzb5fg== X-Received: by 2002:a6b:5d12:: with SMTP id r18-v6mr8485187iob.170.1538759510568; Fri, 05 Oct 2018 10:11:50 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a6b:5910:0:0:0:0:0 with HTTP; Fri, 5 Oct 2018 10:11:49 -0700 (PDT) In-Reply-To: References: <20181005081333.15018-1-ard.biesheuvel@linaro.org> <20181005081333.15018-2-ard.biesheuvel@linaro.org> <20181005141433.GS19272@hirez.programming.kicks-ass.net> <9E0E08C8-0DFC-4E50-A4FA-73208835EF9E@amacapital.net> From: Ard Biesheuvel Date: Fri, 5 Oct 2018 19:11:49 +0200 Message-ID: Subject: Re: [RFC PATCH 1/9] kernel: add support for patchable function pointers To: Andy Lutomirski Cc: Peter Zijlstra , LKML , "Jason A. Donenfeld" , Eric Biggers , Samuel Neves , Arnd Bergmann , Herbert Xu , "David S. Miller" , Catalin Marinas , Will Deacon , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Thomas Gleixner , Ingo Molnar , Kees Cook , "Martin K. Petersen" , Greg KH , Andrew Morton , Richard Weinberger , Linux Crypto Mailing List , linux-arm-kernel , linuxppc-dev Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 5 October 2018 at 18:58, Andy Lutomirski wrote: > On Fri, Oct 5, 2018 at 8:24 AM Ard Biesheuvel = wrote: >> >> On 5 October 2018 at 17:08, Andy Lutomirski wrote: >> > >> > >> >> On Oct 5, 2018, at 7:14 AM, Peter Zijlstra wro= te: >> >> >> >>> On Fri, Oct 05, 2018 at 10:13:25AM +0200, Ard Biesheuvel wrote: >> >>> diff --git a/include/linux/ffp.h b/include/linux/ffp.h >> >>> new file mode 100644 >> >>> index 000000000000..8fc3b4c9b38f >> >>> --- /dev/null >> >>> +++ b/include/linux/ffp.h >> >>> @@ -0,0 +1,43 @@ >> >>> +/* SPDX-License-Identifier: GPL-2.0 */ >> >>> + >> >>> +#ifndef __LINUX_FFP_H >> >>> +#define __LINUX_FFP_H >> >>> + >> >>> +#include >> >>> +#include >> >>> + >> >>> +#ifdef CONFIG_HAVE_ARCH_FFP >> >>> +#include >> >>> +#else >> >>> + >> >>> +struct ffp { >> >>> + void (**fn)(void); >> >>> + void (*default_fn)(void); >> >>> +}; >> >>> + >> >>> +#define DECLARE_FFP(_fn, _def) \ >> >>> + extern typeof(_def) *_fn; \ >> >>> + extern struct ffp const __ffp_ ## _fn >> >>> + >> >>> +#define DEFINE_FFP(_fn, _def) \ >> >>> + typeof(_def) *_fn =3D &_def; \ >> >>> + struct ffp const __ffp_ ## _fn \ >> >>> + =3D { (void(**)(void))&_fn, (void(*)(void))&_def }; \ >> >>> + EXPORT_SYMBOL(__ffp_ ## _fn) >> >>> + >> >>> +static inline void ffp_set_target(const struct ffp *m, void *new_fn= ) >> >>> +{ >> >>> + WRITE_ONCE(*m->fn, new_fn); >> >>> +} >> >>> + >> >>> +static inline void ffp_reset_target(const struct ffp *m) >> >>> +{ >> >>> + WRITE_ONCE(*m->fn, m->default_fn); >> >>> +} >> >>> + >> >>> +#endif >> >>> + >> >>> +#define SET_FFP(_fn, _new) ffp_set_target(&__ffp_ ## _fn, _new) >> >>> +#define RESET_FFP(_fn) ffp_reset_target(&__ffp_ ## _fn) >> >>> + >> >>> +#endif >> >> >> >> I don't understand this interface. There is no wrapper for the call >> >> site, so how are we going to patch all call-sites when you update the >> >> target? >> > >> > I=E2=80=99m also confused. >> > >> > Anyway, we have patchable functions on x86. They=E2=80=99re called PVO= Ps, and they=E2=80=99re way overcomplicated. >> > >> > I=E2=80=99ve proposed a better way that should generate better code, b= e more portable, and be more maintainable. It goes like this. >> > >> > To call the function, you literally just call the default implementat= ion. It *might* be necessary to call a nonexistent wrapper to avoid annoyi= ng optimizations. At build time, the kernel is built with relocations, so t= he object files contain relocation entries for the call. We collect these e= ntries into a table. If we=E2=80=99re using the =E2=80=9Cnonexistent wrappe= r=E2=80=9D approach, we can link in a .S or linker script to alias them to = the default implementation. >> > >> > To patch them, we just patch them. It can=E2=80=99t necessarily be don= e concurrently because nothing forces the right alignment. But we can do it= at boot time and module load time. (Maybe we can patch at runtime on archi= tectures with appropriate instruction alignment. Or we ask gcc for an exte= nsion to align calls to a function.) >> > >> > Most of the machinery already exists: this is roughly how the module l= oader resolves calls outside of a module. >> >> Yeah nothing is ever simple on x86 :-( >> >> So are you saying the approach i use in patch #2 (which would >> translate to emitting a jmpq instruction pointing to the default >> implementation, and patching it at runtime to point elsewhere) would >> not fly on x86? > > After getting some more sleep, I'm obviously wrong. The > text_poke_bp() mechanism will work. It's just really slow. > OK > Let me try to summarize some of the issues. First, when emitting > jumps and calls from inline asm on x86, there are a few considerations > that are annoying: > > 1. Following the x86_64 ABI calling conventions is basically > impossible. x86_64 requires a 128-byte redzone and 16-byte stack > alignment. After much discussion a while back, we decided that it was > flat-out impossible on current gcc to get the stack pointer aligned in > a known manner in an inline asm statement. Instead, if we actually > need alignment, we need to align manually. Fortunately, the kernel is > built with an override that forces only 8-byte alignment (on *most* > GCC versions). But for crypto in particular, it sucks extra, since > the crypto code is basically the only thing in the kernel that > actually wants 16-byte alignment. I don't think this is a huge > problem in practice, but it's annoying. And the kernel is built > without a redzone. > > 2. On x86_64, depending on config, we either need frame pointers or > ORC. ORC is no big deal -- it Just Works (tm). Frame pointers need > extra asm hackery. It's doable, but it's still annoying. > > 3. Actually getting the asm constraints right to do what a C > programmer expects is distinctly nontrivial. I just fixed an > extremely longstanding bug in the vDSO code in which the asm > constraints for the syscall fallback were wrong in such a way that GCC > didn't notice that the fallback wrote to its output parameter. > Whoops. > OK, so the thing I am missing is why this all matters. Note that the compiler should take care of all of this. It emits a call a function with external linkage having prototype X, and all the inline asm does is emit a jmp to some function having that same prototype, either the default one or the one we patched in. Apologies if I am missing something obvious here: as you know, x86 is not my focus in general. So in the arm64 case, we have " .globl " #_fn " \n" \ #_fn " : \n" \ " b " #_def " \n" \ " nop \n" \ " nop \n" \ " nop \n" \ " nop \n" \ and all the patching does is replace the target of that branch (the NOPs are there for jumps that are more then 128 MB away, which require a indirect jump on arm64) > And having all this asm hackery per architecture is ugly and annoying. > True. But note that only the architectures that cannot tolerate the default instantiation using function pointers will require a special implementation. > So my suggestion is to do it like a regular relocation. Call a > function the normal way (make it literally be a C function and call > it), and rig up the noinline and noclone attributes and whatever else > is needed to make sure that it's a *relocatable* call. Then the > toolchain emits ELF relocations saying exactly what part of the text > needs patching, and we can patch it at runtime. On x86, this is a bit > extra annoying because we can't fully reliably parse backwards to find > the beginning of the instruction, but objtool could doit. > > And then we get something that is mostly arch-neutral! Because surely > ARM can also use a relocation-based mechanism. > Not really. We don't have any build time tooling for this, and CONFIG_RELOCATABLE only produces R_AARCH64_RELATIVE relocations for absolute quantities. So it would mean we'd have to start building vmlinux with --emit-relocs, add tooling to parse all of that etc etc. > I will generally object to x86 containing more than one > inline-asm-hackery-based patchable call mechanism, which your series > will add. I would *love* to see a non-inline-asm one, and then we > could move most of the x86 paravirt crap over to use it for a big win > in readability and maintainability. > Fair enough.