Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp1281073imu; Wed, 9 Jan 2019 15:12:43 -0800 (PST) X-Google-Smtp-Source: ALg8bN4vfvjbYxaKzwdUOZaKlvXkBlieAFVEHKVhvNVQM2ta43/c5DuPJHN31hOfBQSlJY41LmyP X-Received: by 2002:a62:1709:: with SMTP id 9mr7745377pfx.249.1547075563051; Wed, 09 Jan 2019 15:12:43 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1547075563; cv=none; d=google.com; s=arc-20160816; b=M5sRBCt4G++gYT10Fx3EnDbufRFoxmxsPcEXFT5pZZmGmQ9pCTkAfauTP2oIlIb2JK AUJVMlnXUalF5oXwAiKMvTLqHHcP+nFhYs94vNrCeb/6rCnjcqwR6SVxCxh9ACOEMt9I xGZnLkGeM1vTpGAJQUwU54n2JdOufsvu0R6lzIJv4HI3rzA127lneDIMa5Ofbo8SXFG4 N6uxXxTrKzzpgca7SrtWAb9Eqx85wG+MX95ejFHySr6Bu0zLuEbAAW773P61AOMKJxd2 E3J5S7vplt1a+gSYz+8VxvLDEuD7DlYswZdH1WSHtwQPe19n9JGe9A2XH9w/egxD8PPZ Df/w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from; bh=cw/BlTdFl0dmGp7zPSuRJ2RCLy7nLDGnIayYgvRKXOI=; b=v5igqnTpWuXC/vNAB7vve9tsFUNTDVQLUoNyaaJNmPvfXG5lUl3NPMVjTqR0NKWRsf cOIZgedYkfcwyDvNgCYG52huVCUOEi0nBGBEY7dBtjomDiiQRCgPr0a7qgiJ9KRnilak jjZ9pz2/8ILZd8VpJp4s4RQOlKpbtoohd15rMZc227UtpRM/DI3JUdlnKxVofeolGGnl AfW08R8esGJ1CY05iCLi/ZS3p7hA2bHlptbvRUfKyW9jvvGRKa3yXHZYnTbhePLJRyYI F8gmMJPxQYmQNsABknvkBd06GtZgcnYJ3uJSpYVt9dY29FDzjQVOJyeDYUVA/03YzqXY lcLQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 187si4239630pfb.41.2019.01.09.15.11.55; Wed, 09 Jan 2019 15:12:42 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726710AbfAIW7x (ORCPT + 99 others); Wed, 9 Jan 2019 17:59:53 -0500 Received: from mx1.redhat.com ([209.132.183.28]:39852 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726280AbfAIW7w (ORCPT ); Wed, 9 Jan 2019 17:59:52 -0500 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 9240481104; Wed, 9 Jan 2019 22:59:51 +0000 (UTC) Received: from treble.redhat.com (ovpn-125-32.rdu2.redhat.com [10.10.125.32]) by smtp.corp.redhat.com (Postfix) with ESMTP id AD4C810A1820; Wed, 9 Jan 2019 22:59:44 +0000 (UTC) From: Josh Poimboeuf To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , Andy Lutomirski , Steven Rostedt , Peter Zijlstra , Ingo Molnar , Thomas Gleixner , Linus Torvalds , Masami Hiramatsu , Jason Baron , Jiri Kosina , David Laight , Borislav Petkov , Julia Cartwright , Jessica Yu , "H. Peter Anvin" , Nadav Amit , Rasmus Villemoes , Edward Cree , Daniel Bristot de Oliveira Subject: [PATCH v3 0/6] Static calls Date: Wed, 9 Jan 2019 16:59:35 -0600 Message-Id: X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Wed, 09 Jan 2019 22:59:52 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org With this version, I stopped trying to use text_poke_bp(), and instead went with a different approach: if the call site destination doesn't cross a cacheline boundary, just do an atomic write. Otherwise, keep using the trampoline indefinitely. NOTE: At least experimentally, the call destination writes seem to be atomic with respect to instruction fetching. On Nehalem I can easily trigger crashes when writing a call destination across cachelines while reading the instruction on other CPU; but I get no such crashes when respecting cacheline boundaries. BUT, the SDM doesn't document this approach, so it would be great if any CPU people can confirm that it's safe! v3: - Split up the patches a bit more so that out-of-line static calls can be separately mergeable. Inline is more controversial, and other approaches or improvements might be considered. For example, Nadav is looking at implementing it with the help of a GCC plugin to ensure the call sites don't cross cacheline boundaries. - Get rid of the use of text_poke_bp(), in favor of atomic writes. Out-of-line calls will be promoted to inline only if the call sites don't cross cache line boundaries. [Linus/Andy] - Converge the inline and out-of-line trampolines into a single implementation, which uses a direct jump. This was made possible by making static_call_update() safe to be called during early boot. - Remove trampoline poisoning for now, since trampolines may still be needed for call sites which cross cache line boundaries. - Rename CONFIG_HAVE_STATIC_CALL_OUTLINE -> CONFIG_HAVE_STATIC_CALL [Steven] - Add missing __static_call_update() call to static_call_update() [Edward] - Add missing key->func update in __static_call_update() [Edward] - Put trampoline in a separate section to prevent 2-byte tail calls [Linus] v2: - fix STATIC_CALL_TRAMP() macro by using __PASTE() [Ard] - rename optimized/unoptimized -> inline/out-of-line [Ard] - tweak arch interfaces for PLT and add key->tramp field [Ard] - rename 'poison' to 'defuse' and do it after all sites have been patched [Ard] - fix .init handling [Ard, Steven] - add CONFIG_HAVE_STATIC_CALL [Steven] - make interfaces more consistent across configs to allow tracepoints to use them [Steven] - move __ADDRESSABLE() to static_call() macro [Steven] - prevent 2-byte jumps [Steven] - add offset to asm-offsets.c instead of hard coding key->func offset - add kernel_text_address() sanity check - make __ADDRESSABLE() symbols truly unique Static calls are a replacement for global function pointers. They use code patching to allow direct calls to be used instead of indirect calls. They give the flexibility of function pointers, but with improved performance. This is especially important for cases where retpolines would otherwise be used, as retpolines can significantly impact performance. The concept and code are an extension of previous work done by Ard Biesheuvel and Steven Rostedt: https://lkml.kernel.org/r/20181005081333.15018-1-ard.biesheuvel@linaro.org https://lkml.kernel.org/r/20181006015110.653946300@goodmis.org There are three implementations, depending on arch support: 1) basic function pointers 2) out-of-line: patched trampolines (CONFIG_HAVE_STATIC_CALL) 3) inline: patched call sites (CONFIG_HAVE_STATIC_CALL_INLINE) Josh Poimboeuf (6): compiler.h: Make __ADDRESSABLE() symbol truly unique static_call: Add basic static call infrastructure x86/static_call: Add out-of-line static call implementation static_call: Add inline static call infrastructure x86/alternative: Use a single access in text_poke() where possible x86/static_call: Add inline static call implementation for x86-64 arch/Kconfig | 7 + arch/x86/Kconfig | 4 +- arch/x86/include/asm/static_call.h | 33 ++ arch/x86/kernel/Makefile | 1 + arch/x86/kernel/alternative.c | 31 +- arch/x86/kernel/static_call.c | 57 ++++ arch/x86/kernel/vmlinux.lds.S | 1 + include/asm-generic/vmlinux.lds.h | 15 + include/linux/compiler.h | 2 +- include/linux/module.h | 10 + include/linux/static_call.h | 196 +++++++++++ include/linux/static_call_types.h | 22 ++ kernel/Makefile | 1 + kernel/module.c | 5 + kernel/static_call.c | 316 ++++++++++++++++++ scripts/Makefile.build | 3 + tools/objtool/Makefile | 3 +- tools/objtool/builtin-check.c | 3 +- tools/objtool/builtin.h | 2 +- tools/objtool/check.c | 131 +++++++- tools/objtool/check.h | 2 + tools/objtool/elf.h | 1 + .../objtool/include/linux/static_call_types.h | 22 ++ tools/objtool/sync-check.sh | 1 + 24 files changed, 860 insertions(+), 9 deletions(-) create mode 100644 arch/x86/include/asm/static_call.h create mode 100644 arch/x86/kernel/static_call.c create mode 100644 include/linux/static_call.h create mode 100644 include/linux/static_call_types.h create mode 100644 kernel/static_call.c create mode 100644 tools/objtool/include/linux/static_call_types.h -- 2.17.2