Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp4391193ybi; Tue, 30 Jul 2019 01:10:39 -0700 (PDT) X-Google-Smtp-Source: APXvYqxBX/klVR603TPlVb833RtJTybgYp05YMORhxHIY4bouy1aeXPwJN6NjP9lra7CBx5nx7ZF X-Received: by 2002:aa7:86c6:: with SMTP id h6mr41291869pfo.51.1564474239057; Tue, 30 Jul 2019 01:10:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1564474239; cv=none; d=google.com; s=arc-20160816; b=ohWxuda/bBtgS7oWjYqQ5HXrbxtH2a4OND0H5UuavJWr69vYLApdBH+dYyQ+5GCPyW izGEIq6T5W7sLNJdp8FFcsLJK/ZQIcQBwcAGICppTAWDneuF8Y6Aba5k/gu0KSVzB7B7 MNCGjIMKe8ZWNVungwTL0McCI6HX8gme4fL9kNrnLY0oLLquqjuA+CCGh6XbktGD5I4o pyKlaPyaaoQ8XiTDe3qMu/CdfBuyR1x+x18EN0lrI+OvUGH/G+hj0WnY3lYmaqFueuo3 3zk7MIE1IHxYPFcWOW/uuMu245+woTUyx0gAhCtp6ucgEG09h0jIC3u4DRCsr5GbXEwk 2oHA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=KM4b1fSGDoh09saiQEBtFew5NHMkXEVSfNGt1lABt5Q=; b=ju39eWJ9t3cWs5kpfRcBupGnUdnZTwrcZjvrQfjXK3iJUCqMowO8FR0lsHSOcp6hCE ogAzq3I9Lh6kl8fqHQyxQqRtHGHfXl9Ky4JvJTU1zaH8+EWt6HHCKYNt7qkaXpftAJON M3OjCSgLcH+PlIyIJt9RSDHKdkrRhYj/XCBP38w8dwrbxEX7E1FB0zp+ymagJp049Bfd yPlX2YgROOdwNo+wZy4AGoL+Smb3QlMa956Ir9PpedKliEYOV/t9cqjLiI8GENr6nswv DG8E2QnIY5h+gQ4BnOkoN8gGVQY4/RyPZ2tqRlkNxnI9Y0vSvCkAJQktCcDw0ymKTjvY 1/bg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@arista.com header.s=googlenew header.b=CDQ82oc4; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=REJECT dis=NONE) header.from=arista.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h127si27977564pfe.44.2019.07.30.01.10.23; Tue, 30 Jul 2019 01:10:39 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@arista.com header.s=googlenew header.b=CDQ82oc4; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=REJECT dis=NONE) header.from=arista.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727716AbfG2V6n (ORCPT + 99 others); Mon, 29 Jul 2019 17:58:43 -0400 Received: from mail-wm1-f68.google.com ([209.85.128.68]:39690 "EHLO mail-wm1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730456AbfG2V6k (ORCPT ); Mon, 29 Jul 2019 17:58:40 -0400 Received: by mail-wm1-f68.google.com with SMTP id u25so44436071wmc.4 for ; Mon, 29 Jul 2019 14:58:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=googlenew; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=KM4b1fSGDoh09saiQEBtFew5NHMkXEVSfNGt1lABt5Q=; b=CDQ82oc4TTrn6YwPtcTnNRFTpiYIZ1v8sBNEs61xlcdktbwB7d4qATNaBXpsf0jdX/ /ck1guG7bKv54Hwm/0qamxui3linbsnBCWlSssOW97DMCfyU2QUEA1nDv6pJ2qj8Whf+ eafkxvGIG8kxbDWKU313nPDQA/wLEM2u/xxzhEuKsnq+XdErHgD252c384CQoOljQusQ fdXVCxAX7/NaQqYkn0Dne81AFvyqTJxGwKzGvWICHcms9IF8HsSbJxmAztMj7wreD0VS V8fhwpdUmqQbjWi8v7yvAXEhOfIBT1V3JGE45F+QONL8xtgArCVZreKJa5uL7Dy/Fyjw raFQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=KM4b1fSGDoh09saiQEBtFew5NHMkXEVSfNGt1lABt5Q=; b=lfcVj/1W2Dv2YwstvPbXSmU/4tP7Df7aMAgGAdEb1PQtOpgVekZgqfShSIndkVPsKd P8zLC9vY03EwnTV4ZwwqMNpv92EYudmSb/C7wCiMc3nIGjcHYb90xlmS60hiPjk5au5U 95YbOaAtYBdk2XdK53r3t3CsDrrDUIZqx5RykF4T59+MGqWulc3KG+CDA6YS/QsO3btl ZHQtLpLvGDbmZMUnR8a1y/VXmg2GsD6iq4lgKJKIvX7ftj2ASJxmKmpI5MHVSedjg3Yu tfEvOgJ19UheQuI+LjhOQYrKTY+Z4mPonOQgxXEF6TV6VoufPXvTyBe+WH5tQB14kDLB 61Kw== X-Gm-Message-State: APjAAAUElXYczlovOs0do3WC6O61iaaZenK2KZ85qlOGW0P02zcCn/3j v6KCVFqCHSOern1vxF5ZuxG2OiCPb1VZqflm6JLn6KEqASrpghUsradfoWTxZYsI/zyqXhAAtAH WOrw0YQJzeeA1/7XKWx5Ss+bzdUHS7gaoqNnxQhMHmXcgiIOsqaTQ3yg3Tcqy0SCoIxLevUqvND 0HOnvlCTBulz25M+R4tXj7xszsJTm8UTkyRF2AIyo= X-Received: by 2002:a7b:c106:: with SMTP id w6mr106406919wmi.80.1564437519011; Mon, 29 Jul 2019 14:58:39 -0700 (PDT) Received: from Mindolluin.ire.aristanetworks.com ([217.173.96.166]) by smtp.gmail.com with ESMTPSA id x20sm49230728wmc.1.2019.07.29.14.58.37 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Mon, 29 Jul 2019 14:58:38 -0700 (PDT) From: Dmitry Safonov To: linux-kernel@vger.kernel.org Cc: Dmitry Safonov <0x7f454c46@gmail.com>, Andrei Vagin , Dmitry Safonov , Adrian Reber , Andrei Vagin , Andy Lutomirski , Arnd Bergmann , Christian Brauner , Cyrill Gorcunov , "Eric W. Biederman" , "H. Peter Anvin" , Ingo Molnar , Jann Horn , Jeff Dike , Oleg Nesterov , Pavel Emelyanov , Shuah Khan , Thomas Gleixner , Vincenzo Frascino , containers@lists.linux-foundation.org, criu@openvz.org, linux-api@vger.kernel.org, x86@kernel.org Subject: [PATCHv5 28/37] x86/vdso: Enable static branches for the timens vdso Date: Mon, 29 Jul 2019 22:57:10 +0100 Message-Id: <20190729215758.28405-29-dima@arista.com> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190729215758.28405-1-dima@arista.com> References: <20190729215758.28405-1-dima@arista.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-CLOUD-SEC-AV-Info: arista,google_mail,monitor X-CLOUD-SEC-AV-Sent: true X-Gm-Spam: 0 X-Gm-Phishy: 0 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Andrei Vagin As it has been discussed on timens RFC, adding a new conditional branch `if (inside_time_ns)` on VDSO for all processes is undesirable. Addressing those problems, there are two versions of VDSO's .so: for host tasks (without any penalty) and for processes inside of time namespace with clk_to_ns() that subtracts offsets from host's time. The timens code in vdso looks like this: if (timens_static_branch_unlikely()) { clk_to_ns(clk, ts); } This static branch is disabled by default. And the code generated consist of a single atomic 'no-op' instruction, in the straight-line code path. Enable static branches in the timens vdso: the 'no-op' instruction gets replaced with a 'jump' instruction to the out-of-line true branch. Signed-off-by: Andrei Vagin Co-developed-by: Dmitry Safonov Signed-off-by: Dmitry Safonov --- arch/x86/entry/vdso/vma.c | 30 ++++++++++++++++++++++++------ arch/x86/kernel/jump_label.c | 14 ++++++++++++++ include/linux/jump_label.h | 8 ++++++++ init/Kconfig | 1 + 4 files changed, 47 insertions(+), 6 deletions(-) diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c index 91cf5a5c8c9e..1a3eb4656eb6 100644 --- a/arch/x86/entry/vdso/vma.c +++ b/arch/x86/entry/vdso/vma.c @@ -15,6 +15,7 @@ #include #include #include +#include #include #include #include @@ -31,20 +32,37 @@ unsigned int __read_mostly vdso64_enabled = 1; #endif -void __init init_vdso_image(struct vdso_image *image) +#ifdef CONFIG_TIME_NS +static __init void init_timens(struct vdso_image *image) { - BUG_ON(image->size % PAGE_SIZE != 0); + struct vdso_jump_entry *entries; + unsigned long entries_nr; + + if (WARN_ON(image->jump_table == -1UL)) + return; - apply_alternatives((struct alt_instr *)(image->text + image->alt), - (struct alt_instr *)(image->text + image->alt + - image->alt_len)); -#ifdef CONFIG_TIME_NS image->text_timens = vmalloc_32(image->size); if (WARN_ON(image->text_timens == NULL)) return; memcpy(image->text_timens, image->text, image->size); + + entries = image->text_timens + image->jump_table; + entries_nr = image->jump_table_len / sizeof(struct vdso_jump_entry); + apply_vdso_jump_labels(entries, entries_nr); +} +#else +static inline void init_timens(struct vdso_image *image) {} #endif + +void __init init_vdso_image(struct vdso_image *image) +{ + BUG_ON(image->size % PAGE_SIZE != 0); + + apply_alternatives((struct alt_instr *)(image->text + image->alt), + (struct alt_instr *)(image->text + image->alt + + image->alt_len)); + init_timens(image); } struct linux_binprm; diff --git a/arch/x86/kernel/jump_label.c b/arch/x86/kernel/jump_label.c index 044053235302..7820ac61b688 100644 --- a/arch/x86/kernel/jump_label.c +++ b/arch/x86/kernel/jump_label.c @@ -24,6 +24,20 @@ union jump_code_union { } __attribute__((packed)); }; +__init void apply_vdso_jump_labels(struct vdso_jump_entry *ent, unsigned long nr) +{ + while (nr--) { + void *code_addr = (void *)ent + ent->code; + union jump_code_union jmp; + + jmp.jump = 0xe9; /* JMP rel32 */ + jmp.offset = ent->target - ent->code - JUMP_LABEL_NOP_SIZE; + memcpy(code_addr, &jmp, JUMP_LABEL_NOP_SIZE); + + ent++; + } +} + static void bug_at(unsigned char *ip, int line) { /* diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h index 3526c0aee954..bb9d828ee49a 100644 --- a/include/linux/jump_label.h +++ b/include/linux/jump_label.h @@ -125,6 +125,11 @@ struct jump_entry { long key; // key may be far away from the core kernel under KASLR }; +struct vdso_jump_entry { + u16 code; + u16 target; +}; + static inline unsigned long jump_entry_code(const struct jump_entry *entry) { return (unsigned long)&entry->code + entry->code; @@ -229,6 +234,9 @@ extern void static_key_enable(struct static_key *key); extern void static_key_disable(struct static_key *key); extern void static_key_enable_cpuslocked(struct static_key *key); extern void static_key_disable_cpuslocked(struct static_key *key); +extern void apply_vdso_jump_labels(struct vdso_jump_entry *ent, + unsigned long nr); + /* * We should be using ATOMIC_INIT() for initializing .enabled, but diff --git a/init/Kconfig b/init/Kconfig index 9e40c07da4e1..be8bd41774f6 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1072,6 +1072,7 @@ config UTS_NS config TIME_NS bool "TIME namespace" depends on ARCH_HAS_VDSO_TIME_NS + depends on JUMP_LABEL default y help In this namespace boottime and monotonic clocks can be set. -- 2.22.0