Received: by 2002:ac0:bc90:0:0:0:0:0 with SMTP id a16csp5576033img; Wed, 27 Mar 2019 11:00:59 -0700 (PDT) X-Google-Smtp-Source: APXvYqzw5C2PQvFf19YEdlW/iJN7bjzmhgDHnz3iWdeoUkH0j0ZOgBWd1txaFy/1KistnyEGe96h X-Received: by 2002:a62:121c:: with SMTP id a28mr36052475pfj.58.1553709659888; Wed, 27 Mar 2019 11:00:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553709659; cv=none; d=google.com; s=arc-20160816; b=TKX0KIh6VJ7eQ9I75D2HxtNa5VzCcZenFaxzmDqR3v80GburIlkGhtfmbuywQFGBja RhtWTiDr3oEJ8S7H1xtF3pF3H5+HxuSHLEX8dpdyFqFZ3jTFWnFCoPdEkWcstgFOSikd EvEx9DbDG8F43WIrCf+Qquo4Tl06+9npwuNX52XaG0NBcoUr5933alIW14P9MPq8YP9J NBzSOndoxbk2JlJMrmrD43c0c7oG56K613/GzETaD2i4LnHSiPzcj0AKAYQOF/RZbal9 Isfch+EeaUO2lg66AXLz5gg23Whf78TpPTx5xuU9aqe0+fanrb75zL2GESYAlVSewNSV rq/Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=TU0W09h/7m7ftvEgIfADq5dLO08GU6CTDfqTFxZ91jk=; b=qNvgdaf4Smwe451+OsebivrCcUAllSb25ZjTWrq971E6yzplxakKCdwFio80j92xYn lnjbfETYj/j67NR7MG+hvHY4cRh4rquJw6tw4VtsDgAFGnYq63bEtlEi2zcE77IjAOr0 5p+vQ9jBo0YycYhzYyvmyIqG7kb7OLRecvh46Kim3Y3QlVY4aQY+gqE/OBXfSX6bNVpF 0aUsnwhdf+nyZkE23WXcuEFgE9xiZ6fPE5T42cFJTuSjC0FSpewQWlsrtGuMuvV+yeJD QtbsWsXB1fsyfVyN7S6hmkFcamEaRC3DGcy3FY9uC+9fBpaB5X/R+wIH97uIOJPwwCqF vNVQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=JXIPsV2V; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t69si19122923pfa.7.2019.03.27.11.00.42; Wed, 27 Mar 2019 11:00:59 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=JXIPsV2V; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728209AbfC0SAE (ORCPT + 99 others); Wed, 27 Mar 2019 14:00:04 -0400 Received: from mail-pl1-f194.google.com ([209.85.214.194]:40686 "EHLO mail-pl1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726127AbfC0SAE (ORCPT ); Wed, 27 Mar 2019 14:00:04 -0400 Received: by mail-pl1-f194.google.com with SMTP id b11so3665904plr.7; Wed, 27 Mar 2019 11:00:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=TU0W09h/7m7ftvEgIfADq5dLO08GU6CTDfqTFxZ91jk=; b=JXIPsV2V4UzfGXpSETLamwBuoTET/F93d1iUB5qRPsNUwb4n6zwOtkjbKto0OcffnV 1sTP4a0NWqjYQeiNSWcGNg7EqS+GE1IyuYuHhdjLYyw0PEgeB3YwKYHK0L1VsvnN6roB 52y43AgoQdClxKXe8kS4WpB17LlR1zyKgQTtiijQ3Dk7P7j1Vf8YZhIRKQ95mJnOkNoF qPjsuClrJzSqKN89eWLLRtbmTH16Zqw1OzhGDbpCmF3/lgtbwEfN4sikWTX8UEgKzb6f C4Dw6uBsIMhT8iWmyD7smVuW/cavoqlNtTF5W1xDu1tPKBcHwCgPa6hXO3BzrRhPpUW+ caMw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=TU0W09h/7m7ftvEgIfADq5dLO08GU6CTDfqTFxZ91jk=; b=JOlJib9sn86nRtk0Pggea/lWEflOpIua+ny8LG4NGpUzcdwqk+glkEvnJPVHlvpZcV tUN/5B/hg322+JXfndL/uOsWuuuvRJ3Q70vBmt2YXClftqS1B/Dv4fi4IidD9wJYYrU8 gI+amvBGNTgWkRkUl2s5E7flaV7qmKZzPr8ffeuqto5GEoK1onBntve3PxDy4guOxfkt Lw9OtKhTw5yxNu+OyaHn7+X8MHcdkI23Q/KpmIxv/5m2K8iAjWL66pODqKky0N50ueOj wLTtjVULYb9onv6RCiro0VqcW/Mi7EZ9uo/q28FoStsYsXZjHnHwwDwHpYbAC5gvN17e eYQA== X-Gm-Message-State: APjAAAUK3D5PwawPDIw2TIozuKGGvQg+O5oWT5/Na9xVtiD3Eyq9jb0r ohlIvwObWww61zf9jChFyKY= X-Received: by 2002:a17:902:6b08:: with SMTP id o8mr38564865plk.105.1553709602889; Wed, 27 Mar 2019 11:00:02 -0700 (PDT) Received: from gmail.com ([2620:0:1008:fd00:ed8e:8493:d2b7:8f54]) by smtp.gmail.com with ESMTPSA id g67sm34103689pfg.94.2019.03.27.11.00.01 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 27 Mar 2019 11:00:02 -0700 (PDT) Date: Wed, 27 Mar 2019 11:00:00 -0700 From: Andrei Vagin To: Thomas Gleixner Cc: Rasmus Villemoes , Dmitry Safonov , LKML , Adrian Reber , Andrei Vagin , Andy Lutomirski , Andy Tucker , Arnd Bergmann , Christian Brauner , Cyrill Gorcunov , Dmitry Safonov <0x7f454c46@gmail.com>, "Eric W. Biederman" , "H. Peter Anvin" , Ingo Molnar , Jeff Dike , Oleg Nesterov , Pavel Emelyanov , Shuah Khan , containers@lists.linux-foundation.org, criu@openvz.org, linux-api@vger.kernel.org, x86@kernel.org, Vincenzo Frascino , Will Deacon Subject: Re: [PATCH 16/32] x86/vdso: Generate vdso{,32}-timens.lds Message-ID: <20190327175957.GA9309@gmail.com> References: <20190206001107.16488-1-dima@arista.com> <20190206001107.16488-17-dima@arista.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org While the generic vdso patchset is in development, we decided to think about what other ways of generating two vdso libraries. In this patchset, we use a linker script, but it looks too complicated, so we decided to look at other options. Another obvious approach is the code patching technique. The main idea was to reduce the amount of arch-dependent code and Dmitry brought with the idea of three labels. Let’s look at this pseudo-code: Int vdso_clock_gettime(clockid_t clk, struct timespec *ts) { ... l_call: clk_to_ns(clk, ts) l_return: return 0; annotate_reachable(); l_out: nop(); return 0; } Here we can see three labels. Without patching this code, the function will apply vdso offsets. But if we copy the code between the last two labels to the first label, we will get a version which skips vdso offsets. The patch which implements this idea will be in replies to this email. It was tested on x86_64 and with gcc as a compiler, but we suspect that there might be some issues on other architectures or with other compilers. So we would like to ask the help of the community to understand what we have to do to be sure that this code works always correctly. The second patch implements static_branch for the vdso code. Here are only a few lines of arch-dependent code: +static __always_inline bool timens_static_branch(void) +{ + asm_volatile_goto("1:\n\t" + ".byte " __stringify(STATIC_KEY_INIT_NOP) "\n\t" + ".pushsection __retcall_table, \"aw\"\n\t" + "2: .word 1b - 2b, %l[l_yes] - 2b\n\t" + ".popsection\n\t" + : : : : l_yes); + + return false; +l_yes: + return true; +} This is a slightly modified version of the arch_static_branch() function. The timens code in vdso looks like this: if (timens_static_branch()) { clk_to_ns(clk, ts); } The version of vdso which is compiled from sources will never execute clk_to_ns(). And then we can patch the 'no-op' in the straight-line codepath with a 'jump' instruction to the out-of-line true branch and get the timens version of the vdso library. Now we can compare these three versions. Our opinion is that the version with three labels looks cleaner and if it will work with all compilers on all architectures, we probably have to choose it. Otherwise, we would prefer the version with static_branches, because it is simpler than the version with the linker script. Thanks, Andrei On Fri, Feb 08, 2019 at 10:57:57AM +0100, Thomas Gleixner wrote: > On Thu, 7 Feb 2019, Rasmus Villemoes wrote: > > Cc: + Vincenzo, Will > > > On 06/02/2019 01.10, Dmitry Safonov wrote: > > > As it has been discussed on timens RFC, adding a new conditional branch > > > `if (inside_time_ns)` on VDSO for all processes is undesirable. > > > It will add a penalty for everybody as branch predictor may mispredict > > > the jump. Also there are instruction cache lines wasted on cmp/jmp. > > > > > > Those effects of introducing time namespace are very much unwanted > > > having in mind how much work have been spent on micro-optimisation > > > vdso code. > > > > > > Addressing those problems, there are two versions of VDSO's .so: > > > for host tasks (without any penalty) and for processes inside of time > > > namespace with clk_to_ns() that subtracts offsets from host's time. > > > > > > Unfortunately, to allow changing VDSO VMA on a running process, > > > the entry points to VDSO should have the same offsets (addresses). > > > That's needed as i.e. application that calls setns() may have already > > > resolved VDSO symbols in GOT/PLT. > > > > These (14-19, if I'm reading them right) seems to add quite a lot of > > complexity and fragility to the build, and other architectures would > > probably have to add something similar to their vdso builds. > > Yes and we really want to avoid that. The VDSO implementations are > pointlessly different accross the architectures and there is effort on the > way to consolidate them: > > https://lkml.kernel.org/r/20190115135539.24762-1-vincenzo.frascino@arm.com > > I talked to Vincenzo earlier this week and he's working on a new version of > that. The timens stuff wants to go on top of the consolidation otherwise we > end up with another set of pointlessly different and differently broken > VDSO variants. > > Thanks, > > tglx