Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp2492401imu; Thu, 10 Jan 2019 15:21:14 -0800 (PST) X-Google-Smtp-Source: ALg8bN7oLubu5q1PcKhqK5FX7HBDicgdLdmg/2btoHSgBbOhmlQNipuOdNpv/NoBrK61KpeIPq2a X-Received: by 2002:aa7:8542:: with SMTP id y2mr12299746pfn.83.1547162474155; Thu, 10 Jan 2019 15:21:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1547162474; cv=none; d=google.com; s=arc-20160816; b=EqfDdEg1Zqp8fqiwOdhutk54KHfJ8SFUG6olqNaivApsVpb1QPosDIIn/MgTJEZJnd k2D6Pvgz8bSpqnWuLxcHQONsHEVAqFPhjs+aiJ8UxAKCLNZMOoUagvJv9uzflM7EI4eS fCj+HvFP6dtrL2ZECDd+p6rSrnjUvidKi7yaKwgXQ54AQfXICVtHuAZQmtrEGVlg8Ned oV9zopLWn8KTyd3xCFWNkRhl/xHwCV/p5d20/29f6CquGKW9BUiDs6X3F/KvjQ3KW4mu r8KpiOJseb61b1EhfAP84ORmOAhgyX1HfENIbA29hBmnjFf6L7mgdF7ISJfA8YLaiPWz sZ5g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=OKPtR9glYb9U8Bugzh/Ohe3oWZFd+ddbOHoWvBS81kI=; b=PA35aqw6z+hevHImofddGbnRPTVHJYJLyC04q20ZCCE0yHbkXKlvaPzKCVxxFvMCq9 2eoFp0/xedqmwVnBFu5+fDVUzNcq+AL6N6Y6Ci0QkcCSmISddeaiTnjn43JaYos8pNfr /ciyxr8apgyPsssJd0ZZ/OQ4FSrR0jLWf2d7MafpJ+mtSMoDdmHLQ+JHKd4S5BrDMUb3 SQ1fiMNJUdl1u2JUhNz0+KaPf3g9vJBf+McZLpHG9QTvzbqdXtfCAFisntuV4os5TOJc 5ImzPbHHnsBmYUoy3fvqA3mY6gmyffooI9fAHhJOmKoJWjIidPjPwCKlDnSQuU8Zgvfr L/ZA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f12si50418633pgd.68.2019.01.10.15.20.58; Thu, 10 Jan 2019 15:21:14 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730860AbfAJU5p (ORCPT + 99 others); Thu, 10 Jan 2019 15:57:45 -0500 Received: from mx1.redhat.com ([209.132.183.28]:57540 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730777AbfAJU5p (ORCPT ); Thu, 10 Jan 2019 15:57:45 -0500 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 5CC4DA0918; Thu, 10 Jan 2019 20:57:44 +0000 (UTC) Received: from treble (ovpn-125-32.rdu2.redhat.com [10.10.125.32]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 8B4D85D9C6; Thu, 10 Jan 2019 20:57:38 +0000 (UTC) Date: Thu, 10 Jan 2019 14:57:36 -0600 From: Josh Poimboeuf To: Nadav Amit Cc: X86 ML , LKML , Ard Biesheuvel , Andy Lutomirski , Steven Rostedt , Peter Zijlstra , Ingo Molnar , Thomas Gleixner , Linus Torvalds , Masami Hiramatsu , Jason Baron , Jiri Kosina , David Laight , Borislav Petkov , Julia Cartwright , Jessica Yu , "H. Peter Anvin" , Rasmus Villemoes , Edward Cree , Daniel Bristot de Oliveira Subject: Re: [PATCH v3 0/6] Static calls Message-ID: <20190110205736.pv3bt5chkgpep4kq@treble> References: <20190110164401.g747vifrppbhbo3o@treble> <20190110181807.irh2b7fk6at43rdl@treble> <3F89FB6B-DA8B-4C71-B825-2B7EB86F274E@vmware.com> <20190110203207.3k43gt4kcvry7us7@treble> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: NeoMutt/20180716 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Thu, 10 Jan 2019 20:57:44 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jan 10, 2019 at 08:48:31PM +0000, Nadav Amit wrote: > > On Jan 10, 2019, at 12:32 PM, Josh Poimboeuf wrote: > > > > On Thu, Jan 10, 2019 at 07:45:26PM +0000, Nadav Amit wrote: > >>>> I’m not GCC expert either and writing this code was not making me full of > >>>> joy, etc.. I’ll be happy that my code would be reviewed, but it does work. I > >>>> don’t think an early pass is needed, as long as hardware registers were not > >>>> allocated. > >>>> > >>>>> Would it work with more than 5 arguments, where args get passed on the > >>>>> stack? > >>>> > >>>> It does. > >>>> > >>>>> At the very least, it would (at least partially) defeat the point of the > >>>>> callee-saved paravirt ops. > >>>> > >>>> Actually, I think you can even deal with callee-saved functions and remove > >>>> all the (terrible) macros. You would need to tell the extension not to > >>>> clobber the registers through a new attribute. > >>> > >>> Ok, it does sound interesting then. I assume you'll be sharing the > >>> code? > >> > >> Of course. If this what is going to convince, I’ll make a small version for > >> PV callee-saved first. > > > > It wasn't *only* the PV callee-saved part which interested me, so if you > > already have something which implements the other parts, I'd still like > > to see it. > > Did you have a look at https://lore.kernel.org/lkml/20181231072112.21051-4-namit@vmware.com/ ? > > See the changes to x86_call_markup_plugin.c . > > The missing part (that I just finished but need to cleanup) is attributes > that allow you to provide key and dynamically enable the patching. Aha, so it's the basically the same plugin you had for optpolines. I missed that. I'll need to stare at the code for a little bit. > >>>>> What if we just used a plugin in a simpler fashion -- to do call site > >>>>> alignment, if necessary, to ensure the instruction doesn't cross > >>>>> cacheline boundaries. This could be done in a later pass, with no side > >>>>> effects other than code layout. And it would allow us to avoid > >>>>> breakpoints altogether -- again, assuming somebody can verify that > >>>>> intra-cacheline call destination writes are atomic with respect to > >>>>> instruction decoder reads. > >>>> > >>>> The plugin should not be able to do so. Layout of the bytecode is done by > >>>> the assembler, so I don’t think a plugin would help you with this one. > >>> > >>> Actually I think we could use .bundle_align_mode for this purpose: > >>> > >>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsourceware.org%2Fbinutils%2Fdocs-2.31%2Fas%2FBundle-directives.html&data=02%7C01%7Cnamit%40vmware.com%7Cbc4dcc541474462da00b08d6773ab61f%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636827491388051263&sdata=HZNPN4UygwQCqsX8dOajaNeDZyy1O0O4cYeSwu%2BIdO0%3D&reserved=0 > >> > >> Hm… I don’t understand what you have in mind (i.e., when would this > >> assembly directives would be emitted). > > > > For example, it could replace > > > > callq ____static_call_tramp_my_key > > > > with > > > > .bundle_align_mode 6 > > callq ____static_call_tramp_my_key > > .bundle_align_mode 0 > > > > which ensures the instruction is within a cache line, aligning it with > > NOPs if necessary. That would allow my current implementation to > > upgrade out-of-line calls to inline calls 100% of the time, instead of > > 95% of the time. > > Heh. I almost wrote based no the feature description that this will add > unnecessary padding no matter what, but actually (experimentally) it works > well… Yeah, based on the poorly worded docs, I made the same assumption, until I tried it. -- Josh