Received: by 2002:ad5:4acb:0:0:0:0:0 with SMTP id n11csp2169895imw; Sun, 17 Jul 2022 03:04:01 -0700 (PDT) X-Google-Smtp-Source: AGRyM1svZBaWr3KF0LiW9lkAECAitakRLnzHufvcygwGJiXxBnwAMHHtyy/7ZkwgHRamyqVYWpgW X-Received: by 2002:a17:907:a055:b0:72b:16dd:d485 with SMTP id gz21-20020a170907a05500b0072b16ddd485mr21475088ejc.435.1658052241294; Sun, 17 Jul 2022 03:04:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1658052241; cv=none; d=google.com; s=arc-20160816; b=lfq49JbLHJMw+emgJKSgypOxTk/iuA9s1xc/zwn0lnbd9Hg0cM0h0Z8zji0Eo14rfT aoiIAKClqYJ5DE9Pxw7KawbliDMoTpH2BAoubt+NGTwgJjUg3pHeDxWBCUb4YtH13Vbn iHt1QN7sC5oBPOKwOGMNLWy+bWrHnDteV+T1W24jRQgnzdV6ZH35gOeh0n6ExDfd0mL9 ol9qFdc42gc1zTr+8Gk97lDpmdcUy2oRPZEvoLwu8pSs0vmMESB7DCyrZ+im5eBrKi8N t3TDzqRRAC1Ta3VUlh0zHqXR4UtNE3une2rUwP+D8eck6UzNKIZebdIH6RPOF5HuPU8L y/TQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :mime-version:accept-language:in-reply-to:references:message-id:date :thread-index:thread-topic:subject:cc:to:from; bh=vSbsyH+CS4xFSbehCRU7AXoNF8GrPin+B1gemfX3ZwM=; b=Ehlduy+xMBXybBNlcSxq7mlczy63+sd2UyHKynxpT/KuUufYE59cpf1OzLQZdPnGpp /V36YK7RQz3rDftIb/VDhOtGOcGOGE+bqMshyKDd0n3vf7rasfRkR2eZCJ81rphD5sMQ 0q/rQb9t/5fWL0LTqDmGpuF0IMqGAUrXZ/HUAmsTDGdmEnVithWAAHAx6DCOL5Rc0Szi ae03f21s6sOIkeQR9jfws3bCMn3+lK3N5uLtvYfwt2jT/yLwHEdCrvDbzmoyuLr/yKah KKUa2aM0aJj8oOs9wCsariAeaE2V9skkyIpdGVbu1WxDog7hWN7U3EIkE3OpwyTj65JS a8dw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=aculab.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id hc32-20020a17090716a000b0072ed047f29dsi14910823ejc.492.2022.07.17.03.03.33; Sun, 17 Jul 2022 03:04:01 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=aculab.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231938AbiGQJp0 convert rfc822-to-8bit (ORCPT + 99 others); Sun, 17 Jul 2022 05:45:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50594 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229731AbiGQJpY (ORCPT ); Sun, 17 Jul 2022 05:45:24 -0400 Received: from eu-smtp-delivery-151.mimecast.com (eu-smtp-delivery-151.mimecast.com [185.58.86.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id AA67E165BD for ; Sun, 17 Jul 2022 02:45:23 -0700 (PDT) Received: from AcuMS.aculab.com (156.67.243.121 [156.67.243.121]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id uk-mta-281-I-MzlVIbOMG4FSaYKBHZTw-1; Sun, 17 Jul 2022 10:45:19 +0100 X-MC-Unique: I-MzlVIbOMG4FSaYKBHZTw-1 Received: from AcuMS.Aculab.com (fd9f:af1c:a25b:0:994c:f5c2:35d6:9b65) by AcuMS.aculab.com (fd9f:af1c:a25b:0:994c:f5c2:35d6:9b65) with Microsoft SMTP Server (TLS) id 15.0.1497.36; Sun, 17 Jul 2022 10:45:17 +0100 Received: from AcuMS.Aculab.com ([fe80::994c:f5c2:35d6:9b65]) by AcuMS.aculab.com ([fe80::994c:f5c2:35d6:9b65%12]) with mapi id 15.00.1497.036; Sun, 17 Jul 2022 10:45:17 +0100 From: David Laight To: 'Thomas Gleixner' , LKML CC: "x86@kernel.org" , Linus Torvalds , Tim Chen , "Josh Poimboeuf" , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , "Steven Rostedt" , Juergen Gross , "Peter Zijlstra (Intel)" , Masami Hiramatsu , Alexei Starovoitov , Daniel Borkmann Subject: RE: [patch 00/38] x86/retbleed: Call depth tracking mitigation Thread-Topic: [patch 00/38] x86/retbleed: Call depth tracking mitigation Thread-Index: AQHYmWozi03gNC/QIkePUsvbFv4MOq2CTJVg Date: Sun, 17 Jul 2022 09:45:17 +0000 Message-ID: References: <20220716230344.239749011@linutronix.de> In-Reply-To: <20220716230344.239749011@linutronix.de> Accept-Language: en-GB, en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [10.202.205.107] MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=C51A453 smtp.mailfrom=david.laight@aculab.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: aculab.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Thomas Gleixner > Sent: 17 July 2022 00:17 > Folks! > > Back in the good old spectre v2 days (2018) we decided to not use > IBRS. In hindsight this might have been the wrong decision because it did > not force people to come up with alternative approaches. > > It was already discussed back then to try software based call depth > accounting and RSB stuffing on underflow for Intel SKL[-X] systems to avoid > the insane overhead of IBRS. > > This has been tried in 2018 and was rejected due to the massive overhead > and other shortcomings of the approach to put the accounting into each > function prologue: > > 1) Text size increase which is inflicted on everyone. While CPUs are > good in ignoring NOPs they still pollute the I-cache. > > 2) That results in tail call over-accounting which can be exploited. > > Disabling tail calls is not an option either and adding a 10 byte padding > in front of every direct call is even worse in terms of text size and > I-cache impact. We also could patch calls past the accounting in the > function prologue but that becomes a nightmare vs. ENDBR. > > As IBRS is a performance horror show, Peter Zijstra and me revisited the > call depth tracking approach and implemented it in a way which is hopefully > more palatable and avoids the downsides of the original attempt. > > We both unsurprisingly hate the result with a passion... > > The way we approached this is: > > 1) objtool creates a list of function entry points and a list of direct > call sites into new sections which can be discarded after init. > > 2) On affected machines, use the new sections, allocate module memory > and create a call thunk per function (16 bytes without > debug/statistics). Then patch all direct calls to invoke the thunk, > which does the call accounting and then jumps to the original call > site. > > 3) Utilize the retbleed return thunk mechanism by making the jump > target run-time configurable. Add the accounting counterpart and > stuff RSB on underflow in that alternate implementation. What happens to indirect calls? The above would imply that they miss the function entry thunk, but get the return one. Won't this lead to mis-counting of the RSB? I also thought that retpolines would trash the return stack? Using a single retpoline thunk would pretty much ensure that they are never correctly predicted from the BTB, but it only gives a single BTB entry that needs 'setting up' to get mis- prediction. I'm also sure I managed to infer from a document of instruction timings and architectures that some x86 cpu actually used the BTB for normal conditional jumps? Possibly to avoid passing the full %ip value all down the cpu pipeline. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)