Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp3834167yba; Tue, 7 May 2019 07:49:57 -0700 (PDT) X-Google-Smtp-Source: APXvYqwcv2rotJjaZj2f0DnVVI/H2kVMNNOUm/xM4FQgixs1Um20zffWas5FboAam66XoIw6gAmD X-Received: by 2002:a63:2ad2:: with SMTP id q201mr1141974pgq.94.1557240597540; Tue, 07 May 2019 07:49:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1557240597; cv=none; d=google.com; s=arc-20160816; b=ZcS7do8L6qcioVCGOUrH/Ym1Lyda//Gi9WGgwgWHHuqkMToAaze1d4ICW68d020Toy zUW0HyVRD0e0xImsx7fJuIbTGotsezzkXZ9qcQcrHslk4Ik84f9IZYicHStQ7p7Z3gLe 1b2PRM/9Qle5s1tuKgtlwtpPYUrQ1pzu7nr9fXSaGVTQWQqbNpMWBkuXDTHBOFy3LFNg epMlacWHQnvHNuKD9LPkQHH6EJs8O9Qhrjaa9RGIKyE5HWl7QU3yeYBml97KmjVUzsBE iEzxDmk063HTNQx9pYmbemNSU0tXm4QvwP3QLUBW+0zmkPQ5Pf9mfReNM2n2K2pFhzKN uXzw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:to:references:message-id :content-transfer-encoding:cc:date:in-reply-to:from:subject :mime-version:dkim-signature; bh=WzezD33BMd1lMmbzTwPzPdqm4AKLM8U8TugyOVRuE80=; b=rkVyiEjPCPDcPZqCbPQpONYsnUujnnqycxTCGDCdjCEY+7ffB1KxTMANTS8tPCuOBF ZRhUkm38gwTXcyRjj3ry3mGn8X1XAwzE/CeGhUlBYQOGlFmBvkGrAHUkjJ1dbu3wJy7h pPySoGexiChsKuoefq1SiQ0xjKdmTIetyEbMcy70BlxVpK5hXZ1KvhMA04r/wjosJL7o H9nNprWJWkzSRY4LCdcORLMEBap79gw2+Cz3XNhdbZCokFJGPKVKM4asd6Ygu/9Lv7Yp 0fAtLkaB2vwhODUIMy8r6ll8Hjyzt2FXSK8rGivUdowNoQ86/yVK7pQkSeg0FF3gXAYp PNrA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amacapital-net.20150623.gappssmtp.com header.s=20150623 header.b="wmv/7GCZ"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t16si2396783plm.65.2019.05.07.07.49.41; Tue, 07 May 2019 07:49:57 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@amacapital-net.20150623.gappssmtp.com header.s=20150623 header.b="wmv/7GCZ"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726519AbfEGOsu (ORCPT + 99 others); Tue, 7 May 2019 10:48:50 -0400 Received: from mail-pg1-f195.google.com ([209.85.215.195]:37022 "EHLO mail-pg1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726337AbfEGOst (ORCPT ); Tue, 7 May 2019 10:48:49 -0400 Received: by mail-pg1-f195.google.com with SMTP id e6so8433172pgc.4 for ; Tue, 07 May 2019 07:48:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amacapital-net.20150623.gappssmtp.com; s=20150623; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=WzezD33BMd1lMmbzTwPzPdqm4AKLM8U8TugyOVRuE80=; b=wmv/7GCZi5CeIq7FJsrMGAWH5gOvg8rDEOeHyrJ4YFduj3v5gFwLwT+o9ian/HZCG9 FA6qRrQ1LZwAxz/xX9VRlJ1TSjE19h7dJis+ndpNxYlo5Cdc0aoWNsp6c9l8ryzmY0N8 zDR3wXMjdCZSwYjV87Z+bFgYJbJNG/CadhqSR8Vjqm4hIHTpprDhAOIdPcdDzBJfsR6l N2fh7KHJks0GJNvnw928sPavtpOMW6bxb9SKQJtjM+jWv+QcKHUbN/VSBbkWNKaSG8dH 5Lmy2biU8H1Jc7jBGp09G6ZcdMoOaudatzgGTW8qtpXBK2MauBFkFaDqPUqo6q0IgBpN 0oaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=WzezD33BMd1lMmbzTwPzPdqm4AKLM8U8TugyOVRuE80=; b=B0oPoiPtuxrCtJBYqIf2UcbQN7I/hPTFF+/hSiaXWHf08CbCkIh4pD1rSBmaSK3j3k eXBtZi6k9DpexP9r5+cm8AYjiktkQb3o5M8F5yZcysm5sHrJpSSBH6UToqg+byrXSwKq oF2dDx5OdJcf1cI3Lkz0apxoAqaRB7vBHbiBuB/9DnS8YzivWhzt4HHzMAAIy9XIsEym raU072Dx9AVKGfDGcvSxZPwJrJz24wfeWtsMKlR9NX8MW+UDfSPojUgHWnBMGMEzQF/B 6HJnnxKi5A6+cJs6jPVhuU+4SaztvC2+p+WYE45Xez1h+tNZ2mySVw31EXrXVzCwUq9w MyXw== X-Gm-Message-State: APjAAAXc5yVYhc33j24Ew2coMOODljrHz6aa0PNKB+3Fstb6NCRvMpUk eG1z6mj+23MXY7tD6PtSRBB9KQ== X-Received: by 2002:a65:6554:: with SMTP id a20mr40715795pgw.284.1557240528817; Tue, 07 May 2019 07:48:48 -0700 (PDT) Received: from ?IPv6:2601:646:c200:1ef2:39d2:a63b:3c03:79ae? ([2601:646:c200:1ef2:39d2:a63b:3c03:79ae]) by smtp.gmail.com with ESMTPSA id 129sm18934352pff.140.2019.05.07.07.48.46 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 07 May 2019 07:48:46 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (1.0) Subject: Re: [RFC][PATCH 1/2] x86: Allow breakpoints to emulate call functions From: Andy Lutomirski X-Mailer: iPhone Mail (16E227) In-Reply-To: Date: Tue, 7 May 2019 07:48:45 -0700 Cc: Steven Rostedt , Peter Zijlstra , Linux List Kernel Mailing , Ingo Molnar , Andrew Morton , Andy Lutomirski , Nicolai Stange , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , the arch/x86 maintainers , Josh Poimboeuf , Jiri Kosina , Miroslav Benes , Petr Mladek , Joe Lawrence , Shuah Khan , Konrad Rzeszutek Wilk , Tim Chen , Sebastian Andrzej Siewior , Mimi Zohar , Juergen Gross , Nick Desaulniers , Nayna Jain , Masahiro Yamada , Joerg Roedel , "open list:KERNEL SELFTEST FRAMEWORK" , stable , Masami Hiramatsu Content-Transfer-Encoding: quoted-printable Message-Id: <48BDF7B6-252B-4D29-9116-844363010BC0@amacapital.net> References: <20190502181811.GY2623@hirez.programming.kicks-ass.net> <20190503092247.20cc1ff0@gandalf.local.home> <2045370D-38D8-406C-9E94-C1D483E232C9@amacapital.net> <20190506081951.GJ2606@hirez.programming.kicks-ass.net> <20190506095631.6f71ad7c@gandalf.local.home> <20190506130643.62c35eeb@gandalf.local.home> <20190506145745.17c59596@gandalf.local.home> <20190506162915.380993f9@gandalf.local.home> <20190506174511.2f8b696b@gandalf.local.home> <20190506210416.2489a659@oasis.local.home> <20190506215353.14a8ef78@oasis.local.home> To: Linus Torvalds Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org >> On May 6, 2019, at 7:22 PM, Linus Torvalds wrote: >>=20 >> On Mon, May 6, 2019 at 6:53 PM Steven Rostedt wrote= : >>=20 >> Also, I figured just calling ftrace_regs_caller() was simpler then >> having that int3 handler do the hash look ups to determine what handler >> it needs to call. >=20 > So what got me looking at this - and the races (that didn't turn out > to be races) - and why I really hate it, is because of the whole > notion of "atomic state". >=20 > Running an "int3" handler (when the int3 is in the kernel) is in some > sense "atomic". There is the state in the caller, of course, and > there's the state that the int3 handler has, but you can *mostly* > think of the two as independent. >=20 > In particular, if the int3 handler doesn't ever enable interrupts, and > if it doesn't need to do any stack switches, the int3 handler part > looks "obvious". It can be interrupted by NMI, but it can't be > interrupted (for example) by the cross-cpu IPI. >=20 > That was at least the mental model I was going for. >=20 > Similarly, the "caller" side mostly looks obvious too. If we don't > take an int3, none of this matter, and if we *do* take an int3, if we > can at least make it "atomic" wrt the rewriter (before or after > state), it should be easy to think about. >=20 > One of the things I was thinking of doing, for example, was to simply > make the call emulation do a "load four bytes from the instruction > stream, and just use that as the emulation target offset". >=20 > Why would that matter? >=20 > We do *not* have very strict guarantees for D$-vs-I$ coherency on x86, > but we *do* have very strict guarantees for D$-vs-D$ coherency. And so > we could use the D$ coherency to give us atomicity guarantees for > loading and storing the instruction offset for instruction emulation, > in ways we can *not* use the D$-to-I$ guarantees and just executing it > directly. >=20 > So while we still need those nasty IPI's to guarantee the D$-vs-I$ > coherency in the "big picture" model and to get the serialization with > the actual 'int3' exception right, we *could* just do all the other > parts of the instruction emulation using the D$ coherency. >=20 > So we could do the actual "call offset" write with a single atomic > 4-byte locked cycle (just use "xchg" to write - it's always locked). > And similarly we could do the call offset *read* with a single locked > cycle (cmpxchg with a 0 value, for example). It would be atomic even > if it crosses a cacheline boundary. I don=E2=80=99t quite get how this could work. Suppose we start with a five= -byte NOP (0F 1F ...). Then we change the first byte to INT3 (CC). Now we c= an atomically change the other four bytes, but the INT3 could happen first. I= suppose that we could treat 1F 00 00 00 or similar as a known-bogus call ta= rget, but that seems dangerous. IOW I think your trick only works if the old and new states are CALL, but we= don=E2=80=99t know that until we=E2=80=99ve looked up the record, at which p= oint we can just use the result of the lookup. An I missing something clever? IMO it=E2=80=99s a bummer that there isn=E2=80= =99t a way to turn NOP into CALL by changing only one byte.