Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp4356701ybi; Tue, 11 Jun 2019 05:26:55 -0700 (PDT) X-Google-Smtp-Source: APXvYqwSldm92FUH38p91SQEhkksA8JtturJwFF9qJ68ncwcyZlwXX83e3Gqy9pw4kl5KKqFxWpV X-Received: by 2002:a62:3287:: with SMTP id y129mr23584377pfy.251.1560256015586; Tue, 11 Jun 2019 05:26:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560256015; cv=none; d=google.com; s=arc-20160816; b=J+l1WgcSK3uZjo27s0dY6WnUF3qt7niZ6y+esULQfuHr4m1sgiTsb71m0u6u55UVfm eQYzq7+/czaqFs+Q/8uT1PEdUoYW1Mde8I70hW9FQVZn5/mLSZmN0sGsbQeWmVcH33ik nrMCjrcROOBT4Z2S8WAuRMRUQf2moE5kZ3s+OakzSxYp/xg3W8iZwY8a8KkaIMB8ZWL+ d5wdl9kLwrTpCe/kCJzHTD9VxcgqL/pWpw8YXzuOdj+u2m2yvMuT/zMwrwVHmvi89fw3 D7gVWfatlbBBc1RAFEsNo+B80JnzsCqIEiK1gCGqnIIv21Rli9xUCzxSZAslnylLvweo upYg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=iADvzNHxrj3aEEZQS4BDjuzQUDhxOJwFwSW7vltHCTI=; b=NNr8ikyVA0K5MwWdW8QlS94bi8Ag+yYu6nMf5D+DiXXNeyLBQpnkZm7D9y4uftju3O o7tXL5RUmyjMdi5LdYum9SSjkZo6T9IVLtdKktCA8nnkeHx6x2PgIQGraQtQ0Q5SsPD2 emFdAqBldCUC+1F9jRXnvUjrweOnKQupNpQy6b16bKoHDeWTd2pn/r3c3KWNlSk8XQ7q 0pOUxAW59fHNYGqospReytxQ+NuZxYaB12XudR/kR/RdX74KBqMLXFCP+MwvJf6rcJ7S e+/dFnsmgIf/6Wfxe86YGvw94rZcZuhlyjx2Zr2XzIL2em4HwtJl5v/keSQ5ennga7Jb lJFQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=merlin.20170209 header.b=K753bTah; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z14si2377520pju.64.2019.06.11.05.26.39; Tue, 11 Jun 2019 05:26:55 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=merlin.20170209 header.b=K753bTah; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2404518AbfFKMJZ (ORCPT + 99 others); Tue, 11 Jun 2019 08:09:25 -0400 Received: from merlin.infradead.org ([205.233.59.134]:53788 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387538AbfFKMJZ (ORCPT ); Tue, 11 Jun 2019 08:09:25 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=merlin.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=iADvzNHxrj3aEEZQS4BDjuzQUDhxOJwFwSW7vltHCTI=; b=K753bTahcHF+Rs2Eljx/ON3D5 qtvxcR8shCT1oenJwJac7IcRvDseDZxbjKGUqKFXIamb/iwBJ8Rkiio1UHxUCuJwUACaAcQijpxhA oly/8SNCjSzJAuIk7dKxFJvMqmjvbzum7sWcvAfT0xXoabpPvkooZKmVWGtofOlfZAR7xE4S+fBaW 6FqyRpdqlC/ik6x7z5oI+icDWqtyZrrdCv3Hm6fMxLZY7pmUcU9WM4PFAvtOKIIx8gYkjVhPawB6s jcK6Vfbl9bJO5oYrDC3rISd/wNNaU1tRI2quNbbVxcn7z2mU26AH6hwVne/G/8FspPHyESmf1X2Xo wKiegod2w==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=hirez.programming.kicks-ass.net) by merlin.infradead.org with esmtpsa (Exim 4.92 #3 (Red Hat Linux)) id 1hafZu-0004Le-2U; Tue, 11 Jun 2019 12:08:38 +0000 Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id 061F32022711C; Tue, 11 Jun 2019 14:08:35 +0200 (CEST) Date: Tue, 11 Jun 2019 14:08:35 +0200 From: Peter Zijlstra To: Andy Lutomirski Cc: Masami Hiramatsu , x86@kernel.org, linux-kernel@vger.kernel.org, Ard Biesheuvel , Andy Lutomirski , Steven Rostedt , Ingo Molnar , Thomas Gleixner , Linus Torvalds , Jason Baron , Jiri Kosina , David Laight , Borislav Petkov , Julia Cartwright , Jessica Yu , "H. Peter Anvin" , Nadav Amit , Rasmus Villemoes , Edward Cree , Daniel Bristot de Oliveira Subject: Re: [PATCH 08/15] x86/alternatives: Teach text_poke_bp() to emulate instructions Message-ID: <20190611120834.GG3463@hirez.programming.kicks-ass.net> References: <20190605130753.327195108@infradead.org> <20190605131945.005681046@infradead.org> <20190608004708.7646b287151cf613838ce05f@kernel.org> <20190607173427.GK3436@hirez.programming.kicks-ass.net> <3DA961AB-950B-4886-9656-C0D268D521F1@amacapital.net> <20190611080307.GN3436@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190611080307.GN3436@hirez.programming.kicks-ass.net> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jun 11, 2019 at 10:03:07AM +0200, Peter Zijlstra wrote: > On Fri, Jun 07, 2019 at 11:10:19AM -0700, Andy Lutomirski wrote: > > I am surely missing some kprobe context, but is it really safe to use > > this mechanism to replace more than one instruction? > > I'm not entirely up-to-scratch here, so Masami, please correct me if I'm > wrong. > > So what happens is that arch_prepare_optimized_kprobe() <- > copy_optimized_instructions() copies however much of the instruction > stream is required such that we can overwrite the instruction at @addr > with a 5 byte jump. > > arch_optimize_kprobe() then does the text_poke_bp() that replaces the > instruction @addr with int3, copies the rel jump address and overwrites > the int3 with jmp. > > And I'm thinking the problem is with something like: > > @addr: nop nop nop nop nop > > We copy out the nops into the trampoline, overwrite the first nop with > an INT3, overwrite the remaining nops with the rel addr, but oops, > another CPU can still be executing one of those NOPs, right? > > I'm thinking we could fix this by first writing INT3 into all relevant > instructions, which is going to be messy, given the current code base. Maybe not that bad; how's something like this? (completely untested) --- arch/x86/kernel/alternative.c | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index 0d57015114e7..8f643dabea72 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -24,6 +24,7 @@ #include #include #include +#include int __read_mostly alternatives_patched; @@ -849,6 +850,7 @@ static void do_sync_core(void *info) static bool bp_patching_in_progress; static void *bp_int3_handler, *bp_int3_addr; +static unsigned int bp_int3_length; int poke_int3_handler(struct pt_regs *regs) { @@ -867,7 +869,11 @@ int poke_int3_handler(struct pt_regs *regs) if (likely(!bp_patching_in_progress)) return 0; - if (user_mode(regs) || regs->ip != (unsigned long)bp_int3_addr) + if (user_mode(regs)) + return 0; + + if (regs->ip < (unsigned long)bp_int3_addr || + regs->ip >= (unsigned long)bp_int3_addr + bp_int3_length) return 0; /* set up the specified breakpoint handler */ @@ -900,9 +906,12 @@ NOKPROBE_SYMBOL(poke_int3_handler); void text_poke_bp(void *addr, const void *opcode, size_t len, void *handler) { unsigned char int3 = 0xcc; + void *kaddr = addr; + struct insn insn; bp_int3_handler = handler; bp_int3_addr = (u8 *)addr + sizeof(int3); + bp_int3_length = len - sizeof(int3); bp_patching_in_progress = true; lockdep_assert_held(&text_mutex); @@ -913,7 +922,14 @@ void text_poke_bp(void *addr, const void *opcode, size_t len, void *handler) */ smp_wmb(); - text_poke(addr, &int3, sizeof(int3)); + do { + kernel_insn_init(&insn, kaddr, MAX_INSN_SIZE); + insn_get_length(&insn); + + text_poke(kaddr, &int3, sizeof(int3)); + + kaddr += insn.length; + } while (kaddr < addr + len); on_each_cpu(do_sync_core, NULL, 1);