Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp5163239yba; Tue, 30 Apr 2019 10:07:21 -0700 (PDT) X-Google-Smtp-Source: APXvYqyTOfWGg4L1ziJoAarHbXm1ad6PCgorIjlZeJ9qgz7usHsg9QvBLEBz6sQM+iq1jlI4rXDt X-Received: by 2002:a63:5907:: with SMTP id n7mr11007678pgb.416.1556644041322; Tue, 30 Apr 2019 10:07:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556644041; cv=none; d=google.com; s=arc-20160816; b=bqOEsIO1yG48ezCIoC584XC6IVbK/QYAxwGAycZdovFApmg1863tvOLplMB28u+AT/ NLK47Ta15unm3M/Eipf0GdymN87aXlFEdXbd99KpqiEhdL9KueGXRRGFhy3Q7DkhVKWC KQ+mPhNKCdiOuVL4wsYiRebYWJHLab3xw3eYoZ8R82eW/jPneiMdGn2yDEnPvMRftwht 0nTyVXMgB36N9GNRukisJrb0rmyDiGG9ucBXumN4P5apKn33iItlg0GFKiY53X2pMtGd IdFaeR4+6XrWOXcctOd+3hu3jib1q2y3rikI5XWdlWrKxTWNUQ4mzMfzRCStNKEOhzqp UYFg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date; bh=G0VyWdDtCMTpQrJcZ4kjqWcttpbqJnZd59xFtEwDRws=; b=ShkBsO2jAQhTGECrhVKkMykiwTmBu+wB3T2y8gUHDxhFpOInaL6ppvXTx6L3Syq+CW kOgxhEsYdFGKRXT7agz3mTzbnMZlQJvWkwXsN2P3kLjgLichrsNG5gh/rqCBVDBEkKww YP/AIJLVKnVxNLWhcUp+ZHgAMSYo8jZawropA5osAkg8BEI2LyliKPCdm0I+g6nrtfoZ cQhXq4B6h/Sw+o8Hm6iBfQ/jQwpDl+fAC/FsGRxfLYCOR4kVUr+LblHTaeUz9EPUMJSD HN57RkKDE6A8UtSfNk4mCaWVhpuyCxOnqzoL9oDXFMwDO0r20B42yMwivgCgNih9X4ao 1DTw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g20si37589791pfi.266.2019.04.30.10.07.04; Tue, 30 Apr 2019 10:07:21 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726819AbfD3REF (ORCPT + 99 others); Tue, 30 Apr 2019 13:04:05 -0400 Received: from mail.kernel.org ([198.145.29.99]:40220 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726050AbfD3REF (ORCPT ); Tue, 30 Apr 2019 13:04:05 -0400 Received: from gandalf.local.home (cpe-66-24-58-225.stny.res.rr.com [66.24.58.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 613032075E; Tue, 30 Apr 2019 17:04:01 +0000 (UTC) Date: Tue, 30 Apr 2019 13:03:59 -0400 From: Steven Rostedt To: Andy Lutomirski Cc: Linus Torvalds , Peter Zijlstra , Nicolai Stange , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , "the arch/x86 maintainers" , Josh Poimboeuf , Jiri Kosina , Miroslav Benes , Petr Mladek , Joe Lawrence , Shuah Khan , Konrad Rzeszutek Wilk , Tim Chen , Sebastian Andrzej Siewior , Mimi Zohar , Juergen Gross , Nick Desaulniers , Nayna Jain , Masahiro Yamada , Joerg Roedel , Linux List Kernel Mailing , live-patching@vger.kernel.org, "open list:KERNEL SELFTEST FRAMEWORK" Subject: Re: [PATCH 3/4] x86/ftrace: make ftrace_int3_handler() not to skip fops invocation Message-ID: <20190430130359.330e895b@gandalf.local.home> In-Reply-To: References: <20190428133826.3e142cfd@oasis.local.home> <20190430135602.GD2589@hirez.programming.kicks-ass.net> X-Mailer: Claws Mail 3.17.3 (GTK+ 2.24.32; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 30 Apr 2019 09:33:51 -0700 Andy Lutomirski wrote: > Linus, can I ask you to reconsider your opposition to Josh's other > approach of just shifting the stack on int3 entry? I agree that it's > ugly, but the ugliness is easily manageable and fairly self-contained. > We add a little bit of complication to the entry asm (but it's not > like it's unprecedented -- the entry asm does all kinds of stack > rearrangement due to IST and PTI crap already), and we add an > int3_emulate_call(struct pt_regs *regs, unsigned long target) helper > that has appropriate assertions that the stack is okay and emulates > the call. And that's it. I also prefer Josh's stack shift solution, as I personally believe that's a cleaner solution. But I went ahead and implemented Linus's version to get it working for ftrace. Here's the code, and it survived some preliminary tests. There's three places that use the update code. One is the start of every function call (yes, I counted that as one, and that case is determined by: ftrace_location(ip)). The other is the trampoline itself has an update. That could also be converted to a text poke, but for now its here as it was written before text poke existed. The third place is actually a jump (to the function graph code). But that can be safely skipped if we are converting it, as it only goes from jump to nop, or nop to jump. The trampolines reflect this. Also, as NMI code is traced by ftrace, I had to duplicate the trampolines for the nmi case (but only for the interrupts disabled case as NMIs don't have interrupts enabled). -- Steve diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c index ef49517f6bb2..bf320bf791dd 100644 --- a/arch/x86/kernel/ftrace.c +++ b/arch/x86/kernel/ftrace.c @@ -17,6 +17,7 @@ #include #include #include +#include #include #include #include @@ -232,6 +233,9 @@ int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr, static unsigned long ftrace_update_func; +/* Used within inline asm below */ +unsigned long ftrace_update_func_call; + static int update_ftrace_func(unsigned long ip, void *new) { unsigned char old[MCOUNT_INSN_SIZE]; @@ -259,6 +263,8 @@ int ftrace_update_ftrace_func(ftrace_func_t func) unsigned char *new; int ret; + ftrace_update_func_call = (unsigned long)func; + new = ftrace_call_replace(ip, (unsigned long)func); ret = update_ftrace_func(ip, new); @@ -280,6 +286,70 @@ static nokprobe_inline int is_ftrace_caller(unsigned long ip) return 0; } +extern asmlinkage void ftrace_emulate_call_irqon(void); +extern asmlinkage void ftrace_emulate_call_irqoff(void); +extern asmlinkage void ftrace_emulate_call_nmi(void); +extern asmlinkage void ftrace_emulate_call_update_irqoff(void); +extern asmlinkage void ftrace_emulate_call_update_irqon(void); +extern asmlinkage void ftrace_emulate_call_update_nmi(void); + +static DEFINE_PER_CPU(void *, ftrace_bp_call_return); +static DEFINE_PER_CPU(void *, ftrace_bp_call_nmi_return); + +asm( + ".text\n" + ".global ftrace_emulate_call_irqoff\n" + ".type ftrace_emulate_call_irqoff, @function\n" + "ftrace_emulate_call_irqoff:\n\t" + "push %gs:ftrace_bp_call_return\n\t" + "sti\n\t" + "jmp ftrace_caller\n" + ".size ftrace_emulate_call_irqoff, .-ftrace_emulate_call_irqoff\n" + + ".global ftrace_emulate_call_irqon\n" + ".type ftrace_emulate_call_irqon, @function\n" + "ftrace_emulate_call_irqon:\n\t" + "push %gs:ftrace_bp_call_return\n\t" + "jmp ftrace_caller\n" + ".size ftrace_emulate_call_irqon, .-ftrace_emulate_call_irqon\n" + + ".global ftrace_emulate_call_nmi\n" + ".type ftrace_emulate_call_nmi, @function\n" + "ftrace_emulate_call_nmi:\n\t" + "push %gs:ftrace_bp_call_nmi_return\n\t" + "jmp ftrace_caller\n" + ".size ftrace_emulate_call_nmi, .-ftrace_emulate_call_nmi\n" + + ".global ftrace_emulate_call_update_irqoff\n" + ".type ftrace_emulate_call_update_irqoff, @function\n" + "ftrace_emulate_call_update_irqoff:\n\t" + "push %gs:ftrace_bp_call_return\n\t" + "sti\n\t" + "jmp *ftrace_update_func_call\n" + ".size ftrace_emulate_call_update_irqoff, .-ftrace_emulate_call_update_irqoff\n" + + ".global ftrace_emulate_call_update_irqon\n" + ".type ftrace_emulate_call_update_irqon, @function\n" + "ftrace_emulate_call_update_irqon:\n\t" + "push %gs:ftrace_bp_call_return\n\t" + "jmp *ftrace_update_func_call\n" + ".size ftrace_emulate_call_update_irqon, .-ftrace_emulate_call_update_irqon\n" + + ".global ftrace_emulate_call_update_nmi\n" + ".type ftrace_emulate_call_update_nmi, @function\n" + "ftrace_emulate_call_update_nmi:\n\t" + "push %gs:ftrace_bp_call_nmi_return\n\t" + "jmp *ftrace_update_func_call\n" + ".size ftrace_emulate_call_update_nmi, .-ftrace_emulate_call_update_nmi\n" + ".previous\n"); + +STACK_FRAME_NON_STANDARD(ftrace_emulate_call_irqoff); +STACK_FRAME_NON_STANDARD(ftrace_emulate_call_irqon); +STACK_FRAME_NON_STANDARD(ftrace_emulate_call_nmi); +STACK_FRAME_NON_STANDARD(ftrace_emulate_call_update_irqoff); +STACK_FRAME_NON_STANDARD(ftrace_emulate_call_update_irqon); +STACK_FRAME_NON_STANDARD(ftrace_emulate_call_update_nmi); + /* * A breakpoint was added to the code address we are about to * modify, and this is the handle that will just skip over it. @@ -295,10 +365,40 @@ int ftrace_int3_handler(struct pt_regs *regs) return 0; ip = regs->ip - 1; - if (!ftrace_location(ip) && !is_ftrace_caller(ip)) + if (ftrace_location(ip)) { + if (in_nmi()) { + this_cpu_write(ftrace_bp_call_nmi_return, (void *)ip + MCOUNT_INSN_SIZE); + regs->ip = (unsigned long) ftrace_emulate_call_nmi; + return 1; + } + this_cpu_write(ftrace_bp_call_return, (void *)ip + MCOUNT_INSN_SIZE); + if (regs->flags & X86_EFLAGS_IF) { + regs->flags &= ~X86_EFLAGS_IF; + regs->ip = (unsigned long) ftrace_emulate_call_irqoff; + } else { + regs->ip = (unsigned long) ftrace_emulate_call_irqon; + } + } else if (is_ftrace_caller(ip)) { + /* If it's a jump, just need to skip it */ + if (!ftrace_update_func_call) { + regs->ip += MCOUNT_INSN_SIZE -1; + return 1; + } + if (in_nmi()) { + this_cpu_write(ftrace_bp_call_nmi_return, (void *)ip + MCOUNT_INSN_SIZE); + regs->ip = (unsigned long) ftrace_emulate_call_update_nmi; + return 1; + } + this_cpu_write(ftrace_bp_call_return, (void *)ip + MCOUNT_INSN_SIZE); + if (regs->flags & X86_EFLAGS_IF) { + regs->flags &= ~X86_EFLAGS_IF; + regs->ip = (unsigned long) ftrace_emulate_call_update_irqoff; + } else { + regs->ip = (unsigned long) ftrace_emulate_call_update_irqon; + } + } else { return 0; - - regs->ip += MCOUNT_INSN_SIZE - 1; + } return 1; } @@ -859,6 +959,8 @@ void arch_ftrace_update_trampoline(struct ftrace_ops *ops) func = ftrace_ops_get_func(ops); + ftrace_update_func_call = (unsigned long)func; + /* Do a safe modify in case the trampoline is executing */ new = ftrace_call_replace(ip, (unsigned long)func); ret = update_ftrace_func(ip, new); @@ -960,6 +1062,7 @@ static int ftrace_mod_jmp(unsigned long ip, void *func) { unsigned char *new; + ftrace_update_func_call = 0; new = ftrace_jmp_replace(ip, (unsigned long)func); return update_ftrace_func(ip, new);