Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp2858204yba; Sun, 28 Apr 2019 10:39:38 -0700 (PDT) X-Google-Smtp-Source: APXvYqxwt9koR5wtnvJ8HTcx0CG6o0bLuq9nrGJ63KoIvrVu6TW0I9uhoX7jCA8Zr49UnicflpOF X-Received: by 2002:a17:902:bf4a:: with SMTP id u10mr25357902pls.63.1556473178678; Sun, 28 Apr 2019 10:39:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556473178; cv=none; d=google.com; s=arc-20160816; b=CA3zIzZlhZqcfE9s3ISahJSTbNVJtutJ+0Oc/mziMVcHp1VgUafHrHJD9Yo1+Xmdvd tVJHsSgGMbyhfrMHxzNlJd8DXEjxW80fYaApVnj2qGBIfiYgkp0zXDaeMVby3MO27jfD Su/LbiqcG696bVz1pDo3aqurXjqw1XJ2Xii2X7Cy9sk/pi7beV/MbpiMXhAd20kJvYXM 83CQgU3q+XolsvyjRHq4KL6ZOLdgpDDgNuZpa0L7vPT45vqIEbKyw9gFTI8WLhvEr/a5 S3gIC4imVmyci8gD4pabX9H3UXyBRewIwc/qTiH+LNi0/gBv0sdsFnlyuvK94cwYMbxX K9tg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date; bh=DNYLEfmDy26GAAJJAnEAeioy70ZrAhFJ7lsG1Jd2k5k=; b=MxJgcLK7rfKOKpzYzqqotTgA3QaCmDlhZqklZAXDM7CjTSkW7Tz8z438St3f0b7IJY lpkpvOQzNaiII8xDccfi0ueV1j0V+F+j5f/ysvjYI8PHVBnmvbW5AQPo4XYdcpnb66qb gEqvQdYaXcc8NaaOWEVM1B/H4/SkfGKgaYKkhOXJjcbrigGdYcLgZm34+8PZrJEwHaCX 02RxLVG366kLnae/XQ9zrH/18+cgz/u4FQRnjHnKAypgmmBHUwjqOiP2JfikZz6iMjiO rlZzQWEqOWj+paEqLFVIQUuhGQJLcUJoS+sG51uOdjYQqRJlwUnH7gobKCCN/iQ57Giv fOzw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q8si29140395pgp.480.2019.04.28.10.39.23; Sun, 28 Apr 2019 10:39:38 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727033AbfD1Rid (ORCPT + 99 others); Sun, 28 Apr 2019 13:38:33 -0400 Received: from mail.kernel.org ([198.145.29.99]:46638 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726299AbfD1Ric (ORCPT ); Sun, 28 Apr 2019 13:38:32 -0400 Received: from oasis.local.home (cpe-66-24-58-225.stny.res.rr.com [66.24.58.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 1265520675; Sun, 28 Apr 2019 17:38:28 +0000 (UTC) Date: Sun, 28 Apr 2019 13:38:26 -0400 From: Steven Rostedt To: Peter Zijlstra Cc: Nicolai Stange , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , x86@kernel.org, Josh Poimboeuf , Jiri Kosina , Miroslav Benes , Petr Mladek , Joe Lawrence , Shuah Khan , Konrad Rzeszutek Wilk , Tim Chen , Sebastian Andrzej Siewior , Mimi Zohar , Juergen Gross , Nick Desaulniers , Nayna Jain , Masahiro Yamada , Andy Lutomirski , Joerg Roedel , linux-kernel@vger.kernel.org, live-patching@vger.kernel.org, linux-kselftest@vger.kernel.org, Linus Torvalds Subject: Re: [PATCH 3/4] x86/ftrace: make ftrace_int3_handler() not to skip fops invocation Message-ID: <20190428133826.3e142cfd@oasis.local.home> In-Reply-To: <20190427102657.GF2623@hirez.programming.kicks-ass.net> References: <20190427100639.15074-1-nstange@suse.de> <20190427100639.15074-4-nstange@suse.de> <20190427102657.GF2623@hirez.programming.kicks-ass.net> X-Mailer: Claws Mail 3.17.3 (GTK+ 2.24.32; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org [ Added Linus ] On Sat, 27 Apr 2019 12:26:57 +0200 Peter Zijlstra wrote: > On Sat, Apr 27, 2019 at 12:06:38PM +0200, Nicolai Stange wrote: > > ftrace_int3_handler()'s context is different from the interrupted call > > instruction's one, obviously. In order to be able to emulate the call > > within the original context, make ftrace_int3_handler() set its iret > > frame's ->ip to some helper stub. Upon return from the trap, this stub will > > then mimic the call by pushing the the return address onto the stack and > > issuing a jmp to the target address. As describe above, the jmp target > > will be either of ftrace_ops_list_func() or ftrace_regs_caller(). Provide > > one such stub implementation for each of the two cases. > > Yuck; I'd much rather we get that static_call() stuff sorted such that > text_poke() and poke_int3_handler() can do CALL emulation. > > Given all the back and forth, I think the solution where we shift > pt_regs a bit to allow the emulated PUSH is a viable solution; eg. I > think we collectively hated it least. Note, this use case is slightly different than the static calls but has the same issue. That issue is that we need a way to simulate a "call" function from int3, and there's no good way to handle that case right now. The static call needs to modify the call sight but make the call still work from int3 context. This use case is similar, but the issue is with a "bug" in the function tracing implementation, that has become an issue with live kernel patching which is built on top of it. The issue is this: For optimization reasons, if there's only a single user of a function it gets its own trampoline that sets up the call to its callback and calls that callback directly: call custom trampoline save regs to call C code call custom_callback restore regs ret If more than one callback is attached to that function, then we need to call the iterator that loops through all registered callbacks of the function tracer and checks their hashes to see if they want to trace this function or not. call iterator_trampoline save regs to call C code call iterator restore regs ret What happens in that transition? We add an "int3" at the function being traced, and change: call custom_trampoline to call iterator_trampoline But if the function being traced is executed during this transition, the int3 just makes it act like a nop and returns pass the call. For tracing this is a bug because we just missed a function that should be traced. For live kernel patching, this could be more devastating because the original "buggy" patched function is called, and this could cause subtle problems. If we can add a slot to the int3 handler, that we can use to emulate a call, (perhaps add another slot to pt_regs structure, call it link register) It would make it easier to solve both this and the static call problems. -- Steve