Received: by 2002:a25:31c3:0:0:0:0:0 with SMTP id x186csp6340334ybx; Mon, 11 Nov 2019 07:40:39 -0800 (PST) X-Google-Smtp-Source: APXvYqwTe0WeCVvqgUK8Cv9UIWK6vaX7AGwdXjcJozWUSqaRZrd0f6TcW9Cdp5fwyclAlWixuITK X-Received: by 2002:a17:906:70ca:: with SMTP id g10mr20732029ejk.141.1573486839217; Mon, 11 Nov 2019 07:40:39 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1573486839; cv=none; d=google.com; s=arc-20160816; b=h0sf9nCq75oV1ik2cuI7ugg3Lfo4G7XKANDBZTkgoI02gXFFKaaeT4GJIFeMO/zqer iVQ9ac41Pe2UdLOwvdsPnny3BXG/iAHEsH6wRSpwLBRd8aP/y1mK3RaQBd43QNHXsbdU qFT/9FZNJRpuxQBIzVshDTFAiNFopWhDtdVIJgFFDvklOJP6gM42FlE2CK/3gojvBEwy xz+BYuK/bXaG40D1pIjMPJLh3Exxvlw37p1S+V2wk43TYlX1AF2sni1mqf8masnTCHW3 I867bbNfl7wIqZwHd4YKzLcVmm0huyjx1LXNwpDJBiqCngWUf0kJN4X34uS7zIXL6RFj IDFg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=ZJUBTvEqyA/0iWyeImK7JXM2FeMOPa4/XqLG34HSVU0=; b=HRdZZbsMIvHDtSX4SYV3fYTI9cRhqAvSMBOUKLv3VprdQYSH/UFeYS0z4miSXSYJyi INxwcP1QMFUXYEiIyzHNV672+jfMnXy7ZYc4ao7m0Fic3oQCoizJPJVeQyd2CDRkErR4 pZpV8k6ZJyeaHyScz3/ls4+IdMa2kZFLgkamRKWodWcwwVR454N4zfVz+GRXeieJozRu Pb7+1OoIlPhME5EDaI6YDfJBcu1rpqPdovz2rde8VNN+UBToNu63iIfa4pczMm0jZSYB F82WA1SHbKI64StxMmhmPtcI/zNJEaWCniN/j7k/yH86U4N8F7Xm43/j+6DxMqaLRS0l C8ZA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=qjrwe5DJ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g24si10232219edv.239.2019.11.11.07.40.14; Mon, 11 Nov 2019 07:40:39 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=qjrwe5DJ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726932AbfKKPjd (ORCPT + 99 others); Mon, 11 Nov 2019 10:39:33 -0500 Received: from mail.kernel.org ([198.145.29.99]:33280 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726889AbfKKPjc (ORCPT ); Mon, 11 Nov 2019 10:39:32 -0500 Received: from willie-the-truck (236.31.169.217.in-addr.arpa [217.169.31.236]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 32A1E2084F; Mon, 11 Nov 2019 15:39:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1573486771; bh=o4qJouEy1HgCEmH6O0gWZxRya36rLkMLwmDDxwFd4s4=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=qjrwe5DJXe7U7bKOIW26eK/hF1Li/TwctNy62/R3SXwK7uAz75n+FRf5fHpjAEydf 8dkQmsAKtOvy3b8yBgg6r8h6xcswTtesgILbIbhqvJr2PH6arFgk0u7PS3/F2bZi8k JT+KuombXKnis5fNIBII8zSsV5qrdFFfRXvfDdxg= Date: Mon, 11 Nov 2019 15:39:25 +0000 From: Will Deacon To: Peter Zijlstra Cc: Leo Yan , Mark Rutland , Mike Leach , Adrian Hunter , Ingo Molnar , Borislav Petkov , "H . Peter Anvin" , x86@kernel.org, Alexander Shishkin , Mathieu Poirier , Arnaldo Carvalho de Melo , Jiri Olsa , linux-kernel@vger.kernel.org Subject: Re: [PATCH RFC 1/6] perf/x86: Add perf text poke event Message-ID: <20191111153925.GC10488@willie-the-truck> References: <20191025130000.13032-2-adrian.hunter@intel.com> <20191030104747.GA21153@leoy-ThinkPad-X240s> <20191030124659.GQ4114@hirez.programming.kicks-ass.net> <20191030141950.GB21153@leoy-ThinkPad-X240s> <20191030162325.GT4114@hirez.programming.kicks-ass.net> <20191031073136.GC21153@leoy-ThinkPad-X240s> <20191101100440.GU4131@hirez.programming.kicks-ass.net> <20191104022346.GC26019@leoy-ThinkPad-X240s> <20191108150530.GA7721@leoy-ThinkPad-X240s> <20191111144642.GM4114@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20191111144642.GM4114@hirez.programming.kicks-ass.net> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org As a disclaimer: I'm having a hard time understanding what this thread is about. On Mon, Nov 11, 2019 at 03:46:42PM +0100, Peter Zijlstra wrote: > On Fri, Nov 08, 2019 at 11:05:30PM +0800, Leo Yan wrote: > > > I will update some status for prototype (the code has been uploaded into > > git server [1]) and found some issues for text poke perf event on arm64. > > These issues are mainly related with arch64 architecture. > > > > - The first issue is for the nosync instruction emulation. On arm64, > > some instructions need to be single stepped and some instructions > > is emulated. Especially, after I read the code for kprobe > > implementation on Arm64. So the main idea for prototyping is to use > > the almos same method with kprobe for nosync instruction. > > This makes no sense to me what so ever. What actual instructions are > patched with _nosync() ? ftrace/jump_label only use 'NOP/JMP/CALL' '_nosync()' can be used to patch the following instructions: B (JMP) BL (CALL) BRK (Breakpoint) SVC, HVC, SMC (System calls) NOP ISB (Pipe flush) Our kprobes implementation prefers to single-step the instruction in an XOL buffer (we should do this by placing a BRK at the end of the buffer, but we currently use hardware step which is overkill imo). For instructions that are performing a PC-relative operation (e.g. an immediate branch), we can't run from the XOL buffer because we'd go to the wrong place; these instructions are therefore emulated in software... > For NOP you can advance regs->ip, for JMP you can adjust regs->ip, for > CALL you adjust regs->ip and regs->r14 (IIUC you do something like: > regs->r14 = regs->ip+4; regs->ip = func;) ... in a manner similar to what you describe here. See simulate_b_bl(), although it's thankfully simpler than what x86 seems to have to do. This approach means we can avoid emulating the vast majority of the instruction set. > (FWIW emulating CALL on x86 is fun because we get to PUSH something on > the stack, let me know if you want to see the patches that were required > to make that happen :-) No thanks ;) > > - The second issue is race condition between the CPU alters > > instructions and other CPUs hit the altered instructions (and > > breakpointed). > > > > Peter's suggestion uses global variables 'nosync_insn' and > > 'nosync_addr' to record the altered instruction. But from the > > testing I found for single static key, usually it will change for > > multiple address at once. > > > > So this might cause the issue is: CPU(a) will loop to alter > > instructions for different address (sometimes the opcode also is > > different for different address), at the meantime, CPU(b) hits an > > altered instruction and handle exception for the breakpoint, if > > CPU(a) is continuing to alter instruction for the next address, thne > > CPU(a) might wrongly to use the value from 'nosync_insn' and > > 'nosync_addr'. > > > > Simply to say, we cannot only support single nosync instruction but > > need to support multiple nosync instructions in the loop. > > On x86 all actual text poking is serialized by text_mutex. On arm64, we patch a *lot* and I think stop_machine() is the only safe choice for things like the alternatives, where we patch all sorts of things (cache maintenance, I/O accessors, atomics (until recently), user accessors. Even then, we have to provide our own synchronisation in __apply_alternatives_multi_stop() and use special cache-flushing (clean_dcache_range_nopatch()) to avoid the possibility of running the code that we're modifying. Outside of the alternatives, I'd still be wary about recursive debug exceptions if we were to patch jump labels with breakpoints. Backing up though, I think I'm missing details about what this thread is trying to achieve. You're adding perf events so that coresight trace can take into account modifications of the kernel text, right? If so: * Does this need to take into account more than just jump_label()? * What consistency guarantees are needed by userspace? It's not clear to me how it correlates the new event with execution on other CPUs. Is this using timestamps or something else? * What about module loading? Finally, the whole point of the '_nosync()' stuff is that we don't have to synchronise. Therefore, there is a window where you really can't tell whether the instruction has been updated from the perspective of another CPU. Will