Received: by 2002:a05:6358:f14:b0:e5:3b68:ec04 with SMTP id b20csp1797286rwj; Sat, 24 Dec 2022 03:45:00 -0800 (PST) X-Google-Smtp-Source: AMrXdXsRgJTB8+JLzaq1+PJuP88oACUrsomRg4nLbqo3j1EBxujcvqYMU+5W8bz2l9dnkKxq4NER X-Received: by 2002:a17:903:3014:b0:191:1987:9f67 with SMTP id o20-20020a170903301400b0019119879f67mr11764871pla.34.1671882299835; Sat, 24 Dec 2022 03:44:59 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671882299; cv=none; d=google.com; s=arc-20160816; b=rskvJcMIw59cRpuouUHDVNAY63SwR8Oh3Ou+EkQz0deuxS07DFP2UFMHtS98T6tI5H tZBM/lEGI2GMRTaE8fNUGm0NQugDHWdB56lPQBG4ZPzuagthtSreXnIhnGATJLnupxYU LeoPYKxow7bIADWGvKg1qebhurEF9hYB+tBtONiY95so3hYkYRfN1U2WvZfJfnd91pNs TR9QYBBsCqs8hkqUaKxJBQZCJY2+jX++Srrt6aQdfGOLVoMwEWY1CizJvj03g2Az8Zic j2Q9o/YpNLgP62xDcVl8V3kWCQ4RIrK+g+U7En7XBcsJmyWs6Y5ujO4smHsz099lK2EV eKbw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=5pnvC7RfNWN+Hjfmf9QbCz7r5lC7E9ZXnAlDsIZUkTY=; b=Aul2LS1KBHO/onkL933GO0fUw2Vf+s4CVTQuTu56nZxHIGpo6SSzqG7aPKD42A0KUK dYtwaoKKIz9dZGoG2joLwsAHrUeWnFf9GFflvnq7jHEvgJA0uJw0OjJk4ymZj9lUrHwG wyAlO9RDGKP2PoW8nftE79stD+iml8nsdNx9r2f8K8aa/x/Dal1b6eVg6ycOFa50HC0/ 12/7wyYi/hyU6rAVmUALM7H+J4UCLiJxPhywwPJOZxYIgBTIGi9OJ+LWojbqlfH+TYwK 1tA+Ng5IgMcwiB6YT577jjsLGH4TnamIQyrgSU9CAGfBmXUu+Pzt6Mu9A1Z2bY2T7AtB DtfQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k8-20020a170902c40800b001884ba979e7si6620685plk.194.2022.12.24.03.44.51; Sat, 24 Dec 2022 03:44:59 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231193AbiLXLnv (ORCPT + 64 others); Sat, 24 Dec 2022 06:43:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35204 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229507AbiLXLnq (ORCPT ); Sat, 24 Dec 2022 06:43:46 -0500 Received: from cstnet.cn (smtp23.cstnet.cn [159.226.251.23]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 19259A44E for ; Sat, 24 Dec 2022 03:43:42 -0800 (PST) Received: from cgk-Precision-3650-Tower.. (unknown [219.141.235.82]) by APP-03 (Coremail) with SMTP id rQCowABXXpbf5aZj9dVkCA--.18955S4; Sat, 24 Dec 2022 19:43:27 +0800 (CST) From: Chen Guokai To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, rostedt@goodmis.org, mingo@redhat.com, sfr@canb.auug.org.au Cc: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, liaochang1@huawei.com, Chen Guokai Subject: [PATCH v5 0/9] Add OPTPROBES feature on RISCV Date: Sat, 24 Dec 2022 19:43:06 +0800 Message-Id: <20221224114315.850130-1-chenguokai17@mails.ucas.ac.cn> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-CM-TRANSID: rQCowABXXpbf5aZj9dVkCA--.18955S4 X-Coremail-Antispam: 1UD129KBjvJXoWxWrWkAw1fZF4rAw45tr4fKrg_yoWrWr48pF WkKw45ArWDJr43CrW3Jw4kuFySkan3Gw43ur1DX34rJw45JryFy3ZagFWDZF1xCF4Y934j qFn09ryvkFy5A3DanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUU9K14x267AKxVW8JVW5JwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2ocxC64kIII0Yj41l84x0c7CEw4AK67xGY2AK02 1l84ACjcxK6xIIjxv20xvE14v26r4j6ryUM28EF7xvwVC0I7IYx2IY6xkF7I0E14v26r4j 6F4UM28EF7xvwVC2z280aVAFwI0_Cr1j6rxdM28EF7xvwVC2z280aVCY1x0267AKxVWxJr 0_GcWlnxkEFVAIw20F6cxK64vIFxWle2I262IYc4CY6c8Ij28IcVAaY2xG8wAqx4xG64xv F2IEw4CE5I8CrVC2j2WlYx0E2Ix0cI8IcVAFwI0_Jr0_Jr4lYx0Ex4A2jsIE14v26r1j6r 4UMcvjeVCFs4IE7xkEbVWUJVW8JwACjcxG0xvY0x0EwIxGrwACjI8F5VA0II8E6IAqYI8I 648v4I1lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc2xSY4AK67AK6r4kMxAIw28IcxkI7VAKI4 8JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xv wVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x0EwIxGrwCI42IY6xIIjx v20xvE14v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x0267AKxVWUJVW8JwCI42IY6xAIw20E Y4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267 AKxVWUJVW8JbIYCTnIWIevJa73UjIFyTuYvjfU5UDJDUUUU X-Originating-IP: [219.141.235.82] X-CM-SenderInfo: xfkh0w5xrntxyrx6ztxlovh3xfdvhtffof0/1tbiCQAKE2OmnPBm8wAAss X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_PASS,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Add jump optimization support for RISC-V. Replaces ebreak instructions used by normal kprobes with an auipc+jalr instruction pair, at the aim of suppressing the probe-hit overhead. All known optprobe-capable RISC architectures have been using a single jump or branch instructions while this patch chooses not. RISC-V has a quite limited jump range (4KB or 2MB) for both its branch and jump instructions, which prevent optimizations from supporting probes that spread all over the kernel. Auipc-jalr instruction pair is introduced with a much wider jump range (4GB), where auipc loads the upper 12 bits to a free register and jalr Deaconappends the lower 20 bits to form a 32 bit immediate. Note that returns from probe handler requires another free register. As kprobes can appear almost anywhere inside the kernel, the free register should be found in a generic way, not depending on calling convention or any other regulations. The algorithm for finding the free register is inspired by the register renaming in modern processors. From the perspective of register renaming, a register could be represented as two different registers if two neighbour instructions both write to it but no one ever reads. Extending this fact, a register is considered to be free if there is no read before its next write in the execution flow. We are free to change its value without interfering normal execution. Static analysis shows that 51% instructions of the kernel (default config) is capable of being replaced i.e. one free register can be found at both the start and end of replaced instruction pairs while the replaced instructions can be directly executed. We also made an efficiency test on Gem 5 RISCV which shows a more than 5x speedup on breakpoint-based implementation. Contribution: Chen Guokai invents the algorithm of searching free register, evaluate the ratio of optimizaion, the basic function support RVI kernel binary. Liao Chang adds the support for hybrid RVI and RVC kernel binary, fix some bugs with different kernel configure, refactor out entire feature into some individual patches. v5: 1. Correct known nits 2. Enable the usage of unused caller-saved registers 3. Append an efficiency test result on Gem 5 v4: Correct the sequence of Signed-off-by and Co-developed-by. v3: 1. Support of hybrid RVI and RVC kernel binary. 2. Refactor out entire feature into some individual patches. v2: 1. Adjust comments 2. Remove improper copyright 3. Clean up format issues that is no common practice 4. Extract common definition of instruction decoder 5. Fix race issue in SMP platform. v1: Chen Guokai contribute the basic functionality code. Chen Guokai (1): riscv/kprobe: Search free registers from unused caller-saved ones Liao Chang (8): riscv/kprobe: Prepare the skeleton to implement RISCV OPTPROBES feature riscv/kprobe: Allocate detour buffer from module area riscv/kprobe: Prepare the skeleton to prepare optimized kprobe riscv/kprobe: Add common RVI and RVC instruction decoder code riscv/kprobe: Search free register(s) to clobber for 'AUIPC/JALR' riscv/kprobe: Add code to check if kprobe can be optimized riscv/kprobe: Prepare detour buffer for optimized kprobe riscv/kprobe: Patch AUIPC/JALR pair to optimize kprobe arch/riscv/Kconfig | 1 + arch/riscv/include/asm/bug.h | 5 +- arch/riscv/include/asm/kprobes.h | 49 ++ arch/riscv/include/asm/patch.h | 1 + arch/riscv/kernel/patch.c | 23 +- arch/riscv/kernel/probes/Makefile | 1 + arch/riscv/kernel/probes/decode-insn.h | 153 +++++ arch/riscv/kernel/probes/kprobes.c | 24 + arch/riscv/kernel/probes/opt.c | 693 ++++++++++++++++++++++ arch/riscv/kernel/probes/opt_trampoline.S | 137 +++++ arch/riscv/kernel/probes/simulate-insn.h | 41 ++ 11 files changed, 1123 insertions(+), 5 deletions(-) create mode 100644 arch/riscv/kernel/probes/opt.c create mode 100644 arch/riscv/kernel/probes/opt_trampoline.S -- 2.34.1