Received: by 2002:ab2:3319:0:b0:1ef:7a0f:c32d with SMTP id i25csp118124lqc; Thu, 7 Mar 2024 12:02:06 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCVpwbqobtk+7RC+w4GCNJkaX4ypClp+O754qGhRhErgXTY65oKvZCWxm7lzMkDS7byCmLCbvfqcOIS/seYNAnJcE8te8Ef2LqtzuwGmxQ== X-Google-Smtp-Source: AGHT+IEVFz+5x8u8z/zhQ2D12o0foG1Ft6MuJZ2wK0/PrXwmxpCMazpN2ehNAz0r0AXymdGEUvHs X-Received: by 2002:a05:6808:1153:b0:3c2:12f3:ad18 with SMTP id u19-20020a056808115300b003c212f3ad18mr9401147oiu.57.1709841726123; Thu, 07 Mar 2024 12:02:06 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1709841726; cv=pass; d=google.com; s=arc-20160816; b=IQgUOlVNM5WerUiyWf6rU9hmkc9IS+VecFM3MW78xtyYPLhC1daCUozeRQaxMmPPlX JTc7B5OwWSH8S7/Xc7bRAz5wdxC8BtN+b1NIdttc4pNVb6sz221Mn/UaShZb/HiYQ6mM 26NO+cAE/lWlX1lqCeVSnq0nK1H4ZI0jb6QdzIEOKUHCX0BO95CeUsFXjibK18Vyz1Yj 0MRXbfpO0HK0JtRSNKlWEopXlh9BYedRWVu5MN+nHmnsnye4YuBJRJOkibmEIUggTb/i 4QCTirAXbqolIYsjBlcvNG+22QTiboo8DaNk2bnsPn6y1eLVmzvff5LoB1O5Q0d2b1yp wFHA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:dkim-signature; bh=c4I1lXn3FBBpmnufblffhcte7kEWdSkpeXIrFdn90Cs=; fh=4nggKsxdimSecj4w3kAgLzsViCLcub2gE3aPBEC5ukM=; b=wU3K7JexxYNiIGopgi2A2oZefXbE2dX9z9dpoqT5xKsFAloKjpPCr4asoV69qP/e5w /qApwwuYMUN6mM70QTRloUQGMeYNkTDdLqlYllAy2IvH/w+qXNb8zrHEx27BSevD6nC/ RDAyYyZHEkQEKYjJoeOvB/EVCkE965DLLp/3/110FT4gV7yg86jyEThE4NgfSEiXVbO7 CZSF2QAWiGDdzdCdLzUmWZ6lbQjdJ8dIV564QmyAIskWqmiINBy9mjTsQLvFXoZl9uIP csYBGQug8JM7EtHrdZaJLHG9d63oAf0Q0RxpbamjgmGngK8SEAp6PSkSfJyO2DDfcIMJ Zctg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=kays2Dit; arc=pass (i=1 spf=pass spfdomain=gmail.com dkim=pass dkdomain=gmail.com dmarc=pass fromdomain=gmail.com); spf=pass (google.com: domain of linux-kernel+bounces-96141-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-96141-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [2604:1380:40f1:3f00::1]) by mx.google.com with ESMTPS id w24-20020a631618000000b005dc7e8a6d6fsi14680773pgl.520.2024.03.07.12.02.05 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 07 Mar 2024 12:02:06 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-96141-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) client-ip=2604:1380:40f1:3f00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=kays2Dit; arc=pass (i=1 spf=pass spfdomain=gmail.com dkim=pass dkdomain=gmail.com dmarc=pass fromdomain=gmail.com); spf=pass (google.com: domain of linux-kernel+bounces-96141-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-96141-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id D27FBB23B8B for ; Thu, 7 Mar 2024 19:53:26 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 8F35D1386AC; Thu, 7 Mar 2024 19:52:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="kays2Dit" Received: from mail-lf1-f54.google.com (mail-lf1-f54.google.com [209.85.167.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B93F41386A2; Thu, 7 Mar 2024 19:51:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.54 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709841119; cv=none; b=JvQaqkU+U6fFEJbDdmifnpT4S7UPTtVX/32Dhz6Fdu9NiE0aGXFbXuDzKiO6Hj7k/hmR3ksGGDnSqdeMJ8yZy/wlMVnUhm8/1AYB4Y/kAKbguXUOQRf8Q7NjhkEY1vyFEHT3hdemagHDoa7XygB8BCR6V18/EhZM3YczM7/jfEk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709841119; c=relaxed/simple; bh=TdtykXB7n8fNd9dJcXpahhooUT09J4+JKDYWfeZjLlE=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=VTIciQbr6MIdfmeEEmkCgaBwdNAR68Ffj1E8cngFNkC/ueHiKD86Xu+ivdF0fjOp14wEjDJXt/D+1p+3PbIbr6tmt6JzSw//QfOmrHsYhTv3Vv9rO6eUD2nJ5p5Lty78UC7F6+3IPYY7I2+nNA98Cf40XHm/ImDQmPbdlGkAizk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=kays2Dit; arc=none smtp.client-ip=209.85.167.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-lf1-f54.google.com with SMTP id 2adb3069b0e04-5133d26632fso163072e87.2; Thu, 07 Mar 2024 11:51:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1709841116; x=1710445916; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=c4I1lXn3FBBpmnufblffhcte7kEWdSkpeXIrFdn90Cs=; b=kays2DitVDgF4FgstaCnHBHFrzGg1UAHnCJgtZMhnqM6UQOL5QFhmxrocNrT5sonZr jiCl0MJ+YYAHr6CDbLUNVey/uZQpqFL3KyLDBwaakvJlrm4aSiS3q2wga81UtGnKsO7G cfoU0ViP9fuXh2Cpief/A32A5JqRTVeZLmbyCMwY3QHvbuAud/ENJqY/5Mz1hVjwnIod zvYBdyPydop32lhxqz/KJrcSo62O1XqdSNts27HjqPbTPCBY908efpo5j6NrKbBpaTrZ 6UiUFH1mzxu24a/bLm+NmOWqpcL7L99XeqdE147mO+sjlhHQgNYYPglXveD560rzn3IH pgfw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709841116; x=1710445916; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=c4I1lXn3FBBpmnufblffhcte7kEWdSkpeXIrFdn90Cs=; b=CtgGQDZcwOAHqWwRBFzavJ6daQ0ldvRgUKstU29j8ImgTNjLEwE5V6kA86dbF/8f5R 1BrLcWB6fmkuVi0RPbTq9Xl1KWB++5nc29nexCaWTsUjROZ5F2ighIbS9xtjFzlRGyRt /Pgmsw/HpL3PO/l17GBNeJB+e3Tu42QIcj2YvSvNeUFlorSKy+i8AkE8aKakcS7BZcx0 FS64cPc7mVVxCOykzGQg96mZz+m9oAa9msndKEE9DGLJpJdC8Mr0KBWFKAvxEdd7LDgk XjHCKCaFclKp+FAg0YNUmxfp4RziTHIMjatS2mepnYUJ9XnWfFaPjug0m+2sFZp9Eyr3 3wNA== X-Forwarded-Encrypted: i=1; AJvYcCWLMDtOdkGhceUGbOV3uMQKMz4edxUmoO+BwswzuWHh61WuRaZ/xK2bBWZCB/ascnc3uAlYSSeEKhp+Ox1gWgX2lOZ/YwNKyhT5Mqok4qj5in/5Tosrmiai+dxToLl8w0NRE0kWECetGa9mi93BHNc3 X-Gm-Message-State: AOJu0YxRSZr3KkRaMRfryUvztgKD1Qqcg0U9BrqhIM4AOrx5INHBtkN+ W56XQ4KdNcQ/zrCMSYhRjd9IIDRlamIDLgT4tmnBgt4RsgxjbkGF65kUFiSYTtqE4no2vXmzrqK ta2upQm9FB4yMIbqK1uZBnt7hzVo= X-Received: by 2002:a05:6512:3d1f:b0:513:2f96:72b5 with SMTP id d31-20020a0565123d1f00b005132f9672b5mr2958723lfv.33.1709841115402; Thu, 07 Mar 2024 11:51:55 -0800 (PST) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20240306165904.108141-1-puranjay12@gmail.com> <87ttlhdeqb.fsf@all.your.base.are.belong.to.us> In-Reply-To: <87ttlhdeqb.fsf@all.your.base.are.belong.to.us> From: Puranjay Mohan Date: Thu, 7 Mar 2024 20:51:44 +0100 Message-ID: Subject: Re: [RFC PATCH] riscv: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS To: =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= Cc: Paul Walmsley , Palmer Dabbelt , Albert Ou , Steven Rostedt , Masami Hiramatsu , Mark Rutland , Sami Tolvanen , Guo Ren , Ley Foon Tan , Deepak Gupta , Sia Jee Heng , =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , Song Shuai , =?UTF-8?B?Q2zDqW1lbnQgTMOpZ2Vy?= , Al Viro , Jisheng Zhang , linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Bj=C3=B6rn, On Thu, Mar 7, 2024 at 8:27=E2=80=AFPM Bj=C3=B6rn T=C3=B6pel wrote: > > Puranjay! > > Puranjay Mohan writes: > > > This patch enables support for DYNAMIC_FTRACE_WITH_CALL_OPS on RISC-V. > > This allows each ftrace callsite to provide an ftrace_ops to the common > > ftrace trampoline, allowing each callsite to invoke distinct tracer > > functions without the need to fall back to list processing or to > > allocate custom trampolines for each callsite. This significantly speed= s > > up cases where multiple distinct trace functions are used and callsites > > are mostly traced by a single tracer. > > > > The idea and most of the implementation is taken from the ARM64's > > implementation of the same feature. The idea is to place a pointer to > > the ftrace_ops as a literal at a fixed offset from the function entry > > point, which can be recovered by the common ftrace trampoline. > > Not really a review, but some more background; Another rationale (on-top > of the improved per-call performance!) for CALL_OPS was to use it to > build ftrace direct call support (which BPF uses a lot!). Mark, please > correct me if I'm lying here! > > On Arm64, CALL_OPS makes it possible to implement direct calls, while > only patching one BL instruction -- nice! > > On RISC-V we cannot use use the same ideas as Arm64 straight off, > because the range of jal (compare to BL) is simply too short (+/-1M). > So, on RISC-V we need to use a full auipc/jal pair (the text patching > story is another chapter, but let's leave that aside for now). Since we > have to patch multiple instructions, the cmodx situation doesn't really > improve with CALL_OPS. > > Let's say that we continue building on your patch and implement direct > calls on CALL_OPS for RISC-V as well. > > From Florent's commit message for direct calls: > > | There are a few cases to distinguish: > | - If a direct call ops is the only one tracing a function: > | - If the direct called trampoline is within the reach of a BL > | instruction > | -> the ftrace patchsite jumps to the trampoline > | - Else > | -> the ftrace patchsite jumps to the ftrace_caller trampoline= which > | reads the ops pointer in the patchsite and jumps to the di= rect > | call address stored in the ops > | - Else > | -> the ftrace patchsite jumps to the ftrace_caller trampoline an= d its > | ops literal points to ftrace_list_ops so it iterates over all > | registered ftrace ops, including the direct call ops and call= s its > | call_direct_funcs handler which stores the direct called > | trampoline's address in the ftrace_regs and the ftrace_caller > | trampoline will return to that address instead of returning t= o the > | traced function > > On RISC-V, where auipc/jalr is used, the direct called trampoline would > always be reachable, and then first Else-clause would never be entered. > This means the the performance for direct calls would be the same as the > one we have today (i.e. no regression!). > > RISC-V does like x86 does (-ish) -- patch multiple instructions, long > reach. > > Arm64 uses CALL_OPS and patch one instruction BL. > > Now, with this background in mind, compared to what we have today, > CALL_OPS would give us (again assuming we're using it for direct calls): > > * Better performance for tracer per-call (faster ops lookup) GOOD ^ this was the only motivation for me to implement this patch. I don't think implementing direct calls over call ops is fruitful for RISC-V because once the auipc/jalr can be patched atomically, the direct call trampoline is always reachable. Solving the atomic text patching problem would be fun!! I am eager to see how it will be solved. > * Larger text size (function alignment + extra nops) BAD > * Same direct call performance NEUTRAL > * Same complicated text patching required NEUTRAL > > It would be interesting to see how the per-call performance would > improve on x86 with CALL_OPS! ;-) If I remember from Steven's talk, x86 uses dynamically allocated trampoline= s for per callsite tracers, would CALL_OPS provide better performance than th= at? > > I'm trying to wrap my head if it makes sense to have it on RISC-V, given > that we're a bit different from Arm64. Does the scale tip to the GOOD > side? > > Oh, and we really need to see performance numbers on real HW! I have a > VF2 that I could try this series on. It would be great if you can do it :D. Thanks, Puranjay