Received: by 2002:ab2:6d45:0:b0:1fb:d597:ff75 with SMTP id d5csp90158lqr; Tue, 4 Jun 2024 22:52:47 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCUiD3wqLIX7HDLZ7dQL0jE9gvobNfANFZee3KDNYKIm50qQH/QPADfKcau1tSRlA/ksM4i9N3Lnc7M2iqhxGTe77qBLGT39XOkhY5hHqw== X-Google-Smtp-Source: AGHT+IHBlJmdL8Ed66pE8GKWiL2Y1QDiw73mPw+1K2qqf6DkMN8sfAAlt5wJqJF9WILqwd8wE62c X-Received: by 2002:a05:6e02:1a8c:b0:374:a14e:1485 with SMTP id e9e14a558f8ab-374b1ef588fmr16483745ab.11.1717566767620; Tue, 04 Jun 2024 22:52:47 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1717566767; cv=pass; d=google.com; s=arc-20160816; b=cJGoNFVYIMn3Ma/ZXxtLrRK3pqZ4ZoNeVvrgKK4Cw/ecZBUdnUfBLMiMshT+MsnCZf 94SGkn+2NGj8KW2EaHF3LKJV4lnclZV4kaWhJ6EHLqvx4eWS1RjhUzvPygqAFnlXezhU 9L2STXonf1PSb3a6yw9kcGnzsfJxi0l3QT8Nqu3heJOJx0ZZV6+qzBOh5fa68dDYxxIw R1FYL/jKvsRqwHbuQ6OZGaEn4Ho6B1Y8v+9TiYBhuWwwoD44igfJGjZm/nJH1qdpa5uw RwfxJpxVQYNjquV+1FHb/PbprYHRr+8jPfR/BZCYN3SWS83Nsoqom4SaExgkYKOSIzs0 2dbQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:dkim-signature; bh=UbSM2KJixF6vlyzthkWODvb3oPvDE2HKUO902+p4SWM=; fh=SPQ5vE+gFRxKLQ+0W+HL5QbewCPAk3LSnorReZs4WPk=; b=cdaClGXNsWRsnQ0jIi3kyiBOLXtoB+whVV1QPQw+45iTA3e46dEhUPAYPIwepoEyG6 tczUseFuz7NqvytiaUQ+ktHCFHUxxDM2w1VwSRr+HjeNsO0NiZnLMJgvsPantd9xkrhM dlJx7s4Y6CjKeLogpEzmu33JcmAAUEHezed/LvPJM8MN2TTVR/350Rad7OijUGXzu4cP sDIZsgxA+ufsVgLkqOROp2bXSvoAKhEDWhd5bV1plK37PoQHyOPx4DUpnaMIxDhmevH3 pco1GnmCKeXKH9oQqERjoeJheCImgqd+IGguS8BBrcTJ+ERDnMaOgXAJ5gjTUqXwKwtL rVNA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=s0iaIVvi; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-201787-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-201787-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id 41be03b00d2f7-6d47ba292aesi1331840a12.690.2024.06.04.22.52.47 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Jun 2024 22:52:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-201787-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=s0iaIVvi; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-201787-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-201787-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 35F70282AA1 for ; Wed, 5 Jun 2024 05:52:47 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 809EF21C18A; Wed, 5 Jun 2024 05:52:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="s0iaIVvi" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 56754946C for ; Wed, 5 Jun 2024 05:52:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717566760; cv=none; b=HNr8kpQe820ltFlZ78UWB7Q+BimN4pQPR4Ytc9shpkXSmc4Dt0GNBVZjvZ2ptd5vbivIP1XEbWpEtIth7eJwDFmj9xvMfs3ZOtoJOx+YQdSFYaqAxSeYu72ZRM8hSw+j+aZ+P84lzmhV4X49QyEbnoNtI/yQ9C0KOYYUXTw0Gn4= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717566760; c=relaxed/simple; bh=oSk+pH9veA9pWxWo+NEX87cLSbrl1nCs2BfhZ30S5yw=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=g/6E4kbnCwx5wMqL6aMkO7WINvgJaUdJLYeAK0zIxfOMhoT48vvS0O7ZsvuuedoZXOsqN9KoHAQ/G7rvs/iEae7lnd6gpbnQHy+O7rJKIAQoUKUnteZ5kiWX7NgGwgWmRdlr3sstE7JPEXF9b6cNo9/4W5jy86gvV3mmXsisj1k= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=s0iaIVvi; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id BF624C4AF09 for ; Wed, 5 Jun 2024 05:52:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1717566759; bh=oSk+pH9veA9pWxWo+NEX87cLSbrl1nCs2BfhZ30S5yw=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=s0iaIVvie0EYs8faXUJeotlVL9CPRnSyarRJdPUsZW+V7Yh7thdO5vQ+hlIEsZQvs hzt7NWzo/3iRKGzZlC/RSO1Hvhflf9kmMUi4eZNsSlZkZJG/qd7A/yMyxxvqvkKc9E /LGcsu+g8TrIxAwWtEXhb2KBeCoSm8eurgI1SUsp8zWtu1KTdtt2LIuItbnp+TEfa3 07jeTBqK8ywpWxJcgny6sX3t89WyxVr9g8Ye40rm+EoBPHu7j3FmAdtfHNh2g886mc NpKV7hHuz5LFL3TNtz4hUSLBxjZSaqFoTS6O0geI4eaa1NOj3nM3yOWLhyd6N8zP3p C4KR/nNpi/5EA== Received: by mail-ed1-f46.google.com with SMTP id 4fb4d7f45d1cf-57a20c600a7so7145361a12.3 for ; Tue, 04 Jun 2024 22:52:39 -0700 (PDT) X-Forwarded-Encrypted: i=1; AJvYcCXuFFx7shgmI6a67CFJfIFViKfMtUwSNQ5ePBm4DRZq8kSdpPjFO1BcrXHxTySXlgetIy95C2raUMfiMP5WK4ll/XjgFk6lUgb2Wscu X-Gm-Message-State: AOJu0YwGRTl5r7L2nEhbnpW83CqmxtdxCwMWh2qVPF0+0tEH/HgvinPT AzuytHrFvgm7wTdEEYQ3qAOzbnn/j7Gm1fksOHKUbIyvYc0e9i7olUNFDYCq12JspOwAKSpIUbS hNbCYbufWQu2E1INunOMdxhXHUAU= X-Received: by 2002:a50:f61d:0:b0:57a:9405:786a with SMTP id 4fb4d7f45d1cf-57a950ab70fmr489801a12.5.1717566758279; Tue, 04 Jun 2024 22:52:38 -0700 (PDT) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20231225040018.1660554-1-antonb@tenstorrent.com> In-Reply-To: From: Guo Ren Date: Wed, 5 Jun 2024 13:52:25 +0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [CAUTION - External Sender] Re: [PATCH] riscv: Improve exception and system call latency To: Cyril Bur Cc: Anton Blanchard , paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Jun 4, 2024 at 4:16=E2=80=AFPM Cyril Bur = wrote: > > On Mon, Jun 3, 2024 at 4:39=E2=80=AFPM Guo Ren wrote: > > > > On Mon, Jun 3, 2024 at 12:38=E2=80=AFPM Cyril Bur wrote: > > > > > > [ apologies, I think my mailer is going to mess up the formatting ] > > > > > > On 26 Dec 2023, at 2:56=E2=80=AFPM, Guo Ren wrote= : > > > > > > On Sun, Dec 24, 2023 at 08:00:18PM -0800, Anton Blanchard wrote: > > > > > > Many CPUs implement return address branch prediction as a stack. The > > > RISCV architecture refers to this as a return address stack (RAS). If > > > this gets corrupted then the CPU will mispredict at least one but > > > potentally many function returns. > > > > > > There are two issues with the current RISCV exception code: > > > > > > - We are using the alternate link stack (x5/t0) for the indirect bran= ch > > > which makes the hardware think this is a function return. This will > > > corrupt the RAS. > > > > > > - We modify the return address of handle_exception to point to > > > ret_from_exception. This will also corrupt the RAS. > > > > > > Testing the null system call latency before and after the patch: > > > > > > Visionfive2 (StarFive JH7110 / U74) > > > baseline: 189.87 ns > > > patched: 176.76 ns > > > > > > Lichee pi 4a (T-Head TH1520 / C910) > > > baseline: 666.58 ns > > > patched: 636.90 ns > > > > > > Just over 7% on the U74 and just over 4% on the C910. > > > > > > > > > Yes, the wrong "jalr zero, t0/ra" would pop RAS and destroy the RAS > > > layout of the hardware for the userspace. How about giving a fake pus= h > > > for the RAS to connect "jalr zero, ra" of sub-function call return? I= 'm > > > curious if you could measure the difference with only one RAS > > > misprediction. > > > > > > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > > > index 54ca4564a926..94c7d2be35d0 100644 > > > --- a/arch/riscv/kernel/entry.S > > > +++ b/arch/riscv/kernel/entry.S > > > @@ -93,7 +93,8 @@ SYM_CODE_START(handle_exception) > > > bge s4, zero, 1f > > > > > > /* Handle interrupts */ > > > - tail do_irq > > > + auipc t0, do_irq > > > + jalr t0, t0 > > > 1: > > > /* Handle other exceptions */ > > > slli t0, s4, RISCV_LGPTR > > > @@ -103,9 +104,10 @@ SYM_CODE_START(handle_exception) > > > /* Check if exception code lies within bounds */ > > > bgeu t0, t2, 1f > > > REG_L t0, 0(t0) > > > - jr t0 > > > + jalr t0, t0 > > > 1: > > > - tail do_trap_unknown > > > + auipc t0, do_trap_unknown > > > + jalr t0, t0 > > > SYM_CODE_END(handle_exception) > > > > > > You could prepare a deeper userspace stack calling if you want better > > > measurement results. > > > > > > > > > Signed-off-by: Anton Blanchard > > > Reviewed-by: Jisheng Zhang > > > --- > > > > > > This introduces some complexity in the stackframe walk code. PowerPC > > > resolves the multiple exception exit paths issue by placing a value i= nto > > > the exception stack frame (basically the word "REGS") that the stack = frame > > > code can look for. Perhaps something to look at. > > > > > > arch/riscv/kernel/entry.S | 21 ++++++++++++++------- > > > arch/riscv/kernel/stacktrace.c | 14 +++++++++++++- > > > 2 files changed, 27 insertions(+), 8 deletions(-) > > > > > > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > > > index 54ca4564a926..89af35edbf6c 100644 > > > --- a/arch/riscv/kernel/entry.S > > > +++ b/arch/riscv/kernel/entry.S > > > @@ -84,7 +84,6 @@ SYM_CODE_START(handle_exception) > > > scs_load_current_if_task_changed s5 > > > > > > move a0, sp /* pt_regs */ > > > - la ra, ret_from_exception > > > > > > /* > > > * MSB of cause differentiates between > > > @@ -93,7 +92,10 @@ SYM_CODE_START(handle_exception) > > > bge s4, zero, 1f > > > > > > /* Handle interrupts */ > > > - tail do_irq > > > + call do_irq > > > +.globl ret_from_irq_exception > > > +ret_from_irq_exception: > > > + j ret_from_exception > > > 1: > > > /* Handle other exceptions */ > > > slli t0, s4, RISCV_LGPTR > > > @@ -101,11 +103,16 @@ SYM_CODE_START(handle_exception) > > > la t2, excp_vect_table_end > > > add t0, t1, t0 > > > /* Check if exception code lies within bounds */ > > > - bgeu t0, t2, 1f > > > - REG_L t0, 0(t0) > > > - jr t0 > > > -1: > > > - tail do_trap_unknown > > > + bgeu t0, t2, 3f > > > + REG_L t1, 0(t0) > > > +2: jalr ra,t1 > > > +.globl ret_from_other_exception > > > +ret_from_other_exception: > > > + j ret_from_exception > > > +3: > > > + > > > + la t1, do_trap_unknown > > > + j 2b > > > SYM_CODE_END(handle_exception) > > > > > > /* > > > diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stack= trace.c > > > index 64a9c093aef9..b9cd131bbc4c 100644 > > > --- a/arch/riscv/kernel/stacktrace.c > > > +++ b/arch/riscv/kernel/stacktrace.c > > > @@ -17,6 +17,18 @@ > > > #ifdef CONFIG_FRAME_POINTER > > > > > > extern asmlinkage void ret_from_exception(void); > > > +extern asmlinkage void ret_from_irq_exception(void); > > > +extern asmlinkage void ret_from_other_exception(void); > > > + > > > +static inline bool is_exception_frame(unsigned long pc) > > > +{ > > > + if ((pc =3D=3D (unsigned long)ret_from_exception) || > > > + (pc =3D=3D (unsigned long)ret_from_irq_exception) || > > > + (pc =3D=3D (unsigned long)ret_from_other_exception)) > > > + return true; > > > + > > > + return false; > > > +} > > > > > > We needn't put too many .globl in the entry.S, and just check that pc= is > > > in SYM_CODE_START/END(handle_exception), then entry.S would be cleane= r: > > > > > > Hi Guo, > > > > > > I've taken this patch over from Anton, mostly just to tidy it up. I'd > > > like to incorporate > > > what you mention here but I'm not sure how to achieve it. Have I > > > missed something > > > obvious? As things currently stand there doesn't seem to be a way to = get the end > > > (or size) of handle_exception in C code. > > "just check that pc is in SYM_CODE_START/END(handle_exception)." > > Sorry, I think my previous description is wrong. > > > > Instead, "We needn't modify anything in stacktrace.c because we keep > > ra =3D ret_from_exception." > > > > I want only cleaner and smaller modifications to the entry.S to > > satisfy RAS prediction performance requirements. > > > > I completely agree with keeping entry.S as clean as possible. > > I'm trying to understand what you mean. > Isn't the point of the patch to remove ra =3D ret_from_exception? > > I'm not sure but maybe we can leave entry.S as it is and the check in > stacktrace.c can become a check for pc =3D=3D handle_exception? Yes, we needn't modify stacktrace.c anymore. Because we still keep "ra =3D handle_exception". > > > > > > > > > Your advice is greatly appreciated, > > > > > > Thanks, > > > > > > Cyril > > > > > > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > > > index 54ca4564a926..d452d5f12b1b 100644 > > > --- a/arch/riscv/kernel/entry.S > > > +++ b/arch/riscv/kernel/entry.S > > > @@ -84,7 +84,6 @@ SYM_CODE_START(handle_exception) > > > scs_load_current_if_task_changed s5 > > > > > > move a0, sp /* pt_regs */ > > > > > > /* > > > * MSB of cause differentiates between > > > @@ -93,7 +92,8 @@ SYM_CODE_START(handle_exception) > > > bge s4, zero, 1f > > > > > > /* Handle interrupts */ > > > call do_irq > > > j ret_from_exception > > > 1: > > > /* Handle other exceptions */ > > > slli t0, s4, RISCV_LGPTR > > > @@ -102,10 +102,12 @@ SYM_CODE_START(handle_exception) > > > add t0, t1, t0 > > > /* Check if exception code lies within bounds */ > > > bgeu t0, t2, 1f > > > REG_L ra, 0(t0) > > > jalr ra, ra > > > j ret_from_exception > > > 1: > > > call do_trap_unknown > > > j ret_from_exception > > > SYM_CODE_END(handle_exception) > > > > > > > > > > > > void notrace walk_stackframe(struct task_struct *task, struct pt_regs= *regs, > > > bool (*fn)(void *, unsigned long), void *arg) > > > @@ -62,7 +74,7 @@ void notrace walk_stackframe(struct task_struct > > > *task, struct pt_regs *regs, > > > fp =3D frame->fp; > > > pc =3D ftrace_graph_ret_addr(current, NULL, frame->ra, > > > &frame->ra); > > > - if (pc =3D=3D (unsigned long)ret_from_exception) { > > > + if (is_exception_frame(pc)) { > > > if (unlikely(!__kernel_text_address(pc) || !fn(arg, pc))) > > > break; > > > > > > -- > > > 2.25.1 > > > > > > > > > _______________________________________________ > > > linux-riscv mailing list > > > linux-riscv@lists.infradead.org > > > http://lists.infradead.org/mailman/listinfo/linux-riscv > > > > > > > > -- > > Best Regards > > Guo Ren --=20 Best Regards Guo Ren