Received: by 2002:a05:7412:37c9:b0:e2:908c:2ebd with SMTP id jz9csp2215519rdb; Thu, 21 Sep 2023 11:45:04 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGuaSuL9YPy8/qaXz0IccUNhKxgI0BFP+y6cysHB80MBj5mrNN9x4AN4q8tjIAESW93PiH+ X-Received: by 2002:a05:6a20:440d:b0:134:d4d3:f0a5 with SMTP id ce13-20020a056a20440d00b00134d4d3f0a5mr6136382pzb.2.1695321904105; Thu, 21 Sep 2023 11:45:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695321904; cv=none; d=google.com; s=arc-20160816; b=CnZUL4V3yDiQrs0CpTcvaMsVbPXaGQm1aoTSkEBdcBbqSewVreP13WCfygGWh8kxx6 mJ68UXgtLBEev6VkD4zsjlf2Nm+Cjv/KWIwNkK2jcO5jTbgYlScsSzCCUc4CjTYfJ559 gAJ6boYAs6XirDlFobb1SsSn1pJOpOfctQIWt3sWdF2EjRez00R9zib4rxD3xJxDNnyT EeXyJip+jny1LQy5TrslgZMDPEMOEg+jKNdJHowGQWFse1LKaq7FthIWYwOuxtlvJKjn nKyF3Ory7No+oxEQMnbQHloOKhaAOSqJR2V6qIU9HAiExL96ybjDhXk6D/UvLOtzjyJV p63Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id; bh=RfxTI6yT98vdYMPFLMQrKaY5NDiQ8OkonbcRN4stPA4=; fh=XDJR/AtB/9DT3XFRxgb2QiqfGCyMpJEM2Q+tyaGjguc=; b=rTC/7RM+t9O/n/XI3bPOpnqYjdSVbIBaSl47+CrvmImnd5IOpqKJTy+Ij46xncTeeH XAgy8rfHRH7BFYmTuQJO11diNR/PrstoIGvZAj4wnFHTZd9Mrs6gLPO3KqOw4Gl6YjZo YA64OmUoSkGJo47Q0Id72Vmba/JDZakM74i4BcScJCBvIeOPiMPc0R0JWfe5Zt8hKBDk 7RYiUT75DtvclwZR8haAB3jwaZhcTx5tWhz5dAf3Cdf3v81RXzzqBDtb2D9Dy66hccIF iWK1cz/7yTi6CGTxRqx2mZpjWoBXDD0ngenVb22d6myFouOhDzD5qSMB5uycOuL3yIWh Ctsw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id j2-20020a63e742000000b00565df122f43si1924612pgk.202.2023.09.21.11.45.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Sep 2023 11:45:04 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id D1876827B4AB; Thu, 21 Sep 2023 10:50:55 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229700AbjIURuu (ORCPT + 99 others); Thu, 21 Sep 2023 13:50:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38120 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230014AbjIURuI (ORCPT ); Thu, 21 Sep 2023 13:50:08 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 467AD72AE for ; Thu, 21 Sep 2023 10:22:38 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C4D7D165C; Thu, 21 Sep 2023 05:08:12 -0700 (PDT) Received: from [10.1.196.40] (e121345-lin.cambridge.arm.com [10.1.196.40]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 8C1D73F5A1; Thu, 21 Sep 2023 05:07:34 -0700 (PDT) Message-ID: Date: Thu, 21 Sep 2023 13:07:29 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux aarch64; rv:102.0) Gecko/20100101 Thunderbird/102.15.0 Subject: Re: [PATCH] ARM: vfp: Add vudot opcode to VFP undef hook Content-Language: en-GB To: Mark-PK Tsai Cc: amit.kachhap@arm.com, angelogioacchino.delregno@collabora.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mediatek@lists.infradead.org, linux@armlinux.org.uk, matthias.bgg@gmail.com, yj.chiang@mediatek.com References: <170e8577-42c9-b72f-60c7-80141f379ec4@arm.com> <20230921021350.28283-1-mark-pk.tsai@mediatek.com> From: Robin Murphy In-Reply-To: <20230921021350.28283-1-mark-pk.tsai@mediatek.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3.4 required=5.0 tests=BAYES_00,NICE_REPLY_A, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_NONE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Thu, 21 Sep 2023 10:50:56 -0700 (PDT) On 21/09/2023 3:13 am, Mark-PK Tsai wrote: >> On 2023-09-20 09:39, Mark-PK Tsai wrote: >>> Add vudot opcode to the VFP undef hook to fix the >>> potentially undefined instruction error when the >>> user space executes vudot instruction. >> >> Did the kernel expose a hwcap to say that the dot product extension is >> supported? I'm pretty sure it didn't, so why would userspace expect this >> to work? ;) > > The hwcap for dotprod has been exported since commit: > > 62ea0d873af3 ARM: 9269/1: vfp: Add hwcap for FEAT_DotProd > >> >> IIRC Amit was looking at defining the hwcaps to align with arm64 compat, >> but I believe that series faltered since most of them weren't actually >> needed (and I think at that point it was still missing the VFP support >> code parts). It would be nice if someone could pick up and combine both > > Were the mentioned series related to this commit? > > 62ea0d873af3 ARM: 9269/1: vfp: Add hwcap for FEAT_DotProd Oh, that did get merged? My apologies, I grepped for the hwcaps in arch/arm but somehow failed to spot that some definitions did exist, so assumed it hadn't been; not sure what went wrong there :( In that case, we definitely want this tagged as a fix, and to make sure we double-check for any equivalent fixes still needed for the other features too. Sorry again for the confusion. >> efforts and get this done properly; fill in *all* the hwcaps and >> relevant handling for extensions which Cortex-A55 supports (since >> there's definitely more than just VUDOT), and then hopefully we're done >> for good. > > Agree. > >> >>> Before this commit, kernel didn't handle the undef exception >>> caused by vudot and didn't enable VFP in lazy VFP context >>> switch code like other NEON instructions. >>> This led to the occurrence of the undefined instruction >>> error as following: >>> >>> [ 250.741238 ] 0904 (26902): undefined instruction: pc=004014ec >>> ... >>> [ 250.741287 ] PC is at 0x4014ec >>> [ 250.741298 ] LR is at 0xb677874f >>> [ 250.741303 ] pc : [<004014ec>] lr : [] psr: 80070010 >>> [ 250.741309 ] sp : beffedb0 ip : b67d7864 fp : beffee58 >>> [ 250.741314 ] r10: 00000000 r9 : 00000000 r8 : 00000000 >>> [ 250.741319 ] r7 : 00000001 r6 : 00000001 r5 : beffee90 r4 : 00401470 >>> [ 250.741324 ] r3 : beffee20 r2 : beffee30 r1 : beffee40 r0 : 004003a8 >>> [ 250.741331 ] Flags: Nzcv IRQs on FIQs on Mode USER_32 ISA ARM Segment user >>> [ 250.741339 ] Control: 10c5383d Table: 32d0406a DAC: 00000055 >>> [ 250.741348 ] Code: f4434aef f4610aef f4622aef f4634aef (fc620df4) >>> >>> Below is the assembly of the user program: >>> >>> 0x4014dc <+108>: vst1.64 {d20, d21}, [r3:128] >>> 0x4014e0 <+112>: vld1.64 {d16, d17}, [r1:128] >>> 0x4014e4 <+116>: vld1.64 {d18, d19}, [r2:128] >>> 0x4014e8 <+120>: vld1.64 {d20, d21}, [r3:128] --> switch out >>> 0x4014ec <+124>: vudot.u8 q8, q9, q10 <-- switch in, and FPEXC.EN = 0 >>> SIGILL(illegal instruction) >>> >>> Link: https://services.arm.com/support/s/case/5004L00000XsOjP >> >> Linking to your private support case is not useful to upstream. Even I >> can't open that link. > > I thought that maybe someone in arm need this. > But it seems a bit noisy so I will remove the link from v2. Yeah, even within Arm most of us don't have permission to access the support system. Cheers, Robin. >> >>> Signed-off-by: Mark-PK Tsai >>> --- >>> arch/arm/vfp/vfpmodule.c | 12 ++++++++++++ >>> 1 file changed, 12 insertions(+) >>> >>> diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c >>> index 7e8773a2d99d..7eab8d1019d2 100644 >>> --- a/arch/arm/vfp/vfpmodule.c >>> +++ b/arch/arm/vfp/vfpmodule.c >>> @@ -788,6 +788,12 @@ static struct undef_hook neon_support_hook[] = {{ >>> .cpsr_mask = PSR_T_BIT, >>> .cpsr_val = 0, >>> .fn = vfp_support_entry, >>> +}, { >>> + .instr_mask = 0xffb00000, >>> + .instr_val = 0xfc200000, >>> + .cpsr_mask = PSR_T_BIT, >>> + .cpsr_val = 0, >>> + .fn = vfp_support_entry, >>> }, { >>> .instr_mask = 0xef000000, >>> .instr_val = 0xef000000, >>> @@ -800,6 +806,12 @@ static struct undef_hook neon_support_hook[] = {{ >>> .cpsr_mask = PSR_T_BIT, >>> .cpsr_val = PSR_T_BIT, >>> .fn = vfp_support_entry, >>> +}, { >>> + .instr_mask = 0xffb00000, >>> + .instr_val = 0xfc200000, >>> + .cpsr_mask = PSR_T_BIT, >>> + .cpsr_val = PSR_T_BIT, >>> + .fn = vfp_support_entry, >> >> Why have two entries conditional on each possible value of one bit for >> otherwise identical encodings? Surely it suffices to set both cpsr_mask >> and cpsr_val to 0? > > You're right. > I will set both cpsr_mask and cpsr_val to 0 and use single entry, > as you suggested, in the v2 patch. > > Thanks. > >> >> Thanks, >> Robin. >> >>> }}; >>> >>> static struct undef_hook vfp_support_hook = {