Received: by 2002:a05:7412:37c9:b0:e2:908c:2ebd with SMTP id jz9csp1545121rdb; Wed, 20 Sep 2023 12:08:26 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHejGzMlvTtb0LO7n1SYnmfeBJ+DC7VPcvNSyGX/y+cPQQzhSVBm+pRqAG1wWlslgJDs/Mp X-Received: by 2002:a05:6358:2489:b0:12f:22c1:66aa with SMTP id m9-20020a056358248900b0012f22c166aamr3438295rwc.3.1695236906038; Wed, 20 Sep 2023 12:08:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695236906; cv=none; d=google.com; s=arc-20160816; b=cJZdR9SFHkTQWGe+JLDNxA/mzSfB51WZHEnRUMmpDDc9HigEqrc0OOn7A3VseEPv/k OBdA3fm1lEKC+5BNwqN2M7r5uMk5BQ343q/g8Z+1MW25WMCJYklnRAHKq8DgapReWNK/ 2eu5XfgA0kmWLp1g66Atva5VA4M2Lz8Yht9EVATEqnEmIvQVHc/xt+h9SNxvqbfnaprM gqY6fhGT5QCIk4Oq0OuTJ7ZaFcGlDAt7EXCV9cGfuqGaTT+QasrPyynBwg44JvwoZQuS Gn5Zh4d+lzjnF7RrhBxyCoFdNhX68q2dKWqgYx1Ck6MaCZ8eFzgeCIJ3Lz2FuleJoFwJ sc/Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :content-language:references:cc:to:subject:user-agent:mime-version :date:message-id; bh=ouc1hhZlpj+cpx4Jvk/vP85+l2Djy+WDTB8AGtXxVK8=; fh=X3gpcEJ77I8OmM1paZ5Y7uW3NnwLIXwXFOR6eV25O5I=; b=puuTpeC0fQ7vLt1R4DZLRARvf/LtEnoMRvtBzc0PD4/xrc3HuGzCdWCNQ4K9R8KPmQ 9GuTp3M6by+1siAhZ3H9VhS7fsyZqhgYz1V2KOR1ekYkSpvN/fjOTt3+mqvaUiV7a7nw vlOrBuZ/s5QRXjA4pvClmpe0WI4tLXogug08sF7yjWjGMpdBQ5QsrvvMRei2PSwQMcmm drFNI6CY5jsYZ0yGrk6eG/rbF+lOaykic3Fkc6SYziDATWub1ZjhJ9WdO73/d2Ujztus X8X02aZhMMhSEjeWfI7rvQy9sHhaRMINpyoH9ELMiEZnLux8ufVFRQ+zPhJ5Psp7VFXp WVDw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from groat.vger.email (groat.vger.email. [2620:137:e000::3:5]) by mx.google.com with ESMTPS id i12-20020a633c4c000000b0056513361b4fsi12476538pgn.741.2023.09.20.12.08.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Sep 2023 12:08:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) client-ip=2620:137:e000::3:5; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id DA5A683249AD; Wed, 20 Sep 2023 11:49:57 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229579AbjITSty (ORCPT + 99 others); Wed, 20 Sep 2023 14:49:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53150 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229461AbjITStx (ORCPT ); Wed, 20 Sep 2023 14:49:53 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id CE119B4 for ; Wed, 20 Sep 2023 11:49:46 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 85A4C1FB; Wed, 20 Sep 2023 11:50:23 -0700 (PDT) Received: from [10.57.0.222] (unknown [10.57.0.222]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 14A3F3F67D; Wed, 20 Sep 2023 11:49:44 -0700 (PDT) Message-ID: <170e8577-42c9-b72f-60c7-80141f379ec4@arm.com> Date: Wed, 20 Sep 2023 19:49:19 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Subject: Re: [PATCH] ARM: vfp: Add vudot opcode to VFP undef hook To: Mark-PK Tsai , Russell King , Matthias Brugger , AngeloGioacchino Del Regno Cc: yj.chiang@mediatek.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mediatek@lists.infradead.org, Amit Daniel Kachhap References: <20230920083907.30479-1-mark-pk.tsai@mediatek.com> Content-Language: en-GB From: Robin Murphy In-Reply-To: <20230920083907.30479-1-mark-pk.tsai@mediatek.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.2 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Wed, 20 Sep 2023 11:49:58 -0700 (PDT) On 2023-09-20 09:39, Mark-PK Tsai wrote: > Add vudot opcode to the VFP undef hook to fix the > potentially undefined instruction error when the > user space executes vudot instruction. Did the kernel expose a hwcap to say that the dot product extension is supported? I'm pretty sure it didn't, so why would userspace expect this to work? ;) IIRC Amit was looking at defining the hwcaps to align with arm64 compat, but I believe that series faltered since most of them weren't actually needed (and I think at that point it was still missing the VFP support code parts). It would be nice if someone could pick up and combine both efforts and get this done properly; fill in *all* the hwcaps and relevant handling for extensions which Cortex-A55 supports (since there's definitely more than just VUDOT), and then hopefully we're done for good. > Before this commit, kernel didn't handle the undef exception > caused by vudot and didn't enable VFP in lazy VFP context > switch code like other NEON instructions. > This led to the occurrence of the undefined instruction > error as following: > > [ 250.741238 ] 0904 (26902): undefined instruction: pc=004014ec > ... > [ 250.741287 ] PC is at 0x4014ec > [ 250.741298 ] LR is at 0xb677874f > [ 250.741303 ] pc : [<004014ec>] lr : [] psr: 80070010 > [ 250.741309 ] sp : beffedb0 ip : b67d7864 fp : beffee58 > [ 250.741314 ] r10: 00000000 r9 : 00000000 r8 : 00000000 > [ 250.741319 ] r7 : 00000001 r6 : 00000001 r5 : beffee90 r4 : 00401470 > [ 250.741324 ] r3 : beffee20 r2 : beffee30 r1 : beffee40 r0 : 004003a8 > [ 250.741331 ] Flags: Nzcv IRQs on FIQs on Mode USER_32 ISA ARM Segment user > [ 250.741339 ] Control: 10c5383d Table: 32d0406a DAC: 00000055 > [ 250.741348 ] Code: f4434aef f4610aef f4622aef f4634aef (fc620df4) > > Below is the assembly of the user program: > > 0x4014dc <+108>: vst1.64 {d20, d21}, [r3:128] > 0x4014e0 <+112>: vld1.64 {d16, d17}, [r1:128] > 0x4014e4 <+116>: vld1.64 {d18, d19}, [r2:128] > 0x4014e8 <+120>: vld1.64 {d20, d21}, [r3:128] --> switch out > 0x4014ec <+124>: vudot.u8 q8, q9, q10 <-- switch in, and FPEXC.EN = 0 > SIGILL(illegal instruction) > > Link: https://services.arm.com/support/s/case/5004L00000XsOjP Linking to your private support case is not useful to upstream. Even I can't open that link. > Signed-off-by: Mark-PK Tsai > --- > arch/arm/vfp/vfpmodule.c | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c > index 7e8773a2d99d..7eab8d1019d2 100644 > --- a/arch/arm/vfp/vfpmodule.c > +++ b/arch/arm/vfp/vfpmodule.c > @@ -788,6 +788,12 @@ static struct undef_hook neon_support_hook[] = {{ > .cpsr_mask = PSR_T_BIT, > .cpsr_val = 0, > .fn = vfp_support_entry, > +}, { > + .instr_mask = 0xffb00000, > + .instr_val = 0xfc200000, > + .cpsr_mask = PSR_T_BIT, > + .cpsr_val = 0, > + .fn = vfp_support_entry, > }, { > .instr_mask = 0xef000000, > .instr_val = 0xef000000, > @@ -800,6 +806,12 @@ static struct undef_hook neon_support_hook[] = {{ > .cpsr_mask = PSR_T_BIT, > .cpsr_val = PSR_T_BIT, > .fn = vfp_support_entry, > +}, { > + .instr_mask = 0xffb00000, > + .instr_val = 0xfc200000, > + .cpsr_mask = PSR_T_BIT, > + .cpsr_val = PSR_T_BIT, > + .fn = vfp_support_entry, Why have two entries conditional on each possible value of one bit for otherwise identical encodings? Surely it suffices to set both cpsr_mask and cpsr_val to 0? Thanks, Robin. > }}; > > static struct undef_hook vfp_support_hook = {