Received: by 2002:a05:6358:53a8:b0:117:f937:c515 with SMTP id z40csp3745211rwe; Mon, 17 Apr 2023 02:59:50 -0700 (PDT) X-Google-Smtp-Source: AKy350YhROeBS9w8V5obwEn7CtjzCEVxqz+BHL9OppMvMlhLu8VF+Dhejq83jsKtV5piDMn0eT86 X-Received: by 2002:a05:6a20:2925:b0:dc:38d3:676e with SMTP id t37-20020a056a20292500b000dc38d3676emr12392705pzf.20.1681725589930; Mon, 17 Apr 2023 02:59:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1681725589; cv=none; d=google.com; s=arc-20160816; b=o8wpbTCqYyx0XNSjhT9rWLjV4FG5sNSPf9wNISs+uFwYbDH4YQs4Opz/BxCnEdrTnS RVSdVIV1zC047JQQVrOiAfZN+4Id6c2L/IKp8i2k3A7f2KTXlno8pbzWpQH5X/j9HviC WtD/xh1Vhuf4FJU1nytPlda/W44+bMgAFcI2FWxol6nebOwuNEbwX5H19vXNxSMTq5zR eAKUIYk1xoRcDMoUsjr1zKOt1Yt4uOr5oUFj8zS0Xpk4hh2NyOMXFHZWR8miz/X6efvN B63fRIXeRAfiWzmIoI9attbHDc/z2puqn9a+CSVJxZcTNdCRj1WMDrWt3JlhV/9vYhgN JGdQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent :content-transfer-encoding:references:in-reply-to:date:cc:to:from :subject:message-id:dkim-signature; bh=2XrMDb4wscVn6iGD//utgL6fMmsBFzQxa7/KWKV8Xz4=; b=DgoS96ZGujKHntJ+cANP3kcGjuMXge+DdZvGonwlcIYJH8I0YENtclh+DUKXabNAc6 0Ic1hBjc8ztJFm3loOt3oy8nxELtsIr656CflKXMHpTFoI2tfnRcnwxivBni0D8P0TiV 5mk//fsEQE6+aOWAuedL/hssbhDqC2Aw5+VMMkVWyoMZrUY9dkF1hyJ3vFxQ5fru858d ufXLXCndCeVwCktNgVtr2afq4vqOBmOS0yX+7mzAF2hJ5os6nsAbMXTm6L4E/e05s0QJ 9VeFGo7aTUA7CpOOqXIVCV34Bm+8kRtqBbrznzdUG8KGJ/rqVNtAEeAX00fRwT/IVe4l qrxw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@xry111.site header.s=default header.b=E5Znmxxo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=xry111.site Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id bs131-20020a632889000000b00502d6998a4asi11397043pgb.458.2023.04.17.02.59.35; Mon, 17 Apr 2023 02:59:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@xry111.site header.s=default header.b=E5Znmxxo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=xry111.site Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229575AbjDQJ5r (ORCPT + 99 others); Mon, 17 Apr 2023 05:57:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59808 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229458AbjDQJ5p (ORCPT ); Mon, 17 Apr 2023 05:57:45 -0400 Received: from xry111.site (xry111.site [89.208.246.23]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 70525659D; Mon, 17 Apr 2023 02:57:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=xry111.site; s=default; t=1681725043; bh=O/cKAdPP1ntH7IK9e/3cclJfPJoYfiCKYcnR1FDS28c=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=E5ZnmxxoCi42LLMmO5z0FR3DlqAoAY5yiajei+fCHQX57j4JBVZRuMDZiTncJykyh MqkIeXj9QPaMbH2TEpJA+cyEtEDz72q/MhY7nble68s8WYuOCo6UfpWh6gtX/oQqyZ EjNGC9LdHmSREBfn9zK60G0/Qx4weomdm7z9QWL4= Received: from localhost.localdomain (xry111.site [IPv6:2001:470:683e::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature ECDSA (P-384) server-digest SHA384) (Client did not present a certificate) (Authenticated sender: xry111@xry111.site) by xry111.site (Postfix) with ESMTPSA id 2A3F865C4E; Mon, 17 Apr 2023 05:50:42 -0400 (EDT) Message-ID: Subject: Re: [PATCH 0/2] LoongArch: Make bounds-checking instructions useful From: Xi Ruoyao To: WANG Xuerui , loongarch@lists.linux.dev Cc: WANG Xuerui , Huacai Chen , Eric Biederman , Al Viro , Arnd Bergmann , linux-api@vger.kernel.org, linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org Date: Mon, 17 Apr 2023 17:50:40 +0800 In-Reply-To: <6ca642a9-62a6-00e5-39ac-f14ef36f6bdb@xen0n.name> References: <20230416173326.3995295-1-kernel@xen0n.name> <6ca642a9-62a6-00e5-39ac-f14ef36f6bdb@xen0n.name> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.48.0 MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_PASS,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 2023-04-17 at 15:54 +0800, WANG Xuerui wrote: > On 2023/4/17 14:47, Xi Ruoyao wrote: > > On Mon, 2023-04-17 at 01:33 +0800, WANG Xuerui wrote: > > > From: WANG Xuerui > > >=20 > > > Hi, > > >=20 > > > The LoongArch-64 base architecture is capable of performing > > > bounds-checking either before memory accesses or alone, with speciali= zed > > > instructions generating BCEs (bounds-checking error) in case of faile= d > > > assertions (ISA manual Volume 1, Sections 2.2.6.1 [1] and 2.2.10.3 [2= ]). > > > This could be useful for managed runtimes, but the exception is not > > > being handled so far, resulting in SIGSYSes in these cases, which is > > > incorrect and warrants a fix in itself. > > >=20 > > > During experimentation, it was discovered that there is already UAPI = for > > > expressing such semantics: SIGSEGV with si_code=3DSEGV_BNDERR. This w= as > > > originally added for Intel MPX, and there is currently no user (!) af= ter > > > the removal of MPX support a few years ago. Although the semantics is > > > not a 1:1 match to that of LoongArch, still it is better than > > > alternatives such as SIGTRAP or SIGBUS of BUS_OBJERR kind, due to bei= ng > > > able to convey both the value that failed assertion and the bound val= ue. > > >=20 > > > This patch series implements just this approach: translating BCEs int= o > > > SIGSEGVs with si_code=3DSEGV_BNDERR, si_value set to the offending va= lue, > > > and si_lower and si_upper set to resemble a range with both lower and > > > upper bound while in fact there is only one. > > >=20 > > > The instructions are not currently used anywhere yet in the fledgling > > > LoongArch ecosystem, so it's not very urgent and we could take the ti= me > > > to figure out the best way forward (should SEGV_BNDERR turn out not > > > suitable). > >=20 > > I don't think these instructions can be used in any systematic way > > within a Linux userspace in 2023.=C2=A0 IMO they should not exist in > > LoongArch at all because they have all the same disadvantages of Intel > > MPX; MPX has been removed by Intel in 2019, and LoongArch is designed > > after 2019. >=20 > Well, the difference is IMO significant enough to make LoongArch=20 > bounds-checking more useful, at least for certain use cases. For=20 > example, the bounds were a separate register bank in Intel MPX, but in > LoongArch they are just values in GPRs. This fits naturally into=20 > JIT-ting or other managed runtimes (e.g. Go) whose slice indexing ops=20 > already bounds-check with a temporary register per bound anyway, so it's= =20 > just a matter of this snippet (or something like it) >=20 > - calculate element address > - if address < base: goto fail > - load/calculate upper bound > - if address >=3D upper bound: goto fail > - access memory >=20 > becoming >=20 > - calculate element address > - asrtgt address, base - 1 > - load/calculate upper bound > - {ld,st}le address, upper bound >=20 > then in SIGSEGV handler, check PC to associate the signal back with the= =20 > exact access op; I remember using the signal handler for "usual" error handling can be a very bad idea but I can't remember where I've read about it. Is there any managed environments doing so in practice? If we redefine new_ldle/new_stle as "if [[likely]] the address is in- bound, do the load/store and skip the next instruction; otherwise do nothing", we can say: blt address, base, 1f new_ldle.d rd, address, upperbound 1:b panic_oob_access xor rd, rd, 42 // use rd to do something This is more versatile, and useful for building a loop as well: or a0, r0, r0 0:new_ldle.d t1, t0, t2 b 1f add.d a0, t1, a0 add.d t0, t0, 8 b 0b 1:bl do_something_with_the_sum Yes it's "non-RISC", but at least more RISC than the current ldle: if you want a trap anyway you can say blt address, base, 1f new_ldle.d rd, address, upperbound 1:break {a code defined for OOB} xor rd, rd, 42 // use rd --=20 Xi Ruoyao School of Aerospace Science and Technology, Xidian University