Received: by 2002:ac8:156:0:b0:3e0:cd10:60c8 with SMTP id f22csp1266495qtg; Thu, 30 Mar 2023 11:38:58 -0700 (PDT) X-Google-Smtp-Source: AKy350Yub+MRjuI6OumxKB8Et4fuA8FtnkjcDQ1I2yqWLYuP5ZKbxvKhnyHP9w9GHgyP8lfr1vrZ X-Received: by 2002:a17:906:dd:b0:93e:22e6:e7d1 with SMTP id 29-20020a17090600dd00b0093e22e6e7d1mr3002063eji.3.1680201538128; Thu, 30 Mar 2023 11:38:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680201538; cv=none; d=google.com; s=arc-20160816; b=w0fSOho7on/1ZAb1ohBTmlnRu7ASMdQCRII6L2isZk5LTFe9FG2d6mgKcaXSTKnqB0 6FLdud/hPnXH8RJADFYR9au2LEeMk1qE+wQakQTXsNg3+88PXc4jsfMCANNdSyvdFIvX M0rtYfr0AmYOpRwk6TcyGNBgA6phv8ZM4uWNovcH+fnkPsYFtlrO3pL41Wkfsgzq3cek 8Gzwi4oCnnEoBiSE7o+O0ONyWFtjQ544vO074UdfPloUxxILsuV6jzbytJliw8b2tsHR deEFzlDEan7Ynfq6yFsSrvPgbNCRuFInL7r1eiOzh5eNuh2sU2K8fY9IfZvV2l4RMUl8 hphA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=y4QG32glT+tPHqBHPO/ahp4Rj/Y6Ok/1/jb66T0Ro+s=; b=GlnAE/Hc+LPrUR5avyrC0EmP9f/+FDOSw+HyJKncMVZzck6nnYIjXyHPVtnqCFgUnR aQsA8Z1wLOzUEQLRvYkKJQey5Dqi4fQlxrZKSFVVGv2EounKgUv10sevxfXkres+Ea3Z 0QkKV9Yzjmiz2L0AQysyij0g4T2NfsmF6NwUSkSQ86/NA61u6Q2JHclknBrFMa6GfrmV mHC9wTD12GG1u+HLytCGw9cJiTkc1pNIZnYQBsNCbJSRf5OfMGm462u2JiF+zMDk7WMY php8yBgG1OAS+0O0RiAOmsaxzIOL4QEmiOzOYIgRuwxgC6WZBZvEZwB6F5EYAnbRwQrY Be9g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@rivosinc-com.20210112.gappssmtp.com header.s=20210112 header.b=Qh8bK7i7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id br1-20020a170906d14100b00932a6c68268si178533ejb.806.2023.03.30.11.38.33; Thu, 30 Mar 2023 11:38:58 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@rivosinc-com.20210112.gappssmtp.com header.s=20210112 header.b=Qh8bK7i7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231332AbjC3SbL (ORCPT + 99 others); Thu, 30 Mar 2023 14:31:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59768 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229814AbjC3SbJ (ORCPT ); Thu, 30 Mar 2023 14:31:09 -0400 Received: from mail-lj1-x22a.google.com (mail-lj1-x22a.google.com [IPv6:2a00:1450:4864:20::22a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D2A0926B8 for ; Thu, 30 Mar 2023 11:31:06 -0700 (PDT) Received: by mail-lj1-x22a.google.com with SMTP id a11so20585104lji.6 for ; Thu, 30 Mar 2023 11:31:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20210112.gappssmtp.com; s=20210112; t=1680201065; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=y4QG32glT+tPHqBHPO/ahp4Rj/Y6Ok/1/jb66T0Ro+s=; b=Qh8bK7i7hfJEM+1W1rQTmIoi2XDNoN9PWIpdbxIWkpGlQUUU9eImLC7wy5nuo3DzfN yLLvaWPgAIxb2EyRzYunmxDmUDVS8nULxt8VddcXJEt0APRFPSyidKmu/hKICKNvyi4u w70DWAHUVtSFLiYo6+9irFiYZLewPu3rkShiyfKyS+reTYaazmrIR7krCkkZm6N629wC aVIbZztnw0jQx4DpJhnTFGeDqe1Mxs6H7EUiLFNymACzxrj/jUPeZQUCm9PG7ngfO7NI jhetQus9nZezECvnMgw8kySj7hwy9GGBe4MOgr8wLpaiDL0src1x8+dl9obApepTIT6R evhg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680201065; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=y4QG32glT+tPHqBHPO/ahp4Rj/Y6Ok/1/jb66T0Ro+s=; b=bkCqS3t5HJOYVat7nFQShyOKQPeUR4GYkPo5Il3akA+Fyr7Alhz0zSp3jx1gay6THS X2ITMTEyxfumcfSiWwjEbxjmbqfpcLdxM4COp4BCLJKpT2FG2Roc+hFx1tBp+4JlDi/7 4JnSSXqs49X1u4InS03oBTco+S08oXBLgVOm8i3rRaCHMUWIjsWnpOGg+1WgcZdzMX4F 3ItbgzQyvw2+ZlB3VI453xUoGe+l6usCQSc/LrCxc6qWK2MPlJ/oeLeMZ058f3dehZ+F jdEGUmNre3LMatJndz19gb0RHZz62uUttyZ1kRfX730FS31ObRVPLE5Rzq4oeQsgbG1x 1wCA== X-Gm-Message-State: AAQBX9dW4dWRM5Vv1+GoaF9MpdSY+QfWgw7zjl7BV+L4ywPuVl4dKDKm p8TMchzWLCsc16xSicdI/A/xqPjR2ed0ebaTzblRB63M+spffoWH X-Received: by 2002:a2e:9887:0:b0:298:72a8:c6c4 with SMTP id b7-20020a2e9887000000b0029872a8c6c4mr7514253ljj.9.1680201065020; Thu, 30 Mar 2023 11:31:05 -0700 (PDT) MIME-Version: 1.0 References: <20230221190858.3159617-1-evan@rivosinc.com> <20230221190858.3159617-3-evan@rivosinc.com> <605fb2fd-bda2-4922-92bf-e3e416d54398@app.fastmail.com> In-Reply-To: <605fb2fd-bda2-4922-92bf-e3e416d54398@app.fastmail.com> From: Evan Green Date: Thu, 30 Mar 2023 11:30:29 -0700 Message-ID: Subject: Re: [PATCH v3 2/7] RISC-V: Add a syscall for HW probing To: Arnd Bergmann Cc: Palmer Dabbelt , =?UTF-8?Q?Heiko_St=C3=BCbner?= , Conor Dooley , slewis@rivosinc.com, Vineet Gupta , Albert Ou , Andrew Bresticker , Andrew Jones , Anup Patel , Atish Patra , Bagas Sanjaya , Celeste Liu , "Conor.Dooley" , guoren , Jonathan Corbet , Niklas Cassel , Palmer Dabbelt , Paul Walmsley , Randy Dunlap , Ruizhe Pan , Sunil V L , Tobias Klauser , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=0.0 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Feb 23, 2023 at 2:06=E2=80=AFAM Arnd Bergmann wrote= : > > On Tue, Feb 21, 2023, at 20:08, Evan Green wrote: > > We don't have enough space for these all in ELF_HWCAP{,2} and there's n= o > > system call that quite does this, so let's just provide an arch-specifi= c > > one to probe for hardware capabilities. This currently just provides > > m{arch,imp,vendor}id, but with the key-value pairs we can pass more in > > the future. > > > > Co-developed-by: Palmer Dabbelt > > Signed-off-by: Palmer Dabbelt > > Signed-off-by: Evan Green > > I'm still skeptical about the need for a custom syscall interface here. > I had not looked at the interface so far, but there are a few things > that stick out: > > > +RISC-V Hardware Probing Interface > > +--------------------------------- > > + > > +The RISC-V hardware probing interface is based around a single > > syscall, which > > +is defined in :: > > + > > + struct riscv_hwprobe { > > + __s64 key; > > + __u64 value; > > + }; > > The way this is defined, the kernel will always have to know > about the specific set of features, it can't just forward > unknown features to user space after probing them from an > architectured hardware interface or from DT. You're correct that this interface wasn't intended to have usermode come in with augmented data or additional key/value pairs. This was purely meant to provide access to the kernel's repository of architectural and microarchitectural details. If usermode wants to provide extra info in this same form, maybe they could wrap this interface. > > If 'key' is just an enumerated value with a small number of > possible values, I don't see anything wrong with using elf > aux data. I understand it's hard to know how many keys > might be needed in the long run, from the way you define > the key/value pairs here, I would expect it to have a lot > of the same limitations that the aux data has, except for > a few bytes to be copied. Correct, this makes allocating bits out of here cheaper by not requiring that we actively copy them into every new process forever. You're right that the aux vector would work as well, but the thinking behind this series was that an interface like this might be better for an architecture as extensible as risc-v. > > > + long sys_riscv_hwprobe(struct riscv_hwprobe *pairs, size_t > > pair_count, > > + size_t cpu_count, cpu_set_t *cpus, > > + unsigned long flags); > > The cpu set argument worries me more: there should never be a > need to optimize for broken hardware that has an asymmetric set > of features. Just let the kernel figure out the minimum set > of features that works across all CPUs and report that like we > do with HWCAP. If there is a SoC that is so broken that it has > important features on a subset of cores that some user might > actually want to rely on, then have them go through the slow > sysfs interface for probing the CPUs indidually, but don't make > the broken case easier at the expense of normal users that > run on working hardware. I'm not so sure. While I agree with you for major classes of features (eg one CPU has floating point support but another does not), I expect these bits to contain more subtle details as well, which might vary across asymmetric implementations without breaking ABI compatibility per-se. Maybe some vendor has implemented exotic video decoding acceleration instructions that only work on the big core. Or maybe the big cores support v3.1 of some extension (where certain things run faster), but the little cores only have v3.0, where it's a little slower. Certain apps would likely want to know these things so they can allocate their work optimally across cores. > > > +asmlinkage long sys_riscv_hwprobe(uintptr_t, uintptr_t, uintptr_t, > > uintptr_t, > > + uintptr_t, uintptr_t); > > Why 'uintptr_t' rather than the correct type? Fixed. -Evan