Received: by 2002:a05:6358:5282:b0:b5:90e7:25cb with SMTP id g2csp3987990rwa; Tue, 23 Aug 2022 14:01:37 -0700 (PDT) X-Google-Smtp-Source: AA6agR7eEmVDY0V4o24FAOXzQoQBg1VExctsHc13z5uQkCeptNJzf/sg3zWAT3Nc52Pom/DSxqTN X-Received: by 2002:a17:907:d8d:b0:73d:8358:53 with SMTP id go13-20020a1709070d8d00b0073d83580053mr921625ejc.664.1661288496852; Tue, 23 Aug 2022 14:01:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1661288496; cv=none; d=google.com; s=arc-20160816; b=TZq5edK3FNL9VSnymTHBQtoH2kRMFDm7oTAPE6W3xo4JQbtNMzS+Jcucek/GuaN24P zPN6jDp8JdzmoHdv06zBOdOLlTtoUYW7gyPEksQF+aKGS2zpA+YN9k7iCHDC/X6duwdX tYQJ6oePmAcAPWBdenCMUYwO3K8qi8EBpK7EolEX0i7zA3sEIOnsJq7LudMG873OC9T3 o6VhrqNf0NTi3UcKX13PO89yojNv1/dPcSjZnDJMMSeeIjluu1yE/9vO3WdEbvLJsFwm W48CGVts3+ih6Q16J2yX+7u/hey0zLGKxsDD8XkSkQjqGzaG6EVg2CXHZXCwo2tsYE1I Oz9g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=Iyj8X1hythu4VbtWXUOop5SKgK9Wa/aA+SaAUy1k2M8=; b=X1AP4foJsnAejBt9BrwnrFuu6O90RE+P0s9HtCyGtwTW58OgRPo+8fxfMSEUUVhGvR uVOGY3OXjW8flOCZW751m0mmbqK2czPZJkBJFuZ5fcrja+ZoN7EyF/qWk2h3J1AXQuXx PYyEJLOydkTMuHmOdEIDomI9hzQ8ytB1yb18OBYjZmIEr+xu0uayHT4rS4fllCwrDVMS +sfofxwmEHMP7Gw7QS6jGtS6xrys4ubPjiptNExl0PSNUmFZnC1SXZMfVcx05MK7HD8c 6jABsyOF0DBuisppP9QWjyqjxoDF3smQiHbP6jXFUZ7R3cBUhR10EwOI3nwYbWfOdcAY hljQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=IvNq8Ruc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id bv8-20020a170906b1c800b00730632d2a0fsi513983ejb.452.2022.08.23.14.01.11; Tue, 23 Aug 2022 14:01:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=IvNq8Ruc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232997AbiHWUuW (ORCPT + 99 others); Tue, 23 Aug 2022 16:50:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32978 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229979AbiHWUuE (ORCPT ); Tue, 23 Aug 2022 16:50:04 -0400 Received: from mail-oi1-x235.google.com (mail-oi1-x235.google.com [IPv6:2607:f8b0:4864:20::235]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1E521246 for ; Tue, 23 Aug 2022 13:45:52 -0700 (PDT) Received: by mail-oi1-x235.google.com with SMTP id j5so17406915oih.6 for ; Tue, 23 Aug 2022 13:45:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc; bh=Iyj8X1hythu4VbtWXUOop5SKgK9Wa/aA+SaAUy1k2M8=; b=IvNq8RucyAgYkTsP1xLFQoJCOZcNtwnyDvU5boC9SB/eiDJ0VWFNIqCfoxbHEH7NnF m4eWeIonFz5iAH6fvwoN8aCbUGfkSFwHeU+mIZrdPGPl3ccX2qhIgNeJL3cDgZCxtbRG oKYwesNPj+wni2WuMga5shIIe8fbOErsDV5HgFNT55gE1v4RWAY1lq7IqU4kANZ+th8N tc1AnwwAun6DvaI6tE5+rRSUZnyCG0kXQQpcUy5NQwaDL/UhlwgT8KOgVZy0EfzcaiSf I6uAQm3lbUSsp79JTGu4FiUpdb+N1LXZt5dP9WrKLdy9WrwVMKFHuO+w1f6DTeTGRINu cIVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc; bh=Iyj8X1hythu4VbtWXUOop5SKgK9Wa/aA+SaAUy1k2M8=; b=qxSODPPnZpImscV9wRRysVYGC2HrTftV6BslLZZlJaQCaGQL5+V5yJVepdEXNnNrpy rPE0Zp7bLD/Ib18DWEUqtT34pzYxEVYEj1+A4DSzlEg4RAush+j5mA+jGDqvsI5aOIbw 19B3N+j0v6Mr8OzU7ta3C7gFTic3F9Xd18bTnJNZSOMWO0jn3cgnPginAjsNyoVnuUwT uHh7v9XY8dlh+WoQUP6fHAUndFu3SXT+33tCv+xTGUrncvFtLOOrODcp04Ewz03OIxwn R2Fbl5usmAIhe8wW71dxy1Csz6B/qwQP49l6efgVzi2m7WFwScrk9Bf9m/rmD26x1RfL 3gwA== X-Gm-Message-State: ACgBeo3Vk9B7N2413eOyoMECypZxmiTfXpNTATPpu+WhS7qDcD157iUb kM7upZt0/1SoKUqdJGH6WD3K3IqHWCqygzI+tSU= X-Received: by 2002:a05:6808:152b:b0:343:ef9d:4729 with SMTP id u43-20020a056808152b00b00343ef9d4729mr2028855oiw.286.1661287551352; Tue, 23 Aug 2022 13:45:51 -0700 (PDT) MIME-Version: 1.0 References: <7hk06ykedc.fsf@baylibre.com> In-Reply-To: <7hk06ykedc.fsf@baylibre.com> From: Oded Gabbay Date: Tue, 23 Aug 2022 23:45:24 +0300 Message-ID: Subject: Re: New subsystem for acceleration devices To: Kevin Hilman Cc: Dave Airlie , Greg Kroah-Hartman , Yuji Ishikawa , Jiho Chu , Alexandre Bailon , Jason Gunthorpe , Arnd Bergmann , dri-devel , "Linux-Kernel@Vger. Kernel. Org" Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Aug 23, 2022 at 9:24 PM Kevin Hilman wrote: > > Hi Obed, > > Oded Gabbay writes: > > [...] > > > I want to update that I'm currently in discussions with Dave to figure > > out what's the best way to move forward. We are writing it down to do > > a proper comparison between the two paths (new accel subsystem or > > using drm). I guess it will take a week or so. > > Any update on the discussions with Dave? and/or are there any plans to > discuss this further at LPC/ksummit yet? Hi Kevin. We are still discussing the details, as at least the habanalabs driver is very complex and there are multiple parts that I need to see if and how they can be mapped to drm. Some of us will attend LPC so we will probably take advantage of that to talk more about this. > > We (BayLibre) are upstreaming support for APUs on Mediatek SoCs, and are > using the DRM-based approach. I'll also be at LPC and happy to discuss > in person. > > For some context on my/our interest: back in Sept 2020 we initially > submitted an rpmesg based driver for kernel communication[1]. After > review comments, we rewrote that based on DRM[2] and are now using it > for some MTK SoCs[3] and supporting our MTK customers with it. > > Hopefully we will get the kernel interfaces sorted out soon, but next, > there's the userspace side of things. To that end, we're also working > on libAPU, a common, open userspace stack. Alex Bailon recently > presented a proposal earlier this year at Embedded Recipes in Paris > (video[4], slides[5].) > > libAPU would include abstractions of the kernel interfaces for DRM > (using libdrm), remoteproc/rpmsg, virtio etc. but also goes farther and > proposes an open firmware for the accelerator side using > libMetal/OpenAMP + rpmsg for communication with (most likely closed > source) vendor firmware. Think of this like sound open firmware (SOF[6]), > but for accelerators. I think your device and the habana device are very different in nature, and it is part of what Dave and I discussed, whether these two classes of devices can live together. I guess they can live together in the kernel, but in the userspace, not so much imo. The first class is the edge inference devices (usually as part of some SoC). I think your description of the APU on MTK SoC is a classic example of such a device. You usually have some firmware you load, you give it a graph and pointers for input and output and then you just execute the graph again and again to perform inference and just replace the inputs. The second class is the data-center, training accelerators, which habana's gaudi device is classified as such. These devices usually have a number of different compute engines, a fabric for scaling out, on-device memory, internal MMUs and RAS monitoring requirements. Those devices are usually operated via command queues, either through their kernel driver or directly from user-space. They have multiple APIs for memory management, RAS, scaling-out and command-submissions. > > We've been using this succesfully for Mediatek SoCs (which have a > Cadence VP6 APU) and have submitted/published the code, including the > OpenAMP[7] and libmetal[8] parts in addition to the kernel parts already > mentioned. What's the difference between libmetal and other open-source low-level runtime drivers, such as oneAPI level-zero ? Currently we have our own runtime driver which is tightly coupled with our h/w. For example, the method the userspace "talks" to the data-plane firmware is very proprietary as it is hard-wired into the architecture of the entire ASIC and how it performs deep-learning training. Therefore, I don't see how this can be shared with other vendors. Not because of secrecy but because it is simply not relevant to any other ASIC. > > We're to the point where we're pretty happy with how this works for MTK > SoCs, and wanting to collaborate with folks working on other platforms > and to see what's needed to support other kinds of accelerators with a > common userspace and open firmware infrastructure. > > Kevin > > [1] https://lore.kernel.org/r/20200930115350.5272-1-abailon@baylibre.com > [2] https://lore.kernel.org/r/20210917125945.620097-1-abailon@baylibre.com > [3] https://lore.kernel.org/r/20210819151340.741565-1-abailon@baylibre.com > [4] https://www.youtube.com/watch?v=Uj1FZoF8MMw&t=18211s > [5] https://embedded-recipes.org/2022/wp-content/uploads/2022/06/bailon.pdf > [6] https://www.sofproject.org/ > [7] https://github.com/BayLibre/open-amp/tree/v2021.10-mtk > [8] https://github.com/BayLibre/libmetal/tree/v2021.10-mtk