Received: by 2002:a05:7412:3784:b0:e2:908c:2ebd with SMTP id jk4csp1871362rdb; Tue, 3 Oct 2023 03:53:32 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHVe1WMSgSerYsVHv/jmz7w54kDjYtDjEaqQIBMHfmOYdIn8XaTJvOhgr2Nxb0z6Z6sPl9C X-Received: by 2002:a05:6870:a686:b0:1c8:ca70:dd0c with SMTP id i6-20020a056870a68600b001c8ca70dd0cmr17039358oam.19.1696330412414; Tue, 03 Oct 2023 03:53:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696330412; cv=none; d=google.com; s=arc-20160816; b=aX7jkyGwDx/vYZzmEXGWbp104j0NYuB7Q4Bl4hOUb6Ettim9ojmupBajEh3tlGwYLF 8PeNQSLIDftqPt40dA8rCqRsK/oLWoXR//3HYCcz5zYFDQV/PxB8O+QLU1SPQBs/bjXa NVnoUV2HrEG//Fli6ekB0KhQP4ifZwfS20rxIexSy/5ruB6GWjErVhYsLjfJHLKApd3z 6m8AQAFnRWSzSrQtUPRGAgSjjKayt2bUrzwAwvEHDTE2+Yz/Z/Y2tU7XbneUnnOUO/is XjzItfRQuFIM8FRH9g/8lmL6bjdP4OHS/zZZ12zvt/HUJe9jtcfmeJSTVsMGlf/gQQYK j5dQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=jk6GY4iBO/HyUmx+2ZQv3PJQ0rwq4edf6OHNwVcTydg=; fh=bOQt7K7gSuOrKcG7NDi4mRy91TP0k94r/yiYQG3NfYE=; b=kP/YpUB+DtOXWpO7bDWZ7i9C+SMhJWSDc0GLeanqzQrgkYX62dIHvCV8pf0pJb1EgM Wv5qK3NCiq4lBLwes4gJYQUbgyFiFtJ9cDKHfFuMGLgykC/Y1whYRw3EQJMpMXPETtmB uwTbageonF9i9zrXtD237YaSr5Q2Qif9zS6WKihmjSr2QQCNqIhuvL7NgGIornzA3ASc 2wdS7pVBA9iGVaacbVBKTp81oRIgQpE/eq2CtQWt3r3EqIP7dzTmZvs2NuUmYTDk3PhF TEL7hqb+EpPHuDTQIeb7+z1RljIJW2WyCrI5xDHRA/LN4tbdY7dC3HRM6g9Y/MMRT6Pr Zn6Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=EQkQHDTW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id r20-20020a6560d4000000b005855f67e490si1149631pgv.690.2023.10.03.03.53.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Oct 2023 03:53:32 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=EQkQHDTW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 2A485801C0B4; Tue, 3 Oct 2023 03:53:31 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231950AbjJCKx2 (ORCPT + 99 others); Tue, 3 Oct 2023 06:53:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34164 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231173AbjJCKx1 (ORCPT ); Tue, 3 Oct 2023 06:53:27 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ED0AAAC; Tue, 3 Oct 2023 03:53:23 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 86458C433C9; Tue, 3 Oct 2023 10:53:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1696330403; bh=7d34zM3dSi1EmvJgpJwjtxHg10NsCJyhiek6+F843Y8=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=EQkQHDTWT6z/gaHpgwuStT/S52T8Nf29hPQfCJKSE2XJbSh/Ogui1YytZKzCtp+pj hFNwP/3NfIErcPYeZLgMPUXOSM0LieiWUTTRRdbv2t479UsM7KQaN5JwVj3Y4psTrl 3vv/YvkSB+zhumtPcoRaNQVsch+QZuxiszfRnvUSd7zmJe0bbEyplLW6QF5fzEH9Ef K/oVVp3cHe9UOoL3F9vNLFxk2+j1R+KyQ7EFUyobNXBjrBehiuj2+DZ+nzjQvq1xko lAbRDroglCpBQpE+Q8r3rkx60q7rN9zcGUvFk6so4sTTWoF1VhaLXYA9kGOYEiL9AV kUd0WjV4h9dbw== Received: by mail-yw1-f178.google.com with SMTP id 00721157ae682-5a1d0fee86aso9389107b3.2; Tue, 03 Oct 2023 03:53:23 -0700 (PDT) X-Gm-Message-State: AOJu0YysggT8ssBDVcsDXLS2a8b3BzFtBC9QnGt59Dvpff/y8V8xj+Yq 1vzhcWwr3BQi3X00bX/ajFKbvr+h72IkhMDz7y4= X-Received: by 2002:a81:4f88:0:b0:5a1:cc37:7c91 with SMTP id d130-20020a814f88000000b005a1cc377c91mr15138418ywb.19.1696330402461; Tue, 03 Oct 2023 03:53:22 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Oded Gabbay Date: Tue, 3 Oct 2023 13:52:56 +0300 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: kernel.org 6.5.4 , NPU driver, --not support (RFC) To: Cancan Chang Cc: Jagan Teki , linux-media , linux-kernel , Dave Airlie , Daniel Vetter Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Tue, 03 Oct 2023 03:53:31 -0700 (PDT) On Thu, Sep 28, 2023 at 11:16=E2=80=AFAM Cancan Chang wrote: > > =E2=80=9CWhat happens if you call this again without waiting for the prev= ious > inference to complete ?=E2=80=9D > --- There is a work-queue in the driver to manage inference tasks. > When two consecutive inference tasks occur, the second inference= task will be add to > the "pending list". While the previous inference task ends, the = second inference task will > switch to the "scheduled list", and be executed. > Each inference task has an id, "inferece" and "wait until finis= h" are paired. > > thanks Thanks for the clarification. I'll wait for your driver's code link. It doesn't have to be a patch series at this point. A link to a git repo is enough. I just want to do a quick pass. Thanks, Oded > > ________________________________________ > =E5=8F=91=E4=BB=B6=E4=BA=BA: Oded Gabbay > =E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4: 2023=E5=B9=B49=E6=9C=8828=E6=97=A5 = 15:40 > =E6=94=B6=E4=BB=B6=E4=BA=BA: Cancan Chang > =E6=8A=84=E9=80=81: Jagan Teki; linux-media; linux-kernel; Dave Airlie; D= aniel Vetter > =E4=B8=BB=E9=A2=98: Re: kernel.org 6.5.4 , NPU driver, --not support (RFC= ) > > [ EXTERNAL EMAIL ] > > On Thu, Sep 28, 2023 at 10:25=E2=80=AFAM Cancan Chang wrote: > > > > =E2=80=9CCould you please post a link to the driver's source code ? > > In addition, could you please elaborate which userspace libraries > > exists that work with your driver ? Are any of them open-source ?=E2=80= =9D > > --- We will prepare the adla driver link after the holiday on October 6= th. > > It's a pity that there is no open-source userspace library. > > But you can probably understand it through a workflow, which can b= e simplified as: > > 1. create model context > > ret =3D ioctl(context->fd, ADLAK_IOCTL_REGISTER_NETWORK, &des= c); > > 2. set inputs > > 3. inference > > ret =3D ioctl(context->fd, ADLAK_IOCTL_INVOKE, &invoke_dec); > What happens if you call this again without waiting for the previous > inference to complete ? > Oded > > 4. wait for the inference to complete > > ret =3D ioctl(context->fd, ADLAK_IOCTL_WAIT_UNTIL_FINISH, &s= tat_req_desc); > > 5. destroy model context > > ret =3D ioctl(context->fd, ADLAK_IOCTL_DESTROY_NETWORK, &sub= mit_del); > > > > > > thanks > > > > > > ________________________________________ > > =E5=8F=91=E4=BB=B6=E4=BA=BA: Oded Gabbay > > =E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4: 2023=E5=B9=B49=E6=9C=8828=E6=97= =A5 13:28 > > =E6=94=B6=E4=BB=B6=E4=BA=BA: Cancan Chang > > =E6=8A=84=E9=80=81: Jagan Teki; linux-media; linux-kernel; Dave Airlie;= Daniel Vetter > > =E4=B8=BB=E9=A2=98: Re: kernel.org 6.5.4 , NPU driver, --not support (R= FC) > > > > [ EXTERNAL EMAIL ] > > > > On Wed, Sep 27, 2023 at 10:01=E2=80=AFAM Cancan Chang wrote: > > > > > > =E2=80=9COr do you handle one cmd at a time, where the user sends a c= md buffer > > > to the driver and the driver then submit it by writing to a couple of > > > registers and polls on some status register until its done, or waits > > > for an interrupt to mark it as done ?=E2=80=9D > > > --- yes=EF=BC=8C user sends a cmd buffer to driver, and driver trig= gers hardware by writing to register, > > > and then, waits for an interrupt to mark it as done. > > > > > > My current driver is very different from drm, so I want to know i= f I have to switch to drm=EF=BC=9F > > Could you please post a link to the driver's source code ? > > In addition, could you please elaborate which userspace libraries > > exists that work with your driver ? Are any of them open-source ? > > > > > Maybe I can refer to /driver/accel/habanalabs. > > That's definitely a possibility. > > > > Oded > > > > > > thanks > > > > > > ________________________________________ > > > =E5=8F=91=E4=BB=B6=E4=BA=BA: Oded Gabbay > > > =E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4: 2023=E5=B9=B49=E6=9C=8826=E6=97= =A5 20:54 > > > =E6=94=B6=E4=BB=B6=E4=BA=BA: Cancan Chang > > > =E6=8A=84=E9=80=81: Jagan Teki; linux-media; linux-kernel; Dave Airli= e; Daniel Vetter > > > =E4=B8=BB=E9=A2=98: Re: kernel.org 6.5.4 , NPU driver, --not support = (RFC) > > > > > > [ EXTERNAL EMAIL ] > > > > > > On Mon, Sep 25, 2023 at 12:29=E2=80=AFPM Cancan Chang wrote: > > > > > > > > Thank you for your reply from Jagan & Oded. > > > > > > > > It is very appropritate for my driver to be placed in driver/accel. > > > > > > > > My accelerator is named ADLA(Amlogic Deep Learning Accelerator). > > > > It is an IP in SOC,mainly used for neural network models accelerati= on. > > > > It will split and compile the neural network model into a private f= ormat cmd buffer, > > > > and submit this cmd buffer to ADLA hardware. It is not programmable= device. > > > What exactly does it mean to "submit this cmd buffer to ADLA hardware= " ? > > > > > > Does your h/w provides queues for the user/driver to put their > > > workloads/cmd-bufs on them ? And does it provide some completion queu= e > > > to notify when the work is completed? > > > > > > Or do you handle one cmd at a time, where the user sends a cmd buffer > > > to the driver and the driver then submit it by writing to a couple of > > > registers and polls on some status register until its done, or waits > > > for an interrupt to mark it as done ? > > > > > > > > > > > ADLA includes four hardware engines: > > > > RS engines : working for the reshape operators > > > > MAC engines : working for the convolution operators > > > > DW engines : working for the planer & Elementwise operato= rs > > > > Activation engines : working for activation operators(ReLu,tanh..) > > > > > > > > By the way, my IP is mainly used for SOC, and the current driver re= gistration is through the platform_driver, > > > > is it necessary to switch to drm? > > > This probably depends on the answer to my question above. btw, there > > > are drivers in drm that handle IPs that are part of an SOC, so > > > platform_driver is supported. > > > > > > Oded > > > > > > > > > > > thanks. > > > > > > > > ________________________________________ > > > > =E5=8F=91=E4=BB=B6=E4=BA=BA: Oded Gabbay > > > > =E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4: 2023=E5=B9=B49=E6=9C=8822=E6= =97=A5 23:08 > > > > =E6=94=B6=E4=BB=B6=E4=BA=BA: Jagan Teki > > > > =E6=8A=84=E9=80=81: Cancan Chang; linux-media; linux-kernel; Dave A= irlie; Daniel Vetter > > > > =E4=B8=BB=E9=A2=98: Re: kernel.org 6.5.4 , NPU driver, --not suppor= t (RFC) > > > > > > > > [=E4=BD=A0=E9=80=9A=E5=B8=B8=E4=B8=8D=E4=BC=9A=E6=94=B6=E5=88=B0=E6= =9D=A5=E8=87=AA ogabbay@kernel.org =E7=9A=84=E7=94=B5=E5=AD=90=E9=82=AE=E4= =BB=B6=E3=80=82=E8=AF=B7=E8=AE=BF=E9=97=AE https://aka.ms/LearnAboutSenderI= dentification=EF=BC=8C=E4=BB=A5=E4=BA=86=E8=A7=A3=E8=BF=99=E4=B8=80=E7=82= =B9=E4=B8=BA=E4=BB=80=E4=B9=88=E5=BE=88=E9=87=8D=E8=A6=81] > > > > > > > > [ EXTERNAL EMAIL ] > > > > > > > > On Fri, Sep 22, 2023 at 12:38=E2=80=AFPM Jagan Teki wrote: > > > > > > > > > > On Fri, 22 Sept 2023 at 15:04, Cancan Chang wrote: > > > > > > > > > > > > Dear Media Maintainers: > > > > > > Thanks for your attention. Before describing my problem=EF= =BC=8Clet me introduce to you what I mean by NPU. > > > > > > NPU is Neural Processing Unit, It is designed for deep lea= rning acceleration, It is also called TPU, APU .. > > > > > > > > > > > > The real problems: > > > > > > When I was about to upstream my NPU driver codes to linux= mainline, i meet two problems: > > > > > > 1. According to my research, There is no NPU module pa= th in the linux (base on linux 6.5.4) , I have searched all linux projects = and found no organization or comany that has submitted NPU code. Is there a= path prepared for NPU driver currently? > > > > > > 2. If there is no NPU driver path currently, I am goi= ng to put my NPU driver code in the drivers/media/platform/amlogic/ =EF=BB= =BF, because my NPU driver belongs to amlogic. and amlogic NPU is mainly us= ed for AI vision applications. Is this plan suitabe for you? > > > > > > > > > > If I'm correct about the discussion with Oded Gabby before. I thi= nk > > > > > the drivers/accel/ is proper for AI Accelerators including NPU. > > > > > > > > > > + Oded in case he can comment. > > > > > > > > > > Thanks, > > > > > Jagan. > > > > Thanks Jagan for adding me to this thread. Adding Dave & Daniel as = well. > > > > > > > > Indeed, the drivers/accel is the place for Accelerators, mainly for > > > > AI/Deep-Learning accelerators. > > > > We currently have 3 drivers there already. > > > > > > > > The accel subsystem is part of the larger drm subsystem. Basically,= to > > > > get into accel, you need to integrate your driver with the drm at t= he > > > > basic level (registering a device, hooking up with the proper > > > > callbacks). ofc the more you use code from drm, the better. > > > > You can take a look at the drivers under accel for some examples on > > > > how to do that. > > > > > > > > Could you please describe in a couple of sentences what your > > > > accelerator does, which engines it contains, how you program it. i.= e. > > > > Is it a fixed-function device where you write to a couple of regist= ers > > > > to execute workloads, or is it a fully programmable device where yo= u > > > > load compiled code into it (GPU style) ? > > > > > > > > For better background on the accel subsystem, please read the follo= wing: > > > > https://docs.kernel.org/accel/introduction.html > > > > This introduction also contains links to other important email thre= ads > > > > and to Dave Airlie's BOF summary in LPC2022. > > > > > > > > Thanks, > > > > Oded