Received: by 2002:a05:6358:e9c4:b0:b2:91dc:71ab with SMTP id hc4csp5882985rwb; Tue, 9 Aug 2022 05:50:47 -0700 (PDT) X-Google-Smtp-Source: AA6agR5fWPJjlT+stt3cH+7Z4zC+ocUz7HcSdtCRTPtUwZW7s5tc7BIXktRcZRNahe21nPOv25vP X-Received: by 2002:aa7:d60b:0:b0:43c:f7ab:3c8f with SMTP id c11-20020aa7d60b000000b0043cf7ab3c8fmr21617956edr.6.1660049447783; Tue, 09 Aug 2022 05:50:47 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1660049447; cv=pass; d=google.com; s=arc-20160816; b=N4arUsdU+h1gE63MLRbMLROnBSqfZn6ThHzjBnODinRFRujYr/4Hl4ly6myAlviLbk wX3H/nU8UcRy2MU3TKesTNIM8rDDXyMGZBBLEsyLj/HaS8pZt8PWq6isk9lWHOSc5C+X Uwn14iBRUX5ab2s7RJq7sAWDYX47plvH0dX4Y2msUQPe3OLAqzOylFoscvPqVs2ZPMQH Xc1LTBnEMxMv6uRBYh6VfwV/JV8adOo9zxrtwKYc0JzAO19GGaJKcWPOpvc6t7Vs1tMa bDq9W1J6exaxGcYaFubgnUAxsb7eoWWbb1784ootD/p0Ad+sO+nY0bsVuKO1J3B4fjDa lYNA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:in-reply-to:content-disposition :references:message-id:subject:cc:to:from:date:dkim-signature; bh=1dUrwIPK/KeGMHlwj0OHqXM3YgXOcCRaBy9NT/Hlro4=; b=NDK363WZTuQOG4G7p222LK/hfKSIdSLWVqGpoUWen5qHO5FvKKp+O+0/HRkxZJ+Y+E sOuAuvfqqI4fGMyzjRlYXvOIGvMHgkboCO5y421KjObpjwxVX+2xxOhM4tXQszHef7y6 UGyfU4PF/NZpj5dWqG5kWCvk/ATmZNdj+H7U77SP5U44oLDYVy0W10Lc6KFd+JL3Ff/e yobb5sDONU+8lBT/UyGjuaYNhUtoH85Br5MTbqnRcvEDbTTX/o5sn5jfW1moL251PeAU AmBqMM1EhFwbXBJeSsqWBRTNi1MFwXEALMfXaZMqAz83Drxc1cuW2sM+ylIpAccfUx3R tNsQ== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@Nvidia.com header.s=selector2 header.b=MftrVn8H; arc=pass (i=1 spf=pass spfdomain=nvidia.com dkim=pass dkdomain=nvidia.com dmarc=pass fromdomain=nvidia.com); spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=nvidia.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id go8-20020a1709070d8800b0073065ea79bfsi2390069ejc.528.2022.08.09.05.50.22; Tue, 09 Aug 2022 05:50:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@Nvidia.com header.s=selector2 header.b=MftrVn8H; arc=pass (i=1 spf=pass spfdomain=nvidia.com dkim=pass dkdomain=nvidia.com dmarc=pass fromdomain=nvidia.com); spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=nvidia.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242823AbiHIMnq (ORCPT + 99 others); Tue, 9 Aug 2022 08:43:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40930 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239296AbiHIMnp (ORCPT ); Tue, 9 Aug 2022 08:43:45 -0400 Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12on2054.outbound.protection.outlook.com [40.107.244.54]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 93D9318E00 for ; Tue, 9 Aug 2022 05:43:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=hOZCGCbtcw+E6ecqBGQ8p5+6CeQ2qMFJX/adoIfJ1xFsqLLQQufvxRJsmkqK6vc9xofRiPy56bD5Cb0SIhgfCtaVaiDu3BZ/VtZTREMCHHdMy44XbSYNOuySQ81gyONrlTCZCbFzCgcEt5PNwmkLplbaV9ISP6iUcIMAxfUi2Imev1ihFnL2VJvTifIGV1zjJAkmOOs2hzDK800aqal+9gMamHYrm0z5/n7wdI1nThmt90rwFIm5ck/QW+WzjKpTtPM6xU0fJONoY8saHlJGB6PEdaUrf4+JLq/j7HzFsDbet9NOMGUi7xw2U9bfHQXcsGJ0OAMfymYF2DJZ49Q4oQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=1dUrwIPK/KeGMHlwj0OHqXM3YgXOcCRaBy9NT/Hlro4=; b=N2A+POmN5V6CsOQHlnkYN+S4U6wFWU3kMvgAsCVN++ibsLLYnlz27reRIkX+pwJvHxH12TRnPvSMutfgP7wZAT+nF3byOSncuiL3PN51znEJC79r1b2RXOMJU43DBAOWnFKX/cIOl+zFu0gocLlhXb7QxojeInfqx6Q0YxpT+6zrrDnctKUB4JBIvowJs60vPoFRNf5GOU1bgpM8z4Is7IBzznJDXX0hJioc7mgJ8csd/WahH4FSPNDCfbFsDj/3M5t1BXfbI7jWExBWb+rSE3g8AFNBwWOPOwX4G/QXa4JLWvy90VNpXplMq6Za/Vgo83LOtqqtR3qswL+rCG/CqA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=1dUrwIPK/KeGMHlwj0OHqXM3YgXOcCRaBy9NT/Hlro4=; b=MftrVn8H1pnJAf0lfa92O3f4bleznpVGBBMhRsS9OKXLMOlizQRl3Rq/ExUWagD4lNBF1IhRgNEV8R3IgYSn/I1PQ8dQBphs4udUkpvX6WhpZ4hZEOiOyY+mB6AKHVXnoIZzWLrdEzg+w5r/wl3ByCDssHwgcLCm4bU2wwpYiliGu8SSGB+oJt4xdkA7pOt9nKpurU+5IPjWMlpGdeRy6je2WHdmK0hoHxDvbLxd+k5JP83Dz7RubDp3kgJFfkuwZsHRTttPhynHvn6mPnMyC5NEBhB7zkiRbHrIkBOnazkq6yJfch9OT9n8gfd8D3ardVJqAVgbq4xr9ibIiyua4Q== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from MN2PR12MB4192.namprd12.prod.outlook.com (2603:10b6:208:1d5::15) by BN8PR12MB3347.namprd12.prod.outlook.com (2603:10b6:408:43::32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5504.14; Tue, 9 Aug 2022 12:43:42 +0000 Received: from MN2PR12MB4192.namprd12.prod.outlook.com ([fe80::1a7:7daa:9230:1372]) by MN2PR12MB4192.namprd12.prod.outlook.com ([fe80::1a7:7daa:9230:1372%7]) with mapi id 15.20.5504.020; Tue, 9 Aug 2022 12:43:42 +0000 Date: Tue, 9 Aug 2022 09:43:41 -0300 From: Jason Gunthorpe To: Oded Gabbay Cc: Dave Airlie , dri-devel , Greg Kroah-Hartman , Yuji Ishikawa , Jiho Chu , Arnd Bergmann , "Linux-Kernel@Vger. Kernel. Org" Subject: Re: New subsystem for acceleration devices Message-ID: References: Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: MN2PR19CA0009.namprd19.prod.outlook.com (2603:10b6:208:178::22) To MN2PR12MB4192.namprd12.prod.outlook.com (2603:10b6:208:1d5::15) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 444d8bd2-6a25-46ae-1123-08da7a04ca5a X-MS-TrafficTypeDiagnostic: BN8PR12MB3347:EE_ X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: urDp7oZlKAbisy/o1mV9UjnG93j3e3t7UsvNbHDRkYCQK1fftV71G32z4o6PG5XgsiCPPoibjEiakUzNSxDHL3ESuZSzGuVgJJUTLT6SWxo3TlCHH+loAPWPt3mTCMXrmyChk3zAvhTSt1obos/mUxPhAo73fxZyhuAmq3RFjkzFY30Ujk/bWUhXf6T9wamemPIlpYiDVPs8EWdB36a+X+neYxoww7pAt23ol3boX5y++Ar8fZF1jfSYRmJ6o3rg42hVebvwugRsCtDXFADurU8XzGzkdGtiDOdcD+GZvSCaQlpTeEd2kksxegrddUKZPWmP3fo4QKG+xR/ryVZamz0kKofMDW62KZLhkt9uXPFuslFCPrB5KrUwv6pAOpRZwKGapTdfdZ+9dzHhGRBcbJqlLFMO6ROhv/RzNTZIiJU+XTOhJye/tyzTf0BOTlnWdI954oR9gV/TuklKYZkh4OYnKn2+ujHXUXU76IVIbSu9UpUobwo+IJBl3ngt+H5t5geyWR2YNm90DD3+TbCB6zu5/f2nAnmeejAROdykiH6NQv/usmHi6sa/xl+v5jpEL3nxoGjfIKmtH1VwMYNC0nyGQpk8fzYaAglGcOs5otMStjBoRpwMtwQedSpMoHx4EOwT4kRXLhEtKQjxIWrH1vl8/yEwNIAcjLkrB8Dj/gW1dFaxIg9djpvLK7sIjq8ibEdKX0JCuPiUnKMRJ/iUKmKt/ETU1gfAS9FCzCEgZP05RsUGt/l849WOiKxo1sMT X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:MN2PR12MB4192.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230016)(4636009)(136003)(366004)(396003)(39860400002)(376002)(346002)(38100700002)(316002)(36756003)(83380400001)(54906003)(186003)(6916009)(8676002)(4326008)(6486002)(5660300002)(2906002)(86362001)(6506007)(2616005)(66946007)(8936002)(6512007)(66476007)(26005)(66556008)(478600001)(41300700001);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?oEh995q5vhrlT2p4ivEiHSEw/Edbnu1mReJmBQ530pGn/4ysqSHjAOWXIyXL?= =?us-ascii?Q?7bumGLv0bD8inBwZe7juAWFMobt9biaJas3L9e41ZRK61igIOa4vCne55+z8?= =?us-ascii?Q?c42DTNFAtQdn9TmbleBFdEVOO3qHHmW3BzUW+2HCkA/5GTt5DLYOj9FrW0h0?= =?us-ascii?Q?tfxrKBdAnUoKU7AcebFDgz6QU9NJzIzNaXB1Y+LpFBLfC9gCZAxIW1YqKLJA?= =?us-ascii?Q?BgC2UejW2o1TZkR6px1jpQI6OJVgs6ID9GX+5Xj43NrA71u26xlHaZH4mVcc?= =?us-ascii?Q?NCMMe0VKLfax/mCl/e1Xmx72KuqGIBoczLl4URM3GIKYoxxaP85cVufuO57G?= =?us-ascii?Q?ovRp1L15C4elyT93WCWe7llsFeF9SWOlBSEVsbsApv1n2IBOiJU4oGgFrUIy?= =?us-ascii?Q?hbm3yj5ssPTUGvnzr8ZA8Oi/5LhJUQJXgPimEb2Em/ZxyZpKIqXAL21k9fW3?= =?us-ascii?Q?udwOAvQez+sxc5ZBp65vcinj9PtQiLmxA2jGFyVusFc/Z3RxinMaLNHMTqnr?= =?us-ascii?Q?Dhkgo3APPY247KEv2TuRQLky7M7SXLrEtiUgv6HzDmltXUi/o9/6aMr383iz?= =?us-ascii?Q?tEPXKG8ry47XUv6PsWdBdCMTzAwpEmNtpi7pMgqvOoVyB9UFR4gXcDq6Qpcm?= =?us-ascii?Q?AdNzZaPywm+V5OZPxGuqyjFMvf9HIPt+7YQKwtpPCN0DXe7D7gRN8/Pl+WFC?= =?us-ascii?Q?/YXAkl4RW0mSwsmunaWldnwaSuFnGSB88hybVx0Eq02iCIQCDawpcmcp2bmO?= =?us-ascii?Q?v7JCmVpu0r8pSPEFNs9AbPLvUaEBWvQwgx1BMHr6pSwrxE7Zvc7E+Sh2wQ51?= =?us-ascii?Q?5imf4WDCx/6bAI7u8Xti1MCaEDDNn2bTl0rsu04MEHehGEjwa3Ho1y/epZTU?= =?us-ascii?Q?FVbdMjOqzF0t4FT7bUmXbRoK/KMeaLbn0laG8vJ0dwvQEY7vN/c38vJGyXas?= =?us-ascii?Q?Nx4RJNAzuegjXdfhOUna0noYoCJdH3wL+plfySWu/WsrUauETS5lvpRgDh3k?= =?us-ascii?Q?h3kbu8/6F7d310gbL52QJAoVgMXU5FPMXlPE0HATIusRrl0xq6ZbAF7nvkfv?= =?us-ascii?Q?pjoDQHLjVGIAZ8cIbpanU2mjmKeQRX8zfeYsulG7VtXO4pL1a7IGUB0Mu0wJ?= =?us-ascii?Q?eDx+0IYmwbjotljgMZkawc/HGuN7kqodJIP08JcR2T4YrFEzs6AJFOwyP3IO?= =?us-ascii?Q?KFvCA8XmVQtSl7Vf/U4Jea/02/p5aPY0fnnwPWZ5vD9j08X3QUrVydBsdRgv?= =?us-ascii?Q?dPMBZ+Wtk+0mn3/AZ5z1/ek/O4qXt4k74VxtJCxFt9mzo64xKSm8pIFosGBV?= =?us-ascii?Q?IldEm9AubkimtyiGfKH86TzMihfcuUbnNs4t2pHXJS8UXOAu6p0u5hLkC/Ly?= =?us-ascii?Q?VqtiarcRuTge91QHuaCaLKffof2vznLwNQp3t+NcMunYESr+0rFIYBnP1fpQ?= =?us-ascii?Q?JriVnNQ0gRqYxb2Pkmln9JrAiKDrQDJgD25S8Ikjb0OSti9lNwIa51Ey3R91?= =?us-ascii?Q?0YbsdU5hwk1WxWFtKidqXYg+br5tD95Jr5rMMpufxjed2auojuFZNPZsWUr1?= =?us-ascii?Q?0HiZmRZ01GM8fjXC8g55kyIdLTv22AkdlMRijxfV?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 444d8bd2-6a25-46ae-1123-08da7a04ca5a X-MS-Exchange-CrossTenant-AuthSource: MN2PR12MB4192.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Aug 2022 12:43:42.0641 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: m+OR7/zsjcVClj2niiY+UXvk4hSZ3NFFJ3cZEs55uA1LffWAuHzBweVNEVODrVrE X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN8PR12MB3347 X-Spam-Status: No, score=-1.7 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FORGED_SPF_HELO, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Aug 08, 2022 at 11:26:11PM +0300, Oded Gabbay wrote: > So if you want a common uAPI and a common userspace library to use it, > you need to expose the same device character files for every device, > regardless of the driver. e.g. you need all devices to be called > /dev/accelX and not /dev/habanaX or /dev/nvidiaX So, this is an interesting idea. One of the things we did in RDMA that turned our very well is to have the user side of the kernel/user API in a single git repo for all the drivers, including the lowest layer of the driver-specific APIs. It gives a reasonable target for a DRM-like test of "you must have a userspace". Ie send your userspace and userspace documentation/tests before your kernel side can be merged. Even if it is just a git repo collecting and curating driver-specific libraries under the "accel" banner it could be quite a useful activity. But, probably this boils down to things that look like: device = habana_open_device() habana_mooo(device) device = nvidia_open_device() nvidia_baaa(device) > That's what I mean by abstracting all this kernel API from the > drivers. Not because it is an API that is hard to use, but because the > drivers should *not* use it at all. > > I think drm did that pretty well. Their code defines objects for > driver, device and minors, with resource manager that will take care > of releasing the objects automatically (it is based on devres.c). We have lots of examples of subsystems doing this - the main thing unique about accel is that that there is really no shared uAPI between the drivers, and not 'abstraction' provided by the kernel. Maybe that is the point.. > So actually I do want an ioctl but as you said, not for the main > device char, but to an accompanied control device char. There is a general problem across all these "thick" devices in the kernel to support their RAS & configuration requirements and IMHO we don't have a good answer at all. We've been talking on and off here about having some kind of subsystem/methodology specifically for this area - how to monitor, configure, service, etc a very complicated off-CPU device. I think there would be a lot of interest in this and maybe it shouldn't be coupled to this accel idea. Eg we already have some established mechinisms - I would expect any accel device to be able to introspect and upgrade its flash FW using the 'devlink flash' common API. > an application only has access to the information ioctl through this > device char (so it can't submit anything, allocate memory, etc.) and > can only retrieve metrics which do not leak information about the > compute application. This is often being done over a netlink socket as the "second char" Jason