Received: by 2002:ac0:aed5:0:0:0:0:0 with SMTP id t21csp3060969imb; Mon, 4 Mar 2019 23:14:14 -0800 (PST) X-Google-Smtp-Source: APXvYqxMJdfkwRTH3H1YR3NDmkqn5R8GrljRuxXoU/6xSIPgFGsO8oLe9c5rmyhd16771fywWzUk X-Received: by 2002:a17:902:4624:: with SMTP id o33mr22170236pld.68.1551770054251; Mon, 04 Mar 2019 23:14:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1551770054; cv=none; d=google.com; s=arc-20160816; b=MCGG2xzCiAqMjRSX2iRu1IHypyy+m/fuW8haycQ66auBlEm4bKcwbFUESki2SpgJyH E6i1U0OMSG4KRq40sUx2yuSPFYTkjjiYE2u0X80idijbv1mXPGxbF2/UC0QEIhcbDx3a XLd75kEVbk1MC+zecekB3I/YK7VHEl4QvD3pyT7fCeqOaLUaqx/B/hXCfawF2327iR8U Czy8YkvdWVjzRdd3ip9QIRxIGJ1dYjVZQxb11l85XSXL6WeNqpvvjVRZIwQknlDff6KV 2mDaHqCdUpfWkrPT7yNqTADNiZTWF0R/A1vxKjCx3NLfjzvSPjU1fGq8XcKv958vv+Mf G3UQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=xNqKsSVXH74OQUwra5T2scoa0rlSXod6QKtg6r7P8U0=; b=AGX9kqn8U1HPn4lBmp/Bo4BAxvLB6If3uP1Xhvrj3CKIkG6vMttvqOutUkkXfFXs73 11XTsK8ASlZNkUbmDHeztqHy4SivO4bOpKcCDhg+M+SnVqMX1NNWI+yo0PFSq0v7yOaq cisl9D21HkW4FVf94h4YrvTu/kMfktJ4qaKfRQx+tulsaA4QjMlcZxrhi89qggjk0pTT 1iFqyodCui3TfBDdRlK8bu4y5grR+xfgpjRt7IQl2AQlb7ah5x01910GLmOlfZHclz2E rOqd1a33aXqAibkdqhlDRn+XXjDqKXIXkuAbjlXxFLpDVvIPQfZLs58tjZVk+CxahKrT vJUA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=pvDXnRR6; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l1si7636437pld.386.2019.03.04.23.13.59; Mon, 04 Mar 2019 23:14:14 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=pvDXnRR6; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727124AbfCEHNf (ORCPT + 99 others); Tue, 5 Mar 2019 02:13:35 -0500 Received: from mail.kernel.org ([198.145.29.99]:49388 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725782AbfCEHNf (ORCPT ); Tue, 5 Mar 2019 02:13:35 -0500 Received: from localhost (5356596B.cm-6-7b.dynamic.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 4F58620675; Tue, 5 Mar 2019 07:13:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1551770013; bh=m3ssJpgzGPjMnDiklK+y5K6rH8PckJSfgnFKhs4vBD8=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=pvDXnRR6eafRWprSXUBmWbi9ZAWZkunVI12V2YfFo+tib4me8bAGOXfAgocpCLfh5 J1e/frf1MZ5BX7okgo1TY2st8rCZGOFg8K+IufcHMsxF6FQhaJs7xyJNg9Jg2YVaN8 B31hqjqthsb0pXKhtKXjv/VaveZgOmCpLzXjr2mc= Date: Tue, 5 Mar 2019 08:13:31 +0100 From: Greg KH To: Parav Pandit Cc: "netdev@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "michal.lkml@markovi.net" , "davem@davemloft.net" , Jiri Pirko , Jakub Kicinski Subject: Re: [RFC net-next 8/8] net/mlx5: Add subdev driver to bind to subdev devices Message-ID: <20190305071331.GA2060@kroah.com> References: <1551418672-12822-1-git-send-email-parav@mellanox.com> <1551418672-12822-9-git-send-email-parav@mellanox.com> <20190301072158.GC8975@kroah.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.11.3 (2019-02-01) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 01, 2019 at 05:21:13PM +0000, Parav Pandit wrote: > > > > -----Original Message----- > > From: Greg KH > > Sent: Friday, March 1, 2019 1:22 AM > > To: Parav Pandit > > Cc: netdev@vger.kernel.org; linux-kernel@vger.kernel.org; > > michal.lkml@markovi.net; davem@davemloft.net; Jiri Pirko > > > > Subject: Re: [RFC net-next 8/8] net/mlx5: Add subdev driver to bind to > > subdev devices > > > > On Thu, Feb 28, 2019 at 11:37:52PM -0600, Parav Pandit wrote: > > > Add a subdev driver to probe the subdev devices and create fake > > > netdevice for it. > > > > So I'm guessing here is the "meat" of the whole goal here? > > > > You just want multiple netdevices per PCI device? Why can't you do that > > today in your PCI driver? > > > Yes, but it just not multiple netdevices. > Let me please elaborate in detail. > > There is a swichdev mode of a PCI function for netdevices. > In this mode a given netdev has additional control netdev (called representor netdevice = rep-ndev). > This rep-ndev is attached to OVS for adding rules, offloads etc using standard tc, netfilter infra. > Currently this rep-ndev controls switch side of the settings, but not the host side of netdev. > So there is discussion to create another netdev or devlink port.. > > Additionally this subdev has optional rdma device too. > > And when we are in switchdev mode, this rdma dev has similar rdma rep device for control. > > In some cases we actually don't create netdev when it is in InfiniBand mode. > Here there is PCI device->rdma_device. > > In other case, a given sub device for rdma is dual port device, having netdevice for each that can use existing netdev->dev_port. > > Creating 4 devices of two different classes using one iproute2/ip or iproute2/rdma command is horrible thing to do. Why is that? > In case if this sub device has to be a passthrough device, ip link command will fail badly that day, because we are creating some sub device which is not even a netdevice. But it is a network device, right? > So iproute2/devlink which works on bus+device, mainly PCI today, seems right abstraction point to create sub devices. > This also extends to map ports of the device, health, registers debug, etc rich infrastructure that is already built. > > Additionally, we don't want mlx driver and other drivers to go through its child devices (split logic in netdev and rdma) for power management. And how is power management going to work with your new devices? All you have here is a tiny shim around a driver bus, I do not see any new functionality, and as others have said, no way to actually share, or split up, the PCI resources. > Kernel core code does that well today, that we like to leverage through subdev bus or mfd pm callbacks. > > So it is lot more than just creating netdevices. But that's all you are showing here :) > > What problem are you trying to solve that others also are having that > > requires all of this? > > > > Adding a new bus type and subsystem is fine, but usually we want more > > than just one user of it, as this does not really show how it is exercised very > > well. > This subdev and devlink infrastructure solves this problem of creating smaller sub devices out of one PCI device. > Someone has to start.. :-) That is what a mfd should allow you to do. > To my knowledge, currently Netronome, Broadcom and Mellanox are actively using this devlink and switchdev infra today. Where are they "using it"? This patchset does not show that. > > Ideally 3 users would be there as that is when it proves itself that it is > > flexible enough. > > > > We were looking at drivers/visorbus if we can repurpose it, but GUID device naming scheme is just not user friendly. You can always change the naming scheme if needed. But why isn't a GUID ok? It's very easy to reserve properly, and you do not need a central naming "authority". > > Would just using the mfd subsystem work better for you? That provides > > core support for "multi-function" drivers/devices already. What is missing > > from that subsystem that does not work for you here? > > > We were not aware of mfd until now. I looked at very high level now. It's a wrapper to platform devices and seems widely use. > Before subdev proposal, Jason suggested an alternative is to create platform devices and driver attach to it. > > When I read kernel documentation [1], it says "platform devices typically appear as autonomous entities" > Here instead of autonomy, it is in user's control. > Platform devices probably don't disappear a lot in live system as opposed to subdevices which are created and removed dynamically a lot often. > > Not sure if platform device is abuse for this purpose or not. No, do not abuse a platform device. You should be able to just use a normal PCI device for this just fine, and if not, we should be able to make the needed changes to mfd for that. thanks, greg k-h