Received: by 2002:a05:6a10:9afc:0:0:0:0 with SMTP id t28csp1098717pxm; Thu, 3 Mar 2022 10:14:53 -0800 (PST) X-Google-Smtp-Source: ABdhPJyV6YF6f2q50ffgLfiiVoCdYXaOa7AF3iXMDYY66Vx1brqWtgJk+qYA9OKg0L2qAbGw/g9X X-Received: by 2002:a17:90b:1e53:b0:1bf:ac1:2a0b with SMTP id pi19-20020a17090b1e5300b001bf0ac12a0bmr5538182pjb.87.1646331293128; Thu, 03 Mar 2022 10:14:53 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1646331293; cv=none; d=google.com; s=arc-20160816; b=BYvPU9pbGp38xG8dFEXwBtI1rtxYQn2AzEyVx7ChC4zf+QF8WGm15CPneW1lA91Ioq 3xJ5IhGt7zcVaemQ75y7+XNXZPlahapmcl1HPvg4BO3y5g5k7bpBp+zBchvC7W9QRmoJ 3kcX1c/twugUI8GknYBNnooUmUjxlWC2QxP0P9LttS7kPGOekQIp3MM/UqHxZ0QxZDeg oT3QGxIuAmA/RY/TWnF8JlGGiFyq83X3sIEKeukEjJaUT4kaI5A/bCILxECFP5HLIIQK 2J/a5vTWWxGbfa29PbhyMwKgrM3KGANerG4SJkBprQGGhegVc3kSOQUPJZ8jApDiD7eN 4uyw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=tMJXZOLx2mgKUBRPHpjfjFcBmTE6lQAKCV0uaMWP45M=; b=B85Ra5lEmuhN8LVDuYADgZL6xoPyDA/EbB1N2+mWuDqzKqKI8yj+PKvPNLRlzRPzWJ gIYEzqFr2NTrW/7+w63pmkPHVj95nsd8R2bNzdG6KgfIcNeiWtzH5glL/+y7zbIXV4xL NzwGHDn08CnZaYQPbN66lzcVxQm5NBPL6IjsG8+aZVkYLmP2ilCVVQ0i+k/rzgjb8VEy z2AUjrejHzgFpoqBpPN80WlTFReiyJo4JYgNSN7ibhByhZcEx1utzd/j8mStIEvfBKsi lDt9wrKFtRH06nzaei8J4NdPJ4yp41nR4KS7aQAT54b5II3TZl/+1N1aUwoAS+r4SxAT MtWQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=Pyf8b66a; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j6-20020a170903024600b001500e08d601si2911728plh.476.2022.03.03.10.14.34; Thu, 03 Mar 2022 10:14:53 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=Pyf8b66a; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233256AbiCCNXY (ORCPT + 99 others); Thu, 3 Mar 2022 08:23:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53858 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229909AbiCCNXX (ORCPT ); Thu, 3 Mar 2022 08:23:23 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7A54C34BBA; Thu, 3 Mar 2022 05:22:37 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 32460B8246E; Thu, 3 Mar 2022 13:22:36 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6ACC0C340ED; Thu, 3 Mar 2022 13:22:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1646313754; bh=WA+6m6yzwvFz/6tJbyG3UMbdVTSlsfS1c83kcIEFxWk=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=Pyf8b66awp9tbUHuOQVCaZJZ7TWxejy6MOWy8/1EDACDVYuzmng785Pltu4FO0wVf zhSV6XGjpf0yNymkNk70LSl/LmsF4AYHWIam6SAZk7729mFhZ5jGwf3uE65/AnFkIV BW4/z8ZhxBQkDvUKTx+QJ1x3ALjtR3zZFkjzBE88= Date: Thu, 3 Mar 2022 14:22:32 +0100 From: Greg KH To: Iouri Tarassov Cc: Wei Liu , kys@microsoft.com, haiyangz@microsoft.com, sthemmin@microsoft.com, linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org, spronovo@microsoft.com, spronovo@linux.microsoft.com Subject: Re: [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading Message-ID: References: <719fe06b7cbe9ac12fa4a729e810e3383ab421c1.1646163378.git.iourit@linux.microsoft.com> <739cf89e71ff72436d7ca3f846881dfb45d07a6a.1646163378.git.iourit@linux.microsoft.com> <20220301222321.yradz24nuyhzh7om@liuwe-devbox-debian-v2> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-7.5 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Mar 02, 2022 at 05:09:21PM -0800, Iouri Tarassov wrote: > > On 3/1/2022 11:53 PM, Greg KH wrote: > > On Tue, Mar 01, 2022 at 10:23:21PM +0000, Wei Liu wrote: > > > > > +struct dxgglobal *dxgglobal; > > > > > > > > No, make this per-device, NEVER have a single device for your driver. > > > > The Linux driver model makes it harder to do it this way than to do it > > > > correctly. Do it correctly please and have no global structures like > > > > this. > > > > > > > > > > This may not be as big an issue as you thought. The device discovery is > > > still done via the normal VMBus probing routine. For all intents and > > > purposes the dxgglobal structure can be broken down into per device > > > fields and a global structure which contains the protocol versioning > > > information -- my understanding is there will always be a global > > > structure to hold information related to the backend, regardless of how > > > many devices there are. > > > > Then that is wrong and needs to be fixed. Drivers should almost never > > have any global data, that is not how Linux drivers work. What happens > > when you get a second device in your system for this? Major rework > > would have to happen and the code will break. Handle that all now as it > > takes less work to make this per-device than it does to have a global > > variable. > > > > > I definitely think splitting is doable, but I also understand why Iouri > > > does not want to do it _now_ given there is no such a model for multiple > > > devices yet, so anything we put into the per-device structure could be > > > incomplete and it requires further changing when such a model arrives > > > later. > > > > > > Iouri, please correct me if I have the wrong mental model here. > > > > > > All in all, I hope this is not going to be a deal breaker for the > > > acceptance of this driver. > > > > For my reviews, yes it will be. > > > > Again, it should be easier to keep things in a per-device state than > > not as the proper lifetime rules and the like are automatically handled > > for you. If you have global data, you have to manage that all on your > > own and it is _MUCH_ harder to review that you got it correct. > > Hi Greg, > > I do not really see how the driver be written without the global data. Let's review the design. I see it the other way around. It's easier to make it without a static structure, it is more work to keep it as you have done so here. Do it correctly to start with and you will not have any of these issues going forward. > Dxgkrnl acts as the aggregator of all virtual compute devices, projected by the host. It needs to do operations, which do not belong to a particular compute device. For example, cross device synchronization and resource sharing. Then hang your data off of your device node structure that you created. Why ignore that? > A PCI device device is created for each virtual compute device. Therefore, there should be a global list of objects and a mutex to synchronize access to the list. Woah, what? You create a fake PCI device for each virtual device? If so, great, then you are now a PCI bus and create the PCI devices properly so that the PCI core can handle and manage them and then assign them to your driver. You should NEVER have a global list of these devices, as that is what the driver model should be managing. Not you! > A VMBus channel is offered by the host for each compute device. The list of the VMBus channels should be global. The vmbus channels are already handled by the driver core. Use those devices that are given to you. You don't need to manage them at all. > A global VMBus channel is offered by the host. The channel does not belong to any particular compute device, so it must be global. That channel is attached to your driver, use the device given to your driver by the bus. It's not "global" in any sense of the word. And what's up with your lack of line wrapping? > IO space is shared by all compute devices, so its parameters should be global. Huh? If that's the case then you have bigger problems. Use the aux bus for devices that share io space. That is what it was created for, do not ignore the functionality that Linux already provides you by trying to go around it and writing your own code. Use the frameworks we have already debugged and support. This is why your Linux driver should be at least 1/3 smaller than drivers for other operating systems. > Dxgkrnl needs to maintain a list of processes, which opened compute device objects. Dxgkrnl maintains private state for each process and when a process opens the /dev/dxg device, Dxgkrnl needs to find if the process state is already created by walking the global process list. That "list" is handled by the device node structure that was opened. It's not "global" at all. Again, just like any other device node in Linux, this isn't a new thing or anything special at all. > Now, where to keep this global state? It could be kept in the /dev/dxg private device structure. But this structure is not available when, for example, dxg_pci_probe_device() or dxg_probe_vmbus() is called. Then your design is wrong. It's as simple as that. Fix it. > Can there be multiple /dev/dxg devices? No. Because the /dev/dxg device represents the driver itself, not a particular compute device. Then fix this. Make your compute devices store the needed information when they are created. Again, we have loads of examples in the kernel, this is nothing new. > I am not sure what design model you have in mind when saying there should be no global data. Could you please explain keeping in mind the above requirements? Please see all of my responses above, and please use more \n characters in the future :) good luck! greg k-h