Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp3333992ybi; Tue, 2 Jul 2019 06:09:42 -0700 (PDT) X-Google-Smtp-Source: APXvYqz/VdL1Fd/Q7RrRVy53aOsBfdWCtfRR3QmTHQxLHsDgR6eC8vbEPeae7FzbGaEMdhacSUH1 X-Received: by 2002:a17:902:b7c1:: with SMTP id v1mr35195601plz.85.1562072982437; Tue, 02 Jul 2019 06:09:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1562072982; cv=none; d=google.com; s=arc-20160816; b=DLsfejBcY8FwNuFSnNr0n4g68aW/S5c9o+Zruku11t84Eztn3R8NolFzKsdNTgD+T/ UO/wk2uaqava+jBon6rA4QJ6cZl3aB+ADhC1nmz+W1uIFmoMh8vJ6CZm7i055SPQEEmB Ax8d+07yJuUjPkA70L92FnGr2Na26izJxZwF4HKbt7ZFJtvrmbmrUMTsH3wCgC2NUVGP yKwwGjxL/LR0vzwLBIdHAxvjyFYQHkhLoUFfUKB+dcBrsAOoQ5PkRDZ79fZvO5RRGKsy EGicXUu++kllLV6QOYH5qvxZdM3o+OCo4ug9YE/eWU8NNsKtXy/r0zP86PETsIvh8xTU JZWw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :organization:references:in-reply-to:message-id:subject:cc:to:from :date; bh=xZEK0xo+tJBlYNR7n3tJyNh8aEKi5IhwNaSdZ//sPg0=; b=cP5P7pWocO+v0UFSMFiSNOQoWmha9zzdS1j+DIpPqhD9vmq5iCGAFf12VjEdYVVkIW lw8BG11F9WFpMS90LsTs+I1zG78U3h6P5rq/Q1cCnMPIHpQzdxn7rPSpXqJrWyiPpQQ4 1YEwll3p16F0BSRAJafQpaqh4uvu3e9IRdO6pBbw2Sbu6XVEZh1/Ns8K5PRH+qgmqeZ8 /LjtSG+tV9x+Fh9Dc3RPr3rbYr4wn88JIJ99HpcL8CXxN4UciuL6iF8tRyCPcgPSQbFH 6j+Q6jK0/CuWewK0DCyEpsfSaFWwKZtSuC1DZPtLNRef0RrmfujH5gxVhZRTDlON0huM sOyA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 34si13373094pgy.540.2019.07.02.06.09.26; Tue, 02 Jul 2019 06:09:42 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727054AbfGBNI6 (ORCPT + 99 others); Tue, 2 Jul 2019 09:08:58 -0400 Received: from mx1.redhat.com ([209.132.183.28]:60366 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726993AbfGBNI6 (ORCPT ); Tue, 2 Jul 2019 09:08:58 -0400 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 72C513001808; Tue, 2 Jul 2019 13:08:57 +0000 (UTC) Received: from x1.home (ovpn-116-83.phx2.redhat.com [10.3.116.83]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0BE066F94F; Tue, 2 Jul 2019 13:08:56 +0000 (UTC) Date: Tue, 2 Jul 2019 07:08:56 -0600 From: Alex Williamson To: Kirti Wankhede Cc: Parav Pandit , "cohuck@redhat.com" , "kvm@vger.kernel.org" , "linux-kernel@vger.kernel.org" Subject: Re: [PATCH v2] mdev: Send uevents around parent device registration Message-ID: <20190702070856.75c23a0c@x1.home> In-Reply-To: References: <156199271955.1646.13321360197612813634.stgit@gimli.home> <08597ab4-cc37-3973-8927-f1bc430f6185@nvidia.com> <20190701112442.176a8407@x1.home> <3b338e73-7929-df20-ca2b-3223ba4ead39@nvidia.com> <20190701140436.45eabf07@x1.home> <14783c81-0236-2f25-6193-c06aa83392c9@nvidia.com> <20190701234201.47b6f23a@x1.home> Organization: Red Hat MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.40]); Tue, 02 Jul 2019 13:08:57 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 2 Jul 2019 18:17:41 +0530 Kirti Wankhede wrote: > On 7/2/2019 12:43 PM, Parav Pandit wrote: > > > > > >> -----Original Message----- > >> From: linux-kernel-owner@vger.kernel.org >> owner@vger.kernel.org> On Behalf Of Alex Williamson > >> Sent: Tuesday, July 2, 2019 11:12 AM > >> To: Kirti Wankhede > >> Cc: cohuck@redhat.com; kvm@vger.kernel.org; linux-kernel@vger.kernel.org > >> Subject: Re: [PATCH v2] mdev: Send uevents around parent device registration > >> > >> On Tue, 2 Jul 2019 10:25:04 +0530 > >> Kirti Wankhede wrote: > >> > >>> On 7/2/2019 1:34 AM, Alex Williamson wrote: > >>>> On Mon, 1 Jul 2019 23:20:35 +0530 > >>>> Kirti Wankhede wrote: > >>>> > >>>>> On 7/1/2019 10:54 PM, Alex Williamson wrote: > >>>>>> On Mon, 1 Jul 2019 22:43:10 +0530 > >>>>>> Kirti Wankhede wrote: > >>>>>> > >>>>>>> On 7/1/2019 8:24 PM, Alex Williamson wrote: > >>>>>>>> This allows udev to trigger rules when a parent device is > >>>>>>>> registered or unregistered from mdev. > >>>>>>>> > >>>>>>>> Signed-off-by: Alex Williamson > >>>>>>>> --- > >>>>>>>> > >>>>>>>> v2: Don't remove the dev_info(), Kirti requested they stay and > >>>>>>>> removing them is only tangential to the goal of this change. > >>>>>>>> > >>>>>>> > >>>>>>> Thanks. > >>>>>>> > >>>>>>> > >>>>>>>> drivers/vfio/mdev/mdev_core.c | 8 ++++++++ > >>>>>>>> 1 file changed, 8 insertions(+) > >>>>>>>> > >>>>>>>> diff --git a/drivers/vfio/mdev/mdev_core.c > >>>>>>>> b/drivers/vfio/mdev/mdev_core.c index ae23151442cb..7fb268136c62 > >>>>>>>> 100644 > >>>>>>>> --- a/drivers/vfio/mdev/mdev_core.c > >>>>>>>> +++ b/drivers/vfio/mdev/mdev_core.c > >>>>>>>> @@ -146,6 +146,8 @@ int mdev_register_device(struct device *dev, > >>>>>>>> const struct mdev_parent_ops *ops) { > >>>>>>>> int ret; > >>>>>>>> struct mdev_parent *parent; > >>>>>>>> + char *env_string = "MDEV_STATE=registered"; > >>>>>>>> + char *envp[] = { env_string, NULL }; > >>>>>>>> > >>>>>>>> /* check for mandatory ops */ > >>>>>>>> if (!ops || !ops->create || !ops->remove || > >>>>>>>> !ops->supported_type_groups) @@ -197,6 +199,8 @@ int > >> mdev_register_device(struct device *dev, const struct mdev_parent_ops *ops) > >>>>>>>> mutex_unlock(&parent_list_lock); > >>>>>>>> > >>>>>>>> dev_info(dev, "MDEV: Registered\n"); > >>>>>>>> + kobject_uevent_env(&dev->kobj, KOBJ_CHANGE, envp); > >>>>>>>> + > >>>>>>>> return 0; > >>>>>>>> > >>>>>>>> add_dev_err: > >>>>>>>> @@ -220,6 +224,8 @@ EXPORT_SYMBOL(mdev_register_device); > >>>>>>>> void mdev_unregister_device(struct device *dev) { > >>>>>>>> struct mdev_parent *parent; > >>>>>>>> + char *env_string = "MDEV_STATE=unregistered"; > >>>>>>>> + char *envp[] = { env_string, NULL }; > >>>>>>>> > >>>>>>>> mutex_lock(&parent_list_lock); > >>>>>>>> parent = __find_parent_device(dev); @@ -243,6 +249,8 @@ > >> void > >>>>>>>> mdev_unregister_device(struct device *dev) > >>>>>>>> up_write(&parent->unreg_sem); > >>>>>>>> > >>>>>>>> mdev_put_parent(parent); > >>>>>>>> + > >>>>>>>> + kobject_uevent_env(&dev->kobj, KOBJ_CHANGE, envp); > >>>>>>> > >>>>>>> mdev_put_parent() calls put_device(dev). If this is the last > >>>>>>> instance holding device, then on put_device(dev) dev would get freed. > >>>>>>> > >>>>>>> This event should be before mdev_put_parent() > >>>>>> > >>>>>> So you're suggesting the vendor driver is calling > >>>>>> mdev_unregister_device() without a reference to the struct device > >>>>>> that it's passing to unregister? Sounds bogus to me. We take a > >>>>>> reference to the device so that it can't disappear out from under > >>>>>> us, the caller cannot rely on our reference and the caller > >>>>>> provided the struct device. Thanks, > >>>>>> > >>>>> > >>>>> 1. Register uevent is sent after mdev holding reference to device, > >>>>> then ideally, unregister path should be mirror of register path, > >>>>> send uevent and then release the reference to device. > >>>> > >>>> I don't see the relevance here. We're marking an event, not > >>>> unwinding state of the device from the registration process. > >>>> Additionally, the event we're trying to mark is the completion of > >>>> each process, so the notion that we need to mirror the ordering between > >> the two is invalid. > >>>> > >>>>> 2. I agree that vendor driver shouldn't call > >>>>> mdev_unregister_device() without holding reference to device. But > >>>>> to be on safer side, if ever such case occur, to avoid any > >>>>> segmentation fault in kernel, better to send event before mdev release the > >> reference to device. > >>>> > >>>> I know that get_device() and put_device() are GPL symbols and that's > >>>> a bit of an issue, but I don't think we should be kludging the code > >>>> for a vendor driver that might have problems with that. A) we're > >>>> using the caller provided device for the uevent, B) we're only > >>>> releasing our own reference to the device that was acquired during > >>>> registration, the vendor driver must have other references, > >>> > >>> Are you going to assume that someone/vendor driver is always going to > >>> do right thing? > >> > >> mdev is a kernel driver, we make reasonable assumptions that other drivers > >> interact with it correctly. > >> > > That is right. > > Vendor drivers must invoke mdev_register_device() and mdev_unregister_device() only once. > > And it must have a valid reference to the device for which it is invoking it. > > This is basic programming practice that a given driver has to follow. > > mdev_register_device() has a loop to check. It needs to WARN_ON there if there are duplicate registration. > > Similarly on mdev_unregister_device() to have WARN_ON if device is not found. > > If assumption is vendor driver is always going to do right way, then why > need check for duplicate registration? vendor driver is always going to > do it right way, right? Are we intentionally misinterpreting "reasonable assumptions" here? > > It was in my TODO list to submit those patches. > > I was still thinking to that mdev_register_device() should return mdev_parent and mdev_unregister_device() should accept mdev_parent pointer, instead of WARN_ON on unregister(). > > > > > >>>> C) the parent device > >>>> generally lives on a bus, with a vendor driver, there's an entire > >>>> ecosystem of references to the device below mdev. Is this a > >>>> paranoia request or are you really concerned that your PCI device suddenly > >>>> disappears when mdev's reference to it disappears. > >>> > >>> mdev infrastructure is not always used by PCI devices. It is designed > >>> to be generic, so that other devices (other than PCI devices) can also > >>> use this framework. > >> > >> Obviously mdev is not PCI specific, I only mention it because I'm asking if you > >> have a specific concern in mind. If you did, I'd assume it's related to a PCI > >> backed vGPU. > > Its not always good to assume certain things. It was only an attempt to relate to a specific issue that might concern you. > >> Any physical parent device of an mdev is likely to have some sort > >> of bus infrastructure behind it holding references to the device (ie. a probe and > >> release where an implicit reference is held between these points). A virtual > >> device would be similar, it's created as part of a module init and destroyed as > >> part of a module exit, where mdev registration would exist between these > >> points. > >> > >>> If there is a assumption that user of mdev framework or vendor drivers > >>> are always going to use mdev in right way, then there is no need for > >>> mdev core to held reference of the device? > >>> This is not a "paranoia request". This is more of a ideal scenario, > >>> mdev should use device by holding its reference rather than assuming > >>> (or relying on) someone else holding the reference of device. > >> > >> In fact, at one point Parav was proposing removing these references entirely, > >> but Connie and I both felt uncomfortable about that. I think it's good practice > >> that mdev indicates the use of the parent device by incrementing the reference > >> count, with each child mdev device also taking a reference, but those > >> references balance out within the mdev core. Their purpose is not to maintain > >> the device for outside callers, nor should outside callers assume mdev's use of > >> references to release their own. I don't think it's unreasonable to assume that > >> the caller should have a legitimate reference to the object it's providing to this > >> function and therefore we should be able to use it after mdev's internal > >> references are balanced out. Thanks, > >> > > I'm not fully convinced with what is the advantage of sending uevent > after releasing reference to device or disadvantage of sending uevent > before releasing reference to device. If mdev-core still holds a reference to the device, is it fully unregistered? Why not send the uevent at the point where the notification is actually true? > Still if you want to go ahead with this change, please add a check or > assert if (dev != NULL) and add an comment highlighting the assumption. If CONFIG_DEBUG_KOBJECT_RELEASE is enabled then the deletion of the kobject can occur at some random delay after the last reference is removed via a workqueue, so such a test would only introduce a false sense of security for an issue that should not exist anyway. Thanks, Alex