Received: by 2002:ac0:bc90:0:0:0:0:0 with SMTP id a16csp3607511img; Mon, 25 Mar 2019 13:51:15 -0700 (PDT) X-Google-Smtp-Source: APXvYqwNKRxDTf5wfYtt4pdLg3+Ei3Xd7ofRfWR1ICXwktlK8cghxh4I1u1Ph0iWxlqBC08XoDDg X-Received: by 2002:a63:1ce:: with SMTP id 197mr25259996pgb.47.1553547075778; Mon, 25 Mar 2019 13:51:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553547075; cv=none; d=google.com; s=arc-20160816; b=BuWcIWVBuB7400qo1+y4YYXVw4pF99f4oLSwi5jX5k6bR4Wd8E6qvEAI37umX2gE6v S5KvIgcjPfSyG9fFPsDcmAHUdzHiUf6wI9SBLwtC5LeTf0mknM65+G6iZAbPMax5SDxp 4QOnvlUMkb2qkVWyGUjIn9+2hxoEeHUO/K1J0gKhfZhQNWPuBiOuvXJgMEL4GKGYwn8/ EdYoFkZQO0kmE1OrZ/fju9W2mop1WSrFahAaZohJ++mlpD0HlYVPBxvL0sToOmFVkRYk ZlWz4rj/HQCJWr8EG2byV0ad2Z1jU55ZDTIUW3NzFrw8UcIUZ4m2iOiKJgSPk4ZjrGga 8VGw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :organization:references:in-reply-to:message-id:subject:cc:to:from :date; bh=v/kUFTm+j53jqp9DZE3mgrNO9wS3cJIcSrV8mkpzIwA=; b=FfdhR++Vo8IAHYFHVvVc9Ol0zseJQ8iFsu5Bb1YU+6RZVKa2jfGz3TRy5lHaMk9suc 3tjvYX5Q/3moG+3mJ5O3mcJH3mF72DQgtBRKeNNWw+g1xO/LUGO0+8xSzfu65aDMPb7A FiAO3LCMbuYe5U2DkUZneE8X/VqNOzWtOupHFeyKkNCuBc+KzyxAZcZTiauUC2uZhyLf kHyF65TG+tU/a9wIDrPS2u5j0QxnzC9i3gRzSUGqMC4IgjbFuQYd3AFw1WyUwaiK0GHW AL8TaBNmelfmTEscfn2aw35hbtFy+HMUVNg6GlmKonZQVtyVLLHg9AAGeOokFEi1PUoj 1LUg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g31si10824698pgm.141.2019.03.25.13.51.00; Mon, 25 Mar 2019 13:51:15 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730363AbfCYUtl (ORCPT + 99 others); Mon, 25 Mar 2019 16:49:41 -0400 Received: from mx1.redhat.com ([209.132.183.28]:33394 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729610AbfCYUtl (ORCPT ); Mon, 25 Mar 2019 16:49:41 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 23766C056792; Mon, 25 Mar 2019 20:49:41 +0000 (UTC) Received: from x1.home (ovpn-116-33.phx2.redhat.com [10.3.116.33]) by smtp.corp.redhat.com (Postfix) with ESMTP id A6AC45DAAE; Mon, 25 Mar 2019 20:49:40 +0000 (UTC) Date: Mon, 25 Mar 2019 14:49:35 -0600 From: Alex Williamson To: Kirti Wankhede Cc: Parav Pandit , "kvm@vger.kernel.org" , "linux-kernel@vger.kernel.org" Subject: Re: [PATCH 7/8] vfio/mdev: Fix aborting mdev child device removal if one fails Message-ID: <20190325144935.0696668a@x1.home> In-Reply-To: References: <1553296835-37522-1-git-send-email-parav@mellanox.com> <1553296835-37522-8-git-send-email-parav@mellanox.com> Organization: Red Hat MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Mon, 25 Mar 2019 20:49:41 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 26 Mar 2019 01:05:34 +0530 Kirti Wankhede wrote: > On 3/23/2019 4:50 AM, Parav Pandit wrote: > > device_for_each_child() stops executing callback function for remaining > > child devices, if callback hits an error. > > Each child mdev device is independent of each other. > > While unregistering parent device, mdev core must remove all child mdev > > devices. > > Therefore, mdev_device_remove_cb() always returns success so that > > device_for_each_child doesn't abort if one child removal hits error. > > > > When unregistering parent device, force_remove is set to true amd > mdev_device_remove_ops() always returns success. Can we know that? mdev_device_remove() doesn't guarantee to return zero. > > While at it, improve remove and unregister functions for below simplicity. > > > > There isn't need to pass forced flag pointer during mdev parent > > removal which invokes mdev_device_remove(). > > There is a need to pass the flag, pasting here the comment above > mdev_device_remove_ops() which explains why the flag is needed: > > /* > * mdev_device_remove_ops gets called from sysfs's 'remove' and when parent > * device is being unregistered from mdev device framework. > * - 'force_remove' is set to 'false' when called from sysfs's 'remove' > which > * indicates that if the mdev device is active, used by VMM or userspace > * application, vendor driver could return error then don't remove the > device. > * - 'force_remove' is set to 'true' when called from > mdev_unregister_device() > * which indicate that parent device is being removed from mdev device > * framework so remove mdev device forcefully. > */ I don't see that this changes the force behavior, it's simply noting that in order to continue the device_for_each_child() iterator, we need to return zero, regardless of what mdev_device_remove() returns, and the parent remove path is the only caller of mdev_device_remove_cb(), so we can assume force = true when calling mdev_device_remove(). Aside from maybe a WARN_ON if mdev_device_remove() returns non-zero, that much looks reasonable to me. > So simplify the flow. > > > > mdev_device_remove() is called from two paths. > > 1. mdev_unregister_driver() > > mdev_device_remove_cb() > > mdev_device_remove() > > 2. remove_store() > > mdev_device_remove() > > > > When device is removed by user using remote_store(), device under > > removal is mdev device. > > When device is removed during parent device removal using generic child > > iterator, mdev check is already done using dev_is_mdev(). > > > > Hence, remove the unnecessary loop in mdev_device_remove(). I don't think knowing the device type is the only reason for this loop though. Both paths you mention above can race with each other, so we need to serialize them and pick a winner. The mdev_list_lock allows us to do that. Additionally... > > > > Fixes: 7b96953bc640 ("vfio: Mediated device Core driver") > > Signed-off-by: Parav Pandit > > --- > > drivers/vfio/mdev/mdev_core.c | 24 +++++------------------- > > 1 file changed, 5 insertions(+), 19 deletions(-) > > > > diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c > > index ab05464..944a058 100644 > > --- a/drivers/vfio/mdev/mdev_core.c > > +++ b/drivers/vfio/mdev/mdev_core.c > > @@ -150,10 +150,10 @@ static int mdev_device_remove_ops(struct mdev_device *mdev, bool force_remove) > > > > static int mdev_device_remove_cb(struct device *dev, void *data) > > { > > - if (!dev_is_mdev(dev)) > > - return 0; > > + if (dev_is_mdev(dev)) > > + mdev_device_remove(dev, true); > > > > - return mdev_device_remove(dev, data ? *(bool *)data : true); > > + return 0; > > } > > > > /* > > @@ -241,7 +241,6 @@ int mdev_register_device(struct device *dev, const struct mdev_parent_ops *ops) > > void mdev_unregister_device(struct device *dev) > > { > > struct mdev_parent *parent; > > - bool force_remove = true; > > > > mutex_lock(&parent_list_lock); > > parent = __find_parent_device(dev); > > @@ -255,8 +254,7 @@ void mdev_unregister_device(struct device *dev) > > list_del(&parent->next); > > class_compat_remove_link(mdev_bus_compat_class, dev, NULL); > > > > - device_for_each_child(dev, (void *)&force_remove, > > - mdev_device_remove_cb); > > + device_for_each_child(dev, NULL, mdev_device_remove_cb); > > > > parent_remove_sysfs_files(parent); > > > > @@ -346,24 +344,12 @@ int mdev_device_create(struct kobject *kobj, struct device *dev, uuid_le uuid) > > > > int mdev_device_remove(struct device *dev, bool force_remove) > > { > > - struct mdev_device *mdev, *tmp; > > + struct mdev_device *mdev; > > struct mdev_parent *parent; > > struct mdev_type *type; > > int ret; > > > > mdev = to_mdev_device(dev); > > - > > - mutex_lock(&mdev_list_lock); Acquiring the lock is removed, but... > > - list_for_each_entry(tmp, &mdev_list, next) { > > - if (tmp == mdev) > > - break; > > - } > > - > > - if (tmp != mdev) { > > - mutex_unlock(&mdev_list_lock); > > - return -ENODEV; > > - } > > - > > if (!mdev->active) { > > mutex_unlock(&mdev_list_lock); > > return -EAGAIN; > > We still release it in this path and the code below here. If we don't find the device on the list under lock, then we're working with a stale device and playing with the 'active' flag of that device outside of any sort of mutual exclusion is racy. Thanks, Alex