Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp2791756imm; Wed, 16 May 2018 20:32:58 -0700 (PDT) X-Google-Smtp-Source: AB8JxZp2XOSbi2ncB13njAVY3xyol36MT+5BkNtw4dEl0hQHUSP2Io+mFfmYpTzwcOLUCU1l9xC2 X-Received: by 2002:a63:3104:: with SMTP id x4-v6mr2793330pgx.167.1526527978666; Wed, 16 May 2018 20:32:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1526527978; cv=none; d=google.com; s=arc-20160816; b=XzjLlSjyUygxRAD9KBViFIV2i5QMbnNvgGKX/wifOz48FkEv4gD50i1b0JyhUhbM8k 1zNEaeVh2r0qnW9G/eR7IaiU3laxE+qsxSoyBlJ6ecqFKZobzU+Cp2IAMQnLoHS/sqIW 7PfnlLieo59w/0BYYEbEodjkZP8oJ2l0ISAtcubCXGDIsgryGr0YJTih+Wr2ohlAEuvS L7SOj2bbbYChu/0Q4XB1sZqJek3PTE11gTT9H2bQ9IlEYCTK79+WNQi6MFVNJWDPPwPY PUbxVLZTmeLBmHS5OozciFz+rH3Jqdb2FTIbNCRZAc2vtsUi+BP4woYktjadeuugnOtB QpPg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :arc-authentication-results; bh=cja4IKk1WPi7i6l7i43/KwoR685bR4qwWloQmCBUb8M=; b=dtGo2qSnHNwkpPpwthameUtNXINDBbzHThY9bhs8lAk+r1PXfzFkacEcW3qsF0imTH PErHWjSuVBCWOx0wIx+sJ4P3I7GHNZ9BBBAWfCLtAxpzd3BKBtMM83qFJOTbpnInCPWo bYyI59txO2vO2lkhSeusmu3hxkfJbVA54ezLkvJvX+Cmfg1S9Bmkfw5HMkWV5NE6FRIm dtcccQe6OkIVJABBLgdnhpoOqS9w+06IBtL2ht0V9c/VMApDp/1m/Gjq+l8vt8Sg3Sf0 mltBojR1tgULuUPW1G9fMIvUPZLaZblNGogC16aBABNfRFbGen0n7zq767wjixGS3/IL 9/Cw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id bh1-v6si3986803plb.481.2018.05.16.20.32.44; Wed, 16 May 2018 20:32:58 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752081AbeEQDaj (ORCPT + 99 others); Wed, 16 May 2018 23:30:39 -0400 Received: from mx1.redhat.com ([209.132.183.28]:52238 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751844AbeEQDah (ORCPT ); Wed, 16 May 2018 23:30:37 -0400 Received: from smtp.corp.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.26]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 96A173118CE7; Thu, 17 May 2018 03:30:37 +0000 (UTC) Received: from w520.home (ovpn-116-135.phx2.redhat.com [10.3.116.135]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3DF4C30012BE; Thu, 17 May 2018 03:30:37 +0000 (UTC) Date: Wed, 16 May 2018 21:30:36 -0600 From: Alex Williamson To: Kirti Wankhede Cc: , , Subject: Re: [PATCH v2] vfio/mdev: Check globally for duplicate devices Message-ID: <20180516213036.1847082d@w520.home> In-Reply-To: <2c45cf24-f67d-bae6-59bf-aedb434ee843@nvidia.com> References: <20180516152228.12123.34883.stgit@gimli.home> <2c45cf24-f67d-bae6-59bf-aedb434ee843@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.84 on 10.5.11.26 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.47]); Thu, 17 May 2018 03:30:37 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 17 May 2018 01:01:40 +0530 Kirti Wankhede wrote: > On 5/16/2018 8:53 PM, Alex Williamson wrote: > > When we create an mdev device, we check for duplicates against the > > parent device and return -EEXIST if found, but the mdev device > > namespace is global since we'll link all devices from the bus. We do > > catch this later in sysfs_do_create_link_sd() to return -EEXIST, but > > with it comes a kernel warning and stack trace for trying to create > > duplicate sysfs links, which makes it an undesirable response. > > > > Therefore we should really be looking for duplicates across all mdev > > parent devices, or as implemented here, against our mdev device list. > > > > Notably, mdev_parent.lock really only seems to be serializing device > > creation and removal per parent. I'm not sure if this is necessary, > > mdev vendor drivers could easily provide this serialization if it > > is required, but a side-effect of holding the mdev_list_lock to > > protect the namespace is actually greater serialization across the > > create and remove paths, > > Exactly for this reason more granular lock is used and that's the reason > mdev_parent.lock was introduced. Consider the max supported config for > vGPU: 8 GPUs in a system with 16 mdev devices on each GPUs, i.e. 128 > mdev devices need to be created in a system, and this count will > increase in future, all mdev device creation/removal will get serialized > with this change. > I agree with your concern that if there are duplicates across parents, > its not caught earlier. Right, thus the concern, but how often are trying to simultaneously create or remove all those mdev devices. Anyway... > > so mdev_parent.lock is removed. If we can > > show that vendor drivers handle the create/remove paths themselves, > > perhaps we can refine the locking granularity. > > > > Here lock is not for create/remove routines of vendor driver, its about > mdev device creation and device registration, which is a common code > path, and so is part of mdev core module. Ok, if mdev_parent.lock was only to protect the per parent device namespace and not meant as a serialization guarantee to the vendor drivers, then we can fix the bug and improve the parallelism. > > Reviewed-by: Cornelia Huck > > Signed-off-by: Alex Williamson > > --- > > > > v2: Remove unnecessary ret init per Cornelia's review > > > > drivers/vfio/mdev/mdev_core.c | 77 +++++++++----------------------------- > > drivers/vfio/mdev/mdev_private.h | 1 > > 2 files changed, 19 insertions(+), 59 deletions(-) > > > > diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c > > index 126991046eb7..aaab3ef93e1c 100644 > > --- a/drivers/vfio/mdev/mdev_core.c > > +++ b/drivers/vfio/mdev/mdev_core.c > > @@ -66,34 +66,6 @@ uuid_le mdev_uuid(struct mdev_device *mdev) > > } > > EXPORT_SYMBOL(mdev_uuid); > > > > -static int _find_mdev_device(struct device *dev, void *data) > > -{ > > - struct mdev_device *mdev; > > - > > - if (!dev_is_mdev(dev)) > > - return 0; > > - > > - mdev = to_mdev_device(dev); > > - > > - if (uuid_le_cmp(mdev->uuid, *(uuid_le *)data) == 0) > > - return 1; > > - > > - return 0; > > -} > > - > > -static bool mdev_device_exist(struct mdev_parent *parent, uuid_le uuid) > > -{ > > - struct device *dev; > > - > > - dev = device_find_child(parent->dev, &uuid, _find_mdev_device); > > - if (dev) { > > - put_device(dev); > > - return true; > > - } > > - > > - return false; > > -} > > - > > /* Should be called holding parent_list_lock */ > > static struct mdev_parent *__find_parent_device(struct device *dev) > > { > > @@ -221,7 +193,6 @@ int mdev_register_device(struct device *dev, const struct mdev_parent_ops *ops) > > } > > > > kref_init(&parent->ref); > > - mutex_init(&parent->lock); > > > > parent->dev = dev; > > parent->ops = ops; > > @@ -304,7 +275,7 @@ static void mdev_device_release(struct device *dev) > > int mdev_device_create(struct kobject *kobj, struct device *dev, uuid_le uuid) > > { > > int ret; > > - struct mdev_device *mdev; > > + struct mdev_device *mdev, *tmp; > > struct mdev_parent *parent; > > struct mdev_type *type = to_mdev_type(kobj); > > > > @@ -312,12 +283,14 @@ int mdev_device_create(struct kobject *kobj, struct device *dev, uuid_le uuid) > > if (!parent) > > return -EINVAL; > > > > - mutex_lock(&parent->lock); > > + mutex_lock(&mdev_list_lock); > > > > /* Check for duplicate */ > > - if (mdev_device_exist(parent, uuid)) { > > - ret = -EEXIST; > > - goto create_err; > > + list_for_each_entry(tmp, &mdev_list, next) { > > + if (!uuid_le_cmp(tmp->uuid, uuid)) { > > + ret = -EEXIST; > > + goto create_err; > > + } > > } > > > Is it possible to use mdev_list_lock for as minimal portion as possible? > By adding mdev device to mdev_list just after: > memcpy(&mdev->uuid, &uuid, sizeof(uuid_le)); > and then unlock mdev_list_lock, but at the same time all later error > cases need to be handled properly in this function. We also need to differentiate a mdev device placeholder for namespace protection for an active device such that we can't race a remove during the creation, seems do-able. v3... Thanks, Alex