Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp2348548rwb; Fri, 16 Dec 2022 00:22:45 -0800 (PST) X-Google-Smtp-Source: AA0mqf4Iewho61s4iXI8tVdKUDy/c8S66wpsbts/XCKBmLeBVTsq39VPUJkHCCeE4iXurarf6+0z X-Received: by 2002:a50:ed98:0:b0:46b:4011:9863 with SMTP id h24-20020a50ed98000000b0046b40119863mr27450350edr.39.1671178965483; Fri, 16 Dec 2022 00:22:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671178965; cv=none; d=google.com; s=arc-20160816; b=kIzp5UeqMojQ1+dTNnnew2EyuDLmJR+RBMhrKm+QQd/Ok7DE8Jy3Kq39S/nyfKZYIH OqFxgtMQT50T5XCm3lJlnZ7TKKBExWi8VDkOqAO/L/XSEV50QaUik0FhrUQFB257/LP2 VgTlDFQ0NJPulh9ccG1MjoavyXAINrqJuhN7iWox37JbSv5wVh/7pVIcTJ76q4MS9E00 wjLkEdgRbYtR612SJ9DnP5+bpgfajsQ8J3meFJl8ieqQga0x/D69nDBo4HNOh9TxWog9 GXkCbtbFBx7peiKe37hLZzmEQb8kfR0iEgfAkPaGT5DogsLUMGt6xF2cYag/oT1U1Dru xO6g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=bQjL3q+AnOHU8AQ6TvuOdI4mq0zg5ZcY1jgOH3hwbcw=; b=FXrGkYoDUNZi3Zc45vkBKZ/J1OmeNzkY/NPpawHvj25sc+olLfIcs7quSj/jeqVOqf rzAN4H42Y8FA62bvGy8eZfi3dFum53qJcI3jfgrmMx4UtD7nkaI/L4uEPMNpU4yTnIx2 7Ql8xmTg8m4Sr2OEI8Ky8AgJp98MD0XS0q8u0XfUK1iNbwB3cTvkqQOCfo/F6LW3phJl M2aIZzpQSShAA68q+yU1O3hHs7fu/dEzZC9LOIw3KMxsZBHT2EcyZBhdRUXgbdLjTwND lzWmBLEEQTJrAy8HQHtCILyOc5v4FfY5JJQcb1Y6bGjlxBN/EFTXRpL2x2Xn6Rxu+lTx sAfQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=gnXIalai; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t6-20020a50ab46000000b0046b13a6c40fsi1703060edc.54.2022.12.16.00.22.29; Fri, 16 Dec 2022 00:22:45 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=gnXIalai; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229911AbiLPIOE (ORCPT + 68 others); Fri, 16 Dec 2022 03:14:04 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60920 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229874AbiLPIOC (ORCPT ); Fri, 16 Dec 2022 03:14:02 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B35F72C114 for ; Fri, 16 Dec 2022 00:13:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1671178392; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=bQjL3q+AnOHU8AQ6TvuOdI4mq0zg5ZcY1jgOH3hwbcw=; b=gnXIalaikiyK+VfXJZN9YHkrXImt1/2X/xQyvFdM1gD7bqcfBKalbA+iRZVQA5/NceQCSQ PDZXmmQemXl56+t41Hdp+ZNqcw8qvcI3acw5JY29SSKWE6UbvD/ple5+mNPGtkoXMuOrDF 7mK8/Uirhof6Lag37jyUFgvJfxLOfhQ= Received: from mail-ej1-f71.google.com (mail-ej1-f71.google.com [209.85.218.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-635-TVxHedAIP5OuBlf1_Af-hA-1; Fri, 16 Dec 2022 03:13:08 -0500 X-MC-Unique: TVxHedAIP5OuBlf1_Af-hA-1 Received: by mail-ej1-f71.google.com with SMTP id hs18-20020a1709073e9200b007c0f9ac75f9so1318941ejc.9 for ; Fri, 16 Dec 2022 00:13:08 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=bQjL3q+AnOHU8AQ6TvuOdI4mq0zg5ZcY1jgOH3hwbcw=; b=MQnaenknuaAtuXm7pUGBOyz/mJr+gDe3884UHVtDKt//3kJhYTlvTN/CiPfC8aR9Ia 4xkqoYPq3rp7ZYpSBh0zr8BI5w1snC1GtiLpQ9nXkdMeYTXUNLfHnfdQdf9oSvgCzw1N vdCTVS60vl/d5IAo8yrjEFkC5l2eE6KeRNLEruLK6im1w51iqlMKj3i3V12wYYjES2Gu xy7YooningowgaseKE4yjnK1BW43SrImBznLup4bCj3YbhZfMf8Vqy3keEtQ+wUZvAP+ c9OijRuc1Uhd/Qdzf6bO2RclRDxf/dkzrXnJ22FzetowNKNGrcxrIZ2qc7EIA2z7F9pM kL5A== X-Gm-Message-State: ANoB5plIdbLWEfYGZT3ANDi7RkRQi7ZOh6hmfmf3lMK9ekAz79wPZnUy T9p/18X+E+oiV0UN4eRqvB1rAPb7v1K0OZ7PH86FgUbrl7L3rE+1lJWBhZdmPKXXcpOQZtfiPbB ZTZkApPeWQVz2I2ag+LX2LccU X-Received: by 2002:a17:906:355a:b0:7c1:524d:2bec with SMTP id s26-20020a170906355a00b007c1524d2becmr19852123eja.39.1671178387557; Fri, 16 Dec 2022 00:13:07 -0800 (PST) X-Received: by 2002:a17:906:355a:b0:7c1:524d:2bec with SMTP id s26-20020a170906355a00b007c1524d2becmr19852111eja.39.1671178387343; Fri, 16 Dec 2022 00:13:07 -0800 (PST) Received: from sgarzare-redhat (host-87-11-6-51.retail.telecomitalia.it. [87.11.6.51]) by smtp.gmail.com with ESMTPSA id k1-20020a17090632c100b007c10ad73927sm568966ejk.28.2022.12.16.00.13.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 16 Dec 2022 00:13:06 -0800 (PST) Date: Fri, 16 Dec 2022 09:13:03 +0100 From: Stefano Garzarella To: Jason Wang Cc: virtualization@lists.linux-foundation.org, Andrey Zhadchenko , linux-kernel@vger.kernel.org, kvm@vger.kernel.org, "Michael S. Tsirkin" , eperezma@redhat.com, stefanha@redhat.com, netdev@vger.kernel.org Subject: Re: [RFC PATCH 6/6] vdpa_sim: add support for user VA Message-ID: <20221216081303.p4pcveclfa5n4slw@sgarzare-redhat> References: <20221214163025.103075-1-sgarzare@redhat.com> <20221214163025.103075-7-sgarzare@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Dec 16, 2022 at 03:26:46PM +0800, Jason Wang wrote: >On Thu, Dec 15, 2022 at 12:31 AM Stefano Garzarella wrote: >> >> The new "use_va" module parameter (default: false) is used in >> vdpa_alloc_device() to inform the vDPA framework that the device >> supports VA. >> >> vringh is initialized to use VA only when "use_va" is true and the >> user's mm has been bound. So, only when the bus supports user VA >> (e.g. vhost-vdpa). >> >> vdpasim_mm_work_fn work is used to attach the kthread to the user >> address space when the .bind_mm callback is invoked, and to detach >> it when the device is reset. > >One thing in my mind is that the current datapath is running under >spinlock which prevents us from using iov_iter (which may have page >faults). > >We need to get rid of the spinlock first. Right! I already have a patch for that since I used for the vdpa-blk software device in-kernel PoC where I had the same issue. I'll add it to the series! > >> >> Signed-off-by: Stefano Garzarella >> --- >> drivers/vdpa/vdpa_sim/vdpa_sim.h | 1 + >> drivers/vdpa/vdpa_sim/vdpa_sim.c | 104 ++++++++++++++++++++++++++++++- >> 2 files changed, 103 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.h b/drivers/vdpa/vdpa_sim/vdpa_sim.h >> index 07ef53ea375e..1b010e5c0445 100644 >> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.h >> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.h >> @@ -55,6 +55,7 @@ struct vdpasim { >> struct vdpasim_virtqueue *vqs; >> struct kthread_worker *worker; >> struct kthread_work work; >> + struct mm_struct *mm_bound; >> struct vdpasim_dev_attr dev_attr; >> /* spinlock to synchronize virtqueue state */ >> spinlock_t lock; >> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c >> index 36a1d2e0a6ba..6e07cedef30c 100644 >> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c >> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c >> @@ -36,10 +36,90 @@ module_param(max_iotlb_entries, int, 0444); >> MODULE_PARM_DESC(max_iotlb_entries, >> "Maximum number of iotlb entries for each address space. 0 means unlimited. (default: 2048)"); >> >> +static bool use_va; >> +module_param(use_va, bool, 0444); >> +MODULE_PARM_DESC(use_va, "Enable the device's ability to use VA"); >> + >> #define VDPASIM_QUEUE_ALIGN PAGE_SIZE >> #define VDPASIM_QUEUE_MAX 256 >> #define VDPASIM_VENDOR_ID 0 >> >> +struct vdpasim_mm_work { >> + struct kthread_work work; >> + struct task_struct *owner; >> + struct mm_struct *mm; >> + bool bind; >> + int ret; >> +}; >> + >> +static void vdpasim_mm_work_fn(struct kthread_work *work) >> +{ >> + struct vdpasim_mm_work *mm_work = >> + container_of(work, struct vdpasim_mm_work, work); >> + >> + mm_work->ret = 0; >> + >> + if (mm_work->bind) { >> + kthread_use_mm(mm_work->mm); >> +#if 0 >> + if (mm_work->owner) >> + mm_work->ret = cgroup_attach_task_all(mm_work->owner, >> + current); >> +#endif >> + } else { >> +#if 0 >> + //TODO: check it >> + cgroup_release(current); >> +#endif >> + kthread_unuse_mm(mm_work->mm); >> + } >> +} >> + >> +static void vdpasim_worker_queue_mm(struct vdpasim *vdpasim, >> + struct vdpasim_mm_work *mm_work) >> +{ >> + struct kthread_work *work = &mm_work->work; >> + >> + kthread_init_work(work, vdpasim_mm_work_fn); >> + kthread_queue_work(vdpasim->worker, work); >> + >> + spin_unlock(&vdpasim->lock); >> + kthread_flush_work(work); >> + spin_lock(&vdpasim->lock); >> +} >> + >> +static int vdpasim_worker_bind_mm(struct vdpasim *vdpasim, >> + struct mm_struct *new_mm, >> + struct task_struct *owner) >> +{ >> + struct vdpasim_mm_work mm_work; >> + >> + mm_work.owner = owner; >> + mm_work.mm = new_mm; >> + mm_work.bind = true; >> + >> + vdpasim_worker_queue_mm(vdpasim, &mm_work); >> + > >Should we wait for the work to be finished? Yep, I'm waiting inside vdpasim_worker_queue_mm() calling kthread_flush_work(). If we will use mutex, I think we can avoid the lock release around that call. > >> + if (!mm_work.ret) >> + vdpasim->mm_bound = new_mm; >> + >> + return mm_work.ret; >> +} >> + >> +static void vdpasim_worker_unbind_mm(struct vdpasim *vdpasim) >> +{ >> + struct vdpasim_mm_work mm_work; >> + >> + if (!vdpasim->mm_bound) >> + return; >> + >> + mm_work.mm = vdpasim->mm_bound; >> + mm_work.bind = false; >> + >> + vdpasim_worker_queue_mm(vdpasim, &mm_work); >> + >> + vdpasim->mm_bound = NULL; >> +} >> static struct vdpasim *vdpa_to_sim(struct vdpa_device *vdpa) >> { >> return container_of(vdpa, struct vdpasim, vdpa); >> @@ -66,8 +146,10 @@ static void vdpasim_vq_notify(struct vringh *vring) >> static void vdpasim_queue_ready(struct vdpasim *vdpasim, unsigned int idx) >> { >> struct vdpasim_virtqueue *vq = &vdpasim->vqs[idx]; >> + bool va_enabled = use_va && vdpasim->mm_bound; >> >> - vringh_init_iotlb(&vq->vring, vdpasim->features, vq->num, false, false, >> + vringh_init_iotlb(&vq->vring, vdpasim->features, vq->num, false, >> + va_enabled, >> (struct vring_desc *)(uintptr_t)vq->desc_addr, >> (struct vring_avail *) >> (uintptr_t)vq->driver_addr, >> @@ -96,6 +178,9 @@ static void vdpasim_do_reset(struct vdpasim *vdpasim) >> { >> int i; >> >> + //TODO: should we cancel the works? >> + vdpasim_worker_unbind_mm(vdpasim); > >We probably don't need this since it's the virtio level reset so we >need to keep the mm bound in this case. Otherwise we may break the >guest. It should be the responsibility of the driver to call >config_ops->unbind if it needs to do that. Got it, my biggest concern was when we go from a vhost-vdpa virtio-vdpa, but as you said, in vhost-vdpa I can call unbind before releasing the device. Thanks, Stefano