Received: by 2002:a05:6358:9144:b0:117:f937:c515 with SMTP id r4csp873396rwr; Thu, 20 Apr 2023 07:20:47 -0700 (PDT) X-Google-Smtp-Source: AKy350YvBEYuukRQi2yp35/A2nYVVOX70IUJgVleo65XSZ02ASeReNRmCAWACUd8PEsoH9IHtNx5 X-Received: by 2002:a05:6a00:114f:b0:63d:2343:f9b with SMTP id b15-20020a056a00114f00b0063d23430f9bmr1619102pfm.19.1682000447433; Thu, 20 Apr 2023 07:20:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1682000447; cv=none; d=google.com; s=arc-20160816; b=J96D7gNSMsbYovOkq85q8YkRlsp5VAcuN8zjfONT+BhgBMTSDzSUutkzB4txi4ZjLV umnRK27KFg8Ip33zo/F3UpkjJcDWQK9thh+arTxFQXszsnzeYwfn9wCLvxwnR6jY9EST Gm7bcyyQ08cpkbMgGEkSOvm0zdCXgvQHEIi0Upu05lLeJWZg8/mno2gXnwhFV9HUw5NX vdn+RZ8M84GNyyykLOSqOohoZ+1LdNO3OZVBJaydIumba7PJBlgF1+hl5s7G/CPKBGWV tplKDsVSGAr6+597jcu6LrY0F0XXr1twime5XKq5WZSlPrAmTIUCJYYx8NKd0Bgdngpf A8Eg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:subject :from:references:cc:to:content-language:user-agent:mime-version:date :message-id:dkim-signature; bh=1T36f5lWuPwGQNCUsmK0J4sinKZGKVtA8ZHxEwHxbRo=; b=xmhNAuHsD8CM/NynHvJdZNXZgCXwy3dGWoPRa13UKn8YS7/NjxT5p0Ip4EORpbg8Kv 8YuuL9W+wIuysvVxwbgwhs9i/k9Fd8mMLsehtpgO/mDGv0TmeKSuKCc4R1UHJuAyA9Zo D/9dbVySIyPkW/Y7hoKcjzWd2FV+yDTzBLjflqzvKI7tYRjKwThkoUpWHVLm+m8UJUxd aXK16AiGh1CnNLx5bUdt7IvNuxAOq0OjmhhAMP39zpIINnZGOdN5hAW5x9Hre71918hg Y1yAqFxOCohFqt2ThfbFElfrxUYdGBaOICp01YTPotQ5Q9BgutZGaEzwDFNBUDvRKCyD NsNw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=QXRaUTOp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p20-20020a63c154000000b0050300b179f3si1594819pgi.444.2023.04.20.07.20.13; Thu, 20 Apr 2023 07:20:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=QXRaUTOp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232101AbjDTORs (ORCPT + 99 others); Thu, 20 Apr 2023 10:17:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39524 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232146AbjDTORQ (ORCPT ); Thu, 20 Apr 2023 10:17:16 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 21F9540DB for ; Thu, 20 Apr 2023 07:16:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1682000183; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1T36f5lWuPwGQNCUsmK0J4sinKZGKVtA8ZHxEwHxbRo=; b=QXRaUTOp1Umo2v9BMsX4rKntbsjJNdSJelmV7zZJHzgcXHAZWTcZBD0AT9YYFBkfTA2NCI 5u27S0uD7YrZUYrMri9LbKJHbdzl2vVTC+huOQIrJVwmlMSVVt7/icBgM11kdik8UCjtiR nl2TwKTOD4PPnMx26Tn/2loHPnDzP4w= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-386-E5By3_RbMseCuPUEaTJtvw-1; Thu, 20 Apr 2023 10:16:21 -0400 X-MC-Unique: E5By3_RbMseCuPUEaTJtvw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id CDD1B384D03B; Thu, 20 Apr 2023 14:16:11 +0000 (UTC) Received: from [10.39.208.29] (unknown [10.39.208.29]) by smtp.corp.redhat.com (Postfix) with ESMTPS id B9F891121319; Thu, 20 Apr 2023 14:16:09 +0000 (UTC) Message-ID: Date: Thu, 20 Apr 2023 16:16:08 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.8.0 Content-Language: en-US To: Jason Wang Cc: xieyongji@bytedance.com, mst@redhat.com, david.marchand@redhat.com, linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, xuanzhuo@linux.alibaba.com, eperezma@redhat.com, Peter Xu References: <20230419134329.346825-1-maxime.coquelin@redhat.com> From: Maxime Coquelin Subject: Re: [RFC 0/2] vduse: add support for networking devices In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 X-Spam-Status: No, score=-3.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 4/20/23 06:34, Jason Wang wrote: > On Wed, Apr 19, 2023 at 9:43 PM Maxime Coquelin > wrote: >> >> This small series enables virtio-net device type in VDUSE. >> With it, basic operation have been tested, both with >> virtio-vdpa and vhost-vdpa using DPDK Vhost library series >> adding VDUSE support [0] using split rings layout. >> >> Control queue support (and so multiqueue) has also been >> tested, but require a Kernel series from Jason Wang >> relaxing control queue polling [1] to function reliably. >> >> Other than that, we have identified a few gaps: >> >> 1. Reconnection: >> a. VDUSE_VQ_GET_INFO ioctl() returns always 0 for avail >> index, even after the virtqueue has already been >> processed. Is that expected? I have tried instead to >> get the driver's avail index directly from the avail >> ring, but it does not seem reliable as I sometimes get >> "id %u is not a head!\n" warnings. Also such solution >> would not be possible with packed ring, as we need to >> know the wrap counters values. > > Looking at the codes, it only returns the value that is set via > set_vq_state(). I think it is expected to be called before the > datapath runs. > > So when bound to virtio-vdpa, it is expected to return 0. But we need > to fix the packed virtqueue case, I wonder if we need to call > set_vq_state() explicitly in virtio-vdpa before starting the device. > > When bound to vhost-vdpa, Qemu will call VHOST_SET_VRING_BASE which > will end up a call to set_vq_state(). Unfortunately, it doesn't > support packed ring which needs some extension. > >> >> b. Missing IOCTLs: it would be handy to have new IOCTLs to >> query Virtio device status, > > What's the use case of this ioctl? It looks to me userspace is > notified on each status change now: > > static int vduse_dev_set_status(struct vduse_dev *dev, u8 status) > { > struct vduse_dev_msg msg = { 0 }; > > msg.req.type = VDUSE_SET_STATUS; > msg.req.s.status = status; > > return vduse_dev_msg_sync(dev, &msg); > } The idea was to be able to query the status at reconnect time, and neither having to assume its value nor having to store its value in a file (the status could change while the VDUSE application is stopped, but maybe it would receive the notification at reconnect). I will prototype using a tmpfs file to save needed information, and see if it works. >> and retrieve the config >> space set at VDUSE_CREATE_DEV time. > > In order to be safe, VDUSE avoids writable config space. Otherwise > drivers could block on config writing forever. That's why we don't do > it now. The idea was not to make the config space writable, but just to be able to fetch what was filled at VDUSE_CREATE_DEV time. With the tmpfs file, we can avoid doing that and just save the config space there. > We need to harden the config write before we can proceed to this I think. > >> >> 2. VDUSE application as non-root: >> We need to run the VDUSE application as non-root. There >> is some race between the time the UDEV rule is applied >> and the time the device starts being used. Discussing >> with Jason, he suggested we may have a VDUSE daemon run >> as root that would create the VDUSE device, manages its >> rights and then pass its file descriptor to the VDUSE >> app. However, with current IOCTLs, it means the VDUSE >> daemon would need to know several information that >> belongs to the VDUSE app implementing the device such >> as supported Virtio features, config space, etc... >> If we go that route, maybe we should have a control >> IOCTL to create the device which would just pass the >> device type. Then another device IOCTL to perform the >> initialization. Would that make sense? > > I think so. We can hear from others. > >> >> 3. Coredump: >> In order to be able to perform post-mortem analysis, DPDK >> Vhost library marks pages used for vrings and descriptors >> buffers as MADV_DODUMP using madvise(). However with >> VDUSE it fails with -EINVAL. My understanding is that we >> set VM_DONTEXPAND flag to the VMAs and madvise's >> MADV_DODUMP fails if it is present. I'm not sure to >> understand why madvise would prevent MADV_DODUMP if >> VM_DONTEXPAND is set. Any thoughts? > > Adding Peter who may know the answer. Thanks! Maxime > Thanks > >> >> [0]: https://patchwork.dpdk.org/project/dpdk/list/?series=27594&state=%2A&archive=both >> [1]: https://lore.kernel.org/lkml/CACGkMEtgrxN3PPwsDo4oOsnsSLJfEmBEZ0WvjGRr3whU+QasUg@mail.gmail.com/T/ >> >> Maxime Coquelin (2): >> vduse: validate block features only with block devices >> vduse: enable Virtio-net device type >> >> drivers/vdpa/vdpa_user/vduse_dev.c | 11 +++++++---- >> 1 file changed, 7 insertions(+), 4 deletions(-) >> >> -- >> 2.39.2 >> >