Received: by 2002:a05:6602:18e:0:0:0:0 with SMTP id m14csp5509345ioo; Wed, 1 Jun 2022 07:03:50 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwRmlU+nggT9Kat34uUzlsIQikg2MHfd00oYVlhBZkSRemkE3rs+T7I5oyC2bH/FPwpT/y1 X-Received: by 2002:a17:902:ea0f:b0:164:1a71:bef1 with SMTP id s15-20020a170902ea0f00b001641a71bef1mr52711plg.52.1654092230592; Wed, 01 Jun 2022 07:03:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1654092230; cv=none; d=google.com; s=arc-20160816; b=xR5m4vRA2bBsq70RP5nxbI5iX7hb/5HkqiHEkke/Ff8i3ul0uL3mCey8oOh7Xbm4J+ 2XKJ5vG1dTQq1AiaKLPCWGY04D4Bj6G9hgT90nG7Tw0/AlFVUnDwIR9MzY+pcareon8l 8tb8RknTOhMk4SarEwoQZ5xq/RIIxU4XjU1JVZVIMbqksiFbflzvmh7e+/RawFgGa+5u WFuGvrG7OIItvt938xUUpXk7pIUv66h10j6L50Tnlz86ALsr1GySPslwy1WIOWBNF4J0 xqP6lqe0j5CTwZvbRkqr2n11ifUmjCjxKXSYlwh3vszUxJQ1/5weWhv7/PG+oWlaPlhA bLjw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=T5n9A4eLifr47qhHLRf9Kq7hYcEBfD03DgfdoZ0OcNA=; b=OZViOIWDEnYBTwfOFkMhTqq6LBLp6IJD3sXeD6tqdpJyoshnTvDGugfIcsFKvtciC4 K7L/IF7uuAKE9UREq39A0YCu55SEXJFbgiAHuLPPv1rlPZyV4ZmCDLNzUfVrYOkeDB9n TG+SVYThJcoP9ASLBqRWn4G2ZzqvzZYtmJIcvjkvJCAUXceqwJ+OMbex3hUi6k85qTqX fxEaIkEYZbIzBkA1Qg6CFtvo7+OWPYADmx+gqnHVQXrUsQf/8IflfYoB+oP+SfFxm1W5 LdDb7vSp28wsW7Lp4IAvvvuE9k8uWLm0ANr9tVlU24vnabnbg4uyzpG1P7qJ4LmxFnBo MT6g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Uk1uvFgU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p13-20020a170902f08d00b00153b2d1659esi2013837pla.422.2022.06.01.07.03.33; Wed, 01 Jun 2022 07:03:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Uk1uvFgU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1352370AbiFAKtP (ORCPT + 99 others); Wed, 1 Jun 2022 06:49:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32986 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1352292AbiFAKtI (ORCPT ); Wed, 1 Jun 2022 06:49:08 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id AC90027B28 for ; Wed, 1 Jun 2022 03:49:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1654080545; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=T5n9A4eLifr47qhHLRf9Kq7hYcEBfD03DgfdoZ0OcNA=; b=Uk1uvFgUXEo7skN5DF4HSeE4BQl9i3HmwsuRPo0FBTd8uH236zDXBSAA0C3JwMqLi9Ti5d 3wL5XqhxO+jyEmbiYw0oZHox7OOEsEadhYucimV9lNum3W0UUjwtVG2c10ETkHlGh/K0vu 1ubdZx0bRI2n4M6aNCgDvtQNet5flhs= Received: from mail-qv1-f71.google.com (mail-qv1-f71.google.com [209.85.219.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-54-taVFllKxMz2rI8Ujyg5ZYA-1; Wed, 01 Jun 2022 06:49:04 -0400 X-MC-Unique: taVFllKxMz2rI8Ujyg5ZYA-1 Received: by mail-qv1-f71.google.com with SMTP id x17-20020a0cfe11000000b004645917e45cso1089340qvr.4 for ; Wed, 01 Jun 2022 03:49:04 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=T5n9A4eLifr47qhHLRf9Kq7hYcEBfD03DgfdoZ0OcNA=; b=wJIvbwU03tOBw/VdCxz4rpJEhKMly/vxRvZQE+djnYZu8VCCV+mHK6XdpGjRSZnkSn 210/K99fdU4U2qvA8QwZusPbpPI/8Z+0Xqu4SF4TXd7lh6DfwMS7KGQDQZT+JfzHLt4/ 8NTln2CoETmoIe1eRqEhhO2f3xR4CZQylVdPyFavStbfEJl65CZN/n+BV4Th1p8ur3r8 FIDqahZ/Pj0FJ0myFzxeRYulyUQGXYsh2u2dZ4hv5VansDSO5v6TzaEt4ng70SLdAIvH CtMUI7vsHC46cmpPiaIYw9lglalv5UTckQgur5HvkGwiPtSyv1Q6jO/5UvdEYScEngE4 SP8Q== X-Gm-Message-State: AOAM533L1M6NZLX8E0J0GxRZfrk935jr/+8XXQSedjHXTY6uTTf+jcbA EhjgcFs0adWOmfblDX6tzw5GASJk67wDr1uByJ+OITiEhVm+v9PiWAN20tKHgp60mMFb2ArrpKS y3jhWX/N+Vqo2aLV1J9epTitDMLmW42aCsHloR5qr X-Received: by 2002:a0c:fe48:0:b0:462:6a02:a17d with SMTP id u8-20020a0cfe48000000b004626a02a17dmr27671525qvs.108.1654080544238; Wed, 01 Jun 2022 03:49:04 -0700 (PDT) X-Received: by 2002:a0c:fe48:0:b0:462:6a02:a17d with SMTP id u8-20020a0cfe48000000b004626a02a17dmr27671514qvs.108.1654080544026; Wed, 01 Jun 2022 03:49:04 -0700 (PDT) MIME-Version: 1.0 References: <20220526124338.36247-1-eperezma@redhat.com> In-Reply-To: From: Eugenio Perez Martin Date: Wed, 1 Jun 2022 12:48:27 +0200 Message-ID: Subject: Re: [PATCH v4 0/4] Implement vdpasim stop operation To: Parav Pandit Cc: Jason Wang , "Michael S. Tsirkin" , "kvm@vger.kernel.org" , "virtualization@lists.linux-foundation.org" , "linux-kernel@vger.kernel.org" , "netdev@vger.kernel.org" , "martinh@xilinx.com" , Stefano Garzarella , "martinpo@xilinx.com" , "lvivier@redhat.com" , "pabloc@xilinx.com" , Eli Cohen , Dan Carpenter , Xie Yongji , Christophe JAILLET , Zhang Min , Wu Zongyong , "lulu@redhat.com" , Zhu Lingshan , "Piotr.Uminski@intel.com" , Si-Wei Liu , "ecree.xilinx@gmail.com" , "gautam.dawar@amd.com" , "habetsm.xilinx@gmail.com" , "tanuj.kamde@amd.com" , "hanand@xilinx.com" , "dinang@xilinx.com" , Longpeng Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.7 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 31, 2022 at 10:26 PM Parav Pandit wrote: > > > > > From: Eugenio Perez Martin > > Sent: Friday, May 27, 2022 3:55 AM > > > > On Fri, May 27, 2022 at 4:26 AM Jason Wang wrote: > > > > > > On Thu, May 26, 2022 at 8:54 PM Parav Pandit wrote= : > > > > > > > > > > > > > > > > > From: Eugenio P=C3=A9rez > > > > > Sent: Thursday, May 26, 2022 8:44 AM > > > > > > > > > Implement stop operation for vdpa_sim devices, so vhost-vdpa will > > > > > offer > > > > > > > > > > that backend feature and userspace can effectively stop the devic= e. > > > > > > > > > > > > > > > > > > > > This is a must before get virtqueue indexes (base) for live > > > > > migration, > > > > > > > > > > since the device could modify them after userland gets them. Ther= e > > > > > are > > > > > > > > > > individual ways to perform that action for some devices > > > > > > > > > > (VHOST_NET_SET_BACKEND, VHOST_VSOCK_SET_RUNNING, ...) but > > there > > > > > was no > > > > > > > > > > way to perform it for any vhost device (and, in particular, vhost= -vdpa). > > > > > > > > > > > > > > > > > > > > After the return of ioctl with stop !=3D 0, the device MUST finis= h > > > > > any > > > > > > > > > > pending operations like in flight requests. It must also preserve > > > > > all > > > > > > > > > > the necessary state (the virtqueue vring base plus the possible > > > > > device > > > > > > > > > > specific states) that is required for restoring in the future. Th= e > > > > > > > > > > device must not change its configuration after that point. > > > > > > > > > > > > > > > > > > > > After the return of ioctl with stop =3D=3D 0, the device can cont= inue > > > > > > > > > > processing buffers as long as typical conditions are met (vq is > > > > > enabled, > > > > > > > > > > DRIVER_OK status bit is enabled, etc). > > > > > > > > Just to be clear, we are adding vdpa level new ioctl() that doesn= =E2=80=99t map to > > any mechanism in the virtio spec. > > > > > > We try to provide forward compatibility to VIRTIO_CONFIG_S_STOP. That > > > means it is expected to implement at least a subset of > > > VIRTIO_CONFIG_S_STOP. > > > > > > > Appending a link to the proposal, just for reference [1]. > > > > > > > > > > Why can't we use this ioctl() to indicate driver to start/stop the = device > > instead of driving it through the driver_ok? > > > > > > > Parav, I'm not sure I follow you here. > > > > By the proposal, the resume of the device is (From qemu POV): > > 1. To configure all data vqs and cvq (addr, num, ...) 2. To enable only= CVQ, not > > data vqs 3. To send DRIVER_OK 4. Wait for all buffers of CVQ to be used= 5. To > > enable all others data vqs (individual ioctl at the moment) > > > > Where can we fit the resume (as "stop(false)") here? If the device is s= topped > > (as if we send stop(true) before DRIVER_OK), we don't read CVQ first. I= f we > > send it right after (or instead) DRIVER_OK, data buffers can reach data= vqs > > before configuring RSS. > > > It doesn=E2=80=99t make sense with currently proposed way of using cvq to= replay the config. The stop/resume part is not intended to restore the config through the CVQ. The stop call is issued to be able to retrieve the vq status (base, in vhost terminology). The symmetric operation (resume) was added on demand, it was never intended to be part neither of the config restore or the virtqueue state restore workflow. The configuration restore workflow was modelled after the device initialization, so each part needed to add the less things the better, and only qemu needed to be changed. From the device POV, there is no need to learn new tricks for this. The support of .set_vq_ready and .get_vq_ready is already in the kernel in every vdpa backend driver. > Need to continue with currently proposed temporary method that subsequent= ly to be replaced with optimized flow as we discussed. Back then, it was noted by you that enabling each data vq individually after DRIVER_OK is slow on mlx5 devices. The solution was to batch these enable calls accounting in the kernel, achieving no growth in the vdpa uAPI layer. The proposed solution did not involve the resume operation. After that, you proposed in this thread "Why can't we use this ioctl() to indicate driver to start/stop the device instead of driving it through the driver_ok?". As I understand, that is a mistake, since it requires the device, the vdpa layer, etc... to learn new tricks. It requires qemu to duplicate the initialization layer (it's now common for start and restore config). But I might have not seen the whole picture, missing advantages of using the resume call for this workflow. Can you describe the workflow you have in mind? How does that new workflow affect this proposal? I'm ok to change the proposal as long as we find we obtain a net gain. Thanks!