Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp231776imu; Thu, 8 Nov 2018 18:32:59 -0800 (PST) X-Google-Smtp-Source: AJdET5emQp+puESDjhQtCCmux2ueErXhn6tZqIFI6GuvIqpc9jUkpWG9WB0bOAEyJqFJ3MCf5Lpq X-Received: by 2002:a17:902:31a4:: with SMTP id x33-v6mr7023474plb.105.1541730779919; Thu, 08 Nov 2018 18:32:59 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1541730779; cv=none; d=google.com; s=arc-20160816; b=pOPWYYKk2fOZeAwAy40ablgduQggOm60kD/jtJhcELyxWmqN+rPbJ8KeikYCBRLsdG 5jLyIxpCittM2/dWJtH9A9HxJYHe7TQ3LFJ73sCkcl/ET1D3m47HM265b8ovhrGL1Gv2 gCibesLYxImYGFiVncDfdmxMOZFg+EVQHtUed09D7utvJz9QqVVhj+LXb02zCjS4j+4o FYQhinpL6rsPtotXmSFRMSpQHg+AO5+Gf0NKLzVszC6SMycqUwh9iZppvz6U73gxxkDS WJ+2tmiMtf3dn6RcPqVdiGG1E1SPJsldfHZssOrEN3LCaxEToS88OPyrNDsxZk0W4u+l O2hA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=UMluz27Gz4eTuriEX8AT16bKPY4AAT+PvTqo0F2G1kk=; b=0DAiWZdMo1gH4VMyU7b785NrR6QLpRob/lizEMhswwhnUum0uZN6SFKt/B+CdtEzPW l5jnloYGF9PdYeuGro0P19a3Eey4QiPoT809FUUYRnvQ0STQzve/YujjnFY32G9C0fy9 bF4AkXHi+ek2d6LiLsOikjfnZh+MnRj+vMPK3FNizphNBRkd2xeLFOc3sNjPB2hCMwF3 gxxuSsPiwGlGmHiQzfyNUJTA0RmQDceERpeEgDqr3vWfk3a4AynLJZfMKRGHc6WGx94g GG+jAZz4Dwmb2Et3fF4PQode82NXB7KuRcecdmrO3Daa0CBhXEvUNokVuImdlcablPvo WQxw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s11si4689097pgk.344.2018.11.08.18.32.43; Thu, 08 Nov 2018 18:32:59 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727718AbeKIMJe (ORCPT + 99 others); Fri, 9 Nov 2018 07:09:34 -0500 Received: from mx1.redhat.com ([209.132.183.28]:56056 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727390AbeKIMJd (ORCPT ); Fri, 9 Nov 2018 07:09:33 -0500 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 9BC3A307DACA; Fri, 9 Nov 2018 02:31:04 +0000 (UTC) Received: from [10.72.12.19] (ovpn-12-19.pek2.redhat.com [10.72.12.19]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 2D4E95C21C; Fri, 9 Nov 2018 02:30:53 +0000 (UTC) Subject: Re: [PATCH net-next v2 3/5] virtio_ring: add packed ring support To: "Michael S. Tsirkin" , Tiwei Bie Cc: virtualization@lists.linux-foundation.org, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, virtio-dev@lists.oasis-open.org, wexu@redhat.com, jfreimann@redhat.com References: <20180711022711.7090-1-tiwei.bie@intel.com> <20180711022711.7090-4-tiwei.bie@intel.com> <20181107123933-mutt-send-email-mst@kernel.org> <20181108013759.GA20591@debian> <2d46a41e-bc00-276a-e19a-105c9dffc75a@redhat.com> <20181108115148.GA15701@debian> <20181108103155-mutt-send-email-mst@kernel.org> From: Jason Wang Message-ID: <67bd6a88-00f2-ed13-ad13-bdfe92ceeffc@redhat.com> Date: Fri, 9 Nov 2018 10:30:50 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1 MIME-Version: 1.0 In-Reply-To: <20181108103155-mutt-send-email-mst@kernel.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.42]); Fri, 09 Nov 2018 02:31:04 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2018/11/8 下午11:56, Michael S. Tsirkin wrote: > On Thu, Nov 08, 2018 at 07:51:48PM +0800, Tiwei Bie wrote: >> On Thu, Nov 08, 2018 at 04:18:25PM +0800, Jason Wang wrote: >>> On 2018/11/8 上午9:38, Tiwei Bie wrote: >>>>>> + >>>>>> + if (vq->vq.num_free < descs_used) { >>>>>> + pr_debug("Can't add buf len %i - avail = %i\n", >>>>>> + descs_used, vq->vq.num_free); >>>>>> + /* FIXME: for historical reasons, we force a notify here if >>>>>> + * there are outgoing parts to the buffer. Presumably the >>>>>> + * host should service the ring ASAP. */ >>>>> I don't think we have a reason to do this for packed ring. >>>>> No historical baggage there, right? >>>> Based on the original commit log, it seems that the notify here >>>> is just an "optimization". But I don't quite understand what does >>>> the "the heuristics which KVM uses" refer to. If it's safe to drop >>>> this in packed ring, I'd like to do it. >>> >>> According to the commit log, it seems like a workaround of lguest networking >>> backend. >> Do you know why removing this notify in Tx will break "the >> heuristics which KVM uses"? Or what does "the heuristics >> which KVM uses" refer to? > Yes. QEMU has a mode where it disables notifications and processes TX > ring periodically from a timer. It's off by default but used to be on > by default a long time ago. If ring becomes full this causes traffic > stalls. Do you mean tx-timer? If yes, we can still enable it for packed ring and the timer will finally fired and we can go. > As a work-around Rusty put in this hack to kick on ring full > even with notifications disabled. From the commit log it looks more like a performance workaround instead of a bug fix. > It's easy enough to make sure QEMU > does not combine devices with packed ring support with the timer hack. > And I am guessing it's safe enough to also block that option completely > e.g. when virtio 1.0 is enabled. I agree. Thanks >>> I agree to drop it, we should not have such burden. >>> >>> But we should notice that, with this removed, the compare between packed vs >>> split is kind of unfair. Consider the removal of lguest support recently, >>> maybe we can drop this for split ring as well? >>> >>> Thanks >>> >>> >>>> commit 44653eae1407f79dff6f52fcf594ae84cb165ec4 >>>> Author: Rusty Russell >>>> Date: Fri Jul 25 12:06:04 2008 -0500 >>>> >>>> virtio: don't always force a notification when ring is full >>>> We force notification when the ring is full, even if the host has >>>> indicated it doesn't want to know. This seemed like a good idea at >>>> the time: if we fill the transmit ring, we should tell the host >>>> immediately. >>>> Unfortunately this logic also applies to the receiving ring, which is >>>> refilled constantly. We should introduce real notification thesholds >>>> to replace this logic. Meanwhile, removing the logic altogether breaks >>>> the heuristics which KVM uses, so we use a hack: only notify if there are >>>> outgoing parts of the new buffer. >>>> Here are the number of exits with lguest's crappy network implementation: >>>> Before: >>>> network xmit 7859051 recv 236420 >>>> After: >>>> network xmit 7858610 recv 118136 >>>> Signed-off-by: Rusty Russell >>>> >>>> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c >>>> index 72bf8bc09014..21d9a62767af 100644 >>>> --- a/drivers/virtio/virtio_ring.c >>>> +++ b/drivers/virtio/virtio_ring.c >>>> @@ -87,8 +87,11 @@ static int vring_add_buf(struct virtqueue *_vq, >>>> if (vq->num_free < out + in) { >>>> pr_debug("Can't add buf len %i - avail = %i\n", >>>> out + in, vq->num_free); >>>> - /* We notify*even if* VRING_USED_F_NO_NOTIFY is set here. */ >>>> - vq->notify(&vq->vq); >>>> + /* FIXME: for historical reasons, we force a notify here if >>>> + * there are outgoing parts to the buffer. Presumably the >>>> + * host should service the ring ASAP. */ >>>> + if (out) >>>> + vq->notify(&vq->vq); >>>> END_USE(vq); >>>> return -ENOSPC; >>>> } >>>> >>>>