Received: by 2002:a25:683:0:0:0:0:0 with SMTP id 125csp1231422ybg; Thu, 11 Jun 2020 04:37:10 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw3XUPLLRtViao8BpWXEUCTbFqNNkKZvKoQnp2phuS1xcTF8yzSqPGHu3TRXxhTjzuUmpFI X-Received: by 2002:aa7:c486:: with SMTP id m6mr6796476edq.234.1591875430188; Thu, 11 Jun 2020 04:37:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1591875430; cv=none; d=google.com; s=arc-20160816; b=ZosaiUa+NG8tTas0luhkYgdcEOPoU/eS43Sj+u6PbArQl98+NuylVyKcDgJ4o63X1s yGLkmZnJM6SXaM711I+WAeMoTgjdY8jq0+XG9c4+2Z1+7it4z8m2JMe8tZAP+RqMAk+v lP4H1y+mhrHxLn5qh/tqCT8WUTf90qFhk9Bbrj6vnUwotuufD7/yj5KhNeKsPF4uYqMR 6aZOa7hJWhh/mBbpPE+YAp3vfe82bsqhh7MpRDL0lqJAn5TrT1I7es3suD2w5QBy/r1y Wvri5tu6nT+Ha7qUSZm0YRVf4IyCrQh/Md/ErDeyAAOAPAu9FajwW6nEnfMtn91mvfQD Rzag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=9/jl7PRjuNx/C+pUMDHawKvKRMB5jFyju67YAOMh5Vw=; b=IBiHyBW92cwB26vQjgZPjgv2+lN9KF7XlyCF01y3L4J3qa15+bqyYjlfpkTXEY/gs+ 3KiXwPPtOoALkakloFLUTfOkcCRFxamUlUwh2e1ds56/wAvP87rpuOqa6OF+IdSLxxtp SjrcYoN7l/5nAQ4oMlc4U12ii+d0md3YdmJpvcB6QBNwj1ckCj3oT2xTM+jQNa2tLkIT OeVOzX3P9LDEiLrKbvKzL/ga+QRNIsDYwpCTIt7zfDFtba/8WvpfNj2LR0eh5hxN9g59 vK7ODgaTbglOmGBjpY0cqQFiN+wO6G6pq74Zcbg0S9r38JpPg8pUPm3ye4T2uQ081XzT X1iw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=K7vMxZEr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id k25si627811eja.712.2020.06.11.04.36.47; Thu, 11 Jun 2020 04:37:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=K7vMxZEr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728027AbgFKLaX (ORCPT + 99 others); Thu, 11 Jun 2020 07:30:23 -0400 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:36178 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727905AbgFKLaX (ORCPT ); Thu, 11 Jun 2020 07:30:23 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1591875021; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=9/jl7PRjuNx/C+pUMDHawKvKRMB5jFyju67YAOMh5Vw=; b=K7vMxZEreFMxTGl9sC1Xvi09Zg/bq+oJvjcv5mvk4M3nsPR2DQVrqAISV5Fpjod4nfY0fZ +M9cK2m17IX8fm1ru8e+tNogakwn3rFAn+V3TFl/Nw055WLbdq8sADW6tcxDM5CqA6TOcT UelvBpknphft7BB6GVbKlrzIG+if/5s= Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-328-rM3oZpfqMrWUgOdf6nDArA-1; Thu, 11 Jun 2020 07:30:19 -0400 X-MC-Unique: rM3oZpfqMrWUgOdf6nDArA-1 Received: by mail-wm1-f72.google.com with SMTP id c4so1233260wmd.0 for ; Thu, 11 Jun 2020 04:30:18 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=9/jl7PRjuNx/C+pUMDHawKvKRMB5jFyju67YAOMh5Vw=; b=hOjikjbGcF3LBTJ0RVb4plnTAIeZ5o34G7j143lgCOfEVKrGVzkEAGu9GldgS2VwH3 aRCoZmJ4d4JSpi0tujcAxYeZDIMq2YX1SXUkmgw7g4yIuJ6gBBesnsPl5J9kCN356wmq KrRWwNjI08Tso4m5kfgcq4EhtocllUGMU1QSPkTUFxRHG3Md9oxFiEaytakydjOUSh/t 0ljMKYgEW2xuASU71w4XykwvL/9PsiW0+BINv/px0kEifQ0jrCCHfRUdtjeV9HBlhd5f sIdtV6ogItlK5MZe9y6xLFmLzpk7ZyU5MHdMFBoIuC8opCHj8LspQ2AArvg5B3XyC8Qd ED9Q== X-Gm-Message-State: AOAM531DQAO5wvL0Sw/NIzf2gtVDfyGdHEUp5nUfhcHc7mHD++ulFXCQ oaqnJAvR2lcOm/e9zilXFf6+1SRC37lDklonCaCkwnT6DuBzjNes9S4vn+abvXdclDh+B3kiTQJ RFy68MWfwYTAxWdQw/QfxLVBW X-Received: by 2002:a1c:3b8b:: with SMTP id i133mr7709146wma.111.1591875017438; Thu, 11 Jun 2020 04:30:17 -0700 (PDT) X-Received: by 2002:a1c:3b8b:: with SMTP id i133mr7709124wma.111.1591875017186; Thu, 11 Jun 2020 04:30:17 -0700 (PDT) Received: from redhat.com (bzq-79-181-55-232.red.bezeqint.net. [79.181.55.232]) by smtp.gmail.com with ESMTPSA id b81sm4055054wmc.5.2020.06.11.04.30.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Jun 2020 04:30:16 -0700 (PDT) Date: Thu, 11 Jun 2020 07:30:14 -0400 From: "Michael S. Tsirkin" To: Eugenio Perez Martin Cc: linux-kernel@vger.kernel.org, kvm list , virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, Jason Wang Subject: Re: [PATCH RFC v7 03/14] vhost: use batched get_vq_desc version Message-ID: <20200611072702-mutt-send-email-mst@kernel.org> References: <20200610113515.1497099-1-mst@redhat.com> <20200610113515.1497099-4-mst@redhat.com> <20200610111147-mutt-send-email-mst@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jun 10, 2020 at 06:18:32PM +0200, Eugenio Perez Martin wrote: > On Wed, Jun 10, 2020 at 5:13 PM Michael S. Tsirkin wrote: > > > > On Wed, Jun 10, 2020 at 02:37:50PM +0200, Eugenio Perez Martin wrote: > > > > +/* This function returns a value > 0 if a descriptor was found, or 0 if none were found. > > > > + * A negative code is returned on error. */ > > > > +static int fetch_descs(struct vhost_virtqueue *vq) > > > > +{ > > > > + int ret; > > > > + > > > > + if (unlikely(vq->first_desc >= vq->ndescs)) { > > > > + vq->first_desc = 0; > > > > + vq->ndescs = 0; > > > > + } > > > > + > > > > + if (vq->ndescs) > > > > + return 1; > > > > + > > > > + for (ret = 1; > > > > + ret > 0 && vq->ndescs <= vhost_vq_num_batch_descs(vq); > > > > + ret = fetch_buf(vq)) > > > > + ; > > > > > > (Expanding comment in V6): > > > > > > We get an infinite loop this way: > > > * vq->ndescs == 0, so we call fetch_buf() here > > > * fetch_buf gets less than vhost_vq_num_batch_descs(vq); descriptors. ret = 1 > > > * This loop calls again fetch_buf, but vq->ndescs > 0 (and avail_vq == > > > last_avail_vq), so it just return 1 > > > > That's what > > [PATCH RFC v7 08/14] fixup! vhost: use batched get_vq_desc version > > is supposed to fix. > > > > Sorry, I forgot to include that fixup. > > With it I don't see CPU stalls, but with that version latency has > increased a lot and I see packet lost: > + ping -c 5 10.200.0.1 > PING 10.200.0.1 (10.200.0.1) 56(84) bytes of data. > >From 10.200.0.2 icmp_seq=1 Destination Host Unreachable > >From 10.200.0.2 icmp_seq=2 Destination Host Unreachable > >From 10.200.0.2 icmp_seq=3 Destination Host Unreachable > 64 bytes from 10.200.0.1: icmp_seq=5 ttl=64 time=6848 ms > > --- 10.200.0.1 ping statistics --- > 5 packets transmitted, 1 received, +3 errors, 80% packet loss, time 76ms > rtt min/avg/max/mdev = 6848.316/6848.316/6848.316/0.000 ms, pipe 4 > -- > > I cannot even use netperf. OK so that's the bug to try to find and fix I think. > If I modify with my proposed version: > + ping -c 5 10.200.0.1 > PING 10.200.0.1 (10.200.0.1) 56(84) bytes of data. > 64 bytes from 10.200.0.1: icmp_seq=1 ttl=64 time=7.07 ms > 64 bytes from 10.200.0.1: icmp_seq=2 ttl=64 time=0.358 ms > 64 bytes from 10.200.0.1: icmp_seq=3 ttl=64 time=5.35 ms > 64 bytes from 10.200.0.1: icmp_seq=4 ttl=64 time=2.27 ms > 64 bytes from 10.200.0.1: icmp_seq=5 ttl=64 time=0.426 ms Not sure which version this is. > [root@localhost ~]# netperf -H 10.200.0.1 -p 12865 -l 10 -t TCP_STREAM > MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to > 10.200.0.1 () port 0 AF_INET > Recv Send Send > Socket Socket Message Elapsed > Size Size Size Time Throughput > bytes bytes bytes secs. 10^6bits/sec > > 131072 16384 16384 10.01 4742.36 > [root@localhost ~]# netperf -H 10.200.0.1 -p 12865 -l 10 -t UDP_STREAM > MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to > 10.200.0.1 () port 0 AF_INET > Socket Message Elapsed Messages > Size Size Time Okay Errors Throughput > bytes bytes secs # # 10^6bits/sec > > 212992 65507 10.00 9214 0 482.83 > 212992 10.00 9214 482.83 > > I will compare with the non-batch version for reference, but the > difference between the two is noticeable. Maybe it's worth finding a > good value for the if() inside fetch_buf? > > Thanks! > I don't think it's performance, I think it's a bug somewhere, e.g. maybe we corrupt a packet, or stall the queue, or something like this. Let's do this, I will squash the fixups and post v8 so you can bisect and then debug cleanly. > > -- > > MST > >