Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752262AbaLBJwA (ORCPT ); Tue, 2 Dec 2014 04:52:00 -0500 Received: from mx1.redhat.com ([209.132.183.28]:44857 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750964AbaLBJv5 (ORCPT ); Tue, 2 Dec 2014 04:51:57 -0500 Date: Tue, 02 Dec 2014 09:59:48 +0008 From: Jason Wang Subject: Re: [PATCH RFC v4 net-next 0/5] virtio_net: enabling tx interrupts To: "Michael S. Tsirkin" Cc: virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, davem@davemloft.net, pagupta@redhat.com Message-Id: <1417513908.16540.0@smtp.corp.redhat.com> In-Reply-To: <20141202094318.GB7732@redhat.com> References: <1417429028-11971-1-git-send-email-jasowang@redhat.com> <20141201104223.GB16108@redhat.com> <1417490120.4405.2@smtp.corp.redhat.com> <1417507622.12638.0@smtp.corp.redhat.com> <20141202094318.GB7732@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Dec 2, 2014 at 5:43 PM, Michael S. Tsirkin wrote: > On Tue, Dec 02, 2014 at 08:15:02AM +0008, Jason Wang wrote: >> >> >> On Tue, Dec 2, 2014 at 11:15 AM, Jason Wang >> wrote: >> > >> > >> >On Mon, Dec 1, 2014 at 6:42 PM, Michael S. Tsirkin >> wrote: >> >>On Mon, Dec 01, 2014 at 06:17:03PM +0800, Jason Wang wrote: >> >>> Hello: >> >>> We used to orphan packets before transmission for virtio-net. >> This >> >>>breaks >> >>> socket accounting and can lead serveral functions won't work, >> e.g: >> >>> - Byte Queue Limit depends on tx completion nofication to work. >> >>> - Packet Generator depends on tx completion nofication for the >> last >> >>> transmitted packet to complete. >> >>> - TCP Small Queue depends on proper accounting of sk_wmem_alloc >> to >> >>>work. >> >>> This series tries to solve the issue by enabling tx >> interrupts. To >> >>>minize >> >>> the performance impacts of this, several optimizations were >> used: >> >>> - In guest side, virtqueue_enable_cb_delayed() was used to >> delay the >> >>>tx >> >>> interrupt untile 3/4 pending packets were sent. >> >>> - In host side, interrupt coalescing were used to reduce tx >> >>>interrupts. >> >>> Performance test results[1] (tx-frames 16 tx-usecs 16) shows: >> >>> - For guest receiving. No obvious regression on throughput were >> >>> noticed. More cpu utilization were noticed in few cases. >> >>> - For guest transmission. Very huge improvement on througput for >> >>>small >> >>> packet transmission were noticed. This is expected since TSQ >> and >> >>>other >> >>> optimization for small packet transmission work after tx >> interrupt. >> >>>But >> >>> will use more cpu for large packets. >> >>> - For TCP_RR, regression (10% on transaction rate and cpu >> >>>utilization) were >> >>> found. Tx interrupt won't help but cause overhead in this >> case. >> >>>Using >> >>> more aggressive coalescing parameters may help to reduce the >> >>>regression. >> >> >> >>OK, you do have posted coalescing patches - does it help any? >> > >> >Helps a lot. >> > >> >For RX, it saves about 5% - 10% cpu. (reduce 60%-90% tx intrs) >> >For small packet TX, it increases 33% - 245% throughput. (reduce >> about 60% >> >inters) >> >For TCP_RR, it increase the 3%-10% trans.rate. (reduce 40%-80% tx >> intrs) >> > >> >> >> >>I'm not sure the regression is due to interrupts. >> >>It would make sense for CPU but why would it >> >>hurt transaction rate? >> > >> >Anyway guest need to take some cycles to handle tx interrupts. >> >And transaction rate does increase if we coalesces more tx >> interurpts. >> >> >> >> >> >>It's possible that we are deferring kicks too much due to BQL. >> >> >> >>As an experiment: do we get any of it back if we do >> >>- if (kick || netif_xmit_stopped(txq)) >> >>- virtqueue_kick(sq->vq); >> >>+ virtqueue_kick(sq->vq); >> >>? >> > >> > >> >I will try, but during TCP_RR, at most 1 packets were pending, >> >I suspect if BQL can help in this case. >> >> Looks like this helps a lot in multiple sessions of TCP_RR. > > so what's faster > BQL + kick each packet > no BQL > ? Quick and manual tests (TCP_RR 64, TCP_STREAM 512) does not show obvious differences. May need a complete benchmark to see. > > >> How about move the BQL patch out of this series? >> >> Let's first converge tx interrupt and then introduce it? >> (e.g with kicking after queuing X bytes?) > > Sounds good. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/