Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751851AbdHPD7Q (ORCPT ); Tue, 15 Aug 2017 23:59:16 -0400 Received: from mx1.redhat.com ([209.132.183.28]:48702 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751224AbdHPD7P (ORCPT ); Tue, 15 Aug 2017 23:59:15 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 093187ACBF Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=mst@redhat.com Date: Wed, 16 Aug 2017 06:59:14 +0300 From: "Michael S. Tsirkin" To: Jason Wang Cc: Eric Dumazet , davem@davemloft.net, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, kubakici@wp.pl Subject: Re: [PATCH net-next V2 1/3] tap: use build_skb() for small packet Message-ID: <20170816065837-mutt-send-email-mst@kernel.org> References: <1502451678-17358-1-git-send-email-jasowang@redhat.com> <1502451678-17358-2-git-send-email-jasowang@redhat.com> <1502855120.4936.89.camel@edumazet-glaptop3.roam.corp.google.com> <20170816064951-mutt-send-email-mst@kernel.org> <5280f66f-85cf-fa4f-1a1c-7acbac2c9ab7@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <5280f66f-85cf-fa4f-1a1c-7acbac2c9ab7@redhat.com> X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Wed, 16 Aug 2017 03:59:15 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1765 Lines: 50 On Wed, Aug 16, 2017 at 11:57:51AM +0800, Jason Wang wrote: > > > On 2017年08月16日 11:55, Michael S. Tsirkin wrote: > > On Tue, Aug 15, 2017 at 08:45:20PM -0700, Eric Dumazet wrote: > > > On Fri, 2017-08-11 at 19:41 +0800, Jason Wang wrote: > > > > We use tun_alloc_skb() which calls sock_alloc_send_pskb() to allocate > > > > skb in the past. This socket based method is not suitable for high > > > > speed userspace like virtualization which usually: > > > > > > > > - ignore sk_sndbuf (INT_MAX) and expect to receive the packet as fast as > > > > possible > > > > - don't want to be block at sendmsg() > > > > > > > > To eliminate the above overheads, this patch tries to use build_skb() > > > > for small packet. We will do this only when the following conditions > > > > are all met: > > > > > > > > - TAP instead of TUN > > > > - sk_sndbuf is INT_MAX > > > > - caller don't want to be blocked > > > > - zerocopy is not used > > > > - packet size is smaller enough to use build_skb() > > > > > > > > Pktgen from guest to host shows ~11% improvement for rx pps of tap: > > > > > > > > Before: ~1.70Mpps > > > > After : ~1.88Mpps > > > > > > > > What's more important, this makes it possible to implement XDP for tap > > > > before creating skbs. > > > Well well well. > > > > > > You do realize that tun_build_skb() is not thread safe ? > > The issue is alloc frag, isn't it? > > I guess for now we can limit this to XDP mode only, and > > just allocate full pages in that mode. > > > > > > Limit this to XDP mode only does not prevent user from sending packets to > same queue in parallel I think? > > Thanks Yes but then you can just drop the page frag allocator since XDP is assumed not to care about truesize for most packets. -- MST