Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753422AbYL3VgW (ORCPT ); Tue, 30 Dec 2008 16:36:22 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752099AbYL3VgF (ORCPT ); Tue, 30 Dec 2008 16:36:05 -0500 Received: from cs-studio.ru ([195.178.208.66]:48854 "EHLO tservice.net.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751969AbYL3VgC (ORCPT ); Tue, 30 Dec 2008 16:36:02 -0500 Date: Wed, 31 Dec 2008 00:35:59 +0300 From: Evgeniy Polyakov To: Vladislav Bolkhovitin Cc: Herbert Xu , Jeremy Fitzhardinge , linux-scsi@vger.kernel.org, James Bottomley , Andrew Morton , FUJITA Tomonori , Mike Christie , Jeff Garzik , Boaz Harrosh , Linus Torvalds , linux-kernel@vger.kernel.org, scst-devel@lists.sourceforge.net, Bart Van Assche , "Nicholas A. Bellinger" , netdev@vger.kernel.org, Rusty Russell , David Miller , Alexey Kuznetsov Subject: Re: [PATCH][RFC 23/23]: Support for zero-copy TCP transmit of user space data Message-ID: <20081230213559.GD20238@ioremap.net> References: <494CA226.9000200@goop.org> <20081220081045.GA17439@gondor.apana.org.au> <20081220103209.GA23632@ioremap.net> <49513909.1050100@vlnb.net> <20081223213817.GB16883@ioremap.net> <4952493F.10508@vlnb.net> <20081224144422.GA25089@ioremap.net> <49527590.7090909@vlnb.net> <20081224180841.GA615@ioremap.net> <495A5C3C.8090006@vlnb.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <495A5C3C.8090006@vlnb.net> User-Agent: Mutt/1.5.13 (2006-08-11) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2688 Lines: 54 Hi Vlad. On Tue, Dec 30, 2008 at 08:37:00PM +0300, Vladislav Bolkhovitin (vst@vlnb.net) wrote: > Although I agree that any additional allocation is something, which > should be avoided, *if possible*. But you shouldn't overestimate the > overhead of the sk_transaction_token allocation in cases, when it would > be needed. At first, sk_transaction_token is quite small, so a single > page in the kmem cache would keep about 100 of them, hence the slow > allocation path would be called only once per 100 objects. Second, in > many cases ->sendpages() needs to allocate a new skb, so already there > is at least one such allocations on the fast path. Once per 100 objects? With millions of packets per second at extreme cases this does not scale. Even more common thousand of usual packets per second with 1.5k mtu will show up (especially freeing actually). Any additional overhead has to be avoided if possible, even if it looks innocent. BSD guys already learned this lesson with packet processing tags at every layer. > Actually, it doesn't look like the skb shared info destructor alone > can't solve the task we are solving, because we need to know not when an > skb transmittion finished, but when transmittion of our *set of pages* > finished. Hence, with skb shared info destructor we would need also to > invent some way to track set of pages <-> set of skbs translation (you > refer it as combining tag and separate destructor), which would bring > this solution on the entire new complexity level for no gain over the > sk_transaction_token solution. You really do not need to know when transmission is over, but when remote side acks it (or connection is reset by the timeout). There is no way to know when transmission is over without creating own skbs and submitting them avoiding usual tcp/ip stack machinery. You do not need to know which skbs contain which pages, system only should track page pointers freed at skb destruction (shared info destruction actually) time, no matter who owns those pages (since new pages can be added into the page and some of the old ones can be freed early). This will be effectively the same token, but it does not mean that everyone who needs notification will have to perform additional allocation. Put two pointers: destructor and token and do whatever you like if one of them is non-empty, but try to avoid unneded overhead when it is possible. -- Evgeniy Polyakov -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/