Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752298AbYLWTNy (ORCPT ); Tue, 23 Dec 2008 14:13:54 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751143AbYLWTNl (ORCPT ); Tue, 23 Dec 2008 14:13:41 -0500 Received: from moutng.kundenserver.de ([212.227.17.8]:53063 "EHLO moutng.kundenserver.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751022AbYLWTNk (ORCPT ); Tue, 23 Dec 2008 14:13:40 -0500 Message-ID: <49513864.7030306@vlnb.net> Date: Tue, 23 Dec 2008 22:13:40 +0300 From: Vladislav Bolkhovitin User-Agent: Thunderbird 2.0.0.9 (X11/20071115) MIME-Version: 1.0 To: Jeremy Fitzhardinge CC: linux-scsi@vger.kernel.org, James Bottomley , Andrew Morton , FUJITA Tomonori , Mike Christie , Jeff Garzik , Boaz Harrosh , Linus Torvalds , linux-kernel@vger.kernel.org, scst-devel@lists.sourceforge.net, Bart Van Assche , "Nicholas A. Bellinger" , netdev@vger.kernel.org, Rusty Russell , Herbert Xu Subject: Re: [PATCH][RFC 23/23]: Support for zero-copy TCP transmit of user space data References: <494009D7.4020602@vlnb.net> <494012C4.7090304@vlnb.net> <494C0255.8010208@goop.org> In-Reply-To: <494C0255.8010208@goop.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Provags-ID: V01U2FsdGVkX1+qtReAIXoFpSb0SQj7DzNg2Af0/C1YwRwc6E1 fGHOg5Pgm9m4gjFz7Ht0MKe9sj8OTRW/45jPyyB+u0Dk46V19K 7MVwDHN7ToVnyCWeYj+hQ== Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2141 Lines: 48 Jeremy Fitzhardinge, on 12/19/2008 11:21 PM wrote: [...] > As with your case, we can simply copy the page data if this mechanism > isn't available. But it would be nice if it were. > >> 1. Add net_priv analog in struct sk_buff, not in struct page. But then >> it would be required that all the pages in each skb must be from the >> same originator, i.e. with the same net_priv. It is unpractical to >> change all the operations with skb's to forbid merging them, if they >> have different net_priv. I tried, but quickly gave up. There are too >> many such places in very not obvious code pieces. >> > > I think Rusty has a patch to put some kind of put notifier in struct > skb_shared_info, but I'm not sure of the details. > >> 2. Have in iSCSI-SCST a hashed list to translate page to iSCSI cmd by a >> simple search function. This approach was rejected, because to copy a >> page a modern CPU needs using MMX about 1500 ticks. > > Is that the cold cache timing? Should be L2 cache hot, which is almost always the case if FILEIO is used, because data are just copied from the page cache. Although, frankly, at the moment I can't find from where I got that number.. >> It was observed, >> that each page can be referenced by TCP during transmit about 20 times >> or even more. So, if each search needs, say, 20 ticks, the overall >> search time will be 20*20*2 (to get() and put()) = 800 ticks. So, this >> approach would considerably worse performance-wise to the chosen >> approach and provide not too much benefit. > > Wouldn't you only need to do the lookup on the last put? No, because you can't say which one is the last. E.g., a page can be mmaped to another process, while it's being transmitted. So, the only possible way is to track all gets and puts done by networking using some external reference counting (net_ref_cnt in case if iscsi-scst). Vlad -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/