Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752411AbZKNF1r (ORCPT ); Sat, 14 Nov 2009 00:27:47 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751527AbZKNF1q (ORCPT ); Sat, 14 Nov 2009 00:27:46 -0500 Received: from mail-qy0-f174.google.com ([209.85.221.174]:57063 "EHLO mail-qy0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751427AbZKNF1p (ORCPT ); Sat, 14 Nov 2009 00:27:45 -0500 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:x-enigmail-version:content-type; b=w5zwkcc9VUlgwRczZhX8fqTAfoCZm92qv1Ud9quTDsuOobPQSDXErryeFNvuRaytZV h7LVs/zwIjA36ef5VFTEsX1PiDITx2tHpkzK4qOrCQpGVCNdyFfPbjIRgF/nVevWVuN2 QuZCNH6A9eNA/oZXWg0qE8GRlqPUbCs2Um9qg= Message-ID: <4AFE3FD2.6030403@gmail.com> Date: Sat, 14 Nov 2009 00:27:46 -0500 From: Gregory Haskins User-Agent: Thunderbird 2.0.0.23 (Macintosh/20090812) MIME-Version: 1.0 To: Stephen Hemminger CC: Herbert Xu , Gregory Haskins , "Michael S. Tsirkin" , alacrityvm-devel@lists.sourceforge.net, linux-kernel@vger.kernel.org, netdev@vger.kernel.org Subject: Re: [RFC PATCH] net: add dataref destructor to sk_buff References: <20091002141407.30224.54207.stgit@dev.haskins.net> <20091110115335.GC6989@redhat.com> <4AF919020200005A000586A9@sinclair.provo.novell.com> <20091110131722.GA19645@redhat.com> <4AF9747E.8020408@novell.com> <20091110143652.GB19645@redhat.com> <4AF98A8C.9040201@novell.com> <20091114011229.GA18580@gondor.apana.org.au> <4AFE08EF.2030308@gmail.com> <20091114022103.GA19020@gondor.apana.org.au> <4AFE15AD.6000208@gmail.com> <20091113184503.13f6d447@s6510> In-Reply-To: <20091113184503.13f6d447@s6510> X-Enigmail-Version: 0.96.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enigBD4C5E956E6511524D237887" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3383 Lines: 87 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigBD4C5E956E6511524D237887 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Stephen Hemminger wrote: > On Fri, 13 Nov 2009 21:27:57 -0500 > Gregory Haskins wrote: >=20 >> Herbert Xu wrote: >>> On Fri, Nov 13, 2009 at 08:33:35PM -0500, Gregory Haskins wrote: >>>> Well, not with respect to the overall protocol, of course not. But = with >>>> respect to the buffer in question, it _has_ to be. Or am I missing >>>> something? >>> sendfile() has never guaranteed that the kernel is finished with >>> the underlying pages when it returns. >>> >>> Cheers, >> Clearly there must be _some_ mechanism to synchronize (e.g. >> flush/barrier) though, right? Otherwise, that interface would seem to= >> be quite prone to races and would likely be unusable. So what does >> said flush use to know when the buffer is free? >=20 > No all the interfaces require a copy. I'm sorry, but I do not think that is correct. As others have pointed out, that would not appear to be true for at least sendfile. > Actually, sendfile makes no guarantee about synchronization > because the receiver of said file could be arbitrarily slow, and any at= tempt at locking down > current contents of file is a denial of service exposure. I think you are inverting the problem space. It is fully expected that changing the "file", or the pages that represent the file before the packet is queued would constitute the modification of the stream on the wire. I am more thinking about the applications of mmap+sendfile to implement a zero-copy interface. As David mentions in another mail, it seems that there is no sync mechanism available, so this would not appear to be a viable use case today, unfortunately. >=20 > People have tried doing copy-less send by page flipping, but the overhe= ad of the IPI to > invalidate the TLB exceeded the overhead of the copy. There was an Inte= l paper on this in > at Linux Symposium (Ottawa) several years ago. I think you are confusing copy-less tx with copy-less rx. You can try to do copy-less rx with page flipping, which has the IPI/TLB thrashing properties you mention, and I agree is problematic. We are talking about copy-less tx here, however, and therefore no page-flipping is involved. Rather, we are just posting SG lists of pages directly to the NIC (assuming the nic supports HIGH_DMA, etc). You do not need to flip the page, or invalidate the TLB (and thus IPI the other cores) to do this to my knowledge. Kind Regards, -Greg --------------enigBD4C5E956E6511524D237887 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.0.11 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkr+P9IACgkQP5K2CMvXmqGeQgCfcGrNp0qqn8q7EZPgFrlqL8xj mC4AnjODIB9KhzJUdZz2jmVTCsU8n7mL =SEAv -----END PGP SIGNATURE----- --------------enigBD4C5E956E6511524D237887-- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/