2007-08-20 00:49:39

by Felix Marti

[permalink] [raw]
Subject: RE: [ofa-general] Re: [PATCH RFC] RDMA/CMA: Allocate PS_TCPportsfrom the host TCP port space.



> -----Original Message-----
> From: David Miller [mailto:[email protected]]
> Sent: Sunday, August 19, 2007 5:40 PM
> To: Felix Marti
> Cc: [email protected]; [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]
> Subject: Re: [ofa-general] Re: [PATCH RFC] RDMA/CMA: Allocate
> PS_TCPportsfrom the host TCP port space.
>
> From: "Felix Marti" <[email protected]>
> Date: Sun, 19 Aug 2007 17:32:39 -0700
>
> [ Why do you put that "[Felix Marti]" everywhere you say something?
> It's annoying and superfluous. The quoting done by your mail client
> makes clear who is saying what. ]
>
> > Hmmm, interesting... I guess it is impossible to even have
> > a discussion on the subject.
>
> Nice try, Herbert Xu gave a great explanation.
[Felix Marti] David and Herbert, so you agree that the user<>kernel
space memory copy overhead is a significant overhead and we want to
enable zero-copy in both the receive and transmit path? - Yes, copy
avoidance is mainly an API issue and unfortunately the so widely used
(synchronous) sockets API doesn't make copy avoidance easy, which is one
area where protocol offload can help. Yes, some apps can resort to
sendfile() but there are many apps which seem to have trouble switching
to that API... and what about the receive path?


2007-08-20 01:05:53

by David Miller

[permalink] [raw]
Subject: Re: [ofa-general] Re: [PATCH RFC] RDMA/CMA: Allocate PS_TCPportsfrom the host TCP port space.

From: "Felix Marti" <[email protected]>
Date: Sun, 19 Aug 2007 17:47:59 -0700

> [Felix Marti]

Please stop using this to start your replies, thank you.

> David and Herbert, so you agree that the user<>kernel
> space memory copy overhead is a significant overhead and we want to
> enable zero-copy in both the receive and transmit path? - Yes, copy
> avoidance is mainly an API issue and unfortunately the so widely used
> (synchronous) sockets API doesn't make copy avoidance easy, which is one
> area where protocol offload can help. Yes, some apps can resort to
> sendfile() but there are many apps which seem to have trouble switching
> to that API... and what about the receive path?

On the send side none of this is an issue. You either are sending
static content, in which using sendfile() is trivial, or you're
generating data dynamically in which case the data copy is in the
noise or too small to do zerocopy on and if not you can use a shared
mmap to generate your data into, and then sendfile out from that file,
to avoid the copy that way.

splice() helps a lot too.

Splice has the capability to do away with the receive side too, and
there are a few receivefile() implementations that could get cleaned
up and merged in.

Also, the I/O bus is still the more limiting factor and main memory
bandwidth in all of this, it is the smallest data pipe for
communications out to and from the network. So the protocol header
avoidance gains of TSO and LRO are still a very worthwhile savings.

But even if RDMA increases performance 100 fold, it still doesn't
avoid the issue that it doesn't fit in with the rest of the networking
stack and feature set.

Any monkey can change the rules around ("ok I can make it go fast as
long as you don't need firewalling, packet scheduling, classification,
and you only need to talk to specific systems that speak this same
special protocol") to make things go faster. On the other hand well
designed solutions can give performance gains within the constraints
of the full system design and without sactificing functionality.

2007-08-20 09:44:21

by Evgeniy Polyakov

[permalink] [raw]
Subject: Re: [ofa-general] Re: [PATCH RFC] RDMA/CMA: Allocate PS_TCPportsfrom the host TCP port space.

On Sun, Aug 19, 2007 at 05:47:59PM -0700, Felix Marti ([email protected]) wrote:
> [Felix Marti] David and Herbert, so you agree that the user<>kernel
> space memory copy overhead is a significant overhead and we want to
> enable zero-copy in both the receive and transmit path? - Yes, copy

It depends. If you need to access that data after received, you will get
cache miss and performance will not be much better (if any) that with
copy.

> avoidance is mainly an API issue and unfortunately the so widely used
> (synchronous) sockets API doesn't make copy avoidance easy, which is one
> area where protocol offload can help. Yes, some apps can resort to
> sendfile() but there are many apps which seem to have trouble switching
> to that API... and what about the receive path?

There is number of implementations, and all they are suitable for is
to have recvfile(), since this is likely the only case, which can work
without cache.

And actually RDMA stack exist and no one said it should be thrown away
_until_ it messes with main stack. It started to speal ports. What will
happen when it gest all port space and no new legal network conection
can be opened, although there is no way to show to user who got it?
What will happen if hardware RDMA connection got terminated and software
could not free the port? Will RDMA request to export connection reset
functions out of stack to drop network connections which are on the ports
which are supposed to be used by new RDMA connections?

RDMA is not a problem, but how it influence to the network stack is.
Let's better think about how to work correctly with network stack (since
we already have that cr^Wdifferent hardware) instead of saying that
others do bad work and do not allow shiny new feature to exist.

--
Evgeniy Polyakov