2007-05-21 07:11:44

by NeilBrown

[permalink] [raw]
Subject: Re: [RFC,PATCH 11/15] knfsd: RDMA transport core

On Friday May 18, [email protected] wrote:
>
> This file implements the core transport data management and I/O
> path. The I/O path for RDMA involves receiving callbacks on interrupt
> context. Since all the svc transport locks are _bh locks we enqueue the
> transport on a list, schedule a tasklet to dequeue data indications from
> the RDMA completion queue. The tasklet in turn takes _bh locks to
> enqueue receive data indications on a list for the transport. The
> svc_rdma_recvfrom transport function dequeues data from this list in an
> NFSD thread context.

Cannot we simply change the usage of ->sp_lock to always disable
interrupts?
That would make this much simpler. How much would it cost?

Alternatively, why can the network layer deliver these notification in
"bh" context, but the ib layer wants to deliver them in irq context?
Does doing it in irq context have lower latency or something?

NeilBrown

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2007-05-21 10:03:08

by Greg Banks

[permalink] [raw]
Subject: Re: [RFC,PATCH 11/15] knfsd: RDMA transport core

On Mon, May 21, 2007 at 05:11:34PM +1000, Neil Brown wrote:
> On Friday May 18, [email protected] wrote:
> >
> > This file implements the core transport data management and I/O
> > path. The I/O path for RDMA involves receiving callbacks on interrupt
> > context. Since all the svc transport locks are _bh locks we enqueue the
> > transport on a list, schedule a tasklet to dequeue data indications from
> > the RDMA completion queue. The tasklet in turn takes _bh locks to
> > enqueue receive data indications on a list for the transport. The
> > svc_rdma_recvfrom transport function dequeues data from this list in an
> > NFSD thread context.
>
> Cannot we simply change the usage of ->sp_lock to always disable
> interrupts?

We could, but the question then is what impact will locking out
interrupts more often have on throughout or latency over Ethernet?

> That would make this much simpler.

Yes it would.

> How much would it cost?

???

> Alternatively, why can the network layer deliver these notification in
> "bh" context, but the ib layer wants to deliver them in irq context?
> Does doing it in irq context have lower latency or something?

The problem is wider than it appears. The kernel verbs interface
reports a *lot* of stuff in irq context, not just incoming data.
Many of those events would be best handled by disconnecting the
svc_sock, which implies setting SK_CLOSE and causing svc_sock_enqueue()
to be called. The former happens; the latter doesn't. For example,
see qp_event_handler() in net/sunrpc/svc_rdma_transport.c. This
means that error conditions are not notified until some other event
causes svc_sock_enqueue() to be called.

Greg.
--
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
Apparently, I'm Bedevere. Which MPHG character are you?
I don't speak for SGI.

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-05-21 15:58:40

by Tom Tucker

[permalink] [raw]
Subject: Re: [RFC,PATCH 11/15] knfsd: RDMA transport core

On Mon, 2007-05-21 at 17:11 +1000, Neil Brown wrote:
> On Friday May 18, [email protected] wrote:
> >
> > This file implements the core transport data management and I/O
> > path. The I/O path for RDMA involves receiving callbacks on interrupt
> > context. Since all the svc transport locks are _bh locks we enqueue the
> > transport on a list, schedule a tasklet to dequeue data indications from
> > the RDMA completion queue. The tasklet in turn takes _bh locks to
> > enqueue receive data indications on a list for the transport. The
> > svc_rdma_recvfrom transport function dequeues data from this list in an
> > NFSD thread context.
>
> Cannot we simply change the usage of ->sp_lock to always disable
> interrupts?
> That would make this much simpler. How much would it cost?

I don't think they are particularly expensive to acquire, but they add
incremental interrupt latency to the system. My impression is that the
general guidance from Linus et al is to avoid them if not absolutely
necessary. Trond's comments I think reflected this viewpoint as well,
but perhaps I'm just solving a problem that doesn't exist.

>
> Alternatively, why can the network layer deliver these notification in
> "bh" context, but the ib layer wants to deliver them in irq context?
> Does doing it in irq context have lower latency or something?
>

Yes, latency was certainly the design criteria in OFA. In practice, I
think we're talking nanoseconds (i.e. interrupt handler vs. tasklet),
however, when you're working with minimum latencies in the 2.8us range,
nanoseconds become statistically significant.

For our purposes, I don't think the latency is an issue.

> NeilBrown


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs