2009-05-12 03:06:29

by Steve Wise

[permalink] [raw]
Subject: Re: [PATCH 2.6.30] xprtrdma: The frmr iova_start values are truncated by the nfs rdma client.

Trond Myklebust wrote:
> On Mon, 2009-05-11 at 21:14 -0400, Tom Talpey wrote:
>
>> At 08:44 PM 5/11/2009, Trond Myklebust wrote:
>>
>>> On Mon, 2009-05-11 at 19:13 -0500, Steve Wise wrote:
>>>
>>>> Trond Myklebust wrote:
>>>>
>>>>> On Mon, 2009-05-11 at 17:25 -0500, Steve Wise wrote:
>>>>>
>>>>>
>>>>>> Hey Trond,
>>>>>>
>>>>>> Will this bug fix make 2.6.30?
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Steve.
>>>>>>
>>>>>>
>>>>> Not in the form it is in now. As I've said earlier, I'm not happy about
>>>>> the sunrpc layer having to circumvent ordinary type checking on
>>>>> non-sunrpc structures.
>>>>>
>>>>> Cheers
>>>>> Trond
>>>>>
>>>> How is it circumventing? It's currently incorrectly casting a pointer
>>>> into a u64. That seems just broken to me. Also, its really the sunrpc
>>>> rdma transport layer. It deals specifically with rdma. It _should_
>>>> know about rdma interfaces and types.
>>>>
>>> The fact is that I'm simply not interested enough in rdma to tolerate
>>> hacks. If it isn't done cleanly, in a manner that I can maintain, then
>>> the whole transport layer comes out...
>>>
>> I know exactly what you want - it's not what the code does now and
>> it's not an accessor function to set the hardware's u64 field. What's
>> needed is a new function to manage the entire RDMA triplet, and the
>> memory registration behind it, in the OFA code side. Put the hardware
>> goop below the line, IOW. I'll dust up Steve on this.
>>
>
> This does indeed sound like what I'd looking for.
>
> There is a huge difference between having code that depends on well
> defined rdma interfaces, and code that depends on rdma hacks. A piece of
> code that requires casts from a non-local opaque type into another
> protocol-dependent non-local type will definitely fall in the latter
> category. I really don't care what the current code does, but a fix for
> that code is something that does it _correctly_; it is not yet another
> hack, whether or not it fixes a bug in the short term.
>
> Trond
>
>

Trond, I get your point, and we can certainly work on improving this
with the rdma developer community. But removing the one-line-broken
cast will resolve a current crash situation for 2.6.30. Can't we get
this fix in 2.6.30 and work on the API improvements for 2.6.31? I've
CCed Roland and the ofa general list to get everyone involved in this
thread so we can get this API design change going.

I agree we can clean this up moving forward, but lets fix the broken
2.6.30 code.

Will this work?

Steve.



2009-05-12 16:11:33

by Steve Wise

[permalink] [raw]
Subject: Re: [ofa-general] Re: [PATCH 2.6.30] xprtrdma: The frmr iova_start values are truncated by the nfs rdma client.

Steve Wise wrote:

>Trond Myklebust wrote (earlier in this thread):
>
> All I should need to know is that I can advertise either dma handles or
> kernel VAs, and know that I can choose between two functions, say,
> ib_send_wr_fastreg_dma_init() and ib_send_wr_fastreg_kva_init() to
> initialise the ib_send_wr structure correctly.


To align more with the rest of the fast_reg API in ib_verbs.h, I propose:

static inline void ib_init_fast_reg_iova_start_dma(struct ib_send_wr
*send_wr, dma_addr_t dma);
static inline void ib_init_fast_reg_iova_start_kva(struct ib_send_wr
*send_wr, void *kva);

Thoughts?



2009-05-12 16:23:29

by Steve Wise

[permalink] [raw]
Subject: Re: [ofa-general] Re: [PATCH 2.6.30] xprtrdma: The frmr iova_start values are truncated by the nfs rdma client.

Steve Wise wrote:
> Steve Wise wrote:
>
> >Trond Myklebust wrote (earlier in this thread):
> >
> > All I should need to know is that I can advertise either dma handles or
> > kernel VAs, and know that I can choose between two functions, say,
> > ib_send_wr_fastreg_dma_init() and ib_send_wr_fastreg_kva_init() to
> > initialise the ib_send_wr structure correctly.
>
>
> To align more with the rest of the fast_reg API in ib_verbs.h, I propose:
>
> static inline void ib_init_fast_reg_iova_start_dma(struct ib_send_wr
> *send_wr, dma_addr_t dma);
> static inline void ib_init_fast_reg_iova_start_kva(struct ib_send_wr
> *send_wr, void *kva);
>
> Thoughts?
>
>
uncompiled patch:

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index c179318..fb56930 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1940,6 +1940,30 @@ static inline void ib_update_fast_reg_key(struct
ib_mr *mr, u8 newkey)
}

/**
+ * ib_init_fast_reg_iova_start_dma - initializes the iova_start field
+ * based on a dma address supplied by the user.
+ * @wr - struct ib_send_wr pointer to be initialized
+ * @addr - dma_addr_t value to be used as the iova_start
+ */
+static inline void ib_init_fast_reg_iova_start_dma(struct ib_send_wr *wr,
+ dma_addr_t addr)
+{
+ wr->wr.fast_reg.iova_start = addr;
+}
+
+/**
+ * ib_init_fast_reg_iova_start_kva - initializes the iova_start field
+ * based on a kernel virtual address supplied by the user.
+ * @wr - struct ib_send_wr pointer to be initialized
+ * @addr - void * address to be used as the iova_start
+ */
+static inline void ib_init_fast_reg_iova_start_kva(struct ib_send_wr *wr,
+ void *addr)
+{
+ wr->wr.fast_reg.iova_start = (unsigned long)addr;
+}
+
+/**
* ib_alloc_mw - Allocates a memory window.
* @pd: The protection domain associated with the memory window.
*/


2009-05-13 21:35:16

by Roland Dreier

[permalink] [raw]
Subject: Re: [ofa-general] Re: [PATCH 2.6.30] xprtrdma: The frmr iova_start values are truncated by the nfs rdma client.

> Trond Myklebust wrote (earlier in this thread):
> >
> > All I should need to know is that I can advertise either dma handles or
> > kernel VAs, and know that I can choose between two functions, say,
> > ib_send_wr_fastreg_dma_init() and ib_send_wr_fastreg_kva_init() to
> > initialise the ib_send_wr structure correctly.

I skimmed the earlier thread, and I have to say that I don't quite see
what the problem with assigning things to a u64 directly is. You can
use any address you want, and I don't quite understand why using the
correct cast to avoid sign extension or truncation problems is such a
big maintenance burden?

The code below really just looks like obfuscation to me -- are we going
to want to add something like

/**
* ib_init_fast_reg_iova_start_u64 - initializes the iova_start field
* based on a 64-bit address supplied by the user.
* @wr - struct ib_send_wr pointer to be initialized
* @addr - void * address to be used as the iova_start
*/
static inline void ib_init_fast_reg_iova_start_kva(struct ib_send_wr *wr,
u64 addr)
{
wr->wr.fast_reg.iova_start = addr;
}

next, to make sure we don't get confused about assigning a u64 to a u64?
It all looks a bit overcomplicated to me.

- R.

> /**
> + * ib_init_fast_reg_iova_start_dma - initializes the iova_start field
> + * based on a dma address supplied by the user.
> + * @wr - struct ib_send_wr pointer to be initialized
> + * @addr - dma_addr_t value to be used as the iova_start
> + */
> +static inline void ib_init_fast_reg_iova_start_dma(struct ib_send_wr *wr,
> + dma_addr_t addr)
> +{
> + wr->wr.fast_reg.iova_start = addr;
> +}
> +
> +/**
> + * ib_init_fast_reg_iova_start_kva - initializes the iova_start field
> + * based on a kernel virtual address supplied by the user.
> + * @wr - struct ib_send_wr pointer to be initialized
> + * @addr - void * address to be used as the iova_start
> + */
> +static inline void ib_init_fast_reg_iova_start_kva(struct ib_send_wr *wr,
> + void *addr)
> +{
> + wr->wr.fast_reg.iova_start = (unsigned long)addr;
> +}
> +
> +/**
> * ib_alloc_mw - Allocates a memory window.
> * @pd: The protection domain associated with the memory window.
> */