2017-12-09 00:02:29

by Logan Gunthorpe

[permalink] [raw]
Subject: [PATCH 1/2] ntb_transport: Fix bug with max_mw_size parameter

When using the max_mw_size parameter of ntb_transport to limit the size of
the Memory windows, communication cannot be established and the queues
freeze.

This is because the mw_size that's reported to the peer is correctly
limited but the size used locally is not. So the MW is initialized
with a buffer smaller than the window but the TX side is using the
full window. This means the TX side will be writing to a region of the
window that points nowhere.

This is easily fixed by applying the same limit to tx_size in
ntb_transport_init_queue().

Fixes: e26a5843f7f5 ("NTB: Split ntb_hw_intel and ntb_transport drivers")
Signed-off-by: Logan Gunthorpe <[email protected]>
Cc: Jon Mason <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: Allen Hubbe <[email protected]>
---
drivers/ntb/ntb_transport.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/ntb/ntb_transport.c b/drivers/ntb/ntb_transport.c
index 045e3dd4750e..9878c48826e3 100644
--- a/drivers/ntb/ntb_transport.c
+++ b/drivers/ntb/ntb_transport.c
@@ -1003,6 +1003,9 @@ static int ntb_transport_init_queue(struct ntb_transport_ctx *nt,
mw_base = nt->mw_vec[mw_num].phys_addr;
mw_size = nt->mw_vec[mw_num].phys_size;

+ if (max_mw_size && mw_size > max_mw_size)
+ mw_size = max_mw_size;
+
tx_size = (unsigned int)mw_size / num_qps_mw;
qp_offset = tx_size * (qp_num / mw_count);

--
2.11.0


2017-12-09 00:02:31

by Logan Gunthorpe

[permalink] [raw]
Subject: [PATCH 2/2] ntb_hw_switchtec: Check for alignment of the buffer in mw_set_trans()

With Switchtec hardware, the buffer used for a memory window must be
aligned to its size (the hardware only replaces the lower bits). In
certain circumstances dma_alloc_coherent() will not provide a buffer
that adheres to this requirement like when using the CMA and
CONFIG_CMA_ALIGNMENT is set lower than the buffer size.

When we get an unaligned buffer mw_set_trans() should return an error.
We also log an error so we know the cause of the problem.

Signed-off-by: Logan Gunthorpe <[email protected]>
Cc: Jon Mason <[email protected]>
---
drivers/ntb/hw/mscc/ntb_hw_switchtec.c | 13 +++++++++++++
1 file changed, 13 insertions(+)

diff --git a/drivers/ntb/hw/mscc/ntb_hw_switchtec.c b/drivers/ntb/hw/mscc/ntb_hw_switchtec.c
index 709f37fbe232..984b83bc7dd3 100644
--- a/drivers/ntb/hw/mscc/ntb_hw_switchtec.c
+++ b/drivers/ntb/hw/mscc/ntb_hw_switchtec.c
@@ -315,6 +315,19 @@ static int switchtec_ntb_mw_set_trans(struct ntb_dev *ntb, int pidx, int widx,
if (xlate_pos < 12)
return -EINVAL;

+ if (addr & ((1 << xlate_pos) - 1)) {
+ /*
+ * In certain circumstances we can get a buffer that is
+ * not aligned to its size. (Most of the time
+ * dma_alloc_coherent ensures this). This can happen when
+ * using large buffers allocated by the CMA
+ * (see CMA_CONFIG_ALIGNMENT)
+ */
+ dev_err(&sndev->stdev->dev,
+ "ERROR: Memory window address is not aligned to it's size!\n");
+ return -EINVAL;
+ }
+
rc = switchtec_ntb_part_op(sndev, ctl, NTB_CTRL_PART_OP_LOCK,
NTB_CTRL_PART_STATUS_LOCKED);
if (rc)
--
2.11.0

2017-12-11 14:55:51

by Allen Hubbe

[permalink] [raw]
Subject: RE: [PATCH 1/2] ntb_transport: Fix bug with max_mw_size parameter

From: Logan Gunthorpe
> When using the max_mw_size parameter of ntb_transport to limit the size of
> the Memory windows, communication cannot be established and the queues
> freeze.
>
> This is because the mw_size that's reported to the peer is correctly
> limited but the size used locally is not. So the MW is initialized
> with a buffer smaller than the window but the TX side is using the
> full window. This means the TX side will be writing to a region of the
> window that points nowhere.
>
> This is easily fixed by applying the same limit to tx_size in
> ntb_transport_init_queue().
>
> Fixes: e26a5843f7f5 ("NTB: Split ntb_hw_intel and ntb_transport drivers")
> Signed-off-by: Logan Gunthorpe <[email protected]>
> Cc: Jon Mason <[email protected]>
> Cc: Dave Jiang <[email protected]>
> Cc: Allen Hubbe <[email protected]>

Acked-by: Allen Hubbe <[email protected]>

> ---
> drivers/ntb/ntb_transport.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/ntb/ntb_transport.c b/drivers/ntb/ntb_transport.c
> index 045e3dd4750e..9878c48826e3 100644
> --- a/drivers/ntb/ntb_transport.c
> +++ b/drivers/ntb/ntb_transport.c
> @@ -1003,6 +1003,9 @@ static int ntb_transport_init_queue(struct ntb_transport_ctx *nt,
> mw_base = nt->mw_vec[mw_num].phys_addr;
> mw_size = nt->mw_vec[mw_num].phys_size;
>
> + if (max_mw_size && mw_size > max_mw_size)
> + mw_size = max_mw_size;
> +
> tx_size = (unsigned int)mw_size / num_qps_mw;
> qp_offset = tx_size * (qp_num / mw_count);
>
> --
> 2.11.0

2017-12-11 14:58:56

by Allen Hubbe

[permalink] [raw]
Subject: RE: [PATCH 2/2] ntb_hw_switchtec: Check for alignment of the buffer in mw_set_trans()

From: Logan Gunthorpe
> With Switchtec hardware, the buffer used for a memory window must be
> aligned to its size (the hardware only replaces the lower bits). In
> certain circumstances dma_alloc_coherent() will not provide a buffer
> that adheres to this requirement like when using the CMA and
> CONFIG_CMA_ALIGNMENT is set lower than the buffer size.
>
> When we get an unaligned buffer mw_set_trans() should return an error.
> We also log an error so we know the cause of the problem.
>
> Signed-off-by: Logan Gunthorpe <[email protected]>
> Cc: Jon Mason <[email protected]>
> ---
> drivers/ntb/hw/mscc/ntb_hw_switchtec.c | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
>
> diff --git a/drivers/ntb/hw/mscc/ntb_hw_switchtec.c b/drivers/ntb/hw/mscc/ntb_hw_switchtec.c
> index 709f37fbe232..984b83bc7dd3 100644
> --- a/drivers/ntb/hw/mscc/ntb_hw_switchtec.c
> +++ b/drivers/ntb/hw/mscc/ntb_hw_switchtec.c
> @@ -315,6 +315,19 @@ static int switchtec_ntb_mw_set_trans(struct ntb_dev *ntb, int pidx, int widx,
> if (xlate_pos < 12)
> return -EINVAL;
>
> + if (addr & ((1 << xlate_pos) - 1)) {

!IS_ALIGNED(addr, BIT_ULL(xlate_pos))

> + /*
> + * In certain circumstances we can get a buffer that is
> + * not aligned to its size. (Most of the time
> + * dma_alloc_coherent ensures this). This can happen when
> + * using large buffers allocated by the CMA
> + * (see CMA_CONFIG_ALIGNMENT)
> + */
> + dev_err(&sndev->stdev->dev,
> + "ERROR: Memory window address is not aligned to it's size!\n");

This would be the only ntb hw driver that prints an error in this situation. The ntb_mw_get_align() should provide enough information to client drivers to determine the alignment requirements before calling ntb_mw_set_trans(). IMO no need to print here, but let's see what others say.

> + return -EINVAL;
> + }
> +
> rc = switchtec_ntb_part_op(sndev, ctl, NTB_CTRL_PART_OP_LOCK,
> NTB_CTRL_PART_STATUS_LOCKED);
> if (rc)
> --
> 2.11.0


2017-12-11 17:15:01

by Logan Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH 2/2] ntb_hw_switchtec: Check for alignment of the buffer in mw_set_trans()



On 11/12/17 07:58 AM, Allen Hubbe wrote:
> !IS_ALIGNED(addr, BIT_ULL(xlate_pos))
>

Ok, yes, that's much better. I'll change it. Thanks.

>> + /*
>> + * In certain circumstances we can get a buffer that is
>> + * not aligned to its size. (Most of the time
>> + * dma_alloc_coherent ensures this). This can happen when
>> + * using large buffers allocated by the CMA
>> + * (see CMA_CONFIG_ALIGNMENT)
>> + */
>> + dev_err(&sndev->stdev->dev,
>> + "ERROR: Memory window address is not aligned to it's size!\n");
>
> This would be the only ntb hw driver that prints an error in this situation. The ntb_mw_get_align() should provide enough information to client drivers to determine the alignment requirements before calling ntb_mw_set_trans(). IMO no need to print here, but let's see what others say.


mw_get_align doesn't communicate the fact that the buffer has to be
aligned by its size. It may also be that all hardware does not have this
restriction (ie. if the hardware adds to the base address instead of
just replacing the lower bits).

There is definitely a need to print this error somewhere as I hit this
case and it caused very weird behavior. It was a huge pain to debug.
Also, it's a security issue and huge bug if we end up mapping the memory
we didn't think we were mapping. I don't think it's a good idea for us
to require clients to check this as that requires a number of checks and
a client author may forget to add it to their driver. I'd maybe go with
a check in ntb_mw_set_trans before calling the driver, but that only
makes sense if all hardware has the same requirement.

Logan

2017-12-11 19:18:08

by Allen Hubbe

[permalink] [raw]
Subject: RE: [PATCH 2/2] ntb_hw_switchtec: Check for alignment of the buffer in mw_set_trans()

From: Logan Gunthorpe

> mw_get_align doesn't communicate the fact that the buffer has to be
> aligned by its size.

Is that not the purpose of the addr_align out parameter of ntb_mw_get_align()?

> It may also be that all hardware does not have this
> restriction (ie. if the hardware adds to the base address instead of
> just replacing the lower bits).
>
> There is definitely a need to print this error somewhere as I hit this
> case and it caused very weird behavior. It was a huge pain to debug.
> Also, it's a security issue and huge bug if we end up mapping the memory
> we didn't think we were mapping.

Of course the driver should validate its parameters not allow bad mappings. I was only commenting on the dev_err() message to the console.

What makes best sense for client drivers with regards to ntb api changes is a fair argument. Let's see what others say.

> I don't think it's a good idea for us
> to require clients to check this as that requires a number of checks and
> a client author may forget to add it to their driver. I'd maybe go with
> a check in ntb_mw_set_trans before calling the driver, but that only
> makes sense if all hardware has the same requirement.
>
> Logan


2017-12-11 19:28:39

by Logan Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH 2/2] ntb_hw_switchtec: Check for alignment of the buffer in mw_set_trans()



On 11/12/17 12:17 PM, Allen Hubbe wrote:
> From: Logan Gunthorpe
>
>> mw_get_align doesn't communicate the fact that the buffer has to be
>> aligned by its size.
>
> Is that not the purpose of the addr_align out parameter of ntb_mw_get_align()?

addr_align provides the minimum alignment required by the device but it
has no idea how big a buffer the caller is trying to create so it can't
express that it needs to be aligned by its size.

To be clear, the minimum alignment the Switchtec device requires is 4KB
so it will return 4k in addr_align. Thus, if you have a 4KB buffer it
may be aligned to 4KB. But if you have a 1MB buffer it must be aligned
to the nearest 1M.

>> It may also be that all hardware does not have this
>> restriction (ie. if the hardware adds to the base address instead of
>> just replacing the lower bits).
>>
>> There is definitely a need to print this error somewhere as I hit this
>> case and it caused very weird behavior. It was a huge pain to debug.
>> Also, it's a security issue and huge bug if we end up mapping the memory
>> we didn't think we were mapping.
>
> Of course the driver should validate its parameters not allow bad mappings. I was only commenting on the dev_err() message to the console.

Ok. I still feel like it would be difficult to debug if ntb_transport
simply was unable to establish a connection without some message in
dmesg telling the user why.

Also, keep in mind this is a somewhat unusual occurrence. In most cases
dma_alloc_coherent() always provides a buffer that is aligned to it's
size. It's just that the CMA (if used) provides a tunable config option
which allows for larger buffers to not be aligned to their size.

Logan

2017-12-11 20:07:01

by Allen Hubbe

[permalink] [raw]
Subject: RE: [PATCH 2/2] ntb_hw_switchtec: Check for alignment of the buffer in mw_set_trans()

From: Logan Gunthorpe
> On 11/12/17 12:17 PM, Allen Hubbe wrote:
> >> mw_get_align doesn't communicate the fact that the buffer has to be
> >> aligned by its size.
> >
> > Is that not the purpose of the addr_align out parameter of ntb_mw_get_align()?
>
> addr_align provides the minimum alignment required by the device but it
> has no idea how big a buffer the caller is trying to create so it can't
> express that it needs to be aligned by its size.
>
> To be clear, the minimum alignment the Switchtec device requires is 4KB
> so it will return 4k in addr_align. Thus, if you have a 4KB buffer it
> may be aligned to 4KB. But if you have a 1MB buffer it must be aligned
> to the nearest 1M.

In switchtec_ntb_mw_get_align, for the lut case it seems to require alignment the same as Intel, aligned to mw size, but for the non-lut case you are saying that SZ_4K is not necessarily correct. The SZ_4K is the minimum, but the actual alignment restriction depends on the size of the buffer actually translated. Right?

Also, for the lut case, it looks like the size also has to be the same size as the mw. So, a client can't allocate a smaller buffer, assume we can get one that is aligned, point the start of the mw at it, and limit the size of the mw?

For the non-lut case I wonder, with the restriction that addr needs to be aligned to the size of the buffer, does the size of the buffer also need to be some power of two? That would make sense, if it determines the alignment. If so, SZ_4K wouldn't be correct for size_align, either.

Do you need the intended buffer size passed in as another parameter to ntb_mw_get_align? The point of ntb_mw_get_align is to figure out all the alignment restrictions before allocating memory.

> >> It may also be that all hardware does not have this
> >> restriction (ie. if the hardware adds to the base address instead of
> >> just replacing the lower bits).
> >>
> >> There is definitely a need to print this error somewhere as I hit this
> >> case and it caused very weird behavior. It was a huge pain to debug.
> >> Also, it's a security issue and huge bug if we end up mapping the memory
> >> we didn't think we were mapping.
> >
> > Of course the driver should validate its parameters not allow bad mappings. I was only commenting
> on the dev_err() message to the console.
>
> Ok. I still feel like it would be difficult to debug if ntb_transport
> simply was unable to establish a connection without some message in
> dmesg telling the user why.
>
> Also, keep in mind this is a somewhat unusual occurrence. In most cases
> dma_alloc_coherent() always provides a buffer that is aligned to it's
> size. It's just that the CMA (if used) provides a tunable config option
> which allows for larger buffers to not be aligned to their size.
>
> Logan


2017-12-11 20:38:37

by Logan Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH 2/2] ntb_hw_switchtec: Check for alignment of the buffer in mw_set_trans()



On 11/12/17 01:06 PM, Allen Hubbe wrote:
> In switchtec_ntb_mw_get_align, for the lut case it seems to require alignment the same as Intel, aligned to mw size, but for the non-lut case you are saying that SZ_4K is not necessarily correct. The SZ_4K is the minimum, but the actual alignment restriction depends on the size of the buffer actually translated. Right?

Yes, that is correct.

> Also, for the lut case, it looks like the size also has to be the same size as the mw. So, a client can't allocate a smaller buffer, assume we can get one that is aligned, point the start of the mw at it, and limit the size of the mw?

The LUT case is much simpler. The size must be exactly 64K (as chosen by
the driver... it may be a module parameter later) and therefore the
alignment must also be exactly 64k.

> For the non-lut case I wonder, with the restriction that addr needs to be aligned to the size of the buffer, does the size of the buffer also need to be some power of two? That would make sense, if it determines the alignment. If so, SZ_4K wouldn't be correct for size_align, either.

It would be weird not to make the size a power of two but this is not a
requirement of the hardware. The alignment must be the next highest
power of two after the size. For example, you could have a 768KB buffer
but it would need to be aligned to 1M. This is how dma_alloc_coherent()
behaves as well.

Think of it this way: in the hardware we program the number of
translation bits (xlate_pos in the code). That number of low bits in the
destination address are replaced with the same bits in the source
address. So if any of the translated bits in the destination address
were not zero, they will be after the replacement. Do you know if Intel
hardware does something similar?

> Do you need the intended buffer size passed in as another parameter to ntb_mw_get_align? The point of ntb_mw_get_align is to figure out all the alignment restrictions before allocating memory.

We could, but I don't really see the point of doing that. There's really
nothing the client would/could do differently if we added that. Remember
this restriction is already (mostly) satisfied by dma_alloc_coherent and
if that wasn't the case then all the tricks that the client currently
does to try and obey ntb_mw_get_align would not work.

Actually, if we were going to change anything, I'd suggest creating an
API that's something like:

void *ntb_mw_alloc(struct ntb_dev *ntb, int pidx, int widx,
size_t buf_size, dma_addr_t *dma_addr, int flags);

This would do the DMA allocation and adjust the size as necessary to
satisfy the alignment requirements.

Then there would be a common place that enforces the alignment issues
instead of expecting every client to get that right. Takes a lot of the
boiler plate out of the clients as well.

Logan