Current code is assuming that the address returned by dma_alloc_coherent
is a logical address. This is not true on ARM/ARM64 systems. This patch
replaces dma_alloc_coherent with dma_map_page API. The address returned
can later by virtually mapped from the CPU side with vmap API.
Signed-off-by: Sinan Kaya <[email protected]>
---
drivers/net/ethernet/mellanox/mlx4/alloc.c | 37 ++++++++++++++++++++++--------
1 file changed, 27 insertions(+), 10 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/alloc.c b/drivers/net/ethernet/mellanox/mlx4/alloc.c
index 0c51c69..22a7ae7 100644
--- a/drivers/net/ethernet/mellanox/mlx4/alloc.c
+++ b/drivers/net/ethernet/mellanox/mlx4/alloc.c
@@ -618,13 +618,26 @@ int mlx4_buf_alloc(struct mlx4_dev *dev, int size, int max_direct,
return -ENOMEM;
for (i = 0; i < buf->nbufs; ++i) {
- buf->page_list[i].buf =
- dma_alloc_coherent(&dev->persist->pdev->dev,
- PAGE_SIZE,
- &t, gfp);
- if (!buf->page_list[i].buf)
+ struct page *page;
+
+ page = alloc_page(GFP_KERNEL);
+ if (!page)
goto err_free;
+ t = dma_map_page(&dev->persist->pdev->dev, page, 0,
+ PAGE_SIZE, DMA_BIDIRECTIONAL);
+
+ if (dma_mapping_error(&dev->persist->pdev->dev, t)) {
+ __free_page(page);
+ goto err_free;
+ }
+
+ buf->page_list[i].buf = page_address(page);
+ if (!buf->page_list[i].buf) {
+ __free_page(page);
+ goto err_free;
+ }
+
buf->page_list[i].map = t;
memset(buf->page_list[i].buf, 0, PAGE_SIZE);
@@ -666,11 +679,15 @@ void mlx4_buf_free(struct mlx4_dev *dev, int size, struct mlx4_buf *buf)
vunmap(buf->direct.buf);
for (i = 0; i < buf->nbufs; ++i)
- if (buf->page_list[i].buf)
- dma_free_coherent(&dev->persist->pdev->dev,
- PAGE_SIZE,
- buf->page_list[i].buf,
- buf->page_list[i].map);
+ if (buf->page_list[i].buf) {
+ struct page *page;
+
+ page = virt_to_page(buf->page_list[i].buf);
+ dma_unmap_page(&dev->persist->pdev->dev,
+ buf->page_list[i].map,
+ PAGE_SIZE, DMA_BIDIRECTIONAL);
+ __free_page(page);
+ }
kfree(buf->page_list);
}
}
--
1.8.2.1
From: Sinan Kaya <[email protected]>
Date: Sat, 16 Apr 2016 18:23:32 -0400
> Current code is assuming that the address returned by dma_alloc_coherent
> is a logical address. This is not true on ARM/ARM64 systems. This patch
> replaces dma_alloc_coherent with dma_map_page API. The address returned
> can later by virtually mapped from the CPU side with vmap API.
>
> Signed-off-by: Sinan Kaya <[email protected]>
You can't do this.
The DMA map page API gives non-coherent mappings, and thus requires
proper flushing.
So a straight conversion like this is never legitimate.
On 2016-04-18 00:00, David Miller wrote:
> From: Sinan Kaya <[email protected]>
> Date: Sat, 16 Apr 2016 18:23:32 -0400
>
>> Current code is assuming that the address returned by
>> dma_alloc_coherent
>> is a logical address. This is not true on ARM/ARM64 systems. This
>> patch
>> replaces dma_alloc_coherent with dma_map_page API. The address
>> returned
>> can later by virtually mapped from the CPU side with vmap API.
>>
>> Signed-off-by: Sinan Kaya <[email protected]>
>
> You can't do this.
>
> The DMA map page API gives non-coherent mappings, and thus requires
> proper flushing.
>
> So a straight conversion like this is never legitimate.
I would agree on proper dma api usage. However, the code is already
assuming coherent architecture by mapping the cpu pages as page_kernel.
Dma_map_page returns cached buffers and you don't need cache flushes on
coherent architecture to make the data visible.
Sinan,
if we get rid of the part this code:
if (BITS_PER_LONG == 64) {
struct page **pages;
pages = kmalloc(sizeof *pages * buf->nbufs, gfp);
if (!pages)
goto err_free;
...
...
if (!buf->direct.buf)
goto err_free;
}
Does that solve the arm issue?
The other parts of the driver using the allocated buffer can still
work with list of coherent chunks.
On Sat, Apr 16, 2016 at 06:23:32PM -0400, Sinan Kaya wrote:
> Current code is assuming that the address returned by dma_alloc_coherent
> is a logical address. This is not true on ARM/ARM64 systems. This patch
> replaces dma_alloc_coherent with dma_map_page API. The address returned
> can later by virtually mapped from the CPU side with vmap API.
>
> Signed-off-by: Sinan Kaya <[email protected]>
> ---
> drivers/net/ethernet/mellanox/mlx4/alloc.c | 37 ++++++++++++++++++++++--------
> 1 file changed, 27 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx4/alloc.c b/drivers/net/ethernet/mellanox/mlx4/alloc.c
> index 0c51c69..22a7ae7 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/alloc.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/alloc.c
> @@ -618,13 +618,26 @@ int mlx4_buf_alloc(struct mlx4_dev *dev, int size, int max_direct,
> return -ENOMEM;
>
> for (i = 0; i < buf->nbufs; ++i) {
> - buf->page_list[i].buf =
> - dma_alloc_coherent(&dev->persist->pdev->dev,
> - PAGE_SIZE,
> - &t, gfp);
> - if (!buf->page_list[i].buf)
> + struct page *page;
> +
> + page = alloc_page(GFP_KERNEL);
> + if (!page)
> goto err_free;
>
> + t = dma_map_page(&dev->persist->pdev->dev, page, 0,
> + PAGE_SIZE, DMA_BIDIRECTIONAL);
> +
> + if (dma_mapping_error(&dev->persist->pdev->dev, t)) {
> + __free_page(page);
> + goto err_free;
> + }
> +
> + buf->page_list[i].buf = page_address(page);
> + if (!buf->page_list[i].buf) {
> + __free_page(page);
> + goto err_free;
> + }
> +
> buf->page_list[i].map = t;
>
> memset(buf->page_list[i].buf, 0, PAGE_SIZE);
> @@ -666,11 +679,15 @@ void mlx4_buf_free(struct mlx4_dev *dev, int size, struct mlx4_buf *buf)
> vunmap(buf->direct.buf);
>
> for (i = 0; i < buf->nbufs; ++i)
> - if (buf->page_list[i].buf)
> - dma_free_coherent(&dev->persist->pdev->dev,
> - PAGE_SIZE,
> - buf->page_list[i].buf,
> - buf->page_list[i].map);
> + if (buf->page_list[i].buf) {
> + struct page *page;
> +
> + page = virt_to_page(buf->page_list[i].buf);
> + dma_unmap_page(&dev->persist->pdev->dev,
> + buf->page_list[i].map,
> + PAGE_SIZE, DMA_BIDIRECTIONAL);
> + __free_page(page);
> + }
> kfree(buf->page_list);
> }
> }
> --
> 1.8.2.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, Apr 16, 2016 at 06:23:32PM -0400, Sinan Kaya wrote:
> Current code is assuming that the address returned by dma_alloc_coherent
> is a logical address. This is not true on ARM/ARM64 systems.
Can you explain what you mean with a 'logical address' and what actual
issue you're trying to solve?
On 2016-04-18 08:12, Christoph Hellwig wrote:
> On Sat, Apr 16, 2016 at 06:23:32PM -0400, Sinan Kaya wrote:
>> Current code is assuming that the address returned by
>> dma_alloc_coherent
>> is a logical address. This is not true on ARM/ARM64 systems.
>
> Can you explain what you mean with a 'logical address' and what actual
> issue you're trying to solve?
Vmap call is failing on arm64 systems because dma alloc api already
returns an address mapped with vmap.
Please see arch/arm64/mm directory.
On Mon, Apr 18, 2016 at 09:06:18AM -0400, [email protected] wrote:
> On 2016-04-18 08:12, Christoph Hellwig wrote:
> >On Sat, Apr 16, 2016 at 06:23:32PM -0400, Sinan Kaya wrote:
> >>Current code is assuming that the address returned by dma_alloc_coherent
> >>is a logical address. This is not true on ARM/ARM64 systems.
> >
> >Can you explain what you mean with a 'logical address' and what actual
> >issue you're trying to solve?
>
> Vmap call is failing on arm64 systems because dma alloc api already returns
> an address mapped with vmap.
Please state your problem clearly. What I'm reverse engineering from
your posts is: because dma_alloc_coherent uses vmap-like mappings on
arm64 (all, some systems?) a driver using a lot of them might run into
limits of the vmap pool size.
Is this correct?
>
> Please see arch/arm64/mm directory.
---end quoted text---
On 4/18/2016 9:10 AM, Christoph Hellwig wrote:
> On Mon, Apr 18, 2016 at 09:06:18AM -0400, [email protected] wrote:
>> On 2016-04-18 08:12, Christoph Hellwig wrote:
>>> On Sat, Apr 16, 2016 at 06:23:32PM -0400, Sinan Kaya wrote:
>>>> Current code is assuming that the address returned by dma_alloc_coherent
>>>> is a logical address. This is not true on ARM/ARM64 systems.
>>>
>>> Can you explain what you mean with a 'logical address' and what actual
>>> issue you're trying to solve?
>>
Here is a good description of logical address vs. virtual address.
https://www.quora.com/What-is-the-Kernel-logical-and-virtual-addresses-What-is-the-difference-between-them-What-is-the-type-of-addresses-listed-in-the-System-map
>> Vmap call is failing on arm64 systems because dma alloc api already returns
>> an address mapped with vmap.
>
> Please state your problem clearly. What I'm reverse engineering from
> your posts is: because dma_alloc_coherent uses vmap-like mappings on
> arm64 (all, some systems?)
All arm64 systems.
>a driver using a lot of them might run into
> limits of the vmap pool size.
>
> Is this correct?
>
No, the driver is plain broken without this patch. It causes a kernel panic
during driver probe.
This is the definition of vmap API.
https://www.kernel.org/doc/htmldocs/kernel-api/API-vmap.html
VMAP allows you to make several pages look contiguous to the CPU.
It can only be used against logical addresses returned from kmalloc
or alloc_page.
You cannot take several virtually mapped addresses returned by dma_alloc_coherent
and try to make them virtually contiguous again.
The code happens to work on other architectures by pure luck. AFAIK, dma_alloc_coherent
returns logical addresses on Intel systems until it runs out of DMA memory. After
that intel arch will also start returning virtually mapped addresses and this code
will also fail. ARM64 on the other hand always returns a virtually mapped address.
The goal of this code is to allocate a bunch of page sized memory and make it look
contiguous. It is just using the wrong API. The correct API is either kmalloc or
alloc_page map it with dma_map_page not dma_alloc_coherent.
The proper usage of dma_map_page requires code to call dma_sync API in correct
places to be compatible with noncoherent systems. This code is already assuming
coherency. It would be a nice to have dma_sync APIs in right places. There is no
harm in calling dma_sync API for coherent systems as they are no-ops in DMA mapping
layer whereas it is a cache flush for non-coherent systems.
>>
>> Please see arch/arm64/mm directory.
> ---end quoted text---
>
I hope it is clear now. The previous email was the most I could type on my phone.
--
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
On 4/18/2016 2:54 AM, Eli Cohen wrote:
> Sinan,
>
> if we get rid of the part this code:
>
> if (BITS_PER_LONG == 64) {
> struct page **pages;
> pages = kmalloc(sizeof *pages * buf->nbufs, gfp);
> if (!pages)
> goto err_free;
> ...
> ...
> if (!buf->direct.buf)
> goto err_free;
> }
>
> Does that solve the arm issue?
I will test. As far as I know, there is one more place these DMA addresses
are called with vmap. This is in mlx4_en_map_buffer.
I was trying to rearrange the allocation so that vmap actually works.
What do you think about mlx4_en_map_buffer?
--
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
On Mon, Apr 18, 2016 at 09:49:10AM -0400, Sinan Kaya wrote:
> Here is a good description of logical address vs. virtual address.
>
> https://www.quora.com/What-is-the-Kernel-logical-and-virtual-addresses-What-is-the-difference-between-them-What-is-the-type-of-addresses-listed-in-the-System-map
That's not how we use the terms in Linux. But it's not really the point
of my question either.
> > Is this correct?
> >
> No, the driver is plain broken without this patch. It causes a kernel panic
> during driver probe.
>
> This is the definition of vmap API.
>
> https://www.kernel.org/doc/htmldocs/kernel-api/API-vmap.html
Thanks for the pointer, but I'm actually the person who introduced vmap
to Linux a long time ago, and this is once again not my question.
> You cannot take several virtually mapped addresses returned by dma_alloc_coherent
> and try to make them virtually contiguous again.
But now we're getting closer to the issue: the mlx4_en driver is using
vmap on buffers allocated using dma_alloc_coherent if on a 64-bit
architecture, and that's obviously broken.
Now the big quetions is: why does it do that, given that
dma_alloc_coherent can be used for high order allocations anyway (and in
fact many architectures implement is using a version of vmap).
Let's get some answers on these question from the Mellanox folks and
work from there.
Sure, this is not the complete patch. As far as I know the problem
you're facing with arm is that virt_to_page() does not provide the
correct page descriptor so my suggestion will eliminate the need for
it.
On Mon, Apr 18, 2016 at 09:53:30AM -0400, Sinan Kaya wrote:
> On 4/18/2016 2:54 AM, Eli Cohen wrote:
> > Sinan,
> >
> > if we get rid of the part this code:
> >
> > if (BITS_PER_LONG == 64) {
> > struct page **pages;
> > pages = kmalloc(sizeof *pages * buf->nbufs, gfp);
> > if (!pages)
> > goto err_free;
> > ...
> > ...
> > if (!buf->direct.buf)
> > goto err_free;
> > }
> >
> > Does that solve the arm issue?
>
> I will test. As far as I know, there is one more place these DMA addresses
> are called with vmap. This is in mlx4_en_map_buffer.
>
> I was trying to rearrange the allocation so that vmap actually works.
>
> What do you think about mlx4_en_map_buffer?
>
>
> --
> Sinan Kaya
> Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Apr 18, 2016 at 09:54:47AM +0300, Eli Cohen wrote:
> Sinan,
>
> if we get rid of the part this code:
>
> if (BITS_PER_LONG == 64) {
> struct page **pages;
> pages = kmalloc(sizeof *pages * buf->nbufs, gfp);
> if (!pages)
> goto err_free;
> ...
> ...
> if (!buf->direct.buf)
> goto err_free;
> }
>
> Does that solve the arm issue?
Not quite. You still have code in mlx4_en_map_buffer that performs
this mapping later if it it wasn't mapped in mlx4_buf_alloc.
You'll need to get rid of that by ensuring max_direct for all the cases
currently using mlx4_en_map_buffer as well.
Right, I did not suggest this as a patch but just wanted to pinpoint the problematic issue which is that virt_to_page does not give you the correct pointer to the page.
-----Original Message-----
From: Christoph Hellwig [mailto:[email protected]]
Sent: Monday, April 18, 2016 9:33 AM
To: Eli Cohen <[email protected]>
Cc: Sinan Kaya <[email protected]>; [email protected]; [email protected]; [email protected]; Yishai Hadas <[email protected]>; [email protected]; [email protected]
Subject: Re: [PATCH V2] net: ethernet: mellanox: correct page conversion
On Mon, Apr 18, 2016 at 09:54:47AM +0300, Eli Cohen wrote:
> Sinan,
>
> if we get rid of the part this code:
>
> if (BITS_PER_LONG == 64) {
> struct page **pages;
> pages = kmalloc(sizeof *pages * buf->nbufs, gfp);
> if (!pages)
> goto err_free;
> ...
> ...
> if (!buf->direct.buf)
> goto err_free;
> }
>
> Does that solve the arm issue?
Not quite. You still have code in mlx4_en_map_buffer that performs this mapping later if it it wasn't mapped in mlx4_buf_alloc.
You'll need to get rid of that by ensuring max_direct for all the cases currently using mlx4_en_map_buffer as well.
Sinan Kaya wrote:
>
> VMAP allows you to make several pages look contiguous to the CPU.
> It can only be used against logical addresses returned from kmalloc
> or alloc_page.
>
> You cannot take several virtually mapped addresses returned by dma_alloc_coherent
> and try to make them virtually contiguous again.
>
> The code happens to work on other architectures by pure luck. AFAIK, dma_alloc_coherent
> returns logical addresses on Intel systems until it runs out of DMA memory. After
> that intel arch will also start returning virtually mapped addresses and this code
> will also fail. ARM64 on the other hand always returns a virtually mapped address.
>
> The goal of this code is to allocate a bunch of page sized memory and make it look
> contiguous. It is just using the wrong API. The correct API is either kmalloc or
> alloc_page map it with dma_map_page not dma_alloc_coherent.
>
> The proper usage of dma_map_page requires code to call dma_sync API in correct
> places to be compatible with noncoherent systems. This code is already assuming
> coherency. It would be a nice to have dma_sync APIs in right places. There is no
> harm in calling dma_sync API for coherent systems as they are no-ops in DMA mapping
> layer whereas it is a cache flush for non-coherent systems.
The text would be a great addition to the patch description.
--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora
Forum, a Linux Foundation collaborative project.
On Mon, Apr 18, 2016 at 02:39:36PM +0000, Eli Cohen wrote:
> Right, I did not suggest this as a patch but just wanted to pinpoint the problematic issue which is that virt_to_page does not give you the correct pointer to the page.
Removing both the virt_to_page + vmap calls would solve the issue
indeed.
On 4/18/2016 11:17 AM, Christoph Hellwig wrote:
> On Mon, Apr 18, 2016 at 02:39:36PM +0000, Eli Cohen wrote:
>> Right, I did not suggest this as a patch but just wanted to pinpoint the problematic issue which is that virt_to_page does not give you the correct pointer to the page.
>
> Removing both the virt_to_page + vmap calls would solve the issue
> indeed.
>
I was looking at the code. I don't see how removing virt_to_page + vmap
would solve the issue.
The code is trying to access the buffer space with direct.buf member
from the CPU side. This member would become NULL, when this code is
removed and also in mlx4_en_map_buffer.
if (BITS_PER_LONG == 64) {
struct page **pages;
pages = kmalloc(sizeof *pages * buf->nbufs, gfp);
if (!pages)
goto err_free;
...
...
if (!buf->direct.buf)
goto err_free;
}
drivers/net/ethernet/mellanox/mlx4/en_tx.c (2 hits)
Line 110: ring->buf = ring->wqres.buf.direct.buf;
Line 114: (unsigned long long) ring->wqres.buf.direct.map);
drivers/net/ethernet/mellanox/mlx4/en_rx.c (1 hit)
Line 404: ring->buf = ring->wqres.buf.direct.buf;
drivers/net/ethernet/mellanox/mlx4/en_cq.c (1 hit)
Line 85: cq->buf = (struct mlx4_cqe *)cq->wqres.buf.direct.buf;
What am I missing?
--
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
On 4/18/2016 11:15 AM, Timur Tabi wrote:
> Sinan Kaya wrote:
>>
>> VMAP allows you to make several pages look contiguous to the CPU.
>> It can only be used against logical addresses returned from kmalloc
>> or alloc_page.
>>
>> You cannot take several virtually mapped addresses returned by dma_alloc_coherent
>> and try to make them virtually contiguous again.
>>
>> The code happens to work on other architectures by pure luck. AFAIK, dma_alloc_coherent
>> returns logical addresses on Intel systems until it runs out of DMA memory. After
>> that intel arch will also start returning virtually mapped addresses and this code
>> will also fail. ARM64 on the other hand always returns a virtually mapped address.
>>
>> The goal of this code is to allocate a bunch of page sized memory and make it look
>> contiguous. It is just using the wrong API. The correct API is either kmalloc or
>> alloc_page map it with dma_map_page not dma_alloc_coherent.
>>
>> The proper usage of dma_map_page requires code to call dma_sync API in correct
>> places to be compatible with noncoherent systems. This code is already assuming
>> coherency. It would be a nice to have dma_sync APIs in right places. There is no
>> harm in calling dma_sync API for coherent systems as they are no-ops in DMA mapping
>> layer whereas it is a cache flush for non-coherent systems.
>
> The text would be a great addition to the patch description.
>
I can do that on the next version.
--
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
On Mon, Apr 18, 2016 at 11:21:12AM -0400, Sinan Kaya wrote:
> I was looking at the code. I don't see how removing virt_to_page + vmap
> would solve the issue.
>
> The code is trying to access the buffer space with direct.buf member
> from the CPU side. This member would become NULL, when this code is
> removed and also in mlx4_en_map_buffer.
>
> ...
>
> What am I missing?
As mentioned before you'll also need to enforce you hit the nbufs = 1
case for these. In fact most callers should simply switch to a plain
dma_zalloc_coherent call without all these wrappers. If we have a case
where we really want multiple buffers that don't have to be contiguous
(maybe the MTT case) I'd rather opencode that instead of building this
confusing interface on top of it.
From: [email protected]
Date: Mon, 18 Apr 2016 01:06:27 -0400
> On 2016-04-18 00:00, David Miller wrote:
>> From: Sinan Kaya <[email protected]>
>> Date: Sat, 16 Apr 2016 18:23:32 -0400
>>
>>> Current code is assuming that the address returned by
>>> dma_alloc_coherent
>>> is a logical address. This is not true on ARM/ARM64 systems. This
>>> patch
>>> replaces dma_alloc_coherent with dma_map_page API. The address
>>> returned
>>> can later by virtually mapped from the CPU side with vmap API.
>>> Signed-off-by: Sinan Kaya <[email protected]>
>> You can't do this.
>> The DMA map page API gives non-coherent mappings, and thus requires
>> proper flushing.
>> So a straight conversion like this is never legitimate.
>
> I would agree on proper dma api usage. However, the code is already
> assuming coherent architecture by mapping the cpu pages as
> page_kernel.
>
> Dma_map_page returns cached buffers and you don't need cache flushes
> on coherent architecture to make the data visible.
All you are telling me is that there are two bugs instead of one, so now
both need to be fixed.
On 4/18/2016 11:59 AM, David Miller wrote:
> From: [email protected]
> Date: Mon, 18 Apr 2016 01:06:27 -0400
>
>> On 2016-04-18 00:00, David Miller wrote:
>>> From: Sinan Kaya <[email protected]>
>>> Date: Sat, 16 Apr 2016 18:23:32 -0400
>>>
>>>> Current code is assuming that the address returned by
>>>> dma_alloc_coherent
>>>> is a logical address. This is not true on ARM/ARM64 systems. This
>>>> patch
>>>> replaces dma_alloc_coherent with dma_map_page API. The address
>>>> returned
>>>> can later by virtually mapped from the CPU side with vmap API.
>>>> Signed-off-by: Sinan Kaya <[email protected]>
>>> You can't do this.
>>> The DMA map page API gives non-coherent mappings, and thus requires
>>> proper flushing.
>>> So a straight conversion like this is never legitimate.
>>
>> I would agree on proper dma api usage. However, the code is already
>> assuming coherent architecture by mapping the cpu pages as
>> page_kernel.
>>
>> Dma_map_page returns cached buffers and you don't need cache flushes
>> on coherent architecture to make the data visible.
>
> All you are telling me is that there are two bugs instead of one, so now
> both need to be fixed.
>
The removal of vmap also fixes the coherency assumption. It is one fix
for both.
I was thinking of submitting another patch to change the vmap argument
PAGE_KERNEL based on the coherency support of the architecture. I don't need
to do that anymore if the other experiment works.
--
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
On 4/18/2016 11:40 AM, Christoph Hellwig wrote:
> On Mon, Apr 18, 2016 at 11:21:12AM -0400, Sinan Kaya wrote:
>> I was looking at the code. I don't see how removing virt_to_page + vmap
>> would solve the issue.
>>
>> The code is trying to access the buffer space with direct.buf member
>> from the CPU side. This member would become NULL, when this code is
>> removed and also in mlx4_en_map_buffer.
>>
>> ...
>>
>> What am I missing?
>
> As mentioned before you'll also need to enforce you hit the nbufs = 1
> case for these. In fact most callers should simply switch to a plain
> dma_zalloc_coherent call without all these wrappers. If we have a case
> where we really want multiple buffers that don't have to be contiguous
> (maybe the MTT case) I'd rather opencode that instead of building this
> confusing interface on top of it.
>
I hit the first problem with CQE. The alloc routine is allocating pages
but CQE code is trying to do linear access with direct buf member.
I see that this code implements page_list support. I'd like to do the same
thing for CQE. Let me know if I'm in the right path.
static struct mlx4_eqe *get_eqe(struct mlx4_eq *eq, u32 entry, u8 eqe_factor,
u8 eqe_size)
--
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
On 4/18/2016 11:40 AM, Christoph Hellwig wrote:
> On Mon, Apr 18, 2016 at 11:21:12AM -0400, Sinan Kaya wrote:
>> I was looking at the code. I don't see how removing virt_to_page + vmap
>> would solve the issue.
>>
>> The code is trying to access the buffer space with direct.buf member
>> from the CPU side. This member would become NULL, when this code is
>> removed and also in mlx4_en_map_buffer.
>>
>> ...
>>
>> What am I missing?
>
> As mentioned before you'll also need to enforce you hit the nbufs = 1
> case for these. In fact most callers should simply switch to a plain
> dma_zalloc_coherent call without all these wrappers. If we have a case
> where we really want multiple buffers that don't have to be contiguous
> (maybe the MTT case) I'd rather opencode that instead of building this
> confusing interface on top of it.
>
So, I did my fair share of investigation. As I pointed out in my previous email,
the code is allocating a bunch of page sized arrays and using them for receive,
transmit and control descriptors.
I'm unable to limit nbufs to 1 because, none of these allocations make a single
contiguous allocation by default. They all go to multiple page approach due to
2 * PAGE_SIZE max_direct parameter passed.
I tried changing the code to handle page_list vs. single allocation. I was able
to do this for CQE and receive queue since both of them allocate fixed size chunks.
However, I couldn't do this for the transmit queue.
The transmit code uses the array of descriptors for variable sized transfers and
it also assumes that the descriptors are contiguous.
When used with pages, one tx data can spill beyond the first page and do illegal
writes.
In the end, my proposed code in this patch is much simpler than what I tried to
achieve by removing vmap API.
Another alternative is to force code use single DMA alloc for all 64 bit architectures.
Something like this:
-- a/drivers/net/ethernet/mellanox/mlx4/alloc.c
+++ b/drivers/net/ethernet/mellanox/mlx4/alloc.c
@@ -588,7 +588,7 @@ int mlx4_buf_alloc(struct mlx4_dev *dev, int size, int max_direct,
{
dma_addr_t t;
- if (size <= max_direct) {
+ if ((size <= max_direct) || (BITS_PER_LONG == 64)){
buf->nbufs = 1;
buf->npages = 1;
buf->page_shift = get_order(size) + PAGE_SHIFT;
This also works on arm64. My proposal is more scalable for memory consumption compared
to this one.
--
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
What I think we need is something like the patch below. In the long
ru nwe should also kill the mlx4_buf structure which now is pretty
pointless.
---
>From a493881d2a6c90152d3daabb7b6b3afd1d254d78 Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <[email protected]>
Date: Tue, 19 Apr 2016 14:12:14 -0400
Subject: mlx4_en: don't try to split and vmap dma coherent allocations
The memory returned by dma_alloc_coherent is not suitable for calling
virt_to_page on it, as it might for example come from vmap allocator.
Remove the code that calls virt_to_page and vmap on dma coherent
allocations from the mlx4 drivers, and replace them with a single
high-order dma_alloc_coherent call.
Signed-off-by: Christoph Hellwig <[email protected]>
Reported-by: Sinan Kaya <[email protected]>
---
drivers/net/ethernet/mellanox/mlx4/alloc.c | 89 ++++-------------------
drivers/net/ethernet/mellanox/mlx4/en_cq.c | 7 --
drivers/net/ethernet/mellanox/mlx4/en_resources.c | 31 --------
drivers/net/ethernet/mellanox/mlx4/en_rx.c | 8 --
drivers/net/ethernet/mellanox/mlx4/en_tx.c | 9 ---
drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 2 -
drivers/net/ethernet/mellanox/mlx4/mr.c | 5 +-
include/linux/mlx4/device.h | 8 +-
8 files changed, 16 insertions(+), 143 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/alloc.c b/drivers/net/ethernet/mellanox/mlx4/alloc.c
index 0c51c69..8d15b35 100644
--- a/drivers/net/ethernet/mellanox/mlx4/alloc.c
+++ b/drivers/net/ethernet/mellanox/mlx4/alloc.c
@@ -588,91 +588,30 @@ int mlx4_buf_alloc(struct mlx4_dev *dev, int size, int max_direct,
{
dma_addr_t t;
- if (size <= max_direct) {
- buf->nbufs = 1;
- buf->npages = 1;
- buf->page_shift = get_order(size) + PAGE_SHIFT;
- buf->direct.buf = dma_alloc_coherent(&dev->persist->pdev->dev,
- size, &t, gfp);
- if (!buf->direct.buf)
- return -ENOMEM;
-
- buf->direct.map = t;
-
- while (t & ((1 << buf->page_shift) - 1)) {
- --buf->page_shift;
- buf->npages *= 2;
- }
+ buf->npages = 1;
+ buf->page_shift = get_order(size) + PAGE_SHIFT;
+ buf->direct.buf = dma_alloc_coherent(&dev->persist->pdev->dev,
+ size, &t, gfp);
+ if (!buf->direct.buf)
+ return -ENOMEM;
- memset(buf->direct.buf, 0, size);
- } else {
- int i;
-
- buf->direct.buf = NULL;
- buf->nbufs = (size + PAGE_SIZE - 1) / PAGE_SIZE;
- buf->npages = buf->nbufs;
- buf->page_shift = PAGE_SHIFT;
- buf->page_list = kcalloc(buf->nbufs, sizeof(*buf->page_list),
- gfp);
- if (!buf->page_list)
- return -ENOMEM;
-
- for (i = 0; i < buf->nbufs; ++i) {
- buf->page_list[i].buf =
- dma_alloc_coherent(&dev->persist->pdev->dev,
- PAGE_SIZE,
- &t, gfp);
- if (!buf->page_list[i].buf)
- goto err_free;
-
- buf->page_list[i].map = t;
-
- memset(buf->page_list[i].buf, 0, PAGE_SIZE);
- }
+ buf->direct.map = t;
- if (BITS_PER_LONG == 64) {
- struct page **pages;
- pages = kmalloc(sizeof *pages * buf->nbufs, gfp);
- if (!pages)
- goto err_free;
- for (i = 0; i < buf->nbufs; ++i)
- pages[i] = virt_to_page(buf->page_list[i].buf);
- buf->direct.buf = vmap(pages, buf->nbufs, VM_MAP, PAGE_KERNEL);
- kfree(pages);
- if (!buf->direct.buf)
- goto err_free;
- }
+ while (t & ((1 << buf->page_shift) - 1)) {
+ --buf->page_shift;
+ buf->npages *= 2;
}
+ memset(buf->direct.buf, 0, size);
return 0;
-
-err_free:
- mlx4_buf_free(dev, size, buf);
-
- return -ENOMEM;
}
EXPORT_SYMBOL_GPL(mlx4_buf_alloc);
void mlx4_buf_free(struct mlx4_dev *dev, int size, struct mlx4_buf *buf)
{
- int i;
-
- if (buf->nbufs == 1)
- dma_free_coherent(&dev->persist->pdev->dev, size,
- buf->direct.buf,
- buf->direct.map);
- else {
- if (BITS_PER_LONG == 64)
- vunmap(buf->direct.buf);
-
- for (i = 0; i < buf->nbufs; ++i)
- if (buf->page_list[i].buf)
- dma_free_coherent(&dev->persist->pdev->dev,
- PAGE_SIZE,
- buf->page_list[i].buf,
- buf->page_list[i].map);
- kfree(buf->page_list);
- }
+ dma_free_coherent(&dev->persist->pdev->dev, size,
+ buf->direct.buf,
+ buf->direct.map);
}
EXPORT_SYMBOL_GPL(mlx4_buf_free);
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_cq.c b/drivers/net/ethernet/mellanox/mlx4/en_cq.c
index af975a2..16ef8cf 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_cq.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_cq.c
@@ -78,17 +78,11 @@ int mlx4_en_create_cq(struct mlx4_en_priv *priv,
if (err)
goto err_cq;
- err = mlx4_en_map_buffer(&cq->wqres.buf);
- if (err)
- goto err_res;
-
cq->buf = (struct mlx4_cqe *)cq->wqres.buf.direct.buf;
*pcq = cq;
return 0;
-err_res:
- mlx4_free_hwq_res(mdev->dev, &cq->wqres, cq->buf_size);
err_cq:
kfree(cq);
*pcq = NULL;
@@ -177,7 +171,6 @@ void mlx4_en_destroy_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq **pcq)
struct mlx4_en_dev *mdev = priv->mdev;
struct mlx4_en_cq *cq = *pcq;
- mlx4_en_unmap_buffer(&cq->wqres.buf);
mlx4_free_hwq_res(mdev->dev, &cq->wqres, cq->buf_size);
if (mlx4_is_eq_vector_valid(mdev->dev, priv->port, cq->vector) &&
cq->is_tx == RX)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_resources.c b/drivers/net/ethernet/mellanox/mlx4/en_resources.c
index 02e925d..a6b0db0 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_resources.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_resources.c
@@ -107,37 +107,6 @@ int mlx4_en_change_mcast_lb(struct mlx4_en_priv *priv, struct mlx4_qp *qp,
return ret;
}
-int mlx4_en_map_buffer(struct mlx4_buf *buf)
-{
- struct page **pages;
- int i;
-
- if (BITS_PER_LONG == 64 || buf->nbufs == 1)
- return 0;
-
- pages = kmalloc(sizeof *pages * buf->nbufs, GFP_KERNEL);
- if (!pages)
- return -ENOMEM;
-
- for (i = 0; i < buf->nbufs; ++i)
- pages[i] = virt_to_page(buf->page_list[i].buf);
-
- buf->direct.buf = vmap(pages, buf->nbufs, VM_MAP, PAGE_KERNEL);
- kfree(pages);
- if (!buf->direct.buf)
- return -ENOMEM;
-
- return 0;
-}
-
-void mlx4_en_unmap_buffer(struct mlx4_buf *buf)
-{
- if (BITS_PER_LONG == 64 || buf->nbufs == 1)
- return;
-
- vunmap(buf->direct.buf);
-}
-
void mlx4_en_sqp_event(struct mlx4_qp *qp, enum mlx4_event event)
{
return;
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index 86bcfe5..bc6c14a 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -396,11 +396,6 @@ int mlx4_en_create_rx_ring(struct mlx4_en_priv *priv,
if (err)
goto err_info;
- err = mlx4_en_map_buffer(&ring->wqres.buf);
- if (err) {
- en_err(priv, "Failed to map RX buffer\n");
- goto err_hwq;
- }
ring->buf = ring->wqres.buf.direct.buf;
ring->hwtstamp_rx_filter = priv->hwtstamp_config.rx_filter;
@@ -408,8 +403,6 @@ int mlx4_en_create_rx_ring(struct mlx4_en_priv *priv,
*pring = ring;
return 0;
-err_hwq:
- mlx4_free_hwq_res(mdev->dev, &ring->wqres, ring->buf_size);
err_info:
vfree(ring->rx_info);
ring->rx_info = NULL;
@@ -513,7 +506,6 @@ void mlx4_en_destroy_rx_ring(struct mlx4_en_priv *priv,
struct mlx4_en_dev *mdev = priv->mdev;
struct mlx4_en_rx_ring *ring = *pring;
- mlx4_en_unmap_buffer(&ring->wqres.buf);
mlx4_free_hwq_res(mdev->dev, &ring->wqres, size * stride + TXBB_SIZE);
vfree(ring->rx_info);
ring->rx_info = NULL;
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index c0d7b72..3c6b59c 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -101,12 +101,6 @@ int mlx4_en_create_tx_ring(struct mlx4_en_priv *priv,
goto err_bounce;
}
- err = mlx4_en_map_buffer(&ring->wqres.buf);
- if (err) {
- en_err(priv, "Failed to map TX buffer\n");
- goto err_hwq_res;
- }
-
ring->buf = ring->wqres.buf.direct.buf;
en_dbg(DRV, priv, "Allocated TX ring (addr:%p) - buf:%p size:%d buf_size:%d dma:%llx\n",
@@ -155,8 +149,6 @@ int mlx4_en_create_tx_ring(struct mlx4_en_priv *priv,
err_reserve:
mlx4_qp_release_range(mdev->dev, ring->qpn, 1);
err_map:
- mlx4_en_unmap_buffer(&ring->wqres.buf);
-err_hwq_res:
mlx4_free_hwq_res(mdev->dev, &ring->wqres, ring->buf_size);
err_bounce:
kfree(ring->bounce_buf);
@@ -182,7 +174,6 @@ void mlx4_en_destroy_tx_ring(struct mlx4_en_priv *priv,
mlx4_qp_remove(mdev->dev, &ring->qp);
mlx4_qp_free(mdev->dev, &ring->qp);
mlx4_qp_release_range(priv->mdev->dev, ring->qpn, 1);
- mlx4_en_unmap_buffer(&ring->wqres.buf);
mlx4_free_hwq_res(mdev->dev, &ring->wqres, ring->buf_size);
kfree(ring->bounce_buf);
ring->bounce_buf = NULL;
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index d12ab6a..a70e2d0 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -671,8 +671,6 @@ void mlx4_en_fill_qp_context(struct mlx4_en_priv *priv, int size, int stride,
int is_tx, int rss, int qpn, int cqn, int user_prio,
struct mlx4_qp_context *context);
void mlx4_en_sqp_event(struct mlx4_qp *qp, enum mlx4_event event);
-int mlx4_en_map_buffer(struct mlx4_buf *buf);
-void mlx4_en_unmap_buffer(struct mlx4_buf *buf);
int mlx4_en_change_mcast_lb(struct mlx4_en_priv *priv, struct mlx4_qp *qp,
int loopback);
void mlx4_en_calc_rx_buf(struct net_device *dev);
diff --git a/drivers/net/ethernet/mellanox/mlx4/mr.c b/drivers/net/ethernet/mellanox/mlx4/mr.c
index 9319519..b61052f 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mr.c
+++ b/drivers/net/ethernet/mellanox/mlx4/mr.c
@@ -802,10 +802,7 @@ int mlx4_buf_write_mtt(struct mlx4_dev *dev, struct mlx4_mtt *mtt,
return -ENOMEM;
for (i = 0; i < buf->npages; ++i)
- if (buf->nbufs == 1)
- page_list[i] = buf->direct.map + (i << buf->page_shift);
- else
- page_list[i] = buf->page_list[i].map;
+ page_list[i] = buf->direct.map + (i << buf->page_shift);
err = mlx4_write_mtt(dev, mtt, 0, buf->npages, page_list);
diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h
index 8541a91..536b547 100644
--- a/include/linux/mlx4/device.h
+++ b/include/linux/mlx4/device.h
@@ -618,8 +618,6 @@ struct mlx4_buf_list {
struct mlx4_buf {
struct mlx4_buf_list direct;
- struct mlx4_buf_list *page_list;
- int nbufs;
int npages;
int page_shift;
};
@@ -1051,11 +1049,7 @@ int mlx4_buf_alloc(struct mlx4_dev *dev, int size, int max_direct,
void mlx4_buf_free(struct mlx4_dev *dev, int size, struct mlx4_buf *buf);
static inline void *mlx4_buf_offset(struct mlx4_buf *buf, int offset)
{
- if (BITS_PER_LONG == 64 || buf->nbufs == 1)
- return buf->direct.buf + offset;
- else
- return buf->page_list[offset >> PAGE_SHIFT].buf +
- (offset & (PAGE_SIZE - 1));
+ return buf->direct.buf + offset;
}
int mlx4_pd_alloc(struct mlx4_dev *dev, u32 *pdn);
--
2.1.4
On 4/19/2016 2:22 PM, Christoph Hellwig wrote:
> What I think we need is something like the patch below. In the long
> ru nwe should also kill the mlx4_buf structure which now is pretty
> pointless.
Maybe; this could be the correct approach if we can guarantee that the
architecture can allocate the requested amount of memory with
dma_alloc_coherent.
When I brought this issue a year ago, the objection was that my code
doesn't compile on intel (dma_to_phys) and also some arches run out of
DMA memory with existing customer base.
If there is a real need to maintain compatibility with the existing
architectures due to limited amount of DMA memory, we need to correct this
code to make proper use of vmap via alloc_pages and also insert the
dma_sync in proper places for DMA API conformance.
Also, the tx descriptors always has to be allocated from a single DMA region
or the tx code needs to be corrected to support page_list.
If we can live with just using dma_alloc_coherent, your solution is
better. I was trying to put this support for 64bit arches only while
maintaining support for the existing code base.
>
> ---
> From a493881d2a6c90152d3daabb7b6b3afd1d254d78 Mon Sep 17 00:00:00 2001
> From: Christoph Hellwig <[email protected]>
> Date: Tue, 19 Apr 2016 14:12:14 -0400
> Subject: mlx4_en: don't try to split and vmap dma coherent allocations
>
> The memory returned by dma_alloc_coherent is not suitable for calling
> virt_to_page on it, as it might for example come from vmap allocator.
>
> Remove the code that calls virt_to_page and vmap on dma coherent
> allocations from the mlx4 drivers, and replace them with a single
> high-order dma_alloc_coherent call.
>
> Signed-off-by: Christoph Hellwig <[email protected]>
> Reported-by: Sinan Kaya <[email protected]>
--
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
Hi Sinan,
We are working in Mellanox for a solution which
removes the vmap call and allocate contiguous memory (using dma_alloc_coherent).
Thanks,
Eran
On Tue, Apr 19, 2016 at 9:37 PM, Sinan Kaya <[email protected]> wrote:
> On 4/19/2016 2:22 PM, Christoph Hellwig wrote:
>> What I think we need is something like the patch below. In the long
>> ru nwe should also kill the mlx4_buf structure which now is pretty
>> pointless.
>
> Maybe; this could be the correct approach if we can guarantee that the
> architecture can allocate the requested amount of memory with
> dma_alloc_coherent.
>
> When I brought this issue a year ago, the objection was that my code
> doesn't compile on intel (dma_to_phys) and also some arches run out of
> DMA memory with existing customer base.
>
> If there is a real need to maintain compatibility with the existing
> architectures due to limited amount of DMA memory, we need to correct this
> code to make proper use of vmap via alloc_pages and also insert the
> dma_sync in proper places for DMA API conformance.
>
> Also, the tx descriptors always has to be allocated from a single DMA region
> or the tx code needs to be corrected to support page_list.
>
> If we can live with just using dma_alloc_coherent, your solution is
> better. I was trying to put this support for 64bit arches only while
> maintaining support for the existing code base.
>
>>
>> ---
>> From a493881d2a6c90152d3daabb7b6b3afd1d254d78 Mon Sep 17 00:00:00 2001
>> From: Christoph Hellwig <[email protected]>
>> Date: Tue, 19 Apr 2016 14:12:14 -0400
>> Subject: mlx4_en: don't try to split and vmap dma coherent allocations
>>
>> The memory returned by dma_alloc_coherent is not suitable for calling
>> virt_to_page on it, as it might for example come from vmap allocator.
>>
>> Remove the code that calls virt_to_page and vmap on dma coherent
>> allocations from the mlx4 drivers, and replace them with a single
>> high-order dma_alloc_coherent call.
>>
>> Signed-off-by: Christoph Hellwig <[email protected]>
>> Reported-by: Sinan Kaya <[email protected]>
>
>
> --
> Sinan Kaya
> Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
On 4/19/2016 2:22 PM, Christoph Hellwig wrote:
> What I think we need is something like the patch below. In the long
> ru nwe should also kill the mlx4_buf structure which now is pretty
> pointless.
>
It is been 1.5 years since I reported the problem. We came up with three
different solutions this week. I'd like to see a version of the solution
to get merged until Mellanox comes up with a better solution with another
patch. My proposal is to use this one.
-- a/drivers/net/ethernet/mellanox/mlx4/alloc.c
+++ b/drivers/net/ethernet/mellanox/mlx4/alloc.c
@@ -588,7 +588,7 @@ int mlx4_buf_alloc(struct mlx4_dev *dev, int size, int max_direct,
{
dma_addr_t t;
- if (size <= max_direct) {
+ if ((size <= max_direct) || (BITS_PER_LONG == 64)){
buf->nbufs = 1;
buf->npages = 1;
buf->page_shift = get_order(size) + PAGE_SHIFT;
Of course, this is assuming that you are not ready to submit your patch yet. If you
are, feel free to post.
--
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
Apologies,
Replied to an older post by mistake. I was trying to reply to Eran.
>Hi Sinan,
>
>We are working in Mellanox for a solution which
>removes the vmap call and allocate contiguous memory (using dma_alloc_coherent).
>
>Thanks,
>Eran
>
>
>On 4/20/2016 9:35 AM, Sinan Kaya wrote:
> On 4/19/2016 2:22 PM, Christoph Hellwig wrote:
>> What I think we need is something like the patch below. In the long
>> ru nwe should also kill the mlx4_buf structure which now is pretty
>> pointless.
>>
>
It is been 1.5 years since I reported the problem. We came up with three
different solutions this week. I'd like to see a version of the solution
to get merged until Mellanox comes up with a better solution with another
patch. My proposal is to use this one.
>
> -- a/drivers/net/ethernet/mellanox/mlx4/alloc.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/alloc.c
> @@ -588,7 +588,7 @@ int mlx4_buf_alloc(struct mlx4_dev *dev, int size, int max_direct,
> {
> dma_addr_t t;
>
> - if (size <= max_direct) {
> + if ((size <= max_direct) || (BITS_PER_LONG == 64)){
> buf->nbufs = 1;
> buf->npages = 1;
> buf->page_shift = get_order(size) + PAGE_SHIFT;
>
> Of course, this is assuming that you are not ready to submit your patch yet. If you
> are, feel free to post.
>
--
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
Sinan Kaya wrote:
> I'd like to see a version of the solution
> to get merged until Mellanox comes up with a better solution with another
> patch.
Yes, I agree 100%.
--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora
Forum, a Linux Foundation collaborative project.
>
> It is been 1.5 years since I reported the problem. We came up with three
> different solutions this week. I'd like to see a version of the solution
> to get merged until Mellanox comes up with a better solution with another
> patch. My proposal is to use this one.
>
We will post our suggestion here in the following days.
On 4/20/2016 2:40 PM, Eran Ben Elisha wrote:
>>
>> It is been 1.5 years since I reported the problem. We came up with three
>> different solutions this week. I'd like to see a version of the solution
>> to get merged until Mellanox comes up with a better solution with another
>> patch. My proposal is to use this one.
>>
>
> We will post our suggestion here in the following days.
>
Thanks, please have me in CC. I'm not subscribed to this group normally.
I can post a tested-by after testing.
--
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project