2019-08-07 02:51:40

by Firo Yang

[permalink] [raw]
Subject: [PATCH v2 1/1] ixgbe: sync the first fragment unconditionally

In Xen environment, if Xen-swiotlb is enabled, ixgbe driver
could possibly allocate a page, DMA memory buffer, for the first
fragment which is not suitable for Xen-swiotlb to do DMA operations.
Xen-swiotlb have to internally allocate another page for doing DMA
operations. It requires syncing between those two pages. However,
since commit f3213d932173 ("ixgbe: Update driver to make use of DMA
attributes in Rx path"), the unmap operation is performed with
DMA_ATTR_SKIP_CPU_SYNC. As a result, the sync is not performed.

To fix this problem, always sync before possibly performing a page
unmap operation.

Fixes: f3213d932173 ("ixgbe: Update driver to make use of DMA
attributes in Rx path")
Reviewed-by: Alexander Duyck <[email protected]>
Signed-off-by: Firo Yang <[email protected]>
---

Changes from v1:
* Imporved the patch description.
* Added Reviewed-by: and Fixes: as suggested by Alexander Duyck

drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 16 +++++++++-------
1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index cbaf712d6529..200de9838096 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -1825,13 +1825,7 @@ static void ixgbe_pull_tail(struct ixgbe_ring *rx_ring,
static void ixgbe_dma_sync_frag(struct ixgbe_ring *rx_ring,
struct sk_buff *skb)
{
- /* if the page was released unmap it, else just sync our portion */
- if (unlikely(IXGBE_CB(skb)->page_released)) {
- dma_unmap_page_attrs(rx_ring->dev, IXGBE_CB(skb)->dma,
- ixgbe_rx_pg_size(rx_ring),
- DMA_FROM_DEVICE,
- IXGBE_RX_DMA_ATTR);
- } else if (ring_uses_build_skb(rx_ring)) {
+ if (ring_uses_build_skb(rx_ring)) {
unsigned long offset = (unsigned long)(skb->data) & ~PAGE_MASK;

dma_sync_single_range_for_cpu(rx_ring->dev,
@@ -1848,6 +1842,14 @@ static void ixgbe_dma_sync_frag(struct ixgbe_ring *rx_ring,
skb_frag_size(frag),
DMA_FROM_DEVICE);
}
+
+ /* If the page was released, just unmap it. */
+ if (unlikely(IXGBE_CB(skb)->page_released)) {
+ dma_unmap_page_attrs(rx_ring->dev, IXGBE_CB(skb)->dma,
+ ixgbe_rx_pg_size(rx_ring),
+ DMA_FROM_DEVICE,
+ IXGBE_RX_DMA_ATTR);
+ }
}

/**
--
2.16.4


2019-08-07 08:18:29

by Jacob Wen

[permalink] [raw]
Subject: Re: [PATCH v2 1/1] ixgbe: sync the first fragment unconditionally

I think the description is not correct. Consider using something like below.


In Xen environment, due to memory fragmentation ixgbe may allocate a
'DMA' buffer with pages that are not physically contiguous.

A NIC doesn't support directly write such buffer. So xen-swiotlb would
use the pages, which are physically contiguous, from the swiotlb buffer
for the NIC.

The unmap operation is used to copy the swiotlb buffer to the pages that
are allocated by ixgbe.

On 8/7/19 10:49 AM, Firo Yang wrote:
> In Xen environment, if Xen-swiotlb is enabled, ixgbe driver
> could possibly allocate a page, DMA memory buffer, for the first
> fragment which is not suitable for Xen-swiotlb to do DMA operations.
> Xen-swiotlb have to internally allocate another page for doing DMA
> operations. It requires syncing between those two pages. However,
> since commit f3213d932173 ("ixgbe: Update driver to make use of DMA
> attributes in Rx path"), the unmap operation is performed with
> DMA_ATTR_SKIP_CPU_SYNC. As a result, the sync is not performed.
>
> To fix this problem, always sync before possibly performing a page
> unmap operation.
>
> Fixes: f3213d932173 ("ixgbe: Update driver to make use of DMA
> attributes in Rx path")
> Reviewed-by: Alexander Duyck <[email protected]>
> Signed-off-by: Firo Yang <[email protected]>
> ---
>
> Changes from v1:
> * Imporved the patch description.
> * Added Reviewed-by: and Fixes: as suggested by Alexander Duyck
>
> drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 16 +++++++++-------
> 1 file changed, 9 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index cbaf712d6529..200de9838096 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -1825,13 +1825,7 @@ static void ixgbe_pull_tail(struct ixgbe_ring *rx_ring,
> static void ixgbe_dma_sync_frag(struct ixgbe_ring *rx_ring,
> struct sk_buff *skb)
> {
> - /* if the page was released unmap it, else just sync our portion */
> - if (unlikely(IXGBE_CB(skb)->page_released)) {
> - dma_unmap_page_attrs(rx_ring->dev, IXGBE_CB(skb)->dma,
> - ixgbe_rx_pg_size(rx_ring),
> - DMA_FROM_DEVICE,
> - IXGBE_RX_DMA_ATTR);
> - } else if (ring_uses_build_skb(rx_ring)) {
> + if (ring_uses_build_skb(rx_ring)) {
> unsigned long offset = (unsigned long)(skb->data) & ~PAGE_MASK;
>
> dma_sync_single_range_for_cpu(rx_ring->dev,
> @@ -1848,6 +1842,14 @@ static void ixgbe_dma_sync_frag(struct ixgbe_ring *rx_ring,
> skb_frag_size(frag),
> DMA_FROM_DEVICE);
> }
> +
> + /* If the page was released, just unmap it. */
> + if (unlikely(IXGBE_CB(skb)->page_released)) {
> + dma_unmap_page_attrs(rx_ring->dev, IXGBE_CB(skb)->dma,
> + ixgbe_rx_pg_size(rx_ring),
> + DMA_FROM_DEVICE,
> + IXGBE_RX_DMA_ATTR);
> + }
> }
>
> /**

2019-08-07 09:11:36

by Firo Yang

[permalink] [raw]
Subject: Re: [PATCH v2 1/1] ixgbe: sync the first fragment unconditionally

The 08/07/2019 15:56, Jacob Wen wrote:
> I think the description is not correct. Consider using something like below.
Thank you for comments.

>
> In Xen environment, due to memory fragmentation ixgbe may allocate a 'DMA'
> buffer with pages that are not physically contiguous.
Actually, I didn't look into the reason why ixgbe got a DMA buffer which
was mapped to Xen-swiotlb area.

But I don't think this issue relates to phsical memory contiguity because, in
our case, one ixgbe_rx_buffer only associates at most one page.

If you take a look at the related code, you will find there are several reasons
for mapping a DMA buffer to Xen-swiotlb area:
static dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page,
*/
if (dma_capable(dev, dev_addr, size) &&
!range_straddles_page_boundary(phys, size) &&
!xen_arch_need_swiotlb(dev, phys, dev_addr) &&
swiotlb_force != SWIOTLB_FORCE)
goto done;

// Firo
>
> A NIC doesn't support directly write such buffer. So xen-swiotlb would use
> the pages, which are physically contiguous, from the swiotlb buffer for the
> NIC.
>
> The unmap operation is used to copy the swiotlb buffer to the pages that are
> allocated by ixgbe.
>
> On 8/7/19 10:49 AM, Firo Yang wrote:
> > In Xen environment, if Xen-swiotlb is enabled, ixgbe driver
> > could possibly allocate a page, DMA memory buffer, for the first
> > fragment which is not suitable for Xen-swiotlb to do DMA operations.
> > Xen-swiotlb have to internally allocate another page for doing DMA
> > operations. It requires syncing between those two pages. However,
> > since commit f3213d932173 ("ixgbe: Update driver to make use of DMA
> > attributes in Rx path"), the unmap operation is performed with
> > DMA_ATTR_SKIP_CPU_SYNC. As a result, the sync is not performed.
> >
> > To fix this problem, always sync before possibly performing a page
> > unmap operation.
> >
> > Fixes: f3213d932173 ("ixgbe: Update driver to make use of DMA
> > attributes in Rx path")
> > Reviewed-by: Alexander Duyck <[email protected]>
> > Signed-off-by: Firo Yang <[email protected]>
> > ---
> >
> > Changes from v1:
> > * Imporved the patch description.
> > * Added Reviewed-by: and Fixes: as suggested by Alexander Duyck
> >
> > drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 16 +++++++++-------
> > 1 file changed, 9 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> > index cbaf712d6529..200de9838096 100644
> > --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> > +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> > @@ -1825,13 +1825,7 @@ static void ixgbe_pull_tail(struct ixgbe_ring *rx_ring,
> > static void ixgbe_dma_sync_frag(struct ixgbe_ring *rx_ring,
> > struct sk_buff *skb)
> > {
> > - /* if the page was released unmap it, else just sync our portion */
> > - if (unlikely(IXGBE_CB(skb)->page_released)) {
> > - dma_unmap_page_attrs(rx_ring->dev, IXGBE_CB(skb)->dma,
> > - ixgbe_rx_pg_size(rx_ring),
> > - DMA_FROM_DEVICE,
> > - IXGBE_RX_DMA_ATTR);
> > - } else if (ring_uses_build_skb(rx_ring)) {
> > + if (ring_uses_build_skb(rx_ring)) {
> > unsigned long offset = (unsigned long)(skb->data) & ~PAGE_MASK;
> > dma_sync_single_range_for_cpu(rx_ring->dev,
> > @@ -1848,6 +1842,14 @@ static void ixgbe_dma_sync_frag(struct ixgbe_ring *rx_ring,
> > skb_frag_size(frag),
> > DMA_FROM_DEVICE);
> > }
> > +
> > + /* If the page was released, just unmap it. */
> > + if (unlikely(IXGBE_CB(skb)->page_released)) {
> > + dma_unmap_page_attrs(rx_ring->dev, IXGBE_CB(skb)->dma,
> > + ixgbe_rx_pg_size(rx_ring),
> > + DMA_FROM_DEVICE,
> > + IXGBE_RX_DMA_ATTR);
> > + }
> > }
> > /**
>

2019-08-08 01:59:28

by Jacob Wen

[permalink] [raw]
Subject: Re: [PATCH v2 1/1] ixgbe: sync the first fragment unconditionally


On 8/7/19 4:38 PM, Firo Yang wrote:
> The 08/07/2019 15:56, Jacob Wen wrote:
>> I think the description is not correct. Consider using something like below.
> Thank you for comments.
>
>> In Xen environment, due to memory fragmentation ixgbe may allocate a 'DMA'
>> buffer with pages that are not physically contiguous.
> Actually, I didn't look into the reason why ixgbe got a DMA buffer which
> was mapped to Xen-swiotlb area.
Yes. I was wrong. You don't need to tell the exact reason.
>
> But I don't think this issue relates to phsical memory contiguity because, in
> our case, one ixgbe_rx_buffer only associates at most one page.

This is interesting.

I guess the performance of the NIC in your environment is not good due
to the extra cost of bounce buffer.

>
> If you take a look at the related code, you will find there are several reasons
> for mapping a DMA buffer to Xen-swiotlb area:
> static dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page,
> */
> if (dma_capable(dev, dev_addr, size) &&
> !range_straddles_page_boundary(phys, size) &&
> !xen_arch_need_swiotlb(dev, phys, dev_addr) &&
> swiotlb_force != SWIOTLB_FORCE)
> goto done;
>
> // Firo
>> A NIC doesn't support directly write such buffer. So xen-swiotlb would use
>> the pages, which are physically contiguous, from the swiotlb buffer for the
>> NIC.
>>
>> The unmap operation is used to copy the swiotlb buffer to the pages that are
>> allocated by ixgbe.
>>
>> On 8/7/19 10:49 AM, Firo Yang wrote:
>>> In Xen environment, if Xen-swiotlb is enabled, ixgbe driver
>>> could possibly allocate a page, DMA memory buffer, for the first
>>> fragment which is not suitable for Xen-swiotlb to do DMA operations.
>>> Xen-swiotlb have to internally allocate another page for doing DMA
>>> operations. It requires syncing between those two pages. However,
>>> since commit f3213d932173 ("ixgbe: Update driver to make use of DMA
>>> attributes in Rx path"), the unmap operation is performed with
>>> DMA_ATTR_SKIP_CPU_SYNC. As a result, the sync is not performed.
>>>
>>> To fix this problem, always sync before possibly performing a page
>>> unmap operation.
>>>
>>> Fixes: f3213d932173 ("ixgbe: Update driver to make use of DMA
>>> attributes in Rx path")
>>> Reviewed-by: Alexander Duyck <[email protected]>
>>> Signed-off-by: Firo Yang <[email protected]>
>>> ---
>>>
>>> Changes from v1:
>>> * Imporved the patch description.
>>> * Added Reviewed-by: and Fixes: as suggested by Alexander Duyck
>>>
>>> drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 16 +++++++++-------
>>> 1 file changed, 9 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> index cbaf712d6529..200de9838096 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> @@ -1825,13 +1825,7 @@ static void ixgbe_pull_tail(struct ixgbe_ring *rx_ring,
>>> static void ixgbe_dma_sync_frag(struct ixgbe_ring *rx_ring,
>>> struct sk_buff *skb)
>>> {
>>> - /* if the page was released unmap it, else just sync our portion */
>>> - if (unlikely(IXGBE_CB(skb)->page_released)) {
>>> - dma_unmap_page_attrs(rx_ring->dev, IXGBE_CB(skb)->dma,
>>> - ixgbe_rx_pg_size(rx_ring),
>>> - DMA_FROM_DEVICE,
>>> - IXGBE_RX_DMA_ATTR);
>>> - } else if (ring_uses_build_skb(rx_ring)) {
>>> + if (ring_uses_build_skb(rx_ring)) {
>>> unsigned long offset = (unsigned long)(skb->data) & ~PAGE_MASK;
>>> dma_sync_single_range_for_cpu(rx_ring->dev,
>>> @@ -1848,6 +1842,14 @@ static void ixgbe_dma_sync_frag(struct ixgbe_ring *rx_ring,
>>> skb_frag_size(frag),
>>> DMA_FROM_DEVICE);
>>> }
>>> +
>>> + /* If the page was released, just unmap it. */
>>> + if (unlikely(IXGBE_CB(skb)->page_released)) {
>>> + dma_unmap_page_attrs(rx_ring->dev, IXGBE_CB(skb)->dma,
>>> + ixgbe_rx_pg_size(rx_ring),
>>> + DMA_FROM_DEVICE,
>>> + IXGBE_RX_DMA_ATTR);
>>> + }
>>> }
>>> /**

2019-08-08 04:02:06

by Alexander Duyck

[permalink] [raw]
Subject: Re: [PATCH v2 1/1] ixgbe: sync the first fragment unconditionally

On Wed, Aug 7, 2019 at 6:58 PM Jacob Wen <[email protected]> wrote:
>
>
> On 8/7/19 4:38 PM, Firo Yang wrote:
> > The 08/07/2019 15:56, Jacob Wen wrote:
> >> I think the description is not correct. Consider using something like below.
> > Thank you for comments.
> >
> >> In Xen environment, due to memory fragmentation ixgbe may allocate a 'DMA'
> >> buffer with pages that are not physically contiguous.
> > Actually, I didn't look into the reason why ixgbe got a DMA buffer which
> > was mapped to Xen-swiotlb area.
> Yes. I was wrong. You don't need to tell the exact reason.
> >
> > But I don't think this issue relates to phsical memory contiguity because, in
> > our case, one ixgbe_rx_buffer only associates at most one page.
>
> This is interesting.
>
> I guess the performance of the NIC in your environment is not good due
> to the extra cost of bounce buffer.

If I recall correctly the Rx performance for ixgbe shouldn't be too
bad even with a bounce buffer. The cost for map/unmap are expensive
for a bounce buffer setup but the syncs are just copies so they are
pretty cheap in comparison. The driver can take advantage of that on
Rx since it leaves the pages mapped and just syncs the portion of the
pages that are used.

Now if you are hitting the bounce buffer on the Tx side that is
another matter as you have to allocate the buffer on demand and that
is quite expensive.