Code includes wmb() followed by writel() in multiple places. writel()
already has a barrier on some architectures like arm64.
This ends up CPU observing two barriers back to back before executing the
register write.
Since code already has an explicit barrier call, changing writel() to
writel_relaxed().
Signed-off-by: Sinan Kaya <[email protected]>
---
drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 21 ++++++++++++++++++---
1 file changed, 18 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index 815cb1a..9e684b1 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -725,7 +725,12 @@ static void ixgbevf_alloc_rx_buffers(struct ixgbevf_ring *rx_ring,
* such as IA-64).
*/
wmb();
- writel(i, rx_ring->tail);
+ writel_relaxed(i, rx_ring->tail);
+
+ /* We need this if more than one processor can write to our tail
+ * at a time, it synchronizes IO on IA64/Altix systems
+ */
+ mmiowb();
}
}
@@ -1232,7 +1237,12 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
* know there are new descriptors to fetch.
*/
wmb();
- writel(xdp_ring->next_to_use, xdp_ring->tail);
+ writel_relaxed(xdp_ring->next_to_use, xdp_ring->tail);
+
+ /* We need this if more than one processor can write to our tail
+ * at a time, it synchronizes IO on IA64/Altix systems
+ */
+ mmiowb();
}
u64_stats_update_begin(&rx_ring->syncp);
@@ -4004,7 +4014,12 @@ static void ixgbevf_tx_map(struct ixgbevf_ring *tx_ring,
tx_ring->next_to_use = i;
/* notify HW of packet */
- writel(i, tx_ring->tail);
+ writel_relaxed(i, tx_ring->tail);
+
+ /* We need this if more than one processor can write to our tail
+ * at a time, it synchronizes IO on IA64/Altix systems
+ */
+ mmiowb();
return;
dma_error:
--
2.7.4
On Fri, Mar 23, 2018 at 11:21 AM, Sinan Kaya <[email protected]> wrote:
> Code includes wmb() followed by writel() in multiple places. writel()
> already has a barrier on some architectures like arm64.
>
> This ends up CPU observing two barriers back to back before executing the
> register write.
>
> Since code already has an explicit barrier call, changing writel() to
> writel_relaxed().
>
> Signed-off-by: Sinan Kaya <[email protected]>
> ---
> drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 21 ++++++++++++++++++---
> 1 file changed, 18 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
> index 815cb1a..9e684b1 100644
> --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
> +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
> @@ -725,7 +725,12 @@ static void ixgbevf_alloc_rx_buffers(struct ixgbevf_ring *rx_ring,
> * such as IA-64).
> */
> wmb();
> - writel(i, rx_ring->tail);
> + writel_relaxed(i, rx_ring->tail);
> +
> + /* We need this if more than one processor can write to our tail
> + * at a time, it synchronizes IO on IA64/Altix systems
> + */
> + mmiowb();
> }
The mmiowb shouldn't be needed for Rx. Only one CPU will be running
NAPI for the queue and we will synchronize this with a full writel
anyway when we re-enable the interrupts.
> }
>
> @@ -1232,7 +1237,12 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
> * know there are new descriptors to fetch.
> */
> wmb();
> - writel(xdp_ring->next_to_use, xdp_ring->tail);
> + writel_relaxed(xdp_ring->next_to_use, xdp_ring->tail);
> +
> + /* We need this if more than one processor can write to our tail
> + * at a time, it synchronizes IO on IA64/Altix systems
> + */
> + mmiowb();
> }
>
> u64_stats_update_begin(&rx_ring->syncp);
> @@ -4004,7 +4014,12 @@ static void ixgbevf_tx_map(struct ixgbevf_ring *tx_ring,
> tx_ring->next_to_use = i;
>
> /* notify HW of packet */
> - writel(i, tx_ring->tail);
> + writel_relaxed(i, tx_ring->tail);
> +
> + /* We need this if more than one processor can write to our tail
> + * at a time, it synchronizes IO on IA64/Altix systems
> + */
> + mmiowb();
>
> return;
> dma_error:
> --
> 2.7.4
>
> _______________________________________________
> Intel-wired-lan mailing list
> [email protected]
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
On 3/23/2018 2:25 PM, Alexander Duyck wrote:
>> + /* We need this if more than one processor can write to our tail
>> + * at a time, it synchronizes IO on IA64/Altix systems
>> + */
>> + mmiowb();
>> }
> The mmiowb shouldn't be needed for Rx. Only one CPU will be running
> NAPI for the queue and we will synchronize this with a full writel
> anyway when we re-enable the interrupts.
>
OK. I can fix this on the next version. I did a blanket search and replace for
my writel_relaxed() changes as I don't know the code well enough.
Please point me to the redundant ones.
--
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
On Fri, Mar 23, 2018 at 11:27 AM, Sinan Kaya <[email protected]> wrote:
> On 3/23/2018 2:25 PM, Alexander Duyck wrote:
>>> + /* We need this if more than one processor can write to our tail
>>> + * at a time, it synchronizes IO on IA64/Altix systems
>>> + */
>>> + mmiowb();
>>> }
>> The mmiowb shouldn't be needed for Rx. Only one CPU will be running
>> NAPI for the queue and we will synchronize this with a full writel
>> anyway when we re-enable the interrupts.
>>
>
> OK. I can fix this on the next version. I did a blanket search and replace for
> my writel_relaxed() changes as I don't know the code well enough.
>
> Please point me to the redundant ones.
So from what I can tell only this file and i40e needed any additional
mmiowb calls added. The rest are not needed.
- Alex
On 3/23/2018 2:31 PM, Alexander Duyck wrote:
>> Please point me to the redundant ones.
> So from what I can tell only this file and i40e needed any additional
> mmiowb calls added. The rest are not needed.
Thanks, I'll clean up between 2..6 and then make your suggested changes
on 1 and 7.
--
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.