2016-04-26 07:36:13

by Noam Camus

[permalink] [raw]
Subject: [PATCH 0/3] Net driver bugs fix and code style

From: Elad Kanfi <[email protected]>

Summary:
1. Bug description: TX done interrupts that arrives while interrupts
are masked, during NAPI poll, will not trigger an interrupt handling.
Since TX interrupt is of level edge we will lose the TX done interrupt.
As a result all pending tx frames will get no service.

Solution: Check if there is a pending tx request after unmasking the
interrupt and if answer is yes then re-add ourselves to
the NAPI poll list.

2. Bug description: CPU-A before sending a frame will set a variable
to true. CPU-B that executes the tx done interrupt service routine
might read a non valid value of that variable.

Solution: Add a memory barrier at the tx sending function.

3. In addition, the check of a valid pending tx request was switched
to an inline function to avoid duplicated code.

Elad Kanfi (3):
net: nps_enet: code style
net: nps_enet: Sync access to packet sent flag
net: nps_enet: bug fix - handle lost tx interrupts

drivers/net/ethernet/ezchip/nps_enet.c | 34 +++++++++++++++++++++++++++----
1 files changed, 29 insertions(+), 5 deletions(-)


2016-04-26 07:36:18

by Noam Camus

[permalink] [raw]
Subject: [PATCH 1/3] net: nps_enet: code style

From: Elad Kanfi <[email protected]>

Add an inline function that returns true if a packet
was sent and requires driver handling,
or returns false otherwise.
It avoids code duplication in current driver,
and will be used also in the next bugs fix.

Signed-off-by: Elad Kanfi <[email protected]>

Acked-by: Noam Camus <[email protected]>
---
drivers/net/ethernet/ezchip/nps_enet.c | 15 ++++++++++-----
1 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/ezchip/nps_enet.c b/drivers/net/ethernet/ezchip/nps_enet.c
index 1f23845..5ed06a6 100644
--- a/drivers/net/ethernet/ezchip/nps_enet.c
+++ b/drivers/net/ethernet/ezchip/nps_enet.c
@@ -24,6 +24,14 @@

#define DRV_NAME "nps_mgt_enet"

+static inline bool nps_enet_is_tx_pending(struct nps_enet_priv *priv)
+{
+ u32 tx_ctrl_value = nps_enet_reg_get(priv, NPS_ENET_REG_TX_CTL);
+ u32 tx_ctrl_ct = (tx_ctrl_value & TX_CTL_CT_MASK) >> TX_CTL_CT_SHIFT;
+
+ return (!tx_ctrl_ct && priv->tx_packet_sent);
+}
+
static void nps_enet_clean_rx_fifo(struct net_device *ndev, u32 frame_len)
{
struct nps_enet_priv *priv = netdev_priv(ndev);
@@ -140,12 +148,11 @@ static void nps_enet_tx_handler(struct net_device *ndev)
{
struct nps_enet_priv *priv = netdev_priv(ndev);
u32 tx_ctrl_value = nps_enet_reg_get(priv, NPS_ENET_REG_TX_CTL);
- u32 tx_ctrl_ct = (tx_ctrl_value & TX_CTL_CT_MASK) >> TX_CTL_CT_SHIFT;
u32 tx_ctrl_et = (tx_ctrl_value & TX_CTL_ET_MASK) >> TX_CTL_ET_SHIFT;
u32 tx_ctrl_nt = (tx_ctrl_value & TX_CTL_NT_MASK) >> TX_CTL_NT_SHIFT;

/* Check if we got TX */
- if (!priv->tx_packet_sent || tx_ctrl_ct)
+ if (!nps_enet_is_tx_pending(priv))
return;

/* Ack Tx ctrl register */
@@ -213,11 +220,9 @@ static irqreturn_t nps_enet_irq_handler(s32 irq, void *dev_instance)
struct net_device *ndev = dev_instance;
struct nps_enet_priv *priv = netdev_priv(ndev);
u32 rx_ctrl_value = nps_enet_reg_get(priv, NPS_ENET_REG_RX_CTL);
- u32 tx_ctrl_value = nps_enet_reg_get(priv, NPS_ENET_REG_TX_CTL);
- u32 tx_ctrl_ct = (tx_ctrl_value & TX_CTL_CT_MASK) >> TX_CTL_CT_SHIFT;
u32 rx_ctrl_cr = (rx_ctrl_value & RX_CTL_CR_MASK) >> RX_CTL_CR_SHIFT;

- if ((!tx_ctrl_ct && priv->tx_packet_sent) || rx_ctrl_cr)
+ if (nps_enet_is_tx_pending(priv) || rx_ctrl_cr)
if (likely(napi_schedule_prep(&priv->napi))) {
nps_enet_reg_set(priv, NPS_ENET_REG_BUF_INT_ENABLE, 0);
__napi_schedule(&priv->napi);
--
1.7.1

2016-04-26 07:36:32

by Noam Camus

[permalink] [raw]
Subject: [PATCH 2/3] net: nps_enet: Sync access to packet sent flag

From: Elad Kanfi <[email protected]>

Below is a description of a possible problematic
sequence. CPU-A is sending a frame and CPU-B handles
the interrupt that indicates the frame was sent. CPU-B
reads an invalid value of tx_packet_sent.

CPU-A CPU-B
----- -----
nps_enet_send_frame
.
.
tx_packet_sent = true
order HW to start tx
.
.
HW complete tx
------> get tx complete interrupt
.
.
if(tx_packet_sent == true)

end memory transaction
(tx_packet_sent actually
written)

Problem solution:

Add a memory barrier after setting tx_packet_sent,
in order to make sure that it is written before
the packet is sent.

Signed-off-by: Elad Kanfi <[email protected]>

Acked-by: Noam Camus <[email protected]>
---
drivers/net/ethernet/ezchip/nps_enet.c | 7 +++++++
1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/ezchip/nps_enet.c b/drivers/net/ethernet/ezchip/nps_enet.c
index 5ed06a6..8cf3cde 100644
--- a/drivers/net/ethernet/ezchip/nps_enet.c
+++ b/drivers/net/ethernet/ezchip/nps_enet.c
@@ -394,6 +394,13 @@ static void nps_enet_send_frame(struct net_device *ndev,

/* Indicate SW is done */
priv->tx_packet_sent = true;
+
+ /* before the frame is sent we have to make
+ * sure that priv->tx_packet_sent will be valid
+ * for the CPU'S that handles the ISR and NAPI poll
+ */
+ smp_wmb();
+
tx_ctrl_value |= NPS_ENET_ENABLE << TX_CTL_CT_SHIFT;
/* Send Frame */
nps_enet_reg_set(priv, NPS_ENET_REG_TX_CTL, tx_ctrl_value);
--
1.7.1

2016-04-26 07:52:09

by Noam Camus

[permalink] [raw]
Subject: [PATCH 3/3] net: nps_enet: bug fix - handle lost tx interrupts

From: Elad Kanfi <[email protected]>

The tx interrupt is of edge type, and in case such interrupt is triggered
while it is masked it will not be handled even after tx interrupts are
re-enabled in the end of NAPI poll.
This will cause tx network to stop in the following scenario:
* Rx is being handled, hence interrupts are masked.
* Tx interrupt is triggered after checking if there is some tx to handle
and before re-enabling the interrupts.
In this situation only rx transaction will release tx requests.

In order to handle the tx that was missed( if there was one ),
a NAPI reschdule was added after enabling the interrupts.

Signed-off-by: Elad Kanfi <[email protected]>

Acked-by: Noam Camus <[email protected]>
---
drivers/net/ethernet/ezchip/nps_enet.c | 12 ++++++++++++
1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/ezchip/nps_enet.c b/drivers/net/ethernet/ezchip/nps_enet.c
index 8cf3cde..5eaede1 100644
--- a/drivers/net/ethernet/ezchip/nps_enet.c
+++ b/drivers/net/ethernet/ezchip/nps_enet.c
@@ -199,6 +199,18 @@ static int nps_enet_poll(struct napi_struct *napi, int budget)

nps_enet_reg_set(priv, NPS_ENET_REG_BUF_INT_ENABLE,
buf_int_enable_value);
+
+ /* in case we will get a tx interrupt while interrupts
+ * are masked, we will lose it since the tx is edge interrupt.
+ * specifically, while executing the code section above,
+ * between nps_enet_tx_handler and the interrupts enable, all
+ * tx requests will be stuck until we will get an rx interrupt.
+ * the two code lines below will solve this situation by
+ * re-adding ourselves to the poll list.
+ */
+
+ if (nps_enet_is_tx_pending(priv))
+ napi_reschedule(napi);
}

return work_done;
--
1.7.1

2016-04-26 15:57:59

by David Miller

[permalink] [raw]
Subject: Re: [PATCH 0/3] Net driver bugs fix and code style


It is not appropriate to mix coding style fixes with bug fixes.

Bug fixes must be targetted at the 'net' tree, and anything else
must be targetted at the 'net-next' tree.

2016-04-26 17:23:33

by Noam Camus

[permalink] [raw]
Subject: Re: [PATCH 0/3] Net driver bugs fix and code style

>From: David Miller <[email protected]>
>Sent: Tuesday, April 26, 2016 6:57 PM

>It is not appropriate to mix coding style fixes with bug fixes.
The code style in patch number 1 is actually prevents code duplication that stem from bug fix at patch number 3.

>Bug fixes must be targetted at the 'net' tree, and anything else
>must be targetted at the 'net-next' tree.
Should we handle bugs fix first (more urgent then code style) and once it will be accepted and applied to net-next proceed with the code style patch?

Noam

2016-04-26 17:57:41

by David Miller

[permalink] [raw]
Subject: Re: [PATCH 0/3] Net driver bugs fix and code style

From: Noam Camus <[email protected]>
Date: Tue, 26 Apr 2016 17:23:26 +0000

>>From: David Miller <[email protected]>
>>Sent: Tuesday, April 26, 2016 6:57 PM
>>
>>Bug fixes must be targetted at the 'net' tree, and anything else
>>must be targetted at the 'net-next' tree.
> Should we handle bugs fix first (more urgent then code style) and once it will be accepted and applied to net-next proceed with the code style patch?

That's exactly what I am telling you, yes.

2016-04-26 17:58:19

by David Miller

[permalink] [raw]
Subject: Re: [PATCH 0/3] Net driver bugs fix and code style


BTW, you should also CC: the correct mailing lists for networking patch
submissions.

You failed to CC: [email protected]