2019-02-06 10:14:38

by Vadim Lomovtsev

[permalink] [raw]
Subject: [PATCH 0/8] nic: thunderx: fix communication races betwen VF & PF

The ThunderX CN88XX NIC Virtual Function driver uses mailbox interface
to communicate to physical function driver. Each of VF has it's own pair
of mailbox registers to read from and write to. The mailbox registers
has no protection from possible races, so it has to be implemented
at software side.

After long term testing by loop of 'ip link set <ifname> up/down'
command it was found that there are two possible scenarios when
race condition appears:
1. VF receives link change message from PF and VF send RX mode
configuration message to PF in the same time from separate thread.
2. PF receives RX mode configuration from VF and in the same time,
in separate thread PF detects link status change and sends appropriate
message to particular VF.

Both cases leads to mailbox data to be rewritten, NIC VF messaging control
data to be updated incorrectly and communication sequence gets broken.

This patch series is to address race condition with VF & PF communication.

Vadim Lomovtsev (8):
net: thunderx: correct typo in macro name
net: thunderx: replace global nicvf_rx_mode_wq work queue for all VFs
to private for each of them.
net: thunderx: make CFG_DONE message to run through generic send-ack
sequence
net: thunderx: add nicvf_send_msg_to_pf result check for
set_rx_mode_task
net: thunderx: rework xcast message structure to make it fit into 64
bit
net: thunderx: add mutex to protect mailbox from concurrent calls for
same VF
net: thunderx: implement helpers to read mailbox IRQ status
net: thunderx: check status of mailbox IRQ before sending a message

drivers/net/ethernet/cavium/thunder/nic.h | 12 +--
.../net/ethernet/cavium/thunder/nic_main.c | 58 +++++++------
.../net/ethernet/cavium/thunder/nicvf_main.c | 82 +++++++++++++------
.../ethernet/cavium/thunder/nicvf_queues.c | 14 ++++
.../ethernet/cavium/thunder/nicvf_queues.h | 1 +
.../net/ethernet/cavium/thunder/thunder_bgx.c | 2 +-
.../net/ethernet/cavium/thunder/thunder_bgx.h | 2 +-
7 files changed, 112 insertions(+), 59 deletions(-)

--
2.17.2


2019-02-06 10:14:47

by Vadim Lomovtsev

[permalink] [raw]
Subject: [PATCH 6/8] net: thunderx: add mutex to protect mailbox from concurrent calls for same VF

In some cases it could happen that nicvf_send_msg_to_pf() could be called
concurrently for the same NIC VF, and thus re-writing mailbox contents and
breaking messaging sequence with PF by re-writing NICVF data.

This commit is to implement mutex for NICVF to protect mailbox registers
and NICVF messaging control data from concurrent access.

Signed-off-by: Vadim Lomovtsev <[email protected]>
---
drivers/net/ethernet/cavium/thunder/nic.h | 2 ++
drivers/net/ethernet/cavium/thunder/nicvf_main.c | 13 ++++++++++---
2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h
index 227343625e83..86cda3f4b37b 100644
--- a/drivers/net/ethernet/cavium/thunder/nic.h
+++ b/drivers/net/ethernet/cavium/thunder/nic.h
@@ -329,6 +329,8 @@ struct nicvf {
spinlock_t rx_mode_wq_lock;
/* workqueue for handling kernel ndo_set_rx_mode() calls */
struct workqueue_struct *nicvf_rx_mode_wq;
+ /* mutex to protect VF's mailbox contents from concurrent access */
+ struct mutex rx_mode_mtx;

/* PTP timestamp */
struct cavium_ptp *ptp_clock;
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index 30c7f54b4f17..a05e2989ec76 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -124,6 +124,9 @@ int nicvf_send_msg_to_pf(struct nicvf *nic, union nic_mbx *mbx)
{
int timeout = NIC_MBOX_MSG_TIMEOUT;
int sleep = 10;
+ int ret = 0;
+
+ mutex_lock(&nic->rx_mode_mtx);

nic->pf_acked = false;
nic->pf_nacked = false;
@@ -136,7 +139,8 @@ int nicvf_send_msg_to_pf(struct nicvf *nic, union nic_mbx *mbx)
netdev_err(nic->netdev,
"PF NACK to mbox msg 0x%02x from VF%d\n",
(mbx->msg.msg & 0xFF), nic->vf_id);
- return -EINVAL;
+ ret = -EINVAL;
+ break;
}
msleep(sleep);
if (nic->pf_acked)
@@ -146,10 +150,12 @@ int nicvf_send_msg_to_pf(struct nicvf *nic, union nic_mbx *mbx)
netdev_err(nic->netdev,
"PF didn't ACK to mbox msg 0x%02x from VF%d\n",
(mbx->msg.msg & 0xFF), nic->vf_id);
- return -EBUSY;
+ ret = -EBUSY;
+ break;
}
}
- return 0;
+ mutex_unlock(&nic->rx_mode_mtx);
+ return ret;
}

/* Checks if VF is able to comminicate with PF
@@ -2211,6 +2217,7 @@ static int nicvf_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
nic->vf_id);
INIT_WORK(&nic->rx_mode_work.work, nicvf_set_rx_mode_task);
spin_lock_init(&nic->rx_mode_wq_lock);
+ mutex_init(&nic->rx_mode_mtx);

err = register_netdev(netdev);
if (err) {
--
2.17.2

2019-02-06 10:14:55

by Vadim Lomovtsev

[permalink] [raw]
Subject: [PATCH 7/8] net: thunderx: implement helpers to read mailbox IRQ status

This commit is to implement routines to read mailbox IRQ status
for particular VF at PF side, and for mailbox IRQ status
from PF at VF side.

Signed-off-by: Vadim Lomovtsev <[email protected]>
---
drivers/net/ethernet/cavium/thunder/nic_main.c | 13 +++++++++++++
drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 14 ++++++++++++++
drivers/net/ethernet/cavium/thunder/nicvf_queues.h | 1 +
3 files changed, 28 insertions(+)

diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c
index 620dbe082ca0..a32c1bd75794 100644
--- a/drivers/net/ethernet/cavium/thunder/nic_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nic_main.c
@@ -104,6 +104,19 @@ static u64 nic_reg_read(struct nicpf *nic, u64 offset)
return readq_relaxed(nic->reg_base + offset);
}

+static int nic_is_mbox_intr_active(struct nicpf *nic, int vf_id)
+{
+ int ret = 0;
+
+ if (vf_id < NIC_VF_PER_MBX_REG) {
+ ret = nic_reg_read(nic, NIC_PF_MAILBOX_INT) & BIT_ULL(vf_id);
+ } else {
+ ret = nic_reg_read(nic, NIC_PF_MAILBOX_INT + sizeof(u64)) &
+ BIT_ULL(vf_id - NIC_VF_PER_MBX_REG);
+ }
+ return ret;
+}
+
/* PF -> VF mailbox communication APIs */
static void nic_enable_mbx_intr(struct nicpf *nic)
{
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
index 5b4d3badcb73..e7ee7005657c 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -1801,6 +1801,20 @@ void nicvf_clear_intr(struct nicvf *nic, int int_type, int q_idx)
nicvf_reg_write(nic, NIC_VF_INT, mask);
}

+/* Check if interrupt is active */
+int nicvf_check_is_intr_active(struct nicvf *nic, int int_type, int q_idx)
+{
+ u64 mask = nicvf_int_type_to_mask(int_type, q_idx);
+
+ if (!mask) {
+ netdev_dbg(nic->netdev,
+ "Failed to read interrupt status: unknown type\n");
+ return 0;
+ }
+
+ return (mask & nicvf_reg_read(nic, NIC_VF_INT));
+}
+
/* Check if interrupt is enabled */
int nicvf_is_intr_enabled(struct nicvf *nic, int int_type, int q_idx)
{
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
index 5e9a03cf1b4d..58f6fbe48bce 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
@@ -358,6 +358,7 @@ void nicvf_enable_intr(struct nicvf *nic, int int_type, int q_idx);
void nicvf_disable_intr(struct nicvf *nic, int int_type, int q_idx);
void nicvf_clear_intr(struct nicvf *nic, int int_type, int q_idx);
int nicvf_is_intr_enabled(struct nicvf *nic, int int_type, int q_idx);
+int nicvf_check_is_intr_active(struct nicvf *nic, int int_type, int q_idx);

/* Register access APIs */
void nicvf_reg_write(struct nicvf *nic, u64 offset, u64 val);
--
2.17.2

2019-02-06 10:15:05

by Vadim Lomovtsev

[permalink] [raw]
Subject: [PATCH 8/8] net: thunderx: check status of mailbox IRQ before sending a message

In order to prevent mailbox data re-writing at VF side we need to check if
there is an active mailbox IRQ from PF, and if there is no one proceed with
sending message to PF. Having spinlock at irq handler and message send
routing wont help since by the moment when code flow would reach the irq
handler and acquire spinlock the message send routine could be already
invoked and thus re-write data in the mailbox.

The same is true for PF while sending messages to VF.

This commit is to implement mailbox IRQ status check before sending
message to VF from PF. Same is for sending message to PF from VF.

Signed-off-by: Vadim Lomovtsev <[email protected]>
---
.../net/ethernet/cavium/thunder/nic_main.c | 39 ++++++++-----------
.../net/ethernet/cavium/thunder/nicvf_main.c | 3 ++
2 files changed, 20 insertions(+), 22 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c
index a32c1bd75794..e0041692caef 100644
--- a/drivers/net/ethernet/cavium/thunder/nic_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nic_main.c
@@ -64,7 +64,6 @@ struct nicpf {
u32 *speed;
u16 cpi_base[MAX_NUM_VFS_SUPPORTED];
u16 rssi_base[MAX_NUM_VFS_SUPPORTED];
- bool mbx_lock[MAX_NUM_VFS_SUPPORTED];

/* MSI-X */
u8 num_vec;
@@ -954,8 +953,6 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf)
int i;
int ret = 0;

- nic->mbx_lock[vf] = true;
-
mbx_addr = nic_get_mbx_addr(vf);
mbx_data = (u64 *)&mbx;

@@ -975,7 +972,7 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf)
nic->duplex[vf] = 0;
nic->speed[vf] = 0;
}
- goto unlock;
+ return;
case NIC_MBOX_MSG_QS_CFG:
reg_addr = NIC_PF_QSET_0_127_CFG |
(mbx.qs.num << NIC_QS_ID_SHIFT);
@@ -1044,7 +1041,7 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf)
break;
case NIC_MBOX_MSG_RSS_SIZE:
nic_send_rss_size(nic, vf);
- goto unlock;
+ return;
case NIC_MBOX_MSG_RSS_CFG:
case NIC_MBOX_MSG_RSS_CFG_CONT:
nic_config_rss(nic, &mbx.rss_cfg);
@@ -1062,19 +1059,19 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf)
break;
case NIC_MBOX_MSG_ALLOC_SQS:
nic_alloc_sqs(nic, &mbx.sqs_alloc);
- goto unlock;
+ return;
case NIC_MBOX_MSG_NICVF_PTR:
nic->nicvf[vf] = mbx.nicvf.nicvf;
break;
case NIC_MBOX_MSG_PNICVF_PTR:
nic_send_pnicvf(nic, vf);
- goto unlock;
+ return;
case NIC_MBOX_MSG_SNICVF_PTR:
nic_send_snicvf(nic, &mbx.nicvf);
- goto unlock;
+ return;
case NIC_MBOX_MSG_BGX_STATS:
nic_get_bgx_stats(nic, &mbx.bgx_stats);
- goto unlock;
+ return;
case NIC_MBOX_MSG_LOOPBACK:
ret = nic_config_loopback(nic, &mbx.lbk);
break;
@@ -1083,7 +1080,7 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf)
break;
case NIC_MBOX_MSG_PFC:
nic_pause_frame(nic, vf, &mbx.pfc);
- goto unlock;
+ return;
case NIC_MBOX_MSG_PTP_CFG:
nic_config_timestamp(nic, vf, &mbx.ptp);
break;
@@ -1134,8 +1131,6 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf)
mbx.msg.msg, vf);
nic_mbx_send_nack(nic, vf);
}
-unlock:
- nic->mbx_lock[vf] = false;
}

static irqreturn_t nic_mbx_intr_handler(int irq, void *nic_irq)
@@ -1313,18 +1308,18 @@ static void nic_poll_for_link(struct work_struct *work)
if (nic->link[vf] == link.link_up)
continue;

- if (!nic->mbx_lock[vf]) {
- nic->link[vf] = link.link_up;
- nic->duplex[vf] = link.duplex;
- nic->speed[vf] = link.speed;
+ nic->link[vf] = link.link_up;
+ nic->duplex[vf] = link.duplex;
+ nic->speed[vf] = link.speed;

- /* Send a mbox message to VF with current link status */
- mbx.link_status.link_up = link.link_up;
- mbx.link_status.duplex = link.duplex;
- mbx.link_status.speed = link.speed;
- mbx.link_status.mac_type = link.mac_type;
+ /* Send a mbox message to VF with current link status */
+ mbx.link_status.link_up = link.link_up;
+ mbx.link_status.duplex = link.duplex;
+ mbx.link_status.speed = link.speed;
+ mbx.link_status.mac_type = link.mac_type;
+
+ if (!nic_is_mbox_intr_active(nic, vf))
nic_send_msg_to_vf(nic, vf, &mbx);
- }
}
queue_delayed_work(nic->check_link, &nic->dwork, HZ * 2);
}
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index a05e2989ec76..66e19c207467 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -128,6 +128,9 @@ int nicvf_send_msg_to_pf(struct nicvf *nic, union nic_mbx *mbx)

mutex_lock(&nic->rx_mode_mtx);

+ while (nicvf_check_is_intr_active(nic, NICVF_INTR_MBOX, 0))
+ msleep(1);
+
nic->pf_acked = false;
nic->pf_nacked = false;

--
2.17.2

2019-02-06 10:15:29

by Vadim Lomovtsev

[permalink] [raw]
Subject: [PATCH 5/8] net: thunderx: rework xcast message structure to make it fit into 64 bit

To communicate to PF each of ThunderX NIC VF uses mailbox which is
pair of 64 bit registers available to both VFn and PF.

This commit is to change the xcast message structure in order to
fit it into 64 bit.

Signed-off-by: Vadim Lomovtsev <[email protected]>
---
drivers/net/ethernet/cavium/thunder/nic.h | 6 ++----
drivers/net/ethernet/cavium/thunder/nic_main.c | 4 ++--
drivers/net/ethernet/cavium/thunder/nicvf_main.c | 6 +++---
3 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h
index 376a96bce33f..227343625e83 100644
--- a/drivers/net/ethernet/cavium/thunder/nic.h
+++ b/drivers/net/ethernet/cavium/thunder/nic.h
@@ -577,10 +577,8 @@ struct set_ptp {

struct xcast {
u8 msg;
- union {
- u8 mode;
- u64 mac;
- } data;
+ u8 mode;
+ u64 mac:48;
};

/* 128 bit shared memory between PF and each VF */
diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c
index 90497a27df18..620dbe082ca0 100644
--- a/drivers/net/ethernet/cavium/thunder/nic_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nic_main.c
@@ -1094,7 +1094,7 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf)
bgx = NIC_GET_BGX_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]);
lmac = NIC_GET_LMAC_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]);
bgx_set_dmac_cam_filter(nic->node, bgx, lmac,
- mbx.xcast.data.mac,
+ mbx.xcast.mac,
vf < NIC_VF_PER_MBX_REG ? vf :
vf - NIC_VF_PER_MBX_REG);
break;
@@ -1106,7 +1106,7 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf)
}
bgx = NIC_GET_BGX_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]);
lmac = NIC_GET_LMAC_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]);
- bgx_set_xcast_mode(nic->node, bgx, lmac, mbx.xcast.data.mode);
+ bgx_set_xcast_mode(nic->node, bgx, lmac, mbx.xcast.mode);
break;
default:
dev_err(&nic->pdev->dev,
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index dbd8862d60d6..30c7f54b4f17 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -1964,7 +1964,7 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs,
* its' own LMAC to the filter to accept packets for it.
*/
mbx.xcast.msg = NIC_MBOX_MSG_ADD_MCAST;
- mbx.xcast.data.mac = 0;
+ mbx.xcast.mac = 0;
if (nicvf_send_msg_to_pf(nic, &mbx) < 0)
goto free_mc;
}
@@ -1974,7 +1974,7 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs,
/* now go through kernel list of MACs and add them one by one */
for (idx = 0; idx < mc_addrs->count; idx++) {
mbx.xcast.msg = NIC_MBOX_MSG_ADD_MCAST;
- mbx.xcast.data.mac = mc_addrs->mc[idx];
+ mbx.xcast.mac = mc_addrs->mc[idx];
if (nicvf_send_msg_to_pf(nic, &mbx) < 0)
goto free_mc;
}
@@ -1982,7 +1982,7 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs,

/* and finally set rx mode for PF accordingly */
mbx.xcast.msg = NIC_MBOX_MSG_SET_XCAST;
- mbx.xcast.data.mode = mode;
+ mbx.xcast.mode = mode;

nicvf_send_msg_to_pf(nic, &mbx);
free_mc:
--
2.17.2

2019-02-06 10:16:04

by Vadim Lomovtsev

[permalink] [raw]
Subject: [PATCH 2/8] net: thunderx: replace global nicvf_rx_mode_wq work queue for all VFs to private for each of them.

Having one work queue for receive mode configuration ndo_set_rx_mode()
call for all VFs results in making each of them wait till the
set_rx_mode() call completes for another VF if any of close, set
receive mode and change flags calls being already invoked. Potentially
this could cause device state change before appropriate call of receive
mode configuration completes, so the call itself became meaningless,
corrupt data or break configuration sequence.

We don't need any delays in NIC VF configuration sequence so having delayed
work call with 0 delay has no sense.

This commit is to implement one work queue for each NIC VF for set_rx_mode
task and to let them work independently and replacing delayed_work
with work_struct.

Signed-off-by: Vadim Lomovtsev <[email protected]>
---
drivers/net/ethernet/cavium/thunder/nic.h | 4 ++-
.../net/ethernet/cavium/thunder/nicvf_main.c | 30 ++++++++++---------
2 files changed, 19 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h
index f4d81765221e..376a96bce33f 100644
--- a/drivers/net/ethernet/cavium/thunder/nic.h
+++ b/drivers/net/ethernet/cavium/thunder/nic.h
@@ -271,7 +271,7 @@ struct xcast_addr_list {
};

struct nicvf_work {
- struct delayed_work work;
+ struct work_struct work;
u8 mode;
struct xcast_addr_list *mc;
};
@@ -327,6 +327,8 @@ struct nicvf {
struct nicvf_work rx_mode_work;
/* spinlock to protect workqueue arguments from concurrent access */
spinlock_t rx_mode_wq_lock;
+ /* workqueue for handling kernel ndo_set_rx_mode() calls */
+ struct workqueue_struct *nicvf_rx_mode_wq;

/* PTP timestamp */
struct cavium_ptp *ptp_clock;
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index 88f8a8fa93cd..abf24e7dff2d 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -68,9 +68,6 @@ module_param(cpi_alg, int, 0444);
MODULE_PARM_DESC(cpi_alg,
"PFC algorithm (0=none, 1=VLAN, 2=VLAN16, 3=IP Diffserv)");

-/* workqueue for handling kernel ndo_set_rx_mode() calls */
-static struct workqueue_struct *nicvf_rx_mode_wq;
-
static inline u8 nicvf_netdev_qidx(struct nicvf *nic, u8 qidx)
{
if (nic->sqs_mode)
@@ -1311,6 +1308,9 @@ int nicvf_stop(struct net_device *netdev)
struct nicvf_cq_poll *cq_poll = NULL;
union nic_mbx mbx = {};

+ /* wait till all queued set_rx_mode tasks completes */
+ drain_workqueue(nic->nicvf_rx_mode_wq);
+
mbx.msg.msg = NIC_MBOX_MSG_SHUTDOWN;
nicvf_send_msg_to_pf(nic, &mbx);

@@ -1418,6 +1418,9 @@ int nicvf_open(struct net_device *netdev)
struct nicvf_cq_poll *cq_poll = NULL;
union nic_mbx mbx = {};

+ /* wait till all queued set_rx_mode tasks completes if any */
+ drain_workqueue(nic->nicvf_rx_mode_wq);
+
netif_carrier_off(netdev);

err = nicvf_register_misc_interrupt(nic);
@@ -1973,7 +1976,7 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs,
static void nicvf_set_rx_mode_task(struct work_struct *work_arg)
{
struct nicvf_work *vf_work = container_of(work_arg, struct nicvf_work,
- work.work);
+ work);
struct nicvf *nic = container_of(vf_work, struct nicvf, rx_mode_work);
u8 mode;
struct xcast_addr_list *mc;
@@ -2030,7 +2033,7 @@ static void nicvf_set_rx_mode(struct net_device *netdev)
kfree(nic->rx_mode_work.mc);
nic->rx_mode_work.mc = mc_list;
nic->rx_mode_work.mode = mode;
- queue_delayed_work(nicvf_rx_mode_wq, &nic->rx_mode_work.work, 0);
+ queue_work(nic->nicvf_rx_mode_wq, &nic->rx_mode_work.work);
spin_unlock(&nic->rx_mode_wq_lock);
}

@@ -2187,7 +2190,10 @@ static int nicvf_probe(struct pci_dev *pdev, const struct pci_device_id *ent)

INIT_WORK(&nic->reset_task, nicvf_reset_task);

- INIT_DELAYED_WORK(&nic->rx_mode_work.work, nicvf_set_rx_mode_task);
+ nic->nicvf_rx_mode_wq = alloc_ordered_workqueue("nicvf_rx_mode_wq_VF%d",
+ WQ_MEM_RECLAIM,
+ nic->vf_id);
+ INIT_WORK(&nic->rx_mode_work.work, nicvf_set_rx_mode_task);
spin_lock_init(&nic->rx_mode_wq_lock);

err = register_netdev(netdev);
@@ -2228,13 +2234,15 @@ static void nicvf_remove(struct pci_dev *pdev)
nic = netdev_priv(netdev);
pnetdev = nic->pnicvf->netdev;

- cancel_delayed_work_sync(&nic->rx_mode_work.work);
-
/* Check if this Qset is assigned to different VF.
* If yes, clean primary and all secondary Qsets.
*/
if (pnetdev && (pnetdev->reg_state == NETREG_REGISTERED))
unregister_netdev(pnetdev);
+ if (nic->nicvf_rx_mode_wq) {
+ destroy_workqueue(nic->nicvf_rx_mode_wq);
+ nic->nicvf_rx_mode_wq = NULL;
+ }
nicvf_unregister_interrupts(nic);
pci_set_drvdata(pdev, NULL);
if (nic->drv_stats)
@@ -2261,17 +2269,11 @@ static struct pci_driver nicvf_driver = {
static int __init nicvf_init_module(void)
{
pr_info("%s, ver %s\n", DRV_NAME, DRV_VERSION);
- nicvf_rx_mode_wq = alloc_ordered_workqueue("nicvf_generic",
- WQ_MEM_RECLAIM);
return pci_register_driver(&nicvf_driver);
}

static void __exit nicvf_cleanup_module(void)
{
- if (nicvf_rx_mode_wq) {
- destroy_workqueue(nicvf_rx_mode_wq);
- nicvf_rx_mode_wq = NULL;
- }
pci_unregister_driver(&nicvf_driver);
}

--
2.17.2

2019-02-06 10:16:31

by Vadim Lomovtsev

[permalink] [raw]
Subject: [PATCH 3/8] net: thunderx: make CFG_DONE message to run through generic send-ack sequence

At the end of NIC VF initialization it send CFG_DONE message to PF without
using nicvf_msg_send_to_pf routine. This potentially could re-write data in
mailbox. This commit is to implement common way of sending CFG_DONE message
by the same way with other configuration messages by using
nicvf_send_msg_to_pf() routine.

Signed-off-by: Vadim Lomovtsev <[email protected]>
---
drivers/net/ethernet/cavium/thunder/nic_main.c | 2 +-
.../net/ethernet/cavium/thunder/nicvf_main.c | 18 +++++++++++++++---
2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c
index 6c8dcb65ff03..90497a27df18 100644
--- a/drivers/net/ethernet/cavium/thunder/nic_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nic_main.c
@@ -1039,7 +1039,7 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf)
case NIC_MBOX_MSG_CFG_DONE:
/* Last message of VF config msg sequence */
nic_enable_vf(nic, vf, true);
- goto unlock;
+ break;
case NIC_MBOX_MSG_SHUTDOWN:
/* First msg in VF teardown sequence */
if (vf >= nic->num_vf_en)
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index abf24e7dff2d..b0e8a04e0f1e 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -169,6 +169,20 @@ static int nicvf_check_pf_ready(struct nicvf *nic)
return 1;
}

+static int nicvf_send_cfg_done(struct nicvf *nic)
+{
+ union nic_mbx mbx = {};
+
+ mbx.msg.msg = NIC_MBOX_MSG_CFG_DONE;
+ if (nicvf_send_msg_to_pf(nic, &mbx)) {
+ netdev_err(nic->netdev,
+ "PF didn't respond to CFG DONE msg\n");
+ return 0;
+ }
+
+ return 1;
+}
+
static void nicvf_read_bgx_stats(struct nicvf *nic, struct bgx_stats_msg *bgx)
{
if (bgx->rx)
@@ -1416,7 +1430,6 @@ int nicvf_open(struct net_device *netdev)
struct nicvf *nic = netdev_priv(netdev);
struct queue_set *qs = nic->qs;
struct nicvf_cq_poll *cq_poll = NULL;
- union nic_mbx mbx = {};

/* wait till all queued set_rx_mode tasks completes if any */
drain_workqueue(nic->nicvf_rx_mode_wq);
@@ -1515,8 +1528,7 @@ int nicvf_open(struct net_device *netdev)
nicvf_enable_intr(nic, NICVF_INTR_RBDR, qidx);

/* Send VF config done msg to PF */
- mbx.msg.msg = NIC_MBOX_MSG_CFG_DONE;
- nicvf_write_to_mbx(nic, &mbx);
+ nicvf_send_cfg_done(nic);

return 0;
cleanup:
--
2.17.2

2019-02-06 10:16:45

by Vadim Lomovtsev

[permalink] [raw]
Subject: [PATCH 4/8] net: thunderx: add nicvf_send_msg_to_pf result check for set_rx_mode_task

The rx_set_mode invokes number of messages to be send to PF for receive
mode configuration. In case if there any issues we need to stop sending
messages and release allocated memory.

This commit is to implement check of nicvf_msg_send_to_pf() result.

Signed-off-by: Vadim Lomovtsev <[email protected]>
---
drivers/net/ethernet/cavium/thunder/nicvf_main.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index b0e8a04e0f1e..dbd8862d60d6 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -1956,7 +1956,8 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs,

/* flush DMAC filters and reset RX mode */
mbx.xcast.msg = NIC_MBOX_MSG_RESET_XCAST;
- nicvf_send_msg_to_pf(nic, &mbx);
+ if (nicvf_send_msg_to_pf(nic, &mbx) < 0)
+ goto free_mc;

if (mode & BGX_XCAST_MCAST_FILTER) {
/* once enabling filtering, we need to signal to PF to add
@@ -1964,7 +1965,8 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs,
*/
mbx.xcast.msg = NIC_MBOX_MSG_ADD_MCAST;
mbx.xcast.data.mac = 0;
- nicvf_send_msg_to_pf(nic, &mbx);
+ if (nicvf_send_msg_to_pf(nic, &mbx) < 0)
+ goto free_mc;
}

/* check if we have any specific MACs to be added to PF DMAC filter */
@@ -1973,9 +1975,9 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs,
for (idx = 0; idx < mc_addrs->count; idx++) {
mbx.xcast.msg = NIC_MBOX_MSG_ADD_MCAST;
mbx.xcast.data.mac = mc_addrs->mc[idx];
- nicvf_send_msg_to_pf(nic, &mbx);
+ if (nicvf_send_msg_to_pf(nic, &mbx) < 0)
+ goto free_mc;
}
- kfree(mc_addrs);
}

/* and finally set rx mode for PF accordingly */
@@ -1983,6 +1985,8 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs,
mbx.xcast.data.mode = mode;

nicvf_send_msg_to_pf(nic, &mbx);
+free_mc:
+ kfree(mc_addrs);
}

static void nicvf_set_rx_mode_task(struct work_struct *work_arg)
--
2.17.2

2019-02-06 10:17:11

by Vadim Lomovtsev

[permalink] [raw]
Subject: [PATCH 1/8] net: thunderx: correct typo in macro name

Correct STREERING to STEERING at macro name for BGX steering register.

Signed-off-by: Vadim Lomovtsev <[email protected]>
---
drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 2 +-
drivers/net/ethernet/cavium/thunder/thunder_bgx.h | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
index e337da6ba2a4..673c57b8023f 100644
--- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
+++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
@@ -1217,7 +1217,7 @@ static void bgx_init_hw(struct bgx *bgx)

/* Disable MAC steering (NCSI traffic) */
for (i = 0; i < RX_TRAFFIC_STEER_RULE_COUNT; i++)
- bgx_reg_write(bgx, 0, BGX_CMR_RX_STREERING + (i * 8), 0x00);
+ bgx_reg_write(bgx, 0, BGX_CMR_RX_STEERING + (i * 8), 0x00);
}

static u8 bgx_get_lane2sds_cfg(struct bgx *bgx, struct lmac *lmac)
diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.h b/drivers/net/ethernet/cavium/thunder/thunder_bgx.h
index cbdd20b9ee6f..5cbc54e9eb19 100644
--- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.h
+++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.h
@@ -60,7 +60,7 @@
#define RX_DMACX_CAM_EN BIT_ULL(48)
#define RX_DMACX_CAM_LMACID(x) (((u64)x) << 49)
#define RX_DMAC_COUNT 32
-#define BGX_CMR_RX_STREERING 0x300
+#define BGX_CMR_RX_STEERING 0x300
#define RX_TRAFFIC_STEER_RULE_COUNT 8
#define BGX_CMR_CHAN_MSK_AND 0x450
#define BGX_CMR_BIST_STATUS 0x460
--
2.17.2

2019-02-06 14:20:41

by Vadim Lomovtsev

[permalink] [raw]
Subject: Re: [PATCH 0/8] nic: thunderx: fix communication races betwen VF & PF

self-NACK here, some emails get's corrupted for some reasons,
along with some typos found.

sorry for inconvenience.

Vadim

On Wed, Feb 06, 2019 at 10:13:54AM +0000, Vadim Lomovtsev wrote:
> The ThunderX CN88XX NIC Virtual Function driver uses mailbox interface
> to communicate to physical function driver. Each of VF has it's own pair
> of mailbox registers to read from and write to. The mailbox registers
> has no protection from possible races, so it has to be implemented
> at software side.
>
> After long term testing by loop of 'ip link set <ifname> up/down'
> command it was found that there are two possible scenarios when
> race condition appears:
> 1. VF receives link change message from PF and VF send RX mode
> configuration message to PF in the same time from separate thread.
> 2. PF receives RX mode configuration from VF and in the same time,
> in separate thread PF detects link status change and sends appropriate
> message to particular VF.
>
> Both cases leads to mailbox data to be rewritten, NIC VF messaging control
> data to be updated incorrectly and communication sequence gets broken.
>
> This patch series is to address race condition with VF & PF communication.
>
> Vadim Lomovtsev (8):
> net: thunderx: correct typo in macro name
> net: thunderx: replace global nicvf_rx_mode_wq work queue for all VFs
> to private for each of them.
> net: thunderx: make CFG_DONE message to run through generic send-ack
> sequence
> net: thunderx: add nicvf_send_msg_to_pf result check for
> set_rx_mode_task
> net: thunderx: rework xcast message structure to make it fit into 64
> bit
> net: thunderx: add mutex to protect mailbox from concurrent calls for
> same VF
> net: thunderx: implement helpers to read mailbox IRQ status
> net: thunderx: check status of mailbox IRQ before sending a message
>
> drivers/net/ethernet/cavium/thunder/nic.h | 12 +--
> .../net/ethernet/cavium/thunder/nic_main.c | 58 +++++++------
> .../net/ethernet/cavium/thunder/nicvf_main.c | 82 +++++++++++++------
> .../ethernet/cavium/thunder/nicvf_queues.c | 14 ++++
> .../ethernet/cavium/thunder/nicvf_queues.h | 1 +
> .../net/ethernet/cavium/thunder/thunder_bgx.c | 2 +-
> .../net/ethernet/cavium/thunder/thunder_bgx.h | 2 +-
> 7 files changed, 112 insertions(+), 59 deletions(-)
>
> --
> 2.17.2