Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932727AbaBDTyt (ORCPT ); Tue, 4 Feb 2014 14:54:49 -0500 Received: from smtp02.citrix.com ([66.165.176.63]:16412 "EHLO SMTP02.CITRIX.COM" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932115AbaBDTyp (ORCPT ); Tue, 4 Feb 2014 14:54:45 -0500 X-IronPort-AV: E=Sophos;i="4.95,781,1384300800"; d="scan'208";a="97926588" From: Zoltan Kiss To: , , , , , CC: Zoltan Kiss Subject: [PATCH net v3] xen-netback: Fix Rx stall due to race condition Date: Tue, 4 Feb 2014 19:54:37 +0000 Message-ID: <1391543677-24039-1-git-send-email-zoltan.kiss@citrix.com> X-Mailer: git-send-email 1.7.9.5 MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.80.2.133] X-DLP: MIA2 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The recent patch to fix receive side flow control (11b57f90257c1d6a91cee720151b69e0c2020cf6: xen-netback: stop vif thread spinning if frontend is unresponsive) solved the spinning thread problem, however caused an another one. The receive side can stall, if: - [THREAD] xenvif_rx_action sets rx_queue_stopped to true - [INTERRUPT] interrupt happens, and sets rx_event to true - [THREAD] then xenvif_kthread sets rx_event to false - [THREAD] rx_work_todo doesn't return true anymore Also, if interrupt sent but there is still no room in the ring, it take quite a long time until xenvif_rx_action realize it. This patch ditch that two variable, and rework rx_work_todo. If the thread finds it can't fit more skb's into the ring, it saves the last slot estimation into rx_last_skb_slots, otherwise it's kept as 0. Then rx_work_todo will check if: - there is something to send to the ring (like before) - there is space for the topmost packet in the queue I think that's more natural and optimal thing to test than two bool which are set somewhere else. Signed-off-by: Zoltan Kiss Reviewed-by: Paul Durrant Acked-by: Wei Liu --- v2: - Improve description of the problem v3: - Change subject and resend against net to emphasize this is a bugfix drivers/net/xen-netback/common.h | 6 +----- drivers/net/xen-netback/interface.c | 1 - drivers/net/xen-netback/netback.c | 16 ++++++---------- 3 files changed, 7 insertions(+), 16 deletions(-) diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h index 4c76bcb..ae413a2 100644 --- a/drivers/net/xen-netback/common.h +++ b/drivers/net/xen-netback/common.h @@ -143,11 +143,7 @@ struct xenvif { char rx_irq_name[IFNAMSIZ+4]; /* DEVNAME-rx */ struct xen_netif_rx_back_ring rx; struct sk_buff_head rx_queue; - bool rx_queue_stopped; - /* Set when the RX interrupt is triggered by the frontend. - * The worker thread may need to wake the queue. - */ - bool rx_event; + RING_IDX rx_last_skb_slots; /* This array is allocated seperately as it is large */ struct gnttab_copy *grant_copy_op; diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c index b9de31e..7669d49 100644 --- a/drivers/net/xen-netback/interface.c +++ b/drivers/net/xen-netback/interface.c @@ -100,7 +100,6 @@ static irqreturn_t xenvif_rx_interrupt(int irq, void *dev_id) { struct xenvif *vif = dev_id; - vif->rx_event = true; xenvif_kick_thread(vif); return IRQ_HANDLED; diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index 2738563..bb241d0 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -477,7 +477,6 @@ static void xenvif_rx_action(struct xenvif *vif) unsigned long offset; struct skb_cb_overlay *sco; bool need_to_notify = false; - bool ring_full = false; struct netrx_pending_operations npo = { .copy = vif->grant_copy_op, @@ -487,7 +486,7 @@ static void xenvif_rx_action(struct xenvif *vif) skb_queue_head_init(&rxq); while ((skb = skb_dequeue(&vif->rx_queue)) != NULL) { - int max_slots_needed; + RING_IDX max_slots_needed; int i; /* We need a cheap worse case estimate for the number of @@ -510,9 +509,10 @@ static void xenvif_rx_action(struct xenvif *vif) if (!xenvif_rx_ring_slots_available(vif, max_slots_needed)) { skb_queue_head(&vif->rx_queue, skb); need_to_notify = true; - ring_full = true; + vif->rx_last_skb_slots = max_slots_needed; break; - } + } else + vif->rx_last_skb_slots = 0; sco = (struct skb_cb_overlay *)skb->cb; sco->meta_slots_used = xenvif_gop_skb(skb, &npo); @@ -523,8 +523,6 @@ static void xenvif_rx_action(struct xenvif *vif) BUG_ON(npo.meta_prod > ARRAY_SIZE(vif->meta)); - vif->rx_queue_stopped = !npo.copy_prod && ring_full; - if (!npo.copy_prod) goto done; @@ -1727,8 +1725,8 @@ static struct xen_netif_rx_response *make_rx_response(struct xenvif *vif, static inline int rx_work_todo(struct xenvif *vif) { - return (!skb_queue_empty(&vif->rx_queue) && !vif->rx_queue_stopped) || - vif->rx_event; + return !skb_queue_empty(&vif->rx_queue) && + xenvif_rx_ring_slots_available(vif, vif->rx_last_skb_slots); } static inline int tx_work_todo(struct xenvif *vif) @@ -1814,8 +1812,6 @@ int xenvif_kthread(void *data) if (!skb_queue_empty(&vif->rx_queue)) xenvif_rx_action(vif); - vif->rx_event = false; - if (skb_queue_empty(&vif->rx_queue) && netif_queue_stopped(vif->dev)) xenvif_start_queue(vif); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/