Received: by 10.223.185.116 with SMTP id b49csp5462564wrg; Wed, 7 Mar 2018 12:10:40 -0800 (PST) X-Google-Smtp-Source: AG47ELukLqT0XWYeWJXynGO0+eh+7+Kr9X4GgGUxLW6UTt4a5pxKTZYZdK6g4ME/IU6ywAXFILhk X-Received: by 2002:a17:902:a9c3:: with SMTP id b3-v6mr17559153plr.442.1520453440345; Wed, 07 Mar 2018 12:10:40 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1520453440; cv=none; d=google.com; s=arc-20160816; b=wTtI5W5bgSTXRM+5SReejdEmguJN9x7hJR5hpCAKA/DHqH6BGmK/AJGIDrUq2tdf72 IZtuMm+oNdBp/uhJ0oNMOHZS+csEbZ2Qn2228iUTB+VaYn3GvfzEn4+lCWIscqXncjiw jxMqvoKmxxQJNpoS2ZoKjITM+GHwMcvuOCgVcH+C/UH86LIuBzzqOJy4ZMuD4Yve2DYi i+4GT3lefkdPoLzCsPvNhdKeRAYt9L1+EoEfx1yxCBBVtK0A5uOgJm7AXevlS7ZMQ7LB EsT8hQViX8cJve1dFXCtXjwi76AsSxjzeSOorQ7zD7oyzpoFMNJ0Z1hXI3l6pg7zX8On LimQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :in-reply-to:message-id:date:subject:cc:to:from :arc-authentication-results; bh=Pp313Izfvr8ZY5nW+VfeawtJ10mkRnK7PGPfUikvGeg=; b=sL+/SWV0hdq+YKcev2W2hNQgztt5zmnaJU5OD1Cel2whz203Ebvpt56F2BwpOlMPUC tJd2nX6rc1WPjUpORV0eS+eoN6p24noDAJSIftB04wrhXtvE2httwyIyZVhvscfhp2Yx H2ZjHdASVg2jdC0JtqvdvWAAbyHDIpTI8FC5ZN/04/NibZevaV6iERb95D8Tuzjzun1c YB2zrBqaKYuSxbDb9Bs4ImJ8/uUDhETdcf0l2nwwK7IVikRmkicKyhKT+d82CvhIpMTC r7w1Roq518R/Ln4y+xEOzoGcetkTGWn1NkrXsYwYgNzdLuIKkFeTc1wU9Ln7fewzryK8 ZkVg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e9-v6si13608232pln.492.2018.03.07.12.10.26; Wed, 07 Mar 2018 12:10:40 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965433AbeCGUJb (ORCPT + 99 others); Wed, 7 Mar 2018 15:09:31 -0500 Received: from mail.linuxfoundation.org ([140.211.169.12]:44258 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965363AbeCGTpj (ORCPT ); Wed, 7 Mar 2018 14:45:39 -0500 Received: from localhost (unknown [185.236.200.248]) by mail.linuxfoundation.org (Postfix) with ESMTPSA id E4DEF31; Wed, 7 Mar 2018 19:45:38 +0000 (UTC) From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Grygorii Strashko , Ivan Khoronzhuk , "David S. Miller" Subject: [PATCH 4.14 039/110] net: ethernet: ti: cpsw: fix net watchdog timeout Date: Wed, 7 Mar 2018 11:38:22 -0800 Message-Id: <20180307191044.746922564@linuxfoundation.org> X-Mailer: git-send-email 2.16.2 In-Reply-To: <20180307191039.748351103@linuxfoundation.org> References: <20180307191039.748351103@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.14-stable review patch. If anyone has any objections, please let me know. ------------------ From: Grygorii Strashko [ Upstream commit 62f94c2101f35cd45775df00ba09bde77580e26a ] It was discovered that simple program which indefinitely sends 200b UDP packets and runs on TI AM574x SoC (SMP) under RT Kernel triggers network watchdog timeout in TI CPSW driver (<6 hours run). The network watchdog timeout is triggered due to race between cpsw_ndo_start_xmit() and cpsw_tx_handler() [NAPI] cpsw_ndo_start_xmit() if (unlikely(!cpdma_check_free_tx_desc(txch))) { txq = netdev_get_tx_queue(ndev, q_idx); netif_tx_stop_queue(txq); ^^ as per [1] barier has to be used after set_bit() otherwise new value might not be visible to other cpus } cpsw_tx_handler() if (unlikely(netif_tx_queue_stopped(txq))) netif_tx_wake_queue(txq); and when it happens ndev TX queue became disabled forever while driver's HW TX queue is empty. Fix this, by adding smp_mb__after_atomic() after netif_tx_stop_queue() calls and double check for free TX descriptors after stopping ndev TX queue - if there are free TX descriptors wake up ndev TX queue. [1] https://www.kernel.org/doc/html/latest/core-api/atomic_ops.html Signed-off-by: Grygorii Strashko Reviewed-by: Ivan Khoronzhuk Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman --- drivers/net/ethernet/ti/cpsw.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) --- a/drivers/net/ethernet/ti/cpsw.c +++ b/drivers/net/ethernet/ti/cpsw.c @@ -1618,6 +1618,7 @@ static netdev_tx_t cpsw_ndo_start_xmit(s q_idx = q_idx % cpsw->tx_ch_num; txch = cpsw->txv[q_idx].ch; + txq = netdev_get_tx_queue(ndev, q_idx); ret = cpsw_tx_packet_submit(priv, skb, txch); if (unlikely(ret != 0)) { cpsw_err(priv, tx_err, "desc submit failed\n"); @@ -1628,15 +1629,26 @@ static netdev_tx_t cpsw_ndo_start_xmit(s * tell the kernel to stop sending us tx frames. */ if (unlikely(!cpdma_check_free_tx_desc(txch))) { - txq = netdev_get_tx_queue(ndev, q_idx); netif_tx_stop_queue(txq); + + /* Barrier, so that stop_queue visible to other cpus */ + smp_mb__after_atomic(); + + if (cpdma_check_free_tx_desc(txch)) + netif_tx_wake_queue(txq); } return NETDEV_TX_OK; fail: ndev->stats.tx_dropped++; - txq = netdev_get_tx_queue(ndev, skb_get_queue_mapping(skb)); netif_tx_stop_queue(txq); + + /* Barrier, so that stop_queue visible to other cpus */ + smp_mb__after_atomic(); + + if (cpdma_check_free_tx_desc(txch)) + netif_tx_wake_queue(txq); + return NETDEV_TX_BUSY; }