Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp2827087pxb; Mon, 1 Nov 2021 02:38:40 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwxSrmPmaeAbT+OYrDggxKE5yDAMYixSgQSlwhuyy4UIhWU8f5eU7lRl09G9x1Mnr/F906w X-Received: by 2002:a05:6402:5252:: with SMTP id t18mr38271481edd.129.1635759520748; Mon, 01 Nov 2021 02:38:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1635759520; cv=none; d=google.com; s=arc-20160816; b=nTzAa8jfqgDwapK/8Rq18JkAmNy8hyOAZbgopdOIQEjoon1GPa+dH8UdyTSCW0m0jC q6XH2y4NXKCNzbncO6UOqWv0cnvPlGuNviy6VNzhfeDV+CMnSIceC1rsVp/jjDhcnBoZ 7I4tgJKmEif53kvzcasO1dWKLBwjAFpcOWdw8v+iU/vGCOf6rgO3OrxRVBMJ1kWgM8Vd 6Z7GVM9yi4BXUmrKfwm7SSxKm7mdTF0PMyK8DqCxd1Qe/dK3N/Kh4QToKoERlrwcPKOn g/mtS1uavgyOu3qU+bg0vJCkcPoHZDkaZhwFx4KUzk+e17jt1U8rb24/FWCqNAEFp38q /cxg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=TzsU14G92BaY8uR/kCkNhOMkRdNM4JXQ/sWHrt8fctY=; b=pi6BKl0kqKCWeFMqL4exN31ys0RKkBoWnwg4AQWBbMwkYZKTQiTkRaQME0yePsmhx/ IEMOs79gQExH6uzi4Mr2fj2K/Dq/BCRxLsHHl87LSAEdkDSoOVQOwSkGSjiaXxFNNFHt 7yjF6b5noUGGblA+PbbGlUN79T+Xpw0qdlo7sEkpniK+52SHyE3LTgv3ncXD2b30f6Vt 0z7sE9BhZVxL7lPZz8ZkiYYx5O/Ax6wdstyRboxDqKFUmz5nl+rpTddsNDOMqV/53ERx azpFA79jp1lD0QhVNGmfg84pavjYQHaQrntzTBYUOr3Ofs+kyFnb2XX4yqczsRmeuZIZ OlGg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=L+Iv+ahS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id gb3si23455569ejc.117.2021.11.01.02.38.17; Mon, 01 Nov 2021 02:38:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=L+Iv+ahS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232838AbhKAJjF (ORCPT + 99 others); Mon, 1 Nov 2021 05:39:05 -0400 Received: from mail.kernel.org ([198.145.29.99]:43770 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233300AbhKAJgC (ORCPT ); Mon, 1 Nov 2021 05:36:02 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 501046127C; Mon, 1 Nov 2021 09:26:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1635758781; bh=jiK+PCob6Sk2ymDtepSMnFrxqelL0ZC47dy5ShLG2ho=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=L+Iv+ahSkdtE5KiN5fDJZjpfWn4xcFL03F0KyQknWS3jT1OFTrYmkY0ayiXVSDYT3 2RIgR6LxjSqZeO3OLInt7EnyDTNk9JRyl/0KpzasOcgju/btJgGSdPvgULsdSHbZtH txPIJz805m6bJa/w75hXPfOlqs8tZOcgMdq8Wskk= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Ido Schimmel , Petr Machata , Jakub Kicinski Subject: [PATCH 5.10 56/77] mlxsw: pci: Recycle received packet upon allocation failure Date: Mon, 1 Nov 2021 10:17:44 +0100 Message-Id: <20211101082523.466579976@linuxfoundation.org> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20211101082511.254155853@linuxfoundation.org> References: <20211101082511.254155853@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Ido Schimmel commit 759635760a804b0d8ad0cc677b650f1544cae22f upstream. When the driver fails to allocate a new Rx buffer, it passes an empty Rx descriptor (contains zero address and size) to the device and marks it as invalid by setting the skb pointer in the descriptor's metadata to NULL. After processing enough Rx descriptors, the driver will try to process the invalid descriptor, but will return immediately seeing that the skb pointer is NULL. Since the driver no longer passes new Rx descriptors to the device, the Rx queue will eventually become full and the device will start to drop packets. Fix this by recycling the received packet if allocation of the new packet failed. This means that allocation is no longer performed at the end of the Rx routine, but at the start, before tearing down the DMA mapping of the received packet. Remove the comment about the descriptor being zeroed as it is no longer correct. This is OK because we either use the descriptor as-is (when recycling) or overwrite its address and size fields with that of the newly allocated Rx buffer. The issue was discovered when a process ("perf") consumed too much memory and put the system under memory pressure. It can be reproduced by injecting slab allocation failures [1]. After the fix, the Rx queue no longer comes to a halt. [1] # echo 10 > /sys/kernel/debug/failslab/times # echo 1000 > /sys/kernel/debug/failslab/interval # echo 100 > /sys/kernel/debug/failslab/probability FAULT_INJECTION: forcing a failure. name failslab, interval 1000, probability 100, space 0, times 8 [...] Call Trace: dump_stack_lvl+0x34/0x44 should_fail.cold+0x32/0x37 should_failslab+0x5/0x10 kmem_cache_alloc_node+0x23/0x190 __alloc_skb+0x1f9/0x280 __netdev_alloc_skb+0x3a/0x150 mlxsw_pci_rdq_skb_alloc+0x24/0x90 mlxsw_pci_cq_tasklet+0x3dc/0x1200 tasklet_action_common.constprop.0+0x9f/0x100 __do_softirq+0xb5/0x252 irq_exit_rcu+0x7a/0xa0 common_interrupt+0x83/0xa0 asm_common_interrupt+0x1e/0x40 RIP: 0010:cpuidle_enter_state+0xc8/0x340 [...] mlxsw_spectrum2 0000:06:00.0: Failed to alloc skb for RDQ Fixes: eda6500a987a ("mlxsw: Add PCI bus implementation") Signed-off-by: Ido Schimmel Reviewed-by: Petr Machata Link: https://lore.kernel.org/r/20211024064014.1060919-1-idosch@idosch.org Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman --- drivers/net/ethernet/mellanox/mlxsw/pci.c | 25 ++++++++++++------------- 1 file changed, 12 insertions(+), 13 deletions(-) --- a/drivers/net/ethernet/mellanox/mlxsw/pci.c +++ b/drivers/net/ethernet/mellanox/mlxsw/pci.c @@ -353,13 +353,10 @@ static int mlxsw_pci_rdq_skb_alloc(struc struct sk_buff *skb; int err; - elem_info->u.rdq.skb = NULL; skb = netdev_alloc_skb_ip_align(NULL, buf_len); if (!skb) return -ENOMEM; - /* Assume that wqe was previously zeroed. */ - err = mlxsw_pci_wqe_frag_map(mlxsw_pci, wqe, 0, skb->data, buf_len, DMA_FROM_DEVICE); if (err) @@ -548,21 +545,26 @@ static void mlxsw_pci_cqe_rdq_handle(str struct pci_dev *pdev = mlxsw_pci->pdev; struct mlxsw_pci_queue_elem_info *elem_info; struct mlxsw_rx_info rx_info = {}; - char *wqe; + char wqe[MLXSW_PCI_WQE_SIZE]; struct sk_buff *skb; u16 byte_count; int err; elem_info = mlxsw_pci_queue_elem_info_consumer_get(q); - skb = elem_info->u.sdq.skb; - if (!skb) - return; - wqe = elem_info->elem; - mlxsw_pci_wqe_frag_unmap(mlxsw_pci, wqe, 0, DMA_FROM_DEVICE); + skb = elem_info->u.rdq.skb; + memcpy(wqe, elem_info->elem, MLXSW_PCI_WQE_SIZE); if (q->consumer_counter++ != consumer_counter_limit) dev_dbg_ratelimited(&pdev->dev, "Consumer counter does not match limit in RDQ\n"); + err = mlxsw_pci_rdq_skb_alloc(mlxsw_pci, elem_info); + if (err) { + dev_err_ratelimited(&pdev->dev, "Failed to alloc skb for RDQ\n"); + goto out; + } + + mlxsw_pci_wqe_frag_unmap(mlxsw_pci, wqe, 0, DMA_FROM_DEVICE); + if (mlxsw_pci_cqe_lag_get(cqe_v, cqe)) { rx_info.is_lag = true; rx_info.u.lag_id = mlxsw_pci_cqe_lag_id_get(cqe_v, cqe); @@ -594,10 +596,7 @@ static void mlxsw_pci_cqe_rdq_handle(str skb_put(skb, byte_count); mlxsw_core_skb_receive(mlxsw_pci->core, skb, &rx_info); - memset(wqe, 0, q->elem_size); - err = mlxsw_pci_rdq_skb_alloc(mlxsw_pci, elem_info); - if (err) - dev_dbg_ratelimited(&pdev->dev, "Failed to alloc skb for RDQ\n"); +out: /* Everything is set up, ring doorbell to pass elem to HW */ q->producer_counter++; mlxsw_pci_queue_doorbell_producer_ring(mlxsw_pci, q);