Received: by 2002:a05:7412:8521:b0:e2:908c:2ebd with SMTP id t33csp629678rdf; Fri, 3 Nov 2023 10:08:38 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEzGnQs5/0mA3lCuQmjh3ML4oeN4ONm4gfw+qKoCev39D1zZoPE8HSl48Ga9c52sT8l83IP X-Received: by 2002:a05:6820:2016:b0:581:6a56:f3bf with SMTP id by22-20020a056820201600b005816a56f3bfmr24817551oob.5.1699031318108; Fri, 03 Nov 2023 10:08:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1699031318; cv=none; d=google.com; s=arc-20160816; b=0x52afEX+YWvU0MOlq6tc6OcBocMULIDHk10dXw7ayLDO9qODoWH1yaRP7vOF1uyzV sGtIuxtac/IccrvOEs6uBWkD3pZtEWyx2TKSAmHPObayEQMonESA2ckCqN/2YHUTOivj B+vAOIBaogwVZThrhgxgOMAWreYb7iX+KgoDC6ynH5kiS3D/IC+3BS5Q9/8Ax9zFe/JC Tx1PBlFuvDF0/kuOSe+tgohiPjxsRrEBLxvpRej+ohDApgIafBh1oxWYAOdn7noM6lv4 JrFnMmgdKRPkbfiLSKJ+a5Bwe7+1wzyZqN/PSZ5ixwY4n3AOzX2GrEeooYCoXaFerE35 D38w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Jqxpo9JM76UDQUoSzJ4bWswVG1V5NnihjPx+B3E+hw0=; fh=fzUzNlsRbtNnuVrnJt6rFasEegbbjpmF9ZEpDOzRc3U=; b=f6DGM1LJduR09+m95h1invi7U4YE/M223dgrYvQkrBER66IH4iURE+ZbpUXTLHjYrf eddTgIXGWAlL8PP7Ei156uZB1VcajTa8Oq99DQlrjTPsJ/9uIERRYylugbzBjVM5GMiI tXQLYUUafUM0dmlO85t6SeDOKVWqt8mgotKpEZugWpgScA8yOwKStJjG4B+1km2nH6/p 8plc4V1YKPi/aNMjPofe4FSwF+76lA9jzjBU12WX7qLRa+pFOaRDDqQxn1WIwWxvZFRh y0JgoEYcO7zx1+ZVk7ggx2TziCanxUmBSbBWxXU37SYOk8Degw63xEzV6McVUwoOn+W3 8XQw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@spacex.com header.s=dkim header.b=WAWiV5Aa; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=spacex.com Return-Path: Received: from fry.vger.email (fry.vger.email. [2620:137:e000::3:8]) by mx.google.com with ESMTPS id f193-20020a4a58ca000000b005840ccf210fsi772145oob.95.2023.11.03.10.08.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 03 Nov 2023 10:08:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) client-ip=2620:137:e000::3:8; Authentication-Results: mx.google.com; dkim=pass header.i=@spacex.com header.s=dkim header.b=WAWiV5Aa; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=spacex.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id 12DF781ED518; Fri, 3 Nov 2023 10:07:49 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1376478AbjKCRHb (ORCPT + 99 others); Fri, 3 Nov 2023 13:07:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51334 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345483AbjKCRHa (ORCPT ); Fri, 3 Nov 2023 13:07:30 -0400 Received: from mx4.spacex.com (mx4.spacex.com [192.31.242.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B08488E; Fri, 3 Nov 2023 10:07:24 -0700 (PDT) Received: from pps.filterd (mx4.spacex.com [127.0.0.1]) by mx4.spacex.com (8.17.1.19/8.17.1.19) with ESMTP id 3A3GiWa6010607; Fri, 3 Nov 2023 10:07:21 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=spacex.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=dkim; bh=Jqxpo9JM76UDQUoSzJ4bWswVG1V5NnihjPx+B3E+hw0=; b=WAWiV5AaBYkTA4iJHnjMAkZMV6Lf8ZaXT+Yp/SkJ7eBNhVAKLlZqFnaW3MDSckmnVfaX hRDRrFTngcT2bfbuck8/Nqb3V5s1XbgJHTufXKqsv623zjxVCYSNlriizbLbmfD97PXl darkarofvWANMH50AKqSnlDjykKL+pHV9EUibPbJKUeqwAnNqI+PQ61ezeA1n5WX5cWA bMBx2V5y49mvL/WJqRhhhcr8AhydpYXInetS3lYSd9oMyydzSb+Njg1XfLP1NBZ//QnL UtzRJz58Z0QdOpUvqqtk5I+rHU826dAaW+u9O6tnxsUAYHHegyXDo44O9Yt+Gw5J3EhP lQ== Received: from smtp.spacex.corp ([10.34.3.234]) by mx4.spacex.com (PPS) with ESMTPS id 3u0yqn8mwh-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT); Fri, 03 Nov 2023 10:07:21 -0700 Received: from apakhunov-z4.spacex.corp (10.1.32.161) by HT-DC-EX-D2-N2.spacex.corp (10.34.3.234) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.34; Fri, 3 Nov 2023 10:07:21 -0700 From: Alex Pakhunov To: CC: , , , , , , Subject: Re: [PATCH v2 1/2] tg3: Increment tx_dropped in tg3_tso_bug() Date: Fri, 3 Nov 2023 10:07:11 -0700 Message-ID: <20231103170711.4006756-1-alexey.pakhunov@spacex.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-ClientProxiedBy: ht-dc-ex-d4-n3.spacex.corp (10.34.3.241) To HT-DC-EX-D2-N2.spacex.corp (10.34.3.234) X-Proofpoint-ORIG-GUID: Slrv5khXa-Qp2xS2LvVH4u5DDRv_8ShW X-Proofpoint-GUID: Slrv5khXa-Qp2xS2LvVH4u5DDRv_8ShW X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 mlxscore=0 phishscore=0 mlxlogscore=972 adultscore=0 spamscore=0 malwarescore=0 suspectscore=0 priorityscore=1501 clxscore=1015 bulkscore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2310240000 definitions=main-2311030143 X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Fri, 03 Nov 2023 10:07:49 -0700 (PDT) > This is prone to race conditions if we have more than one TX queue. Yes, indeed. > The original driver code only supported one TX queue and the counters > were never modified properly to support multiple queues. We should > convert them to per queue counters by moving tx_dropped and rx_dropped > to the tg3_napi struct. I'm not super familiar with the recommended approach for handling locks in network drivers, so I spent a bit of tme looking at what tg3 does. It seems that there are a few ways to remove the race condition when working with these counters: 1. Use atomic increments. It is easy but every update is more expensive than it needs to be. We might be able to say that there specific counters are updated rarely, so maybe we don't care too much. 2. netif_tx_lock is already taken when tx_droped is incremented - wrap rx_dropped increment and reading both counters in netif_tx_lock. This seems legal since tg3_tx() can take netif_tx_lock. I'm not sure how to order netif_tx_lock and tp->lock, since tg3_get_stats64() takes the latter. Should netif_tx_lock be takes inside tp->lock? Should they be not nested? 3. Using tp->lock to protect rx_dropped (tg3_poll_link() already takes it so it must be legal) and netif_tx_lock to protect tx_dropped. There are probably other options. Can you recommend an aproach? Also, this seems like a larger change that should be done separately from fixing the TX stall. Should we land just "[PATCH v2 2/2]"? Should we land the whole patch (since it does not make race condition much worse) and fix the race condition separately? Alex.