Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp21031734rwd; Thu, 29 Jun 2023 10:01:18 -0700 (PDT) X-Google-Smtp-Source: APBJJlEfb8jzjJmrk92it/ivheOzTipUpkZ70mKthEFg5fgLQlJR6PYmuARTcdcqpfhKJG0h6SGA X-Received: by 2002:a05:6a00:18a2:b0:671:4b06:4ea7 with SMTP id x34-20020a056a0018a200b006714b064ea7mr553406pfh.15.1688058077965; Thu, 29 Jun 2023 10:01:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688058077; cv=none; d=google.com; s=arc-20160816; b=S4jBlFlgWMpZe8eP4FUFHVkoM4KHG801iLrkEicJpTLrqDsoD+gVfJIoFdwDQFBIqF heiax05Pj6L4yFcKMWhnOmNAaj0PFy6xO9D1R9iNsxdmM8S4gfAWxjcKn7fsALDiJuta 4H3QOVqYm66UY38in6Ue9304cIsSThjqoqydEzW7EDXTbhNY/mWKmuMifFwogYJ8MTmI D5V4HSnTtZpzXC7zaK7K5IFa96EXTBNodbpUO+uZfpPrZWkCDmBUNiDh5btcPiGMYGCR m+R0ofJBdJNzIw+RzgfjkXp1yq+OWI5nScR1uyCfpFcc/I9w54vwLBxnjBia2SZu4yYh Qt+w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:dkim-signature; bh=uD4LmeMVDMXFm3fIJablL/fpDsRD4c6J60jUn58o1N8=; fh=N4kc64bOtURJlK2XSY/0iiUo8AMhPTaiYWBoNu3c4Zo=; b=yxpb91GKbRsvZ+EP4hZkBiz/n6se87BpIYy+DfYVN4wNivqlYv9cocUP7EC2PoY76U hapmmkyQ0iuL4P/sfi69G/EYt5fwYNKYd9WmocVdEejsFgtr09/AajCwzYc7tMBoc75f MDBdnBnlAI7HqgomkzHMIh/4VjrdhCTazjwKLolNKgF3fyu4i50p5Wt8gZDphn25Utt+ SQoAulKF12JUYmBRnUMjO8LpxazGpR1aO3EuN4j/7EvkB6eOjD2FTJx9ccUcPHBsEiIC fJEVfXDK3OxJnG2nItva0YJQkXoR5HE+0fMHREpG5biJkZBjBPpARO4RFj9qIyNeK2sD a4kg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=NrPWlEWi; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id d4-20020a056a0010c400b00657e27bd74fsi11034805pfu.253.2023.06.29.10.00.54; Thu, 29 Jun 2023 10:01:17 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=NrPWlEWi; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232302AbjF2QZy (ORCPT + 99 others); Thu, 29 Jun 2023 12:25:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42854 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232047AbjF2QZr (ORCPT ); Thu, 29 Jun 2023 12:25:47 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 33D1E30C5; Thu, 29 Jun 2023 09:25:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1688055946; x=1719591946; h=from:to:cc:subject:in-reply-to:references:date: message-id:mime-version; bh=GMipEncNpFuzbp3Y+IWzNKkgCXn6RBM75OjxnGRggWw=; b=NrPWlEWiIdpeni7tO1fYdsoTPccIGc/5658ie7TTX66Cab2seFiCcs/v XWT19BaAoLCq2Z0KAPWqyEDZQkk3bVYA0Ywo57C6AjsjCoA8oPe60meyu TdGk3wqn/tCvxX3ayeyTaSs69oQBgp5OSh50puV1hAQV7hcDPwFrO9Siy 2GFQBxyURt8WPdy1yIheJofxgqjDxIm065sEuqbTu8gIA1dJFDoxsMimF ERn5VheCu8rtfgCd3+JoXGb70wpOavIAFIjss5eEhuhTQDOx8OMsdatgQ 2qIUuyenYhte9vVzB0mk+ihXZKYGAH7Xu1qeULFdmhXx3zOvmeTHOYMI+ Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10756"; a="448535718" X-IronPort-AV: E=Sophos;i="6.01,168,1684825200"; d="scan'208";a="448535718" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Jun 2023 09:25:44 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10756"; a="752704992" X-IronPort-AV: E=Sophos;i="6.01,168,1684825200"; d="scan'208";a="752704992" Received: from pdurugk-mobl1.amr.corp.intel.com (HELO vcostago-mobl3) ([10.209.78.99]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Jun 2023 09:25:43 -0700 From: Vinicius Costa Gomes To: Florian Kauer , Jesse Brandeburg , Tony Nguyen , "David S . Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Vedang Patel , Maciej Fijalkowski , Jithu Joseph , Andre Guedes , Simon Horman Cc: netdev@vger.kernel.org, kurt@linutronix.de, intel-wired-lan@lists.osuosl.org, linux-kernel@vger.kernel.org Subject: Re: [Intel-wired-lan] [PATCH net v2] igc: Prevent garbled TX queue with XDP ZEROCOPY In-Reply-To: References: <20230628091148.62256-1-florian.kauer@linutronix.de> <87a5wjqnjk.fsf@intel.com> Date: Thu, 29 Jun 2023 09:25:43 -0700 Message-ID: <87edlup75k.fsf@intel.com> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Florian Kauer writes: > Hi Vinicius, > > On 28.06.23 23:34, Vinicius Costa Gomes wrote: >> Florian Kauer writes: >> >>> In normal operation, each populated queue item has >>> next_to_watch pointing to the last TX desc of the packet, >>> while each cleaned item has it set to 0. In particular, >>> next_to_use that points to the next (necessarily clean) >>> item to use has next_to_watch set to 0. >>> >>> When the TX queue is used both by an application using >>> AF_XDP with ZEROCOPY as well as a second non-XDP application >>> generating high traffic, the queue pointers can get in >>> an invalid state where next_to_use points to an item >>> where next_to_watch is NOT set to 0. >>> >>> However, the implementation assumes at several places >>> that this is never the case, so if it does hold, >>> bad things happen. In particular, within the loop inside >>> of igc_clean_tx_irq(), next_to_clean can overtake next_to_use. >>> Finally, this prevents any further transmission via >>> this queue and it never gets unblocked or signaled. >>> Secondly, if the queue is in this garbled state, >>> the inner loop of igc_clean_tx_ring() will never terminate, >>> completely hogging a CPU core. >>> >>> The reason is that igc_xdp_xmit_zc() reads next_to_use >>> before acquiring the lock, and writing it back >>> (potentially unmodified) later. If it got modified >>> before locking, the outdated next_to_use is written >>> pointing to an item that was already used elsewhere >>> (and thus next_to_watch got written). >>> >>> Fixes: 9acf59a752d4 ("igc: Enable TX via AF_XDP zero-copy") >>> Signed-off-by: Florian Kauer >>> Reviewed-by: Kurt Kanzenbach >>> Tested-by: Kurt Kanzenbach >>> --- >> >> This patch doesn't directly apply because there's a small conflict with >> commit 95b681485563 ("igc: Avoid transmit queue timeout for XDP"), >> but really easy to solve. >> >> Anyway, good catch: >> >> Acked-by: Vinicius Costa Gomes > > I am sorry, that was bad timing. I prepared the initial patch on Friday and overlooked the merge. > Shall I send a v3 or will someone else take care of the conflict > resolution? I think it's easier if you send a v3. Cheers, -- Vinicius