Received: by 2002:a05:7412:a9a3:b0:f9:327e:43ab with SMTP id o35csp34600rdh; Mon, 18 Dec 2023 03:41:24 -0800 (PST) X-Google-Smtp-Source: AGHT+IH+VIOXyjgPeh/MSB8dzCR1ineLzuEimP0v2O8is2dTZYSwPM6P4L9+EBTxhHdPAuwTeZZk X-Received: by 2002:a05:6402:902:b0:54c:d1b1:d634 with SMTP id g2-20020a056402090200b0054cd1b1d634mr16310475edz.39.1702899684318; Mon, 18 Dec 2023 03:41:24 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702899684; cv=none; d=google.com; s=arc-20160816; b=AY4SXxQAFbuLts9xwgVfpQYQKW++h8mHJwt/UGfZrKKEO0L5Ec+FgRheS5NsJKiWxj u1i0yR6tv4ro6OcbVBG704NPNvlhGKkwuwLar+zy+h77/gJLRjwcbugF7mqIY40MxKzI pm3OSv31bLsUbxT/CGYzCY+F/44ZExpkQdOOP6ySYO1ooW+qelAxLQkG2YxiBNhBPsiy SjVUyi1U+O1/29YB0vV64jFpZk/a+vlBaKGPfuz84KHts+BMuiqoy0IMrSlWfTOs8Mof jH3S+HU95bVXZzidHzL3YLsRPYHfXVzz9btiAH9zKvqwP8ZS2fbevLUQ9xSUsoMI2wb0 1xRA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:list-unsubscribe:list-subscribe:list-id:precedence :subject:date:from; bh=MK3SZOYs1CrTu1IvrztIq6Tt9GdlFRNB4ndpKfk0rcA=; fh=Q7ZN18fsfXWgv5t2C6g8XMh6el8ClhFg30gCILP61IA=; b=zAwm3SK2j8M4QoYT/n9EH/r7fmjtKFqDQWnpdOb12HK2P5sEQPdVV4hiJsHj/EuR/1 pRi4fzs5RBZ/AQNSmgAw+6RN7IlkDEMGb8KmnWKuONYudyjnHOYcqn1Fbye2xJMLZaF8 KtN4VjF77FPIYuyZftoWqUIneNvisEnuoCoDlSaTMWNtuP2vUFIUntesP4GJ+66j4gii h2UmQ92e8FzseB68BzqKkze8ejxeGDw9NlV4fuSTkeKMpldgPPOnUiYT9ZzI6N+zoveA BO0jG0dF9OsZ0nCJnCHN0OtqbunGzDnzR6swgd1dOe7V49Whtirl0Et09Ijis8MIiYu+ R4ug== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-3456-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-3456-linux.lists.archive=gmail.com@vger.kernel.org" Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id y68-20020a50bb4a000000b005522b0f9815si4674538ede.601.2023.12.18.03.41.24 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Dec 2023 03:41:24 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-3456-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-3456-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-3456-linux.lists.archive=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id E43951F224E1 for ; Mon, 18 Dec 2023 11:41:23 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 46B8431A90; Mon, 18 Dec 2023 11:38:11 +0000 (UTC) X-Original-To: linux-kernel@vger.kernel.org Received: from metis.whiteo.stw.pengutronix.de (metis.whiteo.stw.pengutronix.de [185.203.201.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 829BF2D7A7 for ; Mon, 18 Dec 2023 11:38:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pengutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pengutronix.de Received: from ptz.office.stw.pengutronix.de ([2a0a:edc0:0:900:1d::77] helo=ratatoskr.trumtrar.info) by metis.whiteo.stw.pengutronix.de with esmtp (Exim 4.92) (envelope-from ) id 1rFBwY-0000Vp-2R; Mon, 18 Dec 2023 12:37:54 +0100 From: Steffen Trumtrar Date: Mon, 18 Dec 2023 12:37:08 +0100 Subject: [PATCH RFC 1/4] virtio-net: support transmit hash report Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Message-Id: <20231218-v6-7-topic-virtio-net-ptp-v1-1-cac92b2d8532@pengutronix.de> References: <20231218-v6-7-topic-virtio-net-ptp-v1-0-cac92b2d8532@pengutronix.de> In-Reply-To: <20231218-v6-7-topic-virtio-net-ptp-v1-0-cac92b2d8532@pengutronix.de> To: "Michael S. Tsirkin" , Jason Wang , Xuan Zhuo , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Richard Cochran Cc: virtualization@lists.linux.dev, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Willem de Bruijn X-Mailer: b4 0.12.3 X-Developer-Signature: v=1; a=openpgp-sha256; l=5344; i=s.trumtrar@pengutronix.de; h=from:subject:message-id; bh=9Uw8CA+6ibAmUraUAydQKoNaof2rIiZIlLZkYPxWA6M=; b=owGbwMvMwCUmvd38QH3grB+Mp9WSGFIb9P4VLZ0SNVOgJY/7bPn08DMN318JaGTN+nHt27ZrDVkz rAssOkpZGMS4GGTFFFki1x7S2Cz8WefL8fMMMHNYmUCGMHBxCsBEZv5j+J909ta09xvmC4cGvhfneX NjxoWPZr1bl2ueY/k2oefgbaNdjAzXJnR8/b2Ht8a6bF5S47G+K/v4cp4kz91+jnHmatcW/TwOAA== X-Developer-Key: i=s.trumtrar@pengutronix.de; a=openpgp; fpr=59ADC228B313F32CF4C7CF001BB737C07F519AF8 X-SA-Exim-Connect-IP: 2a0a:edc0:0:900:1d::77 X-SA-Exim-Mail-From: s.trumtrar@pengutronix.de X-SA-Exim-Scanned: No (on metis.whiteo.stw.pengutronix.de); SAEximRunCond expanded to false X-PTX-Original-Recipient: linux-kernel@vger.kernel.org From: Willem de Bruijn Virtio-net supports sharing the flow hash from device to driver on rx. Do the same in the other direction for robust routing and telemetry. Linux derives ipv6 flowlabel and ECMP multipath from sk->sk_txhash, and updates this field on error with sk_rethink_txhash. Allow the host stack to do the same. Concrete examples of error conditions that are resolved are mentioned in the commits that add sk_rethink_txhash calls. Such as commit 7788174e8726 ("tcp: change IPv6 flow-label upon receiving spurious retransmission"). Experimental results mirror what the theory suggests: where IPv6 FlowLabel is included in path selection (e.g., LAG/ECMP), flowlabel rotation on TCP timeout avoids the vast majority of TCP disconnects that would otherwise have occurred during link failures in long-haul backbones, when an alternative path is available. Rotation can be applied to various bad connection signals, such as timeouts and spurious retransmissions. In aggregate, such flow level signals can help locate network issues. Reserve field hash_state to share this info. For now, always set VIRTIO_NET_HASH_STATE_DEFAULT. Passing information between TCP stack and driver is future work. Changes RFC->RFCv2 - drop unused VIRTIO_NET_HASH_STATE_TIMEOUT_BIT - convert from cpu_to_virtioXX to cpu_to_leXX Signed-off-by: Willem de Bruijn Signed-off-by: Steffen Trumtrar --- drivers/net/virtio_net.c | 26 +++++++++++++++++++++++--- include/uapi/linux/virtio_net.h | 9 ++++++++- 2 files changed, 31 insertions(+), 4 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index d16f592c2061f..8c9de31af8942 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -271,6 +271,9 @@ struct virtnet_info { u32 rss_hash_types_supported; u32 rss_hash_types_saved; + /* Driver will pass tx path info to the device */ + bool has_tx_hash; + /* Has control virtqueue */ bool has_cvq; @@ -512,9 +515,9 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi, hdr_len = vi->hdr_len; if (vi->mergeable_rx_bufs) - hdr_padded_len = hdr_len; + hdr_padded_len = max_t(unsigned int, hdr_len, sizeof(*hdr)); else - hdr_padded_len = sizeof(struct padded_vnet_hdr); + hdr_padded_len = ALIGN(hdr_len, 16); buf = p - headroom; len -= hdr_len; @@ -2282,6 +2285,7 @@ static int xmit_skb(struct send_queue *sq, struct sk_buff *skb) struct virtio_net_hdr_mrg_rxbuf *hdr; const unsigned char *dest = ((struct ethhdr *)skb->data)->h_dest; struct virtnet_info *vi = sq->vq->vdev->priv; + struct virtio_net_hdr_v1_hash *ht; int num_sg; unsigned hdr_len = vi->hdr_len; bool can_push; @@ -2306,6 +2310,16 @@ static int xmit_skb(struct send_queue *sq, struct sk_buff *skb) if (vi->mergeable_rx_bufs) hdr->num_buffers = 0; + ht = (void *)hdr; + if (vi->has_tx_hash) { + u16 report = skb->l4_hash ? VIRTIO_NET_HASH_REPORT_L4 : + VIRTIO_NET_HASH_REPORT_OTHER; + + ht->hash_value = cpu_to_le32(skb->hash); + ht->hash_report = cpu_to_le16(report); + ht->hash_state = cpu_to_le16(VIRTIO_NET_HASH_STATE_DEFAULT); + } + sg_init_table(sq->sg, skb_shinfo(skb)->nr_frags + (can_push ? 1 : 2)); if (can_push) { __skb_push(skb, hdr_len); @@ -4511,6 +4525,11 @@ static int virtnet_probe(struct virtio_device *vdev) else vi->hdr_len = sizeof(struct virtio_net_hdr); + if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_TX_HASH)) { + vi->has_tx_hash = true; + vi->hdr_len = sizeof(struct virtio_net_hdr_v1_hash); + } + if (virtio_has_feature(vdev, VIRTIO_F_ANY_LAYOUT) || virtio_has_feature(vdev, VIRTIO_F_VERSION_1)) vi->any_header_sg = true; @@ -4753,7 +4772,8 @@ static struct virtio_device_id id_table[] = { VIRTIO_NET_F_SPEED_DUPLEX, VIRTIO_NET_F_STANDBY, \ VIRTIO_NET_F_RSS, VIRTIO_NET_F_HASH_REPORT, VIRTIO_NET_F_NOTF_COAL, \ VIRTIO_NET_F_VQ_NOTF_COAL, \ - VIRTIO_NET_F_GUEST_HDRLEN + VIRTIO_NET_F_GUEST_HDRLEN, \ + VIRTIO_NET_F_TX_HASH static unsigned int features[] = { VIRTNET_FEATURES, diff --git a/include/uapi/linux/virtio_net.h b/include/uapi/linux/virtio_net.h index cc65ef0f3c3e2..698a11f8c6ab9 100644 --- a/include/uapi/linux/virtio_net.h +++ b/include/uapi/linux/virtio_net.h @@ -56,6 +56,7 @@ #define VIRTIO_NET_F_MQ 22 /* Device supports Receive Flow * Steering */ #define VIRTIO_NET_F_CTRL_MAC_ADDR 23 /* Set MAC address */ +#define VIRTIO_NET_F_TX_HASH 51 /* Driver sends hash report */ #define VIRTIO_NET_F_VQ_NOTF_COAL 52 /* Device supports virtqueue notification coalescing */ #define VIRTIO_NET_F_NOTF_COAL 53 /* Device supports notifications coalescing */ #define VIRTIO_NET_F_GUEST_USO4 54 /* Guest can handle USOv4 in. */ @@ -176,8 +177,14 @@ struct virtio_net_hdr_v1_hash { #define VIRTIO_NET_HASH_REPORT_IPv6_EX 7 #define VIRTIO_NET_HASH_REPORT_TCPv6_EX 8 #define VIRTIO_NET_HASH_REPORT_UDPv6_EX 9 +#define VIRTIO_NET_HASH_REPORT_L4 10 +#define VIRTIO_NET_HASH_REPORT_OTHER 11 __le16 hash_report; - __le16 padding; + union { + __le16 padding; +#define VIRTIO_NET_HASH_STATE_DEFAULT 0 + __le16 hash_state; + }; }; #ifndef VIRTIO_NET_NO_LEGACY -- 2.42.0