Received: by 2002:a05:7412:da14:b0:e2:908c:2ebd with SMTP id fe20csp1792443rdb; Mon, 9 Oct 2023 03:11:56 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGaAf9vshf4HB3sUS2aHmOVvWm/UFLpThU7BMfvGijd5dzBCeTs1dQMchkyqoCZooncpuQb X-Received: by 2002:a05:6a21:1f02:b0:14d:abc:73dc with SMTP id ry2-20020a056a211f0200b0014d0abc73dcmr10958480pzb.32.1696846316350; Mon, 09 Oct 2023 03:11:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696846316; cv=none; d=google.com; s=arc-20160816; b=iPJ8POBM0ZS3yRcqgcOx1GWbmE3FTzosgN8Ap5OyTKpzPLZ2VoXIMBEq0cK+HGAu4x vteO7bvdtfSzNMiFRFHOvp3DYb142uAfMHcFiDa2XA61kAROrUtGpNEVCCKzRq2iCdmN Xh0qoaVdfOBq3kjPs1zv550Kedd5vyZAPE1iZX+mYmBFEO8APqCMPuE0qyQTNZhvf3x3 01YV7oBGLDZ3mKLoN2vCMqctHsV0OOv2JGIepkjNlhKZ+ICDKOW7GPbYm+bmwwSlCMf+ y8ThsNE8yATtPVaHCFitkE2HG32wPTceVwML6w10+ZXcq94xj9+QnR9jjfi85SG7VQaw 1J9w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=1E8Mj52Tq0fes8AMOfRByk/ed8eUOVTMuFwlcFHQa4Y=; fh=rvA6+mpTFR+bDtNgVM8f2ZolmredGNfNaQmjMHCI1BM=; b=aq0L5BS5MY05/CqEYEP8YQgWicpN4ZDADX7Y/Uig6vgUNvIBJ6Mug9QWlxTqDKfemn Snw86qd9OxklY2Qfkmd81x462oWbvrYecuJgLEICyOpBBf75g/q1t0UQaffZNBk2HUcu yMLRs33yPAafonpr7h8MYxJmraScuFgNKmH16Eok+KeLJfM5fN7pzr+yuwAC4kmdCArb UfZzBaaTt1LVg3dgarCtO6YpwOqJIOs9y6ilxDrK8FUtsMOreOq++0vYQHKf24yXhBjF TbO8o5AV/Zb9QXT6uC/sDGTFS30dxN4eSKKX9Lvazl/5ALD0OfqlLGvs4bdzPFXGqMY+ SNaw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@daynix-com.20230601.gappssmtp.com header.s=20230601 header.b=Zu7Lao0v; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from howler.vger.email (howler.vger.email. [23.128.96.34]) by mx.google.com with ESMTPS id o14-20020a056a001bce00b0068fb5cec8cfsi6918886pfw.267.2023.10.09.03.11.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Oct 2023 03:11:56 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) client-ip=23.128.96.34; Authentication-Results: mx.google.com; dkim=pass header.i=@daynix-com.20230601.gappssmtp.com header.s=20230601 header.b=Zu7Lao0v; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id 9960C80B8DE2; Mon, 9 Oct 2023 03:11:50 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346082AbjJIKLg (ORCPT + 99 others); Mon, 9 Oct 2023 06:11:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48530 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1346055AbjJIKLf (ORCPT ); Mon, 9 Oct 2023 06:11:35 -0400 Received: from mail-pf1-x431.google.com (mail-pf1-x431.google.com [IPv6:2607:f8b0:4864:20::431]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5492ADB for ; Mon, 9 Oct 2023 03:11:30 -0700 (PDT) Received: by mail-pf1-x431.google.com with SMTP id d2e1a72fcca58-69af8a42066so2237164b3a.1 for ; Mon, 09 Oct 2023 03:11:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=daynix-com.20230601.gappssmtp.com; s=20230601; t=1696846290; x=1697451090; darn=vger.kernel.org; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=1E8Mj52Tq0fes8AMOfRByk/ed8eUOVTMuFwlcFHQa4Y=; b=Zu7Lao0vUUB1HOqJ2WdeeJq2H5KiheKUF5NQj+mB/Ln1punXUEvSw9VGjHdoKsLqrR U4B0UAACkIlFWCCQH7jm5osCCO6IKVpuDifcQ22SsxlQxHe/lw6ojO1Wh1euGHxsL7pL u21PkwgwntvOYCI3V8KKqqgQDDI8ydEXMJ2nMSZl4+yt8oYPR7npdQe91QdU78SZlEw6 CaiOKHN04b6dd+8lJGc40SA7Zj2cv20Mo/S6iB4ZoEFRIetSRVzGCyCkBi5Y7bR4gC+Y WHYsxTv9kZpp+Bv5MQmktqJ3M/DfjTp/JRs6KBRemok8pZVIGT24rXrpW37DrCAEhxL4 iquA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696846290; x=1697451090; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=1E8Mj52Tq0fes8AMOfRByk/ed8eUOVTMuFwlcFHQa4Y=; b=Ah2PmRyiLI8TZxDnqk9e9I5uaK1yyOHTZ3Z+nEXYkZXRLNpFO9zIHRteCWDLoLgmAD dutxpZKBuOuKHTQXwvDX8CJqhzVcRWpMKRnwuxcaC1VKdhlN2gAnoITz+ADj27Y3/zbV 0b8hcjY8OgJqVXF+UVH+eoXhMr84zBflS0tmXcXzW83q1+qwTGsQzqJzKGLRCRrOKovu UVnL0IPcnUAFpJLorSIsxebUWR0+PEteIkGwpte5Gd2Cr4dBmZIJ6cEZ4nGKBcqBpEJJ TvykxGm/qpG/sCZF6P8Ersg8PM1qqUvZtqbs0kvAjoatG7cgoO4ClVKxIJ7kAojIZcOO qWrQ== X-Gm-Message-State: AOJu0YyV4K0qFyG/an7Uj4DwmRcrWFnYhHOJi38hCl1/YfJ+JdlunaCW K7aKhCmvWpzSdWi1Uvgzy/K7vg== X-Received: by 2002:a05:6a00:15c5:b0:692:b4d8:c8b4 with SMTP id o5-20020a056a0015c500b00692b4d8c8b4mr14892882pfu.21.1696846289702; Mon, 09 Oct 2023 03:11:29 -0700 (PDT) Received: from ?IPV6:2400:4050:a840:1e00:78d2:b862:10a7:d486? ([2400:4050:a840:1e00:78d2:b862:10a7:d486]) by smtp.gmail.com with ESMTPSA id fb3-20020a056a002d8300b0069ee4242f89sm4287361pfb.13.2023.10.09.03.11.23 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 09 Oct 2023 03:11:29 -0700 (PDT) Message-ID: Date: Mon, 9 Oct 2023 19:11:22 +0900 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH 5/7] tun: Introduce virtio-net hashing feature Content-Language: en-US To: Willem de Bruijn Cc: Jason Wang , "Michael S. Tsirkin" , netdev@vger.kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-kselftest@vger.kernel.org, bpf@vger.kernel.org, davem@davemloft.net, kuba@kernel.org, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, kafai@fb.com, songliubraving@fb.com, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, rdunlap@infradead.org, willemb@google.com, gustavoars@kernel.org, herbert@gondor.apana.org.au, steffen.klassert@secunet.com, nogikh@google.com, pablo@netfilter.org, decui@microsoft.com, cai@lca.pw, jakub@cloudflare.com, elver@google.com, pabeni@redhat.com, Yuri Benditovich References: <20231008052101.144422-1-akihiko.odaki@daynix.com> <20231008052101.144422-6-akihiko.odaki@daynix.com> <48e20be1-b658-4117-8856-89ff1df6f48f@daynix.com> <6a698c99-6f02-4cfb-a709-ba02296a05f7@daynix.com> From: Akihiko Odaki In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=2.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_SBL_CSS, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on howler.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Mon, 09 Oct 2023 03:11:51 -0700 (PDT) X-Spam-Level: ** On 2023/10/09 19:07, Willem de Bruijn wrote: > On Mon, Oct 9, 2023 at 3:05 AM Akihiko Odaki wrote: >> >> >> >> On 2023/10/09 18:54, Willem de Bruijn wrote: >>> On Mon, Oct 9, 2023 at 3:44 AM Akihiko Odaki wrote: >>>> >>>> On 2023/10/09 17:13, Willem de Bruijn wrote: >>>>> On Sun, Oct 8, 2023 at 12:22 AM Akihiko Odaki wrote: >>>>>> >>>>>> virtio-net have two usage of hashes: one is RSS and another is hash >>>>>> reporting. Conventionally the hash calculation was done by the VMM. >>>>>> However, computing the hash after the queue was chosen defeats the >>>>>> purpose of RSS. >>>>>> >>>>>> Another approach is to use eBPF steering program. This approach has >>>>>> another downside: it cannot report the calculated hash due to the >>>>>> restrictive nature of eBPF. >>>>>> >>>>>> Introduce the code to compute hashes to the kernel in order to overcome >>>>>> thse challenges. An alternative solution is to extend the eBPF steering >>>>>> program so that it will be able to report to the userspace, but it makes >>>>>> little sense to allow to implement different hashing algorithms with >>>>>> eBPF since the hash value reported by virtio-net is strictly defined by >>>>>> the specification. >>>>>> >>>>>> The hash value already stored in sk_buff is not used and computed >>>>>> independently since it may have been computed in a way not conformant >>>>>> with the specification. >>>>>> >>>>>> Signed-off-by: Akihiko Odaki >>>>> >>>>>> @@ -2116,31 +2172,49 @@ static ssize_t tun_put_user(struct tun_struct *tun, >>>>>> } >>>>>> >>>>>> if (vnet_hdr_sz) { >>>>>> - struct virtio_net_hdr gso; >>>>>> + union { >>>>>> + struct virtio_net_hdr hdr; >>>>>> + struct virtio_net_hdr_v1_hash v1_hash_hdr; >>>>>> + } hdr; >>>>>> + int ret; >>>>>> >>>>>> if (iov_iter_count(iter) < vnet_hdr_sz) >>>>>> return -EINVAL; >>>>>> >>>>>> - if (virtio_net_hdr_from_skb(skb, &gso, >>>>>> - tun_is_little_endian(tun), true, >>>>>> - vlan_hlen)) { >>>>>> + if ((READ_ONCE(tun->vnet_hash.flags) & TUN_VNET_HASH_REPORT) && >>>>>> + vnet_hdr_sz >= sizeof(hdr.v1_hash_hdr) && >>>>>> + skb->tun_vnet_hash) { >>>>> >>>>> Isn't vnet_hdr_sz guaranteed to be >= hdr.v1_hash_hdr, by virtue of >>>>> the set hash ioctl failing otherwise? >>>>> >>>>> Such checks should be limited to control path where possible >>>> >>>> There is a potential race since tun->vnet_hash.flags and vnet_hdr_sz are >>>> not read at once. >>> >>> It should not be possible to downgrade the hdr_sz once v1 is selected. >> >> I see nothing that prevents shrinking the header size. >> >> tun->vnet_hash.flags is read after vnet_hdr_sz so the race can happen >> even for the case the header size grows though this can be fixed by >> reordering the two reads. > > One option is to fail any control path that tries to re-negotiate > header size once this hash option is enabled? > > There is no practical reason to allow feature re-negotiation at any > arbitrary time. I think it's a bit awkward interface design since tun allows to reconfigure any of its parameters, but it's certainly possible.