Received: by 2002:a25:31c3:0:0:0:0:0 with SMTP id x186csp831162ybx; Wed, 30 Oct 2019 05:52:04 -0700 (PDT) X-Google-Smtp-Source: APXvYqygu3R/qhXKU7sS4yJyDr/cE/r2/O62eYge4qBzED/4bykgFH8cItwwCkw4cSaUATTM9poV X-Received: by 2002:a17:906:76d2:: with SMTP id q18mr8637827ejn.232.1572439924104; Wed, 30 Oct 2019 05:52:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1572439924; cv=none; d=google.com; s=arc-20160816; b=ZFnhON7MoxO3zafbTp8/xAWWkvYo8yBXOD0H16Mwl80sfuSd2IB+wAxM/mVqxomgKD FJALah5qQa6TPLSLEEs0tEjkdpmOmTPJl9S1W7fGzKDjUUFgt2oUc/AjINpOq4c47AL/ w0Ac5m4xQ3Nq6RhS2WmHr4dERtBBOocDTT9HBLKb03ZwH68qqmUdkU+uG9VqMXtPUAV9 GepyhpugLShq+CqFheznX6gCIgqhkhaB2gy4y5cHFQ40TGK1SBD6bcm3QResgjlAhGdL rGUUR3xiwoLqM4RGAvjkELnqVPa8TeB+qn6kF3cr8RIR7bsw7Y6cnnVE/9n+eHYFRdLB Khag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature; bh=sSxMD93SgrMKa1m5daPa8vTyvv+ZCSpaIY9FPy5hVnc=; b=cDtycN5AvdLAp5tNHiUGL6vLqLfijK+Py7ebz/uK574jELUMSuLEu4AOw2FaaMtDch hDnFTAxgKin1j0gNNF1DQKLnkEBO1X3xdE5W4DmQO5x5wJ+JaBkUzMma9u6sT7CoJMsf yn01GgaV0+sQb6/ZxUcUcVP7o7PRqZErK5gvi3yxbUIglwzAmg4M4JXEdr8ktXe5v5yv P+x884PEhXF/oiK8hyY969LKsW2Wjoms+55oGGGSw73KSCJVdzHOI2G+ZPpKmaC4Cq6l XpB78dbI8tG65p0Pme3/AP+nnb2F4jV6nozbl20GRISbsLrMhlY0ga5jKSeeohQCsd6q 1J7A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@cumulusnetworks.com header.s=google header.b=VH2JjfTE; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=cumulusnetworks.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g6si1420044eda.377.2019.10.30.05.51.40; Wed, 30 Oct 2019 05:52:04 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@cumulusnetworks.com header.s=google header.b=VH2JjfTE; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=cumulusnetworks.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726551AbfJ3MtF (ORCPT + 99 others); Wed, 30 Oct 2019 08:49:05 -0400 Received: from mail-lf1-f66.google.com ([209.85.167.66]:44229 "EHLO mail-lf1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726119AbfJ3MtE (ORCPT ); Wed, 30 Oct 2019 08:49:04 -0400 Received: by mail-lf1-f66.google.com with SMTP id v4so1434110lfd.11 for ; Wed, 30 Oct 2019 05:49:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cumulusnetworks.com; s=google; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=sSxMD93SgrMKa1m5daPa8vTyvv+ZCSpaIY9FPy5hVnc=; b=VH2JjfTED6tJP9SgurJTiQJxYZXJDKAJ+tS5boUHlN5RVihbMQ5GRxPToaX+sOmjTV 7mupqAEL3KPSSblg3l1/0p6MMy9rGB1/wwKo9JNymSfGafIJm4bEu4J+zzspaHhypkFM QmMllYvN4VWySnPjqfn84YqVMHDSPYrMCmkyw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=sSxMD93SgrMKa1m5daPa8vTyvv+ZCSpaIY9FPy5hVnc=; b=AUy7EC9salhXNf/L2TUVPQTmZ6xQtUDUfp6NSpf7VUc6DR5w6sK/O0USb8FK4llXvK /oeO+h5WziB066K11aWtwwl9P1MgmAVXtbbxf2qBlFCi2AsUhorC+CJeh4rXWXIUVSnR 9jVB9uB/YOnN+8EbogtZhe07JRsgbjMfXggQcjjdyTruEEHhJQr36X2CSytsLuqEWyH8 lwM8aitzfv7ZuSxXWtlUbkjZktaQJcE3UYngqeh0QTVAjxkhHJPAlBECs3dS7xfLJJoz hYHkqf9Lics2CM7XCk0uW/bZhFN6VKOMqAJjsZ2DbFTWsOMiCa63AKbYs2XFh/P09prO BMdg== X-Gm-Message-State: APjAAAU0pSXD6a4zTN+jSGlinHtPcFFHB9dkh6xJNoeXotj0ZUKV3CoB urNoEhUqxT2+DABD3+09tftaw6eDaE8= X-Received: by 2002:a19:3845:: with SMTP id d5mr5838987lfj.162.1572439741746; Wed, 30 Oct 2019 05:49:01 -0700 (PDT) Received: from [192.168.0.107] (84-238-136-197.ip.btc-net.bg. [84.238.136.197]) by smtp.gmail.com with ESMTPSA id b4sm911745ljp.84.2019.10.30.05.49.00 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 30 Oct 2019 05:49:01 -0700 (PDT) Subject: Re: [PATCH net-next v2 4/4] bonding: balance ICMP echoes in layer3+4 mode To: Eric Dumazet , Matteo Croce , netdev@vger.kernel.org Cc: Jay Vosburgh , Veaceslav Falico , Andy Gospodarek , "David S . Miller" , Stanislav Fomichev , Daniel Borkmann , Song Liu , Alexei Starovoitov , Paul Blakey , linux-kernel@vger.kernel.org References: <20191029135053.10055-1-mcroce@redhat.com> <20191029135053.10055-5-mcroce@redhat.com> <5be14e4e-807f-486d-d11a-3113901e72fe@cumulusnetworks.com> <576a4a96-861b-6a86-b059-6621a22d191c@gmail.com> <294b9604-8d43-4a31-9324-6368c584fd63@gmail.com> From: Nikolay Aleksandrov Message-ID: <4d4f88c6-8b95-5d9b-7e14-3cdc2f660d3f@cumulusnetworks.com> Date: Wed, 30 Oct 2019 14:48:58 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.1.1 MIME-Version: 1.0 In-Reply-To: <294b9604-8d43-4a31-9324-6368c584fd63@gmail.com> Content-Type: text/plain; charset=windows-1252 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 30/10/2019 01:04, Eric Dumazet wrote: > > > On 10/29/19 2:50 PM, Nikolay Aleksandrov wrote: > >> Right, I was just giving it as an example. Your suggestion sounds much better and >> wouldn't interfere with other layers, plus we already use skb->hash in bond_xmit_hash() >> and skb_set_owner_w() sets l4_hash if txhash is present which is perfect. >> >> One thing - how do we deal with sk_rethink_txhash() ? I guess we'll need some way to >> signal that the user specified the txhash and it is not to be recomputed ? >> That can also be used to avoid the connect txhash set as well if SO_TXHASH was set prior >> to the connect. It's quite late here, I'll look into it more tomorrow. :) > > I guess that we have something similar with SO_RCVBUF/SO_SNDBUF > > autotuning is disabled when/if they are used : > > SOCK_RCVBUF_LOCK & SOCK_SNDBUF_LOCK > > We could add a SOCK_TXHASH_LOCK so that sk_rethink_txhash() does nothing if > user forced a TXHASH value. > > Something like the following (probably not complete) patch. > Actually I think it's ok. I had a similar change last night sans the userlocks. I just built and tested a kernel with it successfully using the bonding. The only case that doesn't seem to work is a raw socket without hdrincl, IIUC due to the direct alloc_skb() (transhdrlen == 0) in ip_append_data(). Unless you have other concerns could you please submit it formally ? > diff --git a/include/net/sock.h b/include/net/sock.h > index 380312cc67a9d9ee8720eb2db82b1f7f8a5615ab..a8882738710eaa9d9d629e1207837a798401a594 100644 > --- a/include/net/sock.h > +++ b/include/net/sock.h > @@ -1354,6 +1354,7 @@ static inline int __sk_prot_rehash(struct sock *sk) > #define SOCK_RCVBUF_LOCK 2 > #define SOCK_BINDADDR_LOCK 4 > #define SOCK_BINDPORT_LOCK 8 > +#define SOCK_TXHASH_LOCK 16 > > struct socket_alloc { > struct socket socket; > @@ -1852,7 +1853,8 @@ static inline u32 net_tx_rndhash(void) > > static inline void sk_set_txhash(struct sock *sk) > { > - sk->sk_txhash = net_tx_rndhash(); > + if (!(sk->sk_userlocks & SOCK_TXHASH_LOCK)) > + sk->sk_txhash = net_tx_rndhash(); > } > > static inline void sk_rethink_txhash(struct sock *sk) > diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h > index 77f7c1638eb1ce7d3e143bbffd60056e472b1122..998be6ee7991de3a76d4ad33df3a38dbe791eae8 100644 > --- a/include/uapi/asm-generic/socket.h > +++ b/include/uapi/asm-generic/socket.h > @@ -118,6 +118,7 @@ > #define SO_SNDTIMEO_NEW 67 > > #define SO_DETACH_REUSEPORT_BPF 68 > +#define SO_TXHASH 69 > > #if !defined(__KERNEL__) > > diff --git a/net/core/sock.c b/net/core/sock.c > index 997b352c2a72ee39f00b102a553ac1191202b74f..85b85dffd462bc3b497e0432100ff24b759832e0 100644 > --- a/net/core/sock.c > +++ b/net/core/sock.c > @@ -770,6 +770,10 @@ int sock_setsockopt(struct socket *sock, int level, int optname, > case SO_BROADCAST: > sock_valbool_flag(sk, SOCK_BROADCAST, valbool); > break; > + case SO_TXHASH: > + sk->sk_txhash = val; > + sk->sk_userlocks |= SOCK_TXHASH_LOCK; > + break; > case SO_SNDBUF: > /* Don't error on this BSD doesn't and if you think > * about it this is right. Otherwise apps have to > @@ -1249,6 +1253,10 @@ int sock_getsockopt(struct socket *sock, int level, int optname, > v.val = sock_flag(sk, SOCK_BROADCAST); > break; > > + case SO_TXHASH: > + v.val = sk->sk_txhash; > + break; > + > case SO_SNDBUF: > v.val = sk->sk_sndbuf; > break; >