Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp5353454ybi; Wed, 12 Jun 2019 00:47:52 -0700 (PDT) X-Google-Smtp-Source: APXvYqxw1Uoa+4eJjpm015qEvXK/3kIXM7ePadppObcf3UHxwEpssWxnB/3/YAubG+o5mtM8DhA9 X-Received: by 2002:a17:902:2a68:: with SMTP id i95mr81585397plb.167.1560325672069; Wed, 12 Jun 2019 00:47:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560325672; cv=none; d=google.com; s=arc-20160816; b=bJJwl7Eh2MpF16mGAhRCeq/bug+N2GpFCZnfcPGBVrgucVtHIHKz4/FmXMRZNrhG6W oz1nmcWQ0mvtylBtJMiT24VkSX6c1fF+VVKHaMmp2tg8YAdPzBDzk2lP94VLH53eshvo mynwklSuEJJW6BzWbGnxL16r3+RJtLrupMZiOrcaHLxwJB2zlAzlMMjoUDj/Sls5OIaY VYT9pi+26JQ3wL62HHHulHCq7VJNobOSCzUWnVUkr4Jf9bQ199qKbuYMfxhCAwoFDPNr Pekqxkpfkj66+JSA5aGyWHGDe7ihIkn4I0U88E/DIvwHLsiGpy8Jg0VXnsgqjXFXqMET ocMA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=KO2CJPU9NrllMxrS3omrsyCElfOuxaPvp02aF7Pa4Wo=; b=Y4TpP7W/U5jGJdoYCeWWA4Ezm32Jqq5Rv7dZsZNi4SizKn43eywU/ER1yAtZI4/DaV JjvT4VOoW3Dwrn8RS0Rgbk0R7t58ltt6cbx0cXAjBnoHLo3h9dUnBexQfhvo1CLaxVIQ wmHWqx/Ckzj6GQncYAgxT2NQOmrfKjlIumjQTbXFU6p1YQSkSx3rYpKvCCOW8WFW4rwE nYuNOj7WoCbMcLGQ8OImBgVmekYl+Mv7c7occ/IKRMR8GiTMqSfbNpGunzJ9aCyQAaEY 0dwvb2Zn+vARFTvA7jXEpB95Ddd0iX1wrorQVCkpFu1C9Cvtk7F3SMLWHzrFaORiGtCK tttw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b142si15816827pfb.245.2019.06.12.00.47.35; Wed, 12 Jun 2019 00:47:51 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2436584AbfFLDty (ORCPT + 99 others); Tue, 11 Jun 2019 23:49:54 -0400 Received: from szxga06-in.huawei.com ([45.249.212.32]:60556 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2407047AbfFLDtx (ORCPT ); Tue, 11 Jun 2019 23:49:53 -0400 Received: from DGGEMS406-HUB.china.huawei.com (unknown [172.30.72.59]) by Forcepoint Email with ESMTP id 268FD9DDA16174A382EF; Wed, 12 Jun 2019 11:49:49 +0800 (CST) Received: from localhost.localdomain.localdomain (10.175.113.25) by DGGEMS406-HUB.china.huawei.com (10.3.19.206) with Microsoft SMTP Server id 14.3.439.0; Wed, 12 Jun 2019 11:49:40 +0800 From: Mao Wenan To: CC: , , , Mao Wenan Subject: [PATCH net v2] tcp: avoid creating multiple req socks with the same tuples Date: Wed, 12 Jun 2019 11:57:15 +0800 Message-ID: <20190612035715.166676-1-maowenan@huawei.com> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII X-Originating-IP: [10.175.113.25] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org There is one issue about bonding mode BOND_MODE_BROADCAST, and two slaves with diffierent affinity, so packets will be handled by different cpu. These are two pre-conditions in this case. When two slaves receive the same syn packets at the same time, two request sock(reqsk) will be created if below situation happens: 1. syn1 arrived tcp_conn_request, create reqsk1 and have not yet called inet_csk_reqsk_queue_hash_add. 2. syn2 arrived tcp_v4_rcv, it goes to tcp_conn_request and create reqsk2 because it can't find reqsk1 in the __inet_lookup_skb. Then reqsk1 and reqsk2 are added to establish hash table, and two synack with different seq(seq1 and seq2) are sent to client, then tcp ack arrived and will be processed in tcp_v4_rcv and tcp_check_req, if __inet_lookup_skb find the reqsk2, and tcp ack packet is ack_seq is seq1, it will be failed after checking: TCP_SKB_CB(skb)->ack_seq != tcp_rsk(req)->snt_isn + 1) and then tcp rst will be sent to client and close the connection. To fix this, call __inet_lookup_established() before __sk_nulls_add_node_rcu() in inet_ehash_insert(). If there is existed reqsk with same tuples in established hash table, directly to remove current reqsk2, and does not send synack to client. Signed-off-by: Mao Wenan --- v2: move __inet_lookup_established from tcp_conn_request() to inet_ehash_insert() as Eric suggested. --- include/net/inet_connection_sock.h | 2 +- net/ipv4/inet_connection_sock.c | 16 ++++++++++++---- net/ipv4/inet_hashtables.c | 13 +++++++++++++ net/ipv4/tcp_input.c | 7 ++++--- 4 files changed, 30 insertions(+), 8 deletions(-) diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h index c57d53e7e02c..2d3538e333cb 100644 --- a/include/net/inet_connection_sock.h +++ b/include/net/inet_connection_sock.h @@ -263,7 +263,7 @@ struct dst_entry *inet_csk_route_child_sock(const struct sock *sk, struct sock *inet_csk_reqsk_queue_add(struct sock *sk, struct request_sock *req, struct sock *child); -void inet_csk_reqsk_queue_hash_add(struct sock *sk, struct request_sock *req, +bool inet_csk_reqsk_queue_hash_add(struct sock *sk, struct request_sock *req, unsigned long timeout); struct sock *inet_csk_complete_hashdance(struct sock *sk, struct sock *child, struct request_sock *req, diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c index 13ec7c3a9c49..fd45ed2fd985 100644 --- a/net/ipv4/inet_connection_sock.c +++ b/net/ipv4/inet_connection_sock.c @@ -749,7 +749,7 @@ static void reqsk_timer_handler(struct timer_list *t) inet_csk_reqsk_queue_drop_and_put(sk_listener, req); } -static void reqsk_queue_hash_req(struct request_sock *req, +static bool reqsk_queue_hash_req(struct request_sock *req, unsigned long timeout) { req->num_retrans = 0; @@ -759,19 +759,27 @@ static void reqsk_queue_hash_req(struct request_sock *req, timer_setup(&req->rsk_timer, reqsk_timer_handler, TIMER_PINNED); mod_timer(&req->rsk_timer, jiffies + timeout); - inet_ehash_insert(req_to_sk(req), NULL); + if (!inet_ehash_insert(req_to_sk(req), NULL)) { + if (timer_pending(&req->rsk_timer)) + del_timer_sync(&req->rsk_timer); + return false; + } /* before letting lookups find us, make sure all req fields * are committed to memory and refcnt initialized. */ smp_wmb(); refcount_set(&req->rsk_refcnt, 2 + 1); + return true; } -void inet_csk_reqsk_queue_hash_add(struct sock *sk, struct request_sock *req, +bool inet_csk_reqsk_queue_hash_add(struct sock *sk, struct request_sock *req, unsigned long timeout) { - reqsk_queue_hash_req(req, timeout); + if (!reqsk_queue_hash_req(req, timeout)) + return false; + inet_csk_reqsk_queue_added(sk); + return true; } EXPORT_SYMBOL_GPL(inet_csk_reqsk_queue_hash_add); diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c index c4503073248b..b6a1b5334565 100644 --- a/net/ipv4/inet_hashtables.c +++ b/net/ipv4/inet_hashtables.c @@ -477,6 +477,7 @@ bool inet_ehash_insert(struct sock *sk, struct sock *osk) struct inet_ehash_bucket *head; spinlock_t *lock; bool ret = true; + struct sock *reqsk = NULL; WARN_ON_ONCE(!sk_unhashed(sk)); @@ -486,6 +487,18 @@ bool inet_ehash_insert(struct sock *sk, struct sock *osk) lock = inet_ehash_lockp(hashinfo, sk->sk_hash); spin_lock(lock); + if (!osk) + reqsk = __inet_lookup_established(sock_net(sk), &tcp_hashinfo, + sk->sk_daddr, sk->sk_dport, + sk->sk_rcv_saddr, sk->sk_num, + sk->sk_bound_dev_if, sk->sk_bound_dev_if); + if (unlikely(reqsk)) { + ret = false; + reqsk_free(inet_reqsk(sk)); + spin_unlock(lock); + return ret; + } + if (osk) { WARN_ON_ONCE(sk->sk_hash != osk->sk_hash); ret = sk_nulls_del_node_init_rcu(osk); diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 38dfc308c0fb..358272394590 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -6570,9 +6570,10 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops, sock_put(fastopen_sk); } else { tcp_rsk(req)->tfo_listener = false; - if (!want_cookie) - inet_csk_reqsk_queue_hash_add(sk, req, - tcp_timeout_init((struct sock *)req)); + if (!want_cookie && !inet_csk_reqsk_queue_hash_add(sk, req, + tcp_timeout_init((struct sock *)req))) + return 0; + af_ops->send_synack(sk, dst, &fl, req, &foc, !want_cookie ? TCP_SYNACK_NORMAL : TCP_SYNACK_COOKIE); -- 2.20.1