Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp742755ybi; Fri, 14 Jun 2019 02:35:44 -0700 (PDT) X-Google-Smtp-Source: APXvYqxPTxORLydRs4bJq7zMpl5boshJEFhIvIbjzPcxLYX80oLnnnLfVa+tKGpJH9jDNx+4nmUo X-Received: by 2002:a17:902:e287:: with SMTP id cf7mr1972278plb.32.1560504944098; Fri, 14 Jun 2019 02:35:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560504944; cv=none; d=google.com; s=arc-20160816; b=E6GafRGRZVY5T3xZMMKGhNW8jPe1PYxcsRT/Tjsl2VSLdXavZBxkFQICqYqbYp1MPY 6tZpS9ZwQ3RNRBujhdLPyP2b5Olo8h3syvp0tLMCpQp5sQrbFdey3FJvRVZvdW/hPGYy nL/5au4iUOUoAc5MKrEusfY/ZD2MyIIq89WxBZoHhua1kUmQ6TA/muJo4BOGOndJxpyk jmEdRbEQd8bEKfCe3wApj0OlgYkPUsNBlpBm5depngWr3DYLsI9mqIC53a5VordLLw35 e3rFyCQUSfgIb/lw5yE9YleskvVTzKTbn9pLTyP/5mUHX4QLohfhmZkaCeZ8ZWCB0TdJ Ibbw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:from:cc:references:to :subject; bh=YhHG0nTm55nag/F3lWsW8F7DGcQHmNvQqckxbdgXLb4=; b=N0cJNtQn8hbHkkvQFVB6eOlHcf7TtL4h2hbUCsMyMR6EO2611JeH/MNNCYZezpNqb7 k6Edxpr36wLOJtRO3hdQSAAbdFvFPdHfGfMPcT+AU7xihLkGoN+y1fk7sw64PZvszLkM QThV4hSFE01OP4PZ3lwa3S7bDAcAS/Rr5pZUkxJ/C6G30ygGPff5ZZtJbjSZvhahckXD JTtVhWBzOpG1nPKfgV0oc+fxxdDZLDdWX9iLk58Mcl4kKHEiToKpYrsKV46Ks1QH5U1p Ty43wPG4Fgkv4MgaLCGdhcnbrd7e5QVHxjsAD/2+d4TYa4zlgzSjDYSRGRBITn9Qelp2 qlMQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j17si1915144pfr.13.2019.06.14.02.35.26; Fri, 14 Jun 2019 02:35:44 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727177AbfFNJfK (ORCPT + 99 others); Fri, 14 Jun 2019 05:35:10 -0400 Received: from szxga05-in.huawei.com ([45.249.212.191]:18605 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726083AbfFNJfJ (ORCPT ); Fri, 14 Jun 2019 05:35:09 -0400 Received: from DGGEMS401-HUB.china.huawei.com (unknown [172.30.72.60]) by Forcepoint Email with ESMTP id 975DA7D693A28A8183BE; Fri, 14 Jun 2019 17:35:03 +0800 (CST) Received: from [127.0.0.1] (10.177.96.96) by DGGEMS401-HUB.china.huawei.com (10.3.19.201) with Microsoft SMTP Server id 14.3.439.0; Fri, 14 Jun 2019 17:35:01 +0800 Subject: Re: [PATCH net v2] tcp: avoid creating multiple req socks with the same tuples To: Eric Dumazet , Eric Dumazet References: <20190612035715.166676-1-maowenan@huawei.com> <6de5d6d8-e481-8235-193e-b12e7f511030@huawei.com> <6aa69ab5-ed81-6a7f-2b2b-214e44ff0ada@gmail.com> CC: David Miller , netdev , LKML From: maowenan Message-ID: <52025f94-04d3-2a44-11cd-7aa66ebc7e27@huawei.com> Date: Fri, 14 Jun 2019 17:35:01 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: <6aa69ab5-ed81-6a7f-2b2b-214e44ff0ada@gmail.com> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.177.96.96] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2019/6/14 12:28, Eric Dumazet wrote: > > > On 6/13/19 9:19 PM, maowenan wrote: >> >> >> @Eric, for this issue I only want to check TCP_NEW_SYN_RECV sk, is it OK like below? >> + if (!osk && sk->sk_state == TCP_NEW_SYN_RECV) >> + reqsk = __inet_lookup_established(sock_net(sk), &tcp_hashinfo, >> + sk->sk_daddr, sk->sk_dport, >> + sk->sk_rcv_saddr, sk->sk_num, >> + sk->sk_bound_dev_if, sk->sk_bound_dev_if); >> + if (unlikely(reqsk)) { >> > > Not enough. > > If we have many cpus here, there is a chance another cpu has inserted a request socket, then > replaced it by an ESTABLISH socket for the same 4-tuple. I try to get more clear about the scene you mentioned. And I have do some testing about this, it can work well when I use multiple cpus. The ESTABLISH socket would be from tcp_check_req->tcp_v4_syn_recv_sock->tcp_create_openreq_child, and for this path, inet_ehash_nolisten pass osk(NOT NULL), my patch won't call __inet_lookup_established in inet_ehash_insert(). When TCP_NEW_SYN_RECV socket try to inset to hash table, it will pass osk with NULL, my patch will check whether reqsk existed in hash table or not. If reqsk is existed, it just removes this reqsk and dose not insert to hash table. Then the synack for this reqsk can't be sent to client, and there is no chance to receive the ack from client, so ESTABLISH socket can't be replaced in hash table. So I don't see the race when there are many cpus. Can you show me some clue? thank you. > > We need to take the per bucket spinlock much sooner. > > And this is fine, all what matters is that we do no longer grab the listener spinlock. > >