Received: by 2002:a25:683:0:0:0:0:0 with SMTP id 125csp1177217ybg; Thu, 11 Jun 2020 03:04:17 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw0CwKIT2m8GVn+JC8YCAGPPgpKo60VF7fN9U0U959ttPu7HoAHvIEY9xTjPJRHqASxtIqk X-Received: by 2002:a17:906:a458:: with SMTP id cb24mr7474981ejb.5.1591869857090; Thu, 11 Jun 2020 03:04:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1591869857; cv=none; d=google.com; s=arc-20160816; b=0ov/4tu258PDP5H4C1STrKqw4bJ8siqSRXeuse1hEwKwJwsJbt96H9Iba7S0XKdX/R f3rBNIoyLXv6k1s57NL7krWfOwVJ8p/kXl13expI1y1J2MISBewNUp55nlsugWbS9byl u5iKZOdetH3MWFRZ75V/GUjRC0D9JvgiV0KzJf5Yd9Cz9LuiKZtfNVbNaSkfEdDZnqhS YkoW8J2ksrNK2SE+v4tSVyyYzIJqUbpWkkXxIs8g5nHnPS2EzDyVlFLpbrSL6jSP7qr8 N/2p4gGoaZy7Xh76NEBTunvteWwCIGfLQcP269Qxphe+RS3HQgwZcTX9PAc/CVyztgO+ yiww== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:content-transfer-encoding :content-disposition:mime-version:message-id:subject:cc:to:from:date :dkim-signature; bh=hDcJLQph3XEDbqRIioZiQfkn61SbO41SN1rr45lJWyU=; b=YB5M4BNHczYT3/+94bWOeXXWgFlZluKUA+bXEWWNAKDG0+99o2Ds+p68XGLycMYvPS dstIfvq9oHMn8n13JnLJwuVJ2Vw8XTBlKkYdNGcR7OHRYBepYsENioz5TgSQPJWC9FnK aay/fmhp+U5/lSK0fduH+cTct/vYLTbdFeVpTQvrmvk9R7y5GIXfvE0c5bpAnxVgzmAb N+IcnI0qVXyL0kV+G/1l9/fdYVJ6eq3FL063S0y8qqkZyjMTK4ksuuaCBFzSHhGNX73/ 1sw758hejnncKxrtl+zhfOYCBMqDjtApqyp75pjbsjSGye8x7zsS1beZ+NQq6WIbWmDG uO9w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=sgeBjL7A; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id dt4si1685321ejb.454.2020.06.11.03.03.54; Thu, 11 Jun 2020 03:04:17 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=sgeBjL7A; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727114AbgFKKBu (ORCPT + 99 others); Thu, 11 Jun 2020 06:01:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46372 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726946AbgFKKBt (ORCPT ); Thu, 11 Jun 2020 06:01:49 -0400 Received: from mail-pl1-x642.google.com (mail-pl1-x642.google.com [IPv6:2607:f8b0:4864:20::642]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C11EBC08C5C1; Thu, 11 Jun 2020 03:01:48 -0700 (PDT) Received: by mail-pl1-x642.google.com with SMTP id v24so2145889plo.6; Thu, 11 Jun 2020 03:01:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:mime-version:content-disposition :content-transfer-encoding:user-agent; bh=hDcJLQph3XEDbqRIioZiQfkn61SbO41SN1rr45lJWyU=; b=sgeBjL7AljP1Pm5YNHqOcaY5kTJdalizz4L7U25FzemzsMt5QR8hQ5bkYmVqEYLA04 GLBNSCNKLKVpHpqzcWe8utWPVwopNTKj8Fs6mhRndKR5t2XUtiilw9+nXxhEiUaPGcH7 G0UnoyDSHyYx/FonVIS0TZ7kWvL5IVyh/oFV1q3eh5rvIPzFTih03lKOLdszSHOuZgaO B5pt2ECbqKJ6PtcbEDwRTo9hgY/YnLOCuwVImNUA4x2Z007jmAwejXWfqxRGKV2Hl1wb UygYK/0gNrcnr812aBxsaBsbzbhsPPSFL6Aj+5oR1wDKDTUOx/S84T1tC1JjAw+/RihC M37Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:mime-version :content-disposition:content-transfer-encoding:user-agent; bh=hDcJLQph3XEDbqRIioZiQfkn61SbO41SN1rr45lJWyU=; b=tYajahjJvrI0FI+8pYA6SyJJ8RRwTwQEvadtU7TrY2fIeBlrwk/amSYiso/t7rRirC 0db5IuFXvIUF+criMTZyXiOWze3T1cvafgAxChmFNDrKos+OoRCd7SaHTMXQ7OXy/Zol geGNHNinDbZyODmKo+YXvMt55XuQNSC/CsxI2z1AY0NybTvvoj1QawViA6LTgVxDjakW HrUQpkrFTdrXftjw8QzqqkViubE9tlHfIsImJCkets9scXu8//WP9EgiPl/54NVVQ4qw 5rUVmnRP2AKXaRrKhltRCEYYxZ/VO/bd8J+BHuW7sZUKTUKwHEZU3+OqY+R2HLV9qw+7 M29g== X-Gm-Message-State: AOAM530Zr1SaM5nX59wdsWTScaNcfN2/cA/8j4M5rWOE0oEvO8rczqY0 1vWh55tw7RsYogTgXPFwsyg= X-Received: by 2002:a17:90a:7c4e:: with SMTP id e14mr7705366pjl.52.1591869708163; Thu, 11 Jun 2020 03:01:48 -0700 (PDT) Received: from VM_111_229_centos ([203.205.141.39]) by smtp.gmail.com with ESMTPSA id q68sm2438721pjc.30.2020.06.11.03.01.41 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Thu, 11 Jun 2020 03:01:47 -0700 (PDT) Date: Thu, 11 Jun 2020 18:01:35 +0800 From: YangYuxi To: wensong@linux-vs.org, horms@verge.net.au, ja@ssi.bg, pablo@netfilter.org, kadlec@netfilter.org, fw@strlen.de, davem@davemloft.net, kuba@kernel.org Cc: netdev@vger.kernel.org, lvs-devel@vger.kernel.org, netfilter-devel@vger.kernel.org, coreteam@netfilter.org, linux-kernel@vger.kernel.org, yx.atom1@gmail.com Subject: [PATCH] ipvs: avoid drop first packet to reuse conntrack Message-ID: <20200611100135.GA19243@VM_111_229_centos> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Since 'commit f719e3754ee2 ("ipvs: drop first packet to redirect conntrack")', when a new TCP connection meet the conditions that need reschedule, the first syn packet is dropped, this cause one second latency for the new connection, more discussion about this problem can easy search from google, such as: 1)One second connection delay in masque https://marc.info/?t=151683118100004&r=1&w=2 2)IPVS low throughput?#70747 https://github.com/kubernetes/kubernetes/issues/70747 3)Apache Bench can fill up ipvs service proxy in seconds #544 https://github.com/cloudnativelabs/kube-router/issues/544 4)Additional 1s latency in `host -> service IP -> pod` https://github.com/kubernetes/kubernetes/issues/90854 The root cause is when the old session is expired, the conntrack related to the session is dropped by ip_vs_conn_drop_conntrack. The code is as follows: ``` static void ip_vs_conn_expire(struct timer_list *t) { ... if ((cp->flags & IP_VS_CONN_F_NFCT) && !(cp->flags & IP_VS_CONN_F_ONE_PACKET)) { /* Do not access conntracks during subsys cleanup * because nf_conntrack_find_get can not be used after * conntrack cleanup for the net. */ smp_rmb(); if (ipvs->enable) ip_vs_conn_drop_conntrack(cp); } ... } ``` As the code show, only if the condition (cp->flags & IP_VS_CONN_F_NFCT) is true, ip_vs_conn_drop_conntrack will be called. So we solve this bug by following steps: 1) erase the IP_VS_CONN_F_NFCT flag (it is safely because no packets will use the old session) 2) call ip_vs_conn_expire_now to release the old session, then the related conntrack will not be dropped 3) then ipvs unnecessary to drop the first syn packet, it just continue to pass the syn packet to the next process, create a new ipvs session, and the new session will related to the old conntrack(which is reopened by conntrack as a new one), the next whole things is just as normal as that the old session isn't used to exist. This patch has been verified on our thousands of kubernets node servers on Tencent Inc. Signed-off-by: YangYuxi --- net/netfilter/ipvs/ip_vs_core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c index aa6a603a2425..2f750145172f 100644 --- a/net/netfilter/ipvs/ip_vs_core.c +++ b/net/netfilter/ipvs/ip_vs_core.c @@ -2086,11 +2086,11 @@ static int ip_vs_in_icmp_v6(struct netns_ipvs *ipvs, struct sk_buff *skb, } if (resched) { + if (uses_ct) + cp->flags &= ~IP_VS_CONN_F_NFCT; if (!atomic_read(&cp->n_control)) ip_vs_conn_expire_now(cp); __ip_vs_conn_put(cp); - if (uses_ct) - return NF_DROP; cp = NULL; } } -- 1.8.3.1