Received: by 2002:a17:90a:bc8d:0:0:0:0 with SMTP id x13csp500141pjr; Sun, 17 May 2020 10:33:06 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzV4sSrSg27BWDtgHuNBuL4XmnMmptysS6yBJW6830SJwedpA4ZUNWSGMFKqpBqpnZpwWp1 X-Received: by 2002:a17:906:2503:: with SMTP id i3mr11410949ejb.293.1589736786515; Sun, 17 May 2020 10:33:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1589736786; cv=none; d=google.com; s=arc-20160816; b=w6doMXYtqRx2Cs66iLVnNcGOI+XMUTqN+uzYzuxo+Zo51UOZQpBdYTfoI1XMN6yyMc o3toeV6dFLuGd9+LXvlJ3xvop0ytL6fna2eelPO8O8A0Pg2ZX1Ok/AbNpZPzgi/MBl33 iprcAQizTlqWtLJK5nqDCMRGpflYLeZFKVUMxv+vQdD3XHoDSEEu3km5qcIuQzK3pfOS TMauygB+CRWT2f0ipYAtaMi8r2MF+CiDmPniF2kBbd95G+C1L5lFx+WQKXv9xhtNJqEi 1TNVlUWBzENMVqsFJVz08brD0nJhS80XwBApr8o8Nz1opza2aqknygznLJlM6/r0jCBA YuHA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:to:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:from :dkim-signature; bh=ezHFvOY0sk+1uJqHoIJwBLf+T1Zrvf/LHooFVb1kJ2Y=; b=biXXd3BQsvLCWsvYu5jiftE000NLWj3V03DKVEZl5Clg3Q2KWAk5jzYtCuRavs/gdm 6EC8K+Q3kQy+hDSHdwIIRhaB0aQSBiW2+deSiFZ68iUwAGdS28v4zr6GiQ9RmOeCN/ma WBumPuYjmS3gAc2cMXPhpIc8iKhIrwISAyBkkX097Q4kLNpDgEy1UTmi0RE4qrgLu4YY omNSHRnb77duyATWNREmwis6J2DU08wihPrfn/XfkLsjnbkXh/mYU+PFixYdWJclYhqH P79zFOtvgGeckNtMYHKn7w6eywPO7L182nCPOtITRIs3Ns/QZhkYqCc+yfqFoYM1vVtE qFew== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=okKoVE8e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y8si1325653ejf.120.2020.05.17.10.32.42; Sun, 17 May 2020 10:33:06 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=okKoVE8e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726399AbgEQR2G (ORCPT + 99 others); Sun, 17 May 2020 13:28:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59260 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726244AbgEQR2F (ORCPT ); Sun, 17 May 2020 13:28:05 -0400 Received: from mail-il1-x12c.google.com (mail-il1-x12c.google.com [IPv6:2607:f8b0:4864:20::12c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 99F01C061A0C; Sun, 17 May 2020 10:28:05 -0700 (PDT) Received: by mail-il1-x12c.google.com with SMTP id e8so7561977ilm.7; Sun, 17 May 2020 10:28:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ezHFvOY0sk+1uJqHoIJwBLf+T1Zrvf/LHooFVb1kJ2Y=; b=okKoVE8ecn6B4uVqvhL84Q4qeJcLgXDfYK/x5Hqw5wZNxZ10k/dFZhpHV811uy/eDu fi4cCFdtlgB+hE2zqjsAvihscnXuYA+SwrdxlTU903oawUTyocqmnItkk3SIS+uUVXbH szy+ErfKRnVuHjkGwdGP6/9x1EsYLLaJb0cyP/Ueyrjov9ye7cHOEF15kF0paGAGdelA Zo9YqbaZ4Lduizehwy2aaVCXhwXZUYiesPZp1E1Ct8cjwzlb+t5Kj+FDouzRXKSKx8YZ 72SEpSBAbR0QCVGPPoxixFcFjnnnrYV5WcajpxUUsjFo04ozM1Urw6W+KmC6Q/aYnQyx BKnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ezHFvOY0sk+1uJqHoIJwBLf+T1Zrvf/LHooFVb1kJ2Y=; b=mplAmVBRLZJ9rj6fg8GGU2QMZsOOFQfZ+ZgwpEC2oIUaELMP5fi7DBcsmmsoii4EkS T88aZNofUJjYQr82+lPP/dIXDI4oGNBjWNkYznI72LoCntKAWNkjLZTAsp2+TEYBUegV B40c7bixrKt41EbAqGnmApMs8EipQcSGcpfJBQlGPHijVv0bmOlZYt0Qfl+soMhepTHv fRtHyuhvyZb71R6U6DZ7JxGWmUqsnEbHnkQ/QgY6DetW02NlD4J1s0eEq+4Grc5KoVQq nZhP7diWOdbP672gwNmKEJG3x1vr/QgaQ6o3tzzGMMzCvV7+swYoaHnmSXnfY9KJFlH9 yvNA== X-Gm-Message-State: AOAM5335XBUMk5ueUAsoOj76I+TyJQAPh8XoJ5BB2msAbQi3BiTpR2fs 8qqt66fGlpkxoBNzzp2E6a0= X-Received: by 2002:a92:9f4b:: with SMTP id u72mr11359892ili.273.1589736484888; Sun, 17 May 2020 10:28:04 -0700 (PDT) Received: from localhost.localdomain (toroon0411w-lp130-03-174-95-146-183.dsl.bell.ca. [174.95.146.183]) by smtp.googlemail.com with ESMTPSA id f17sm3103724iol.26.2020.05.17.10.28.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 17 May 2020 10:28:04 -0700 (PDT) From: Andrew Sy Kim Cc: kim.andrewsy@gmail.com, "David S. Miller" , Alexey Kuznetsov , Hideaki YOSHIFUJI , Wensong Zhang , Simon Horman , Julian Anastasov , Jakub Kicinski , Pablo Neira Ayuso , Jozsef Kadlecsik , Florian Westphal , netdev@vger.kernel.org, lvs-devel@vger.kernel.org, linux-kernel@vger.kernel.org, netfilter-devel@vger.kernel.org, coreteam@netfilter.org Subject: [PATCH] netfilter/ipvs: immediately expire UDP connections matching unavailable destination if expire_nodest_conn=1 Date: Sun, 17 May 2020 13:16:53 -0400 Message-Id: <20200517171654.8194-1-kim.andrewsy@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200515013556.5582-1-kim.andrewsy@gmail.com> References: <20200515013556.5582-1-kim.andrewsy@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit To: unlisted-recipients:; (no To-header on input) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org If expire_nodest_conn=1 and a UDP destination is deleted, IPVS should also expire all matching connections immiediately instead of waiting for the next matching packet. This is particulary useful when there are a lot of packets coming from a few number of clients. Those clients are likely to match against existing entries if a source port in the connection hash is reused. When the number of entries in the connection tracker is large, we can significantly reduce the number of dropped packets by expiring all connections upon deletion. Signed-off-by: Andrew Sy Kim --- include/net/ip_vs.h | 7 ++++++ net/netfilter/ipvs/ip_vs_conn.c | 38 +++++++++++++++++++++++++++++++++ net/netfilter/ipvs/ip_vs_core.c | 5 ----- net/netfilter/ipvs/ip_vs_ctl.c | 9 ++++++++ 4 files changed, 54 insertions(+), 5 deletions(-) diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h index 83be2d93b407..deecf1344676 100644 --- a/include/net/ip_vs.h +++ b/include/net/ip_vs.h @@ -1049,6 +1049,11 @@ static inline int sysctl_conn_reuse_mode(struct netns_ipvs *ipvs) return ipvs->sysctl_conn_reuse_mode; } +static inline int sysctl_expire_nodest_conn(struct netns_ipvs *ipvs) +{ + return ipvs->sysctl_expire_nodest_conn; +} + static inline int sysctl_schedule_icmp(struct netns_ipvs *ipvs) { return ipvs->sysctl_schedule_icmp; @@ -1209,6 +1214,8 @@ struct ip_vs_conn * ip_vs_conn_out_get_proto(struct netns_ipvs *ipvs, int af, const struct sk_buff *skb, const struct ip_vs_iphdr *iph); +void ip_vs_conn_flush_dest(struct netns_ipvs *ipvs, struct ip_vs_dest *dest); + /* Get reference to gain full access to conn. * By default, RCU read-side critical sections have access only to * conn fields and its PE data, see ip_vs_conn_rcu_free() for reference. diff --git a/net/netfilter/ipvs/ip_vs_conn.c b/net/netfilter/ipvs/ip_vs_conn.c index 02f2f636798d..c69dfbbc3416 100644 --- a/net/netfilter/ipvs/ip_vs_conn.c +++ b/net/netfilter/ipvs/ip_vs_conn.c @@ -1366,6 +1366,44 @@ static void ip_vs_conn_flush(struct netns_ipvs *ipvs) goto flush_again; } } + +/* Flush all the connection entries in the ip_vs_conn_tab with a + * matching destination. + */ +void ip_vs_conn_flush_dest(struct netns_ipvs *ipvs, struct ip_vs_dest *dest) +{ + int idx; + struct ip_vs_conn *cp, *cp_c; + + rcu_read_lock(); + for (idx = 0; idx < ip_vs_conn_tab_size; idx++) { + hlist_for_each_entry_rcu(cp, &ip_vs_conn_tab[idx], c_list) { + if (cp->ipvs != ipvs) + continue; + + if (cp->dest != dest) + continue; + + /* As timers are expired in LIFO order, restart + * the timer of controlling connection first, so + * that it is expired after us. + */ + cp_c = cp->control; + /* cp->control is valid only with reference to cp */ + if (cp_c && __ip_vs_conn_get(cp)) { + IP_VS_DBG(4, "del controlling connection\n"); + ip_vs_conn_expire_now(cp_c); + __ip_vs_conn_put(cp); + } + IP_VS_DBG(4, "del connection\n"); + ip_vs_conn_expire_now(cp); + } + cond_resched_rcu(); + } + rcu_read_unlock(); +} +EXPORT_SYMBOL_GPL(ip_vs_conn_flush_dest); + /* * per netns init and exit */ diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c index aa6a603a2425..0139fa597d76 100644 --- a/net/netfilter/ipvs/ip_vs_core.c +++ b/net/netfilter/ipvs/ip_vs_core.c @@ -694,11 +694,6 @@ static int sysctl_nat_icmp_send(struct netns_ipvs *ipvs) return ipvs->sysctl_nat_icmp_send; } -static int sysctl_expire_nodest_conn(struct netns_ipvs *ipvs) -{ - return ipvs->sysctl_expire_nodest_conn; -} - #else static int sysctl_snat_reroute(struct netns_ipvs *ipvs) { return 0; } diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c index 8d14a1acbc37..f87c03622874 100644 --- a/net/netfilter/ipvs/ip_vs_ctl.c +++ b/net/netfilter/ipvs/ip_vs_ctl.c @@ -1225,6 +1225,15 @@ ip_vs_del_dest(struct ip_vs_service *svc, struct ip_vs_dest_user_kern *udest) */ __ip_vs_del_dest(svc->ipvs, dest, false); + /* If expire_nodest_conn is enabled and protocol is UDP, + * attempt best effort flush of all connections with this + * destination. + */ + if (sysctl_expire_nodest_conn(svc->ipvs) && + dest->protocol == IPPROTO_UDP) { + ip_vs_conn_flush_dest(svc->ipvs, dest); + } + LeaveFunction(2); return 0; -- 2.20.1