Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp6349imm; Tue, 24 Jul 2018 12:57:15 -0700 (PDT) X-Google-Smtp-Source: AAOMgpdOndY5oJP07v5QN9b2UrwT2kT6ArGDggTlfzEvs1U1HC8oYbZe29u11A3efNpWEMQPWyD/ X-Received: by 2002:a17:902:530a:: with SMTP id b10-v6mr18572450pli.316.1532462235846; Tue, 24 Jul 2018 12:57:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1532462235; cv=none; d=google.com; s=arc-20160816; b=jT+fTclNqPzKKEhXYunbvDO4kNtW4IRQgkTUd49hGBfgINx2o/QfkmzmnvmfAz5o7g eSy8x363vXSoQHHqUKcGoLhTbwUNvN/ewgeOM03atD02LRq3uaI/+zwEImaT0w76CNpa nrif6iuANpZDrshObOOzpxrMwJDM77cTAJGqWMG9+3hqdiF0RqCVKs3N4Oi+t8CNAiRp H1iJw77AilGuKqxUwIVfnlGQdqC3es7BL8J+sJq7TwdO9Rq0awyM1G9b8vUClDQPgT90 de5a34NP9J+9wcKXnXpYgqRG9EIwyewhbppjUGy+CZJCtu192qacXbGCaFBCc6sJHWE1 +mqw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date :arc-authentication-results; bh=HxXDWrEhC/cWfvxLBKRRm/NaL7zbe57s/3SfsmVriJ0=; b=KvQWqu2Rr9/8Ctb95PaSIuQyOmK+987HzHjDTiayAIoWFScEGXCFQpcjDgNGnvKYUr JWMc3Nca+QkUBgJzFuvzuEs8jbLXzpt5DwVDgLf3dOgqa/asMZtdjVkEagdaDCah14/f 7MBXY+9uQFI00Sw7qXd3YLF8o3v1shDagSdOrd3n4SsqOPF7YC6A+6CwsyNBYIlklzIq 3l4+5mOT2x9STMaCKENK5jSjsNXksuv6SGFxOcaGKNgnIbISsNOFb0uxHgH/fxUjrfOs YPKa2OaQvxPrD/+L8rI+be/Ya8Mdf6Hxhp7B/i2G+6SsTq+ORIXlGI2mzsm8Xemln52l ycZQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g12-v6si11817751pfh.346.2018.07.24.12.57.00; Tue, 24 Jul 2018 12:57:15 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388699AbeGXVEK (ORCPT + 99 others); Tue, 24 Jul 2018 17:04:10 -0400 Received: from ja.ssi.bg ([178.16.129.10]:55252 "EHLO ja.ssi.bg" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S2388587AbeGXVEK (ORCPT ); Tue, 24 Jul 2018 17:04:10 -0400 X-Greylist: delayed 497 seconds by postgrey-1.27 at vger.kernel.org; Tue, 24 Jul 2018 17:04:09 EDT Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by ja.ssi.bg (8.15.2/8.15.2) with ESMTP id w6OJlChI005294; Tue, 24 Jul 2018 22:47:14 +0300 Date: Tue, 24 Jul 2018 22:47:12 +0300 (EEST) From: Julian Anastasov To: Tan Hu cc: wensong@linux-vs.org, horms@verge.net.au, pablo@netfilter.org, kadlec@blackhole.kfki.hu, fw@strlen.de, davem@davemloft.net, netdev@vger.kernel.org, lvs-devel@vger.kernel.org, netfilter-devel@vger.kernel.org, coreteam@netfilter.org, linux-kernel@vger.kernel.org, zhong.weidong@zte.com.cn, jiang.biao2@zte.com.cn Subject: Re: [PATCH] ipvs: fix race between ip_vs_conn_new() and ip_vs_del_dest() In-Reply-To: <1532419953-5517-1-git-send-email-tan.hu@zte.com.cn> Message-ID: References: <1532419953-5517-1-git-send-email-tan.hu@zte.com.cn> User-Agent: Alpine 2.20 (LFD 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, On Tue, 24 Jul 2018, Tan Hu wrote: > We came across infinite loop in ipvs when using ipvs in docker > env. > > When ipvs receives new packets and cannot find an ipvs connection, > it will create a new connection, then if the dest is unavailable > (i.e. IP_VS_DEST_F_AVAILABLE), the packet will be dropped sliently. > > But if the dropped packet is the first packet of this connection, > the connection control timer never has a chance to start and the > ipvs connection cannot be released. This will lead to memory leak, or > infinite loop in cleanup_net() when net namespace is released like > this: > > ip_vs_conn_net_cleanup at ffffffffa0a9f31a [ip_vs] > __ip_vs_cleanup at ffffffffa0a9f60a [ip_vs] > ops_exit_list at ffffffff81567a49 > cleanup_net at ffffffff81568b40 > process_one_work at ffffffff810a851b > worker_thread at ffffffff810a9356 > kthread at ffffffff810b0b6f > ret_from_fork at ffffffff81697a18 > > race condition: > CPU1 CPU2 > ip_vs_in() > ip_vs_conn_new() > ip_vs_del_dest() > __ip_vs_unlink_dest() > ~IP_VS_DEST_F_AVAILABLE > cp->dest && !IP_VS_DEST_F_AVAILABLE > __ip_vs_conn_put > ... > cleanup_net ---> infinite looping > > Fix this by checking whether the timer already started. Looks like an old bug... > > Signed-off-by: Tan Hu > Reviewed-by: Jiang Biao > --- > net/netfilter/ipvs/ip_vs_core.c | 10 +++++++--- > 1 file changed, 7 insertions(+), 3 deletions(-) > > diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c > index 0679dd1..ca9e7cc 100644 > --- a/net/netfilter/ipvs/ip_vs_core.c > +++ b/net/netfilter/ipvs/ip_vs_core.c > @@ -1972,13 +1972,17 @@ static int ip_vs_in_icmp_v6(struct netns_ipvs *ipvs, struct sk_buff *skb, > if (cp->dest && !(cp->dest->flags & IP_VS_DEST_F_AVAILABLE)) { > /* the destination server is not available */ Add: u32 flags = cp->flags; > > + /* when timer already started, silently drop the packet.*/ > + if (timer_pending(&cp->timer)) > + __ip_vs_conn_put(cp); > + else > + ip_vs_conn_put(cp); When ip_vs_conn_put is called for IP_VS_CONN_F_ONE_PACKET it is possible to call ip_vs_conn_expire and to free cp with ip_vs_conn_rcu_free immediately. What we can do is to avoid the ip_vs_conn_expire_now call in such case by reading the flags early (as above) and adding the needed check (as below). > + > if (sysctl_expire_nodest_conn(ipvs)) { Add !(flags & IP_VS_CONN_F_ONE_PACKET) check in above 'if'. > /* try to expire the connection immediately */ > ip_vs_conn_expire_now(cp); > } > - /* don't restart its timer, and silently > - drop the packet. */ > - __ip_vs_conn_put(cp); > + > return NF_DROP; > } > Regards -- Julian Anastasov