Received: by 10.192.165.156 with SMTP id m28csp1121861imm; Wed, 11 Apr 2018 12:51:39 -0700 (PDT) X-Google-Smtp-Source: AIpwx48I38KskkEHzO1wf6GQjvv7/RYNgoXqjuxfVHh/I6PF8Ugzy69dF/qlcG/1jsn6EU+XAktm X-Received: by 10.99.137.67 with SMTP id v64mr4434335pgd.423.1523476299052; Wed, 11 Apr 2018 12:51:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1523476299; cv=none; d=google.com; s=arc-20160816; b=zcuokkoPPCc4eW8w9o/I0BbkoF11eGTo39ZrlAubAruC5ydI4p/tUZI0Eo6yzbRQTn q15NWqfvIRKGpI0DQioeLGn8eoJY7VC4C9qPNlwYvi93IZEa5TcCDx8/Oi71nfIspGC4 iGyaEC05MzuU4ZVYoQ0+Tm7hoxn+NRM/gpgxB+sUap4j2SdXQokSe6WjwjBJHoDshrIU PXdcYeVYP5g8VQRQ7Jhm9+ifAllMKXODwMBGgphNEnA9WJbGdsN8jmosrRKC5eq9HI4e RqiJTaJHKBadic8Xiy1hcG+XXJm/BRu6DCPG8Uk2MVWVfqQCt5VcFNZ1OQgUVkfmCXKL jJSw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :in-reply-to:message-id:date:subject:cc:to:from :arc-authentication-results; bh=Qo3cu2rfCELd7oi76+9W+UJao3nnHCNvBF/n8xBynis=; b=k0k/0RbNHFKxL84uCrCp11fIP29F+hUFTfF+6bYhtiP1qmyVkqmgwZmDV0QBS303tv ecSzmMiXo1rbEHqZxIppCzGqrm0YGf+XHrjkVaAujV7GSrcr5faGvh54d1r1S4xK7cmZ tc/Onr6guGbWyTC5nsckGTc2IBu1kjKTInm29clna0lv338vUGBb8tBT2swq3s7VL6eD gCK79BLQ8A8yv8sefs7s+fQZWPG2bn346+S9/0Zg/WGy60RYdFOZimVQtunThh2vcHqw 0WzVFnrLsCiB+imiTUiYrko3HybI1UgaAa4niV56WrI6TbGJ/DZGOQZ605tJiXFApaqw FJ7g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w6si1098701pgp.496.2018.04.11.12.51.02; Wed, 11 Apr 2018 12:51:39 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755924AbeDKTsA (ORCPT + 99 others); Wed, 11 Apr 2018 15:48:00 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:37376 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754541AbeDKS7V (ORCPT ); Wed, 11 Apr 2018 14:59:21 -0400 Received: from localhost (LFbn-1-12247-202.w90-92.abo.wanadoo.fr [90.92.61.202]) by mail.linuxfoundation.org (Postfix) with ESMTPSA id C283A94B; Wed, 11 Apr 2018 18:59:20 +0000 (UTC) From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Florian Westphal , Pablo Neira Ayuso , Sasha Levin Subject: [PATCH 4.9 150/310] netfilter: conntrack: dont call iter for non-confirmed conntracks Date: Wed, 11 Apr 2018 20:34:49 +0200 Message-Id: <20180411183628.890879702@linuxfoundation.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180411183622.305902791@linuxfoundation.org> References: <20180411183622.305902791@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.9-stable review patch. If anyone has any objections, please let me know. ------------------ From: Florian Westphal [ Upstream commit b0feacaad13a0aa9657c37ed80991575981e2e3b ] nf_ct_iterate_cleanup_net currently calls iter() callback also for conntracks on the unconfirmed list, but this is unsafe. Acesses to nf_conn are fine, but some users access the extension area in the iter() callback, but that does only work reliably for confirmed conntracks (ct->ext can be reallocated at any time for unconfirmed conntrack). The seond issue is that there is a short window where a conntrack entry is neither on the list nor in the table: To confirm an entry, it is first removed from the unconfirmed list, then insert into the table. Fix this by iterating the unconfirmed list first and marking all entries as dying, then wait for rcu grace period. This makes sure all entries that were about to be confirmed either are in the main table, or will be dropped soon. Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman --- net/netfilter/nf_conntrack_core.c | 39 ++++++++++++++++++++++++++++---------- 1 file changed, 29 insertions(+), 10 deletions(-) --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -1542,7 +1542,6 @@ get_next_corpse(struct net *net, int (*i struct nf_conntrack_tuple_hash *h; struct nf_conn *ct; struct hlist_nulls_node *n; - int cpu; spinlock_t *lockp; for (; *bucket < nf_conntrack_htable_size; (*bucket)++) { @@ -1564,24 +1563,40 @@ get_next_corpse(struct net *net, int (*i cond_resched(); } + return NULL; +found: + atomic_inc(&ct->ct_general.use); + spin_unlock(lockp); + local_bh_enable(); + return ct; +} + +static void +__nf_ct_unconfirmed_destroy(struct net *net) +{ + int cpu; + for_each_possible_cpu(cpu) { - struct ct_pcpu *pcpu = per_cpu_ptr(net->ct.pcpu_lists, cpu); + struct nf_conntrack_tuple_hash *h; + struct hlist_nulls_node *n; + struct ct_pcpu *pcpu; + + pcpu = per_cpu_ptr(net->ct.pcpu_lists, cpu); spin_lock_bh(&pcpu->lock); hlist_nulls_for_each_entry(h, n, &pcpu->unconfirmed, hnnode) { + struct nf_conn *ct; + ct = nf_ct_tuplehash_to_ctrack(h); - if (iter(ct, data)) - set_bit(IPS_DYING_BIT, &ct->status); + + /* we cannot call iter() on unconfirmed list, the + * owning cpu can reallocate ct->ext at any time. + */ + set_bit(IPS_DYING_BIT, &ct->status); } spin_unlock_bh(&pcpu->lock); cond_resched(); } - return NULL; -found: - atomic_inc(&ct->ct_general.use); - spin_unlock(lockp); - local_bh_enable(); - return ct; } void nf_ct_iterate_cleanup(struct net *net, @@ -1596,6 +1611,10 @@ void nf_ct_iterate_cleanup(struct net *n if (atomic_read(&net->ct.count) == 0) return; + __nf_ct_unconfirmed_destroy(net); + + synchronize_net(); + while ((ct = get_next_corpse(net, iter, data, &bucket)) != NULL) { /* Time to push up daises... */