Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp3535136imu; Mon, 7 Jan 2019 05:14:37 -0800 (PST) X-Google-Smtp-Source: ALg8bN7kQb3Gs7vlufo+Fu1qqfCDagliIRTwnT51wtfLQAT3/fLjlj2y3YhPUcWVXQ6whJqkcHUC X-Received: by 2002:a63:790e:: with SMTP id u14mr10893172pgc.452.1546866877723; Mon, 07 Jan 2019 05:14:37 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1546866877; cv=none; d=google.com; s=arc-20160816; b=yyq4qLhW8PjKY33SqscFvjscaTu4tPyFhC/sME5epQWdTtD299L9WDU9uEsiBisPJq CJPI8GWnpX36yqJX0c36jS6WY6qKxlsLzphhdKosFgePIIZlgYeC7zM6GQwwLpGFOPed LNjRRX7A6PMPlLLRZ6K0ExshzdnbfUzNyyx4Z0Uru6iyOg5dIbm3Cd67CEk/Nij74F6e gZHElDco0s524biJY+3mOktS3WQf34NGY3zFm+qbXIIdQBImLCGI1SIsyjKZxZ+tjJiA igVWRvlVdDws7dqu07urJxGZUv9BqBOm5P1GJQlbaz8iH8Apbw/wHqoqcjyHoyoJfMN7 Biyw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=B+RCx5BWB9E8th+qGAq57AhjH3UoaNJTK940F5Hy/lo=; b=L0PDcsILzM89EsNZs3qbdqNs48pygWG4i1OwXym8N83V8/F4ina/ZzvuGOw7m/SWSL jGJM4UiCsnTab4l/Qa1aOx2b4RVNH5TBC5D2JRWAmLQj1aFzR+ryn8dbHlVt0ZOdYv9u 0UG1EEWsu+h4xEZoK6RXkBUAk/qsv+pDcPTyzUZjVj4XB+fOi+BhgFB1a5BtL3NqSuou PQrJGatE3EB+D7Tyrmo8ka7i2sNBkWMYYk7FoB35V5++MdJ6tQB+5iLTpnRdOhSa3Bm/ w6bJGHJCNrsOVgPiy2VXcGSyoEJhWYUbTIY55KQNzcNleFQ5MFhqoT8QxbX4bYNLhOvj Xo8g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=bi12AorU; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p14si7963745plq.25.2019.01.07.05.14.22; Mon, 07 Jan 2019 05:14:37 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=bi12AorU; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731196AbfAGNEb (ORCPT + 99 others); Mon, 7 Jan 2019 08:04:31 -0500 Received: from mail.kernel.org ([198.145.29.99]:52094 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731179AbfAGNE3 (ORCPT ); Mon, 7 Jan 2019 08:04:29 -0500 Received: from localhost (5356596B.cm-6-7b.dynamic.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id E92F52089F; Mon, 7 Jan 2019 13:04:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1546866268; bh=9a2Ub2wUaexXd6c4x9e5CO6pUtKSivbKm5PauN/7UqY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=bi12AorUFQZhwLFYRUz3x1mySDl+4qlDs31giPvMBwaLTwTZ8OD1pvMEHCae9zDNA U/W+/OZLB58lEY4KUOiIuUdBXkfrD32rKv2N/aFq7xKufm9+H1X/pbD8Dlm7/5HBiu e3D+DdbnjI9iaHFKU+bpiAWCvEc28EpS1bSGDU8w= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Dmitry Andrianov , Justin Pettit , Yi-Hung Wei , Florian Westphal , Pablo Neira Ayuso , Mauricio Faria de Oliveira , Sasha Levin Subject: [PATCH 4.14 083/101] netfilter: nf_conncount: fix garbage collection confirm race Date: Mon, 7 Jan 2019 13:33:11 +0100 Message-Id: <20190107105337.481551812@linuxfoundation.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190107105330.372621917@linuxfoundation.org> References: <20190107105330.372621917@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review X-Patchwork-Hint: ignore MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.14-stable review patch. If anyone has any objections, please let me know. ------------------ commit b36e4523d4d56e2595e28f16f6ccf1cd6a9fc452 upstream. Yi-Hung Wei and Justin Pettit found a race in the garbage collection scheme used by nf_conncount. When doing list walk, we lookup the tuple in the conntrack table. If the lookup fails we remove this tuple from our list because the conntrack entry is gone. This is the common cause, but turns out its not the only one. The list entry could have been created just before by another cpu, i.e. the conntrack entry might not yet have been inserted into the global hash. The avoid this, we introduce a timestamp and the owning cpu. If the entry appears to be stale, evict only if: 1. The current cpu is the one that added the entry, or, 2. The timestamp is older than two jiffies The second constraint allows GC to be taken over by other cpu too (e.g. because a cpu was offlined or napi got moved to another cpu). We can't pretend the 'doubtful' entry wasn't in our list. Instead, when we don't find an entry indicate via IS_ERR that entry was removed ('did not exist' or withheld ('might-be-unconfirmed'). This most likely also fixes a xt_connlimit imbalance earlier reported by Dmitry Andrianov. Cc: Dmitry Andrianov Reported-by: Justin Pettit Reported-by: Yi-Hung Wei Signed-off-by: Florian Westphal Acked-by: Yi-Hung Wei Signed-off-by: Pablo Neira Ayuso [mfo: backport: refresh context lines and use older symbol/file names: - nf_conncount.c -> xt_connlimit.c. - nf_conncount_rb -> xt_connlimit_rb - nf_conncount_tuple -> xt_connlimit_conn - conncount_conn_cachep -> connlimit_conn_cachep] Signed-off-by: Mauricio Faria de Oliveira Signed-off-by: Sasha Levin --- net/netfilter/xt_connlimit.c | 52 ++++++++++++++++++++++++++++++++---- 1 file changed, 47 insertions(+), 5 deletions(-) diff --git a/net/netfilter/xt_connlimit.c b/net/netfilter/xt_connlimit.c index ab1f849464fa..913b86ef3a8d 100644 --- a/net/netfilter/xt_connlimit.c +++ b/net/netfilter/xt_connlimit.c @@ -47,6 +47,8 @@ struct xt_connlimit_conn { struct hlist_node node; struct nf_conntrack_tuple tuple; struct nf_conntrack_zone zone; + int cpu; + u32 jiffies32; }; struct xt_connlimit_rb { @@ -126,11 +128,42 @@ bool nf_conncount_add(struct hlist_head *head, return false; conn->tuple = *tuple; conn->zone = *zone; + conn->cpu = raw_smp_processor_id(); + conn->jiffies32 = (u32)jiffies; hlist_add_head(&conn->node, head); return true; } EXPORT_SYMBOL_GPL(nf_conncount_add); +static const struct nf_conntrack_tuple_hash * +find_or_evict(struct net *net, struct xt_connlimit_conn *conn) +{ + const struct nf_conntrack_tuple_hash *found; + unsigned long a, b; + int cpu = raw_smp_processor_id(); + __s32 age; + + found = nf_conntrack_find_get(net, &conn->zone, &conn->tuple); + if (found) + return found; + b = conn->jiffies32; + a = (u32)jiffies; + + /* conn might have been added just before by another cpu and + * might still be unconfirmed. In this case, nf_conntrack_find() + * returns no result. Thus only evict if this cpu added the + * stale entry or if the entry is older than two jiffies. + */ + age = a - b; + if (conn->cpu == cpu || age >= 2) { + hlist_del(&conn->node); + kmem_cache_free(connlimit_conn_cachep, conn); + return ERR_PTR(-ENOENT); + } + + return ERR_PTR(-EAGAIN); +} + unsigned int nf_conncount_lookup(struct net *net, struct hlist_head *head, const struct nf_conntrack_tuple *tuple, const struct nf_conntrack_zone *zone, @@ -138,18 +171,27 @@ unsigned int nf_conncount_lookup(struct net *net, struct hlist_head *head, { const struct nf_conntrack_tuple_hash *found; struct xt_connlimit_conn *conn; - struct hlist_node *n; struct nf_conn *found_ct; + struct hlist_node *n; unsigned int length = 0; *addit = true; /* check the saved connections */ hlist_for_each_entry_safe(conn, n, head, node) { - found = nf_conntrack_find_get(net, &conn->zone, &conn->tuple); - if (found == NULL) { - hlist_del(&conn->node); - kmem_cache_free(connlimit_conn_cachep, conn); + found = find_or_evict(net, conn); + if (IS_ERR(found)) { + /* Not found, but might be about to be confirmed */ + if (PTR_ERR(found) == -EAGAIN) { + length++; + if (!tuple) + continue; + + if (nf_ct_tuple_equal(&conn->tuple, tuple) && + nf_ct_zone_id(&conn->zone, conn->zone.dir) == + nf_ct_zone_id(zone, zone->dir)) + *addit = false; + } continue; } -- 2.19.1