Received: by 2002:ac0:950c:0:0:0:0:0 with SMTP id f12csp2316707imc; Tue, 12 Mar 2019 11:13:58 -0700 (PDT) X-Google-Smtp-Source: APXvYqxbAiSywWE10X8JCR7w/X01gWb+0Gu+Efjg+Hv+pNWWb+n4eow/TCLlsG3fFIoZeI4y4Id8 X-Received: by 2002:a65:5286:: with SMTP id y6mr19071757pgp.310.1552414438387; Tue, 12 Mar 2019 11:13:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1552414438; cv=none; d=google.com; s=arc-20160816; b=taU7hjOzkkfDyL1DOzIwIJEYDZlLfm7ka1PyIJK6i4+dBdqcUmRwOW9jJ453QenjjG upl9RghIZ/dAfq2eryE0jRVmQs19g+RC6WjPbx8BuWVXy2UcMawst8/R3YkZneku0IN3 JJnotFoANxYb/4EZ5m8iVG1Z8/cna3wx4W8Qez/H3BlUxAQtn1VIOd78CIVzRt9+6+eu 9pI5dypNi7+Jv/NB/TVmh+nKzN2pnoy3jNfZbV/r1uK74BnRWu0WEJj9rHfU+hsEsEyu 4+Fp6m+pNyEPy8FMAKjZZVma1oDZZQQAjjLBHwP/vVnFpmRof06Y6vlu9XsKqAtTi/cJ z+OQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=bdtQ/fU4rxfgKB12sQvm4nrQrAWUijEirCPrBSmKhJM=; b=FtdWH3YfduMH6M+yFU3BaMuUTfTX7nbWLIVGbE9yINYj4BJ8nxBdbBOaz0tFQEnyXq jSgpE+ktFNsnCcz2buSmq61ruUnOKMbwKpkRFhBqEmovF8rftxkKV6nu8oatOSdvJqKX ljmGoVp8mJ7pL64fOwqbCO0EsWkmkDdqpwyRHmtklz7QhzhEQ8yMLMOJAo1ft2n6wBTd M36ZVfaM+CEeakKpT6UQblxq31mWndGaR75O65AwrH94pWtww/G6nggfNQouq0pYhUBB WANOczqX1GlNvhWiNcJHU3YdkQy8tRkabJ6L+CRDJeIZrMQKmHfI5Ztuf749V72jeXpu I8IA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=xQHWvFi4; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j7si8899514plb.349.2019.03.12.11.13.43; Tue, 12 Mar 2019 11:13:58 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=xQHWvFi4; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729598AbfCLSLB (ORCPT + 99 others); Tue, 12 Mar 2019 14:11:01 -0400 Received: from mail.kernel.org ([198.145.29.99]:45696 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727116AbfCLRLr (ORCPT ); Tue, 12 Mar 2019 13:11:47 -0400 Received: from localhost (unknown [104.133.8.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 6AEA921741; Tue, 12 Mar 2019 17:11:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1552410706; bh=JsLIY69GrEdcUSrQnDKBxU0Tsh/3MHCrjrDKSIbLuco=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=xQHWvFi4bidkagMeP4T8F9Hw71bhhHUEpQLR/YTqoaJzOGBVJ+B66fB0JcQKxTWjM QgYGkl8svD2XkBfxiL6vrcU+pWDbxHG8O0GMT6KSE70OD5T3zOWcyx3kfUQdhVHVOX gjMNiHaZVwA/FteTLvkhtw8Euh0Z7pyyAc7emfqA= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Taehee Yoo , Florian Westphal , Pablo Neira Ayuso , Sasha Levin Subject: [PATCH 4.20 005/171] netfilter: nft_compat: make lists per netns Date: Tue, 12 Mar 2019 10:06:25 -0700 Message-Id: <20190312170348.325739242@linuxfoundation.org> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190312170347.868927101@linuxfoundation.org> References: <20190312170347.868927101@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review X-Patchwork-Hint: ignore MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.20-stable review patch. If anyone has any objections, please let me know. ------------------ [ Upstream commit cf52572ebbd7189a1966c2b5fc34b97078cd1dce ] There are two problems with nft_compat since the netlink config plane uses a per-netns mutex: 1. Concurrent add/del accesses to the same list 2. accesses to a list element after it has been free'd already. This patch fixes the first problem. Freeing occurs from a work queue, after transaction mutexes have been released, i.e., it still possible for a new transaction (even from same net ns) to find the to-be-deleted expression in the list. The ->destroy functions are not allowed to have any such side effects, i.e. the list_del() in the destroy function is not allowed. This part of the problem is solved in the next patch. I tried to make this work by serializing list access via mutex and by moving list_del() to a deactivate callback, but Taehee spotted following race on this approach: NET #0 NET #1 >select_ops() ->init() ->select_ops() ->deactivate() ->destroy() nft_xt_put() kfree_rcu(xt, rcu_head); ->init() <-- use-after-free occurred. Unfortunately, we can't increment reference count in select_ops(), because we can't undo the refcount increase in case a different expression fails in the same batch. (The destroy hook will only be called in case the expression was initialized successfully). Fixes: f102d66b335a ("netfilter: nf_tables: use dedicated mutex to guard transactions") Reported-by: Taehee Yoo Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin --- net/netfilter/nft_compat.c | 129 +++++++++++++++++++++++++------------ 1 file changed, 89 insertions(+), 40 deletions(-) diff --git a/net/netfilter/nft_compat.c b/net/netfilter/nft_compat.c index 791f3e693e3d..d3412138e000 100644 --- a/net/netfilter/nft_compat.c +++ b/net/netfilter/nft_compat.c @@ -22,6 +22,7 @@ #include #include #include +#include struct nft_xt { struct list_head head; @@ -43,6 +44,20 @@ struct nft_xt_match_priv { void *info; }; +struct nft_compat_net { + struct list_head nft_target_list; + struct list_head nft_match_list; +}; + +static unsigned int nft_compat_net_id __read_mostly; +static struct nft_expr_type nft_match_type; +static struct nft_expr_type nft_target_type; + +static struct nft_compat_net *nft_compat_pernet(struct net *net) +{ + return net_generic(net, nft_compat_net_id); +} + static bool nft_xt_put(struct nft_xt *xt) { if (refcount_dec_and_test(&xt->refcnt)) { @@ -735,10 +750,6 @@ static const struct nfnetlink_subsystem nfnl_compat_subsys = { .cb = nfnl_nft_compat_cb, }; -static LIST_HEAD(nft_match_list); - -static struct nft_expr_type nft_match_type; - static bool nft_match_cmp(const struct xt_match *match, const char *name, u32 rev, u32 family) { @@ -750,6 +761,7 @@ static const struct nft_expr_ops * nft_match_select_ops(const struct nft_ctx *ctx, const struct nlattr * const tb[]) { + struct nft_compat_net *cn; struct nft_xt *nft_match; struct xt_match *match; unsigned int matchsize; @@ -766,8 +778,10 @@ nft_match_select_ops(const struct nft_ctx *ctx, rev = ntohl(nla_get_be32(tb[NFTA_MATCH_REV])); family = ctx->family; + cn = nft_compat_pernet(ctx->net); + /* Re-use the existing match if it's already loaded. */ - list_for_each_entry(nft_match, &nft_match_list, head) { + list_for_each_entry(nft_match, &cn->nft_match_list, head) { struct xt_match *match = nft_match->ops.data; if (nft_match_cmp(match, mt_name, rev, family)) @@ -811,7 +825,7 @@ nft_match_select_ops(const struct nft_ctx *ctx, nft_match->ops.size = matchsize; - list_add(&nft_match->head, &nft_match_list); + list_add(&nft_match->head, &cn->nft_match_list); return &nft_match->ops; err: @@ -827,10 +841,6 @@ static struct nft_expr_type nft_match_type __read_mostly = { .owner = THIS_MODULE, }; -static LIST_HEAD(nft_target_list); - -static struct nft_expr_type nft_target_type; - static bool nft_target_cmp(const struct xt_target *tg, const char *name, u32 rev, u32 family) { @@ -842,6 +852,7 @@ static const struct nft_expr_ops * nft_target_select_ops(const struct nft_ctx *ctx, const struct nlattr * const tb[]) { + struct nft_compat_net *cn; struct nft_xt *nft_target; struct xt_target *target; char *tg_name; @@ -862,8 +873,9 @@ nft_target_select_ops(const struct nft_ctx *ctx, strcmp(tg_name, "standard") == 0) return ERR_PTR(-EINVAL); + cn = nft_compat_pernet(ctx->net); /* Re-use the existing target if it's already loaded. */ - list_for_each_entry(nft_target, &nft_target_list, head) { + list_for_each_entry(nft_target, &cn->nft_target_list, head) { struct xt_target *target = nft_target->ops.data; if (!target->target) @@ -908,7 +920,7 @@ nft_target_select_ops(const struct nft_ctx *ctx, else nft_target->ops.eval = nft_target_eval_xt; - list_add(&nft_target->head, &nft_target_list); + list_add(&nft_target->head, &cn->nft_target_list); return &nft_target->ops; err: @@ -924,13 +936,74 @@ static struct nft_expr_type nft_target_type __read_mostly = { .owner = THIS_MODULE, }; +static int __net_init nft_compat_init_net(struct net *net) +{ + struct nft_compat_net *cn = nft_compat_pernet(net); + + INIT_LIST_HEAD(&cn->nft_target_list); + INIT_LIST_HEAD(&cn->nft_match_list); + + return 0; +} + +static void __net_exit nft_compat_exit_net(struct net *net) +{ + struct nft_compat_net *cn = nft_compat_pernet(net); + struct nft_xt *xt, *next; + + if (list_empty(&cn->nft_match_list) && + list_empty(&cn->nft_target_list)) + return; + + /* If there was an error that caused nft_xt expr to not be initialized + * fully and noone else requested the same expression later, the lists + * contain 0-refcount entries that still hold module reference. + * + * Clean them here. + */ + mutex_lock(&net->nft.commit_mutex); + list_for_each_entry_safe(xt, next, &cn->nft_target_list, head) { + struct xt_target *target = xt->ops.data; + + list_del_init(&xt->head); + + if (refcount_read(&xt->refcnt)) + continue; + module_put(target->me); + kfree(xt); + } + + list_for_each_entry_safe(xt, next, &cn->nft_match_list, head) { + struct xt_match *match = xt->ops.data; + + list_del_init(&xt->head); + + if (refcount_read(&xt->refcnt)) + continue; + module_put(match->me); + kfree(xt); + } + mutex_unlock(&net->nft.commit_mutex); +} + +static struct pernet_operations nft_compat_net_ops = { + .init = nft_compat_init_net, + .exit = nft_compat_exit_net, + .id = &nft_compat_net_id, + .size = sizeof(struct nft_compat_net), +}; + static int __init nft_compat_module_init(void) { int ret; + ret = register_pernet_subsys(&nft_compat_net_ops); + if (ret < 0) + goto err_target; + ret = nft_register_expr(&nft_match_type); if (ret < 0) - return ret; + goto err_pernet; ret = nft_register_expr(&nft_target_type); if (ret < 0) @@ -943,45 +1016,21 @@ static int __init nft_compat_module_init(void) } return ret; - err_target: nft_unregister_expr(&nft_target_type); err_match: nft_unregister_expr(&nft_match_type); +err_pernet: + unregister_pernet_subsys(&nft_compat_net_ops); return ret; } static void __exit nft_compat_module_exit(void) { - struct nft_xt *xt, *next; - - /* list should be empty here, it can be non-empty only in case there - * was an error that caused nft_xt expr to not be initialized fully - * and noone else requested the same expression later. - * - * In this case, the lists contain 0-refcount entries that still - * hold module reference. - */ - list_for_each_entry_safe(xt, next, &nft_target_list, head) { - struct xt_target *target = xt->ops.data; - - if (WARN_ON_ONCE(refcount_read(&xt->refcnt))) - continue; - module_put(target->me); - kfree(xt); - } - - list_for_each_entry_safe(xt, next, &nft_match_list, head) { - struct xt_match *match = xt->ops.data; - - if (WARN_ON_ONCE(refcount_read(&xt->refcnt))) - continue; - module_put(match->me); - kfree(xt); - } nfnetlink_subsys_unregister(&nfnl_compat_subsys); nft_unregister_expr(&nft_target_type); nft_unregister_expr(&nft_match_type); + unregister_pernet_subsys(&nft_compat_net_ops); } MODULE_ALIAS_NFNL_SUBSYS(NFNL_SUBSYS_NFT_COMPAT); -- 2.19.1