Received: by 2002:ac0:bc90:0:0:0:0:0 with SMTP id a16csp5584724img; Wed, 27 Mar 2019 11:09:38 -0700 (PDT) X-Google-Smtp-Source: APXvYqwbiF73SPW7cSrzgLBGGjE4ZDEu5XDaKXX6vx1J6cNaFEoBu8GPVO2BowQVk6FlARuZbBbz X-Received: by 2002:a63:ff18:: with SMTP id k24mr36399172pgi.140.1553710178586; Wed, 27 Mar 2019 11:09:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553710178; cv=none; d=google.com; s=arc-20160816; b=b9KXCKPvb5y3bUdevDkBKM9D7YbnMGy5qo9GcT2DuVS+jwzWhEmRn2aTFjzyBr06F7 dgh5fesbGrOCXwcvksY0Y+DqD7Rmvp1pnjh6+0NMjzx7ChsbJlbucn8zGyJX55UizcZj nMChsh5lvLljWYwOsdlBFG+7Q2GVoLi0PmHb3a2nHEQ7RDAeax5fpPsQR4moJPgN2ARY /itHu9Q7M4DKbdWHdHxcgR0aSM1DSbJ/21jF2/QoBaUqugcZoeMar54Oc1gQSbTEmnc0 UgWdLFHFJjHeW8/mShK/JWKTY5V6agexeYZsq9f2SbQoDABhxBcmyCcD+xwYAg5WoxV5 9uQw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=d2GYHGyeMXIe17tKO/BMGVDrSGcIDq/+fN7DBaSoMPU=; b=uan4Cn1A8WlhHKSqUCt4uw2BkGSzAJbBZ33jwzfpRwW1wtwFs4IuYjT806bN/sQKQF Vn5GsN574tazzMu2WrR0n1giPHgtG0bDxBB1WjaFIMmXS7O2n1xKiICT5LmIn0XTtWF/ m4Nae7/LN88NZHLsdl7TwaZ5sjIGxvfrvWFRV4d3guudR6V72FfuOGprOwwDtq7flcLA mxVu2VPxoIIQ4oTDBV1baOlNoAUjOPeqkDOHPHrpKEau0tXcDWnnAj9ndbrd0IR098+e VYgCkBSGj67VDqmOuWNcCZ+CJYW4UhNyW0Z0WDTqTC5onUt+FfkdBfaVQpXRX9QqA0fd fPYw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=VMM36Wse; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m3si20140445plt.310.2019.03.27.11.09.23; Wed, 27 Mar 2019 11:09:38 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=VMM36Wse; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388026AbfC0SGs (ORCPT + 99 others); Wed, 27 Mar 2019 14:06:48 -0400 Received: from mail.kernel.org ([198.145.29.99]:48544 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388012AbfC0SGq (ORCPT ); Wed, 27 Mar 2019 14:06:46 -0400 Received: from sasha-vm.mshome.net (c-73-47-72-35.hsd1.nh.comcast.net [73.47.72.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id AB1242184D; Wed, 27 Mar 2019 18:06:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1553710005; bh=hwIQkumsm02l7ELcMhM1NzD2TMtFFki900g1Q9UTozQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=VMM36WseWTH7oxAwi+pWLKdH9O2VJELKjeB18BR6ojeifylnW3NmkYzEovWCupB5M y/ugfAm62rVaQjDSr0YNjaqV7Fvwxr8Loo91PNNCQ3cjAvOws5xSoI1lg0f0HYBDgb St/DM7//MGyIoAvhHLzY3MYO1F+z+PKz2oXRoCwY= From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Chieh-Min Wang , Pablo Neira Ayuso , Sasha Levin , netfilter-devel@vger.kernel.org, coreteam@netfilter.org, netdev@vger.kernel.org Subject: [PATCH AUTOSEL 5.0 153/262] netfilter: conntrack: fix cloned unconfirmed skb->_nfct race in __nf_conntrack_confirm Date: Wed, 27 Mar 2019 14:00:08 -0400 Message-Id: <20190327180158.10245-153-sashal@kernel.org> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20190327180158.10245-1-sashal@kernel.org> References: <20190327180158.10245-1-sashal@kernel.org> MIME-Version: 1.0 X-Patchwork-Hint: Ignore Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Chieh-Min Wang [ Upstream commit 13f5251fd17088170c18844534682d9cab5ff5aa ] For bridge(br_flood) or broadcast/multicast packets, they could clone skb with unconfirmed conntrack which break the rule that unconfirmed skb->_nfct is never shared. With nfqueue running on my system, the race can be easily reproduced with following warning calltrace: [13257.707525] CPU: 0 PID: 12132 Comm: main Tainted: P W 4.4.60 #7744 [13257.707568] Hardware name: Qualcomm (Flattened Device Tree) [13257.714700] [] (unwind_backtrace) from [] (show_stack+0x10/0x14) [13257.720253] [] (show_stack) from [] (dump_stack+0x94/0xa8) [13257.728240] [] (dump_stack) from [] (warn_slowpath_common+0x94/0xb0) [13257.735268] [] (warn_slowpath_common) from [] (warn_slowpath_null+0x1c/0x24) [13257.743519] [] (warn_slowpath_null) from [] (__nf_conntrack_confirm+0xa8/0x618) [13257.752284] [] (__nf_conntrack_confirm) from [] (ipv4_confirm+0xb8/0xfc) [13257.761049] [] (ipv4_confirm) from [] (nf_iterate+0x48/0xa8) [13257.769725] [] (nf_iterate) from [] (nf_hook_slow+0x30/0xb0) [13257.777108] [] (nf_hook_slow) from [] (br_nf_post_routing+0x274/0x31c) [13257.784486] [] (br_nf_post_routing) from [] (nf_iterate+0x48/0xa8) [13257.792556] [] (nf_iterate) from [] (nf_hook_slow+0x30/0xb0) [13257.800458] [] (nf_hook_slow) from [] (br_forward_finish+0x94/0xa4) [13257.808010] [] (br_forward_finish) from [] (br_nf_forward_finish+0x150/0x1ac) [13257.815736] [] (br_nf_forward_finish) from [] (nf_reinject+0x108/0x170) [13257.824762] [] (nf_reinject) from [] (nfqnl_recv_verdict+0x3d8/0x420) [13257.832924] [] (nfqnl_recv_verdict) from [] (nfnetlink_rcv_msg+0x158/0x248) [13257.841256] [] (nfnetlink_rcv_msg) from [] (netlink_rcv_skb+0x54/0xb0) [13257.849762] [] (netlink_rcv_skb) from [] (netlink_unicast+0x148/0x23c) [13257.858093] [] (netlink_unicast) from [] (netlink_sendmsg+0x2ec/0x368) [13257.866348] [] (netlink_sendmsg) from [] (sock_sendmsg+0x34/0x44) [13257.874590] [] (sock_sendmsg) from [] (___sys_sendmsg+0x1ec/0x200) [13257.882489] [] (___sys_sendmsg) from [] (__sys_sendmsg+0x3c/0x64) [13257.890300] [] (__sys_sendmsg) from [] (ret_fast_syscall+0x0/0x34) The original code just triggered the warning but do nothing. It will caused the shared conntrack moves to the dying list and the packet be droppped (nf_ct_resolve_clash returns NF_DROP for dying conntrack). - Reproduce steps: +----------------------------+ | br0(bridge) | | | +-+---------+---------+------+ | eth0| | eth1| | eth2| | | | | | | +--+--+ +--+--+ +---+-+ | | | | | | +--+-+ +-+--+ +--+-+ | PC1| | PC2| | PC3| +----+ +----+ +----+ iptables -A FORWARD -m mark --mark 0x1000000/0x1000000 -j NFQUEUE --queue-num 100 --queue-bypass ps: Our nfq userspace program will set mark on packets whose connection has already been processed. PC1 sends broadcast packets simulated by hping3: hping3 --rand-source --udp 192.168.1.255 -i u100 - Broadcast racing flow chart is as follow: br_handle_frame BR_HOOK(NFPROTO_BRIDGE, NF_BR_PRE_ROUTING, br_handle_frame_finish) // skb->_nfct (unconfirmed conntrack) is constructed at PRE_ROUTING stage br_handle_frame_finish // check if this packet is broadcast br_flood_forward br_flood list_for_each_entry_rcu(p, &br->port_list, list) // iterate through each port maybe_deliver deliver_clone skb = skb_clone(skb) __br_forward BR_HOOK(NFPROTO_BRIDGE, NF_BR_FORWARD,...) // queue in our nfq and received by our userspace program // goto __nf_conntrack_confirm with process context on CPU 1 br_pass_frame_up BR_HOOK(NFPROTO_BRIDGE, NF_BR_LOCAL_IN,...) // goto __nf_conntrack_confirm with softirq context on CPU 0 Because conntrack confirm can happen at both INPUT and POSTROUTING stage. So with NFQUEUE running, skb->_nfct with the same unconfirmed conntrack could race on different core. This patch fixes a repeating kernel splat, now it is only displayed once. Signed-off-by: Chieh-Min Wang Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin --- net/netfilter/nf_conntrack_core.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c index db4d46332e86..9dd4c2048a2b 100644 --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -901,10 +901,18 @@ __nf_conntrack_confirm(struct sk_buff *skb) * REJECT will give spurious warnings here. */ - /* No external references means no one else could have - * confirmed us. + /* Another skb with the same unconfirmed conntrack may + * win the race. This may happen for bridge(br_flood) + * or broadcast/multicast packets do skb_clone with + * unconfirmed conntrack. */ - WARN_ON(nf_ct_is_confirmed(ct)); + if (unlikely(nf_ct_is_confirmed(ct))) { + WARN_ON_ONCE(1); + nf_conntrack_double_unlock(hash, reply_hash); + local_bh_enable(); + return NF_DROP; + } + pr_debug("Confirming conntrack %p\n", ct); /* We have to check the DYING flag after unlink to prevent * a race against nf_ct_get_next_corpse() possibly called from -- 2.19.1