Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp3320134pxf; Mon, 15 Mar 2021 07:05:08 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx2h7wjKzZI15tx3bsFFm8wQm6ZZyBHUSu5XON8Ac6TD/awWOORwf7UsXlZUFmkhC+Iwysg X-Received: by 2002:aa7:cc94:: with SMTP id p20mr30529316edt.353.1615817108398; Mon, 15 Mar 2021 07:05:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1615817108; cv=none; d=google.com; s=arc-20160816; b=IacsxiC/I2uO/o+Q0277Yeq0MrUoq/yg3FUfTNvPQjWZ4R0+XKb1zLwBAFWalutmkC eNxhdAL71m4Gn34XIgZUe2KiAmv47IgnBg9NxAzJckZl5N5SwfqJfDuExo048dpat7Pk 4hD4/hHVERwJGPqtIF17ITMPejVNDAPEFLlkw36yFY2JI8kp2VCfd2KtcMrRgJwPZCbI qZzB+rv0tX0F37XiaXWH2rYQwpvdbfWZyjUO+yx/WKQt2CjPN/RfTjwVMqhX7QNYxIzj IkSWbma85FiA0DIYf4nUGuFRL6AHRPQLdnWLuYSxRm38x8zc1VuIju7Duu+VfsTJTgj8 EAPA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=vChNLC47lu0TEER2DXs+IRBT1x+b/O/G+GPN/Sd9VP8=; b=pereWfPgHmF4Ezw+0YeLDVUU6ppMjgIvmmZ209rMqgtPIW2+hrJAcxK+BHN9qf1N7d Ad8aeZJVnhzi9LLhHOrjn0G+VJ7ypmchEfdNNjCiJt6KLWrDlDzNgG9PO3cc0ZQMNA23 G8BhMcHnQvs6jfaVe+AxhjHzEm5LzUNyVR4fBYvgL3ilXAlCp9oCDTQZuvaWoErpwY+d 71bGuoQjzBec8hrr1Tghq/Ghw4u3gXmY4TEkiLIXLUQPR+YtMd7Anp1R3jHJwH4fuhXy VHTIevEV+fqlh93BIDEX0ztkEzojMZEAkdm8GPdl5M4tNM57TfcTW3F64jjoHOtrLpFq UecQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=bHCajVRl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id dg23si11153988edb.519.2021.03.15.07.04.45; Mon, 15 Mar 2021 07:05:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=bHCajVRl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234402AbhCOODT (ORCPT + 99 others); Mon, 15 Mar 2021 10:03:19 -0400 Received: from mail.kernel.org ([198.145.29.99]:34136 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230204AbhCON5I (ORCPT ); Mon, 15 Mar 2021 09:57:08 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 81BBA64EEE; Mon, 15 Mar 2021 13:56:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1615816620; bh=Ux5Nui7i5uAA02jA+/Fd1375riJrUm2Q5kXkTqXaIOQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=bHCajVRl4bXk24CEPN4uI5uq6WdLFsRKd6k1CwRfmVZAJJDF/B18h0Vc5i7Am6ajt 091d/5655TgdpFcfDqsjs2AsTPUYxHXMYUTgDxdOOvXd5S7Vyr85vFd/IsbuRsvOmM DIoJr2pyzXH2zMXPHy2ObuBWyo/3S0YiYmY5oZGI= From: gregkh@linuxfoundation.org To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Florian Westphal , Pablo Neira Ayuso Subject: [PATCH 5.11 025/306] netfilter: nf_nat: undo erroneous tcp edemux lookup Date: Mon, 15 Mar 2021 14:51:28 +0100 Message-Id: <20210315135508.478257023@linuxfoundation.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210315135507.611436477@linuxfoundation.org> References: <20210315135507.611436477@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Greg Kroah-Hartman From: Florian Westphal commit 03a3ca37e4c6478e3a84f04c8429dd5889e107fd upstream. Under extremely rare conditions TCP early demux will retrieve the wrong socket. 1. local machine establishes a connection to a remote server, S, on port p. This gives: laddr:lport -> S:p ... both in tcp and conntrack. 2. local machine establishes a connection to host H, on port p2. 2a. TCP stack choses same laddr:lport, so we have laddr:lport -> H:p2 from TCP point of view. 2b). There is a destination NAT rewrite in place, translating H:p2 to S:p. This results in following conntrack entries: I) laddr:lport -> S:p (origin) S:p -> laddr:lport (reply) II) laddr:lport -> H:p2 (origin) S:p -> laddr:lport2 (reply) NAT engine has rewritten laddr:lport to laddr:lport2 to map the reply packet to the correct origin. When server sends SYN/ACK to laddr:lport2, the PREROUTING hook will undo-the SNAT transformation, rewriting IP header to S:p -> laddr:lport This causes TCP early demux to associate the skb with the TCP socket of the first connection. The INPUT hook will then reverse the DNAT transformation, rewriting the IP header to H:p2 -> laddr:lport. Because packet ends up with the wrong socket, the new connection never completes: originator stays in SYN_SENT and conntrack entry remains in SYN_RECV until timeout, and responder retransmits SYN/ACK until it gives up. To resolve this, orphan the skb after the input rewrite: Because the source IP address changed, the socket must be incorrect. We can't move the DNAT undo to prerouting due to backwards compatibility, doing so will make iptables/nftables rules to no longer match the way they did. After orphan, the packet will be handed to the next protocol layer (tcp, udp, ...) and that will repeat the socket lookup just like as if early demux was disabled. Fixes: 41063e9dd1195 ("ipv4: Early TCP socket demux.") Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1427 Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman --- net/netfilter/nf_nat_proto.c | 25 +++++++++++++++++++++---- 1 file changed, 21 insertions(+), 4 deletions(-) --- a/net/netfilter/nf_nat_proto.c +++ b/net/netfilter/nf_nat_proto.c @@ -646,8 +646,8 @@ nf_nat_ipv4_fn(void *priv, struct sk_buf } static unsigned int -nf_nat_ipv4_in(void *priv, struct sk_buff *skb, - const struct nf_hook_state *state) +nf_nat_ipv4_pre_routing(void *priv, struct sk_buff *skb, + const struct nf_hook_state *state) { unsigned int ret; __be32 daddr = ip_hdr(skb)->daddr; @@ -660,6 +660,23 @@ nf_nat_ipv4_in(void *priv, struct sk_buf } static unsigned int +nf_nat_ipv4_local_in(void *priv, struct sk_buff *skb, + const struct nf_hook_state *state) +{ + __be32 saddr = ip_hdr(skb)->saddr; + struct sock *sk = skb->sk; + unsigned int ret; + + ret = nf_nat_ipv4_fn(priv, skb, state); + + if (ret == NF_ACCEPT && sk && saddr != ip_hdr(skb)->saddr && + !inet_sk_transparent(sk)) + skb_orphan(skb); /* TCP edemux obtained wrong socket */ + + return ret; +} + +static unsigned int nf_nat_ipv4_out(void *priv, struct sk_buff *skb, const struct nf_hook_state *state) { @@ -736,7 +753,7 @@ nf_nat_ipv4_local_fn(void *priv, struct static const struct nf_hook_ops nf_nat_ipv4_ops[] = { /* Before packet filtering, change destination */ { - .hook = nf_nat_ipv4_in, + .hook = nf_nat_ipv4_pre_routing, .pf = NFPROTO_IPV4, .hooknum = NF_INET_PRE_ROUTING, .priority = NF_IP_PRI_NAT_DST, @@ -757,7 +774,7 @@ static const struct nf_hook_ops nf_nat_i }, /* After packet filtering, change source */ { - .hook = nf_nat_ipv4_fn, + .hook = nf_nat_ipv4_local_in, .pf = NFPROTO_IPV4, .hooknum = NF_INET_LOCAL_IN, .priority = NF_IP_PRI_NAT_SRC,