Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp3323736pxf; Mon, 15 Mar 2021 07:08:49 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwLpVMD45jxvAkWPub3r6tOWYIEJ7THwYZGK+0NaIY0+8v4LQQhLmlHd0y0YBMaTXtjjNz1 X-Received: by 2002:a17:907:9808:: with SMTP id ji8mr23867549ejc.333.1615817329118; Mon, 15 Mar 2021 07:08:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1615817329; cv=none; d=google.com; s=arc-20160816; b=YBgWS7lIcAK3t79fANXneee1yd1n0XdmjjTmc3X/c925ay0AdqRB8NSaP4ElpLZFQr EJfMt65FOHH3UEsGn9mBULmH5Yph0qDyNIHYo8LOqGMJKCCdpIRRUkIbB+tv28uq2w4P I1ol6iIMPFNIn9lGmaEmlcze74LNXRAIxbORTF4gixUjzKsBRf+JMkX5Ge/Ny9M9ILa1 0d9+ZARBkygkEBdxyCHnCCpm9/E6IiGZAXMY4r9nSyDD59TJ42gIIBayanEv0j0OmQOQ WCXOfuGPlP3XEtyp+BHfoC8hlaG0tGz992ybhHn9vBHMbu/SffyjU6erxsJA6W4ycukn YL5g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=vChNLC47lu0TEER2DXs+IRBT1x+b/O/G+GPN/Sd9VP8=; b=mfb1q1ewWe/8p+uHOGxdT/tZ2afMNNYUjhPmSKaPGAARXCUSyFnxEhYY2m0Sja53Ts Zlnw0eFYUCiOe5FN+YkepY3bqB+Mm4o2KJNL4rTFHP0wDJRZsU+Q5UEFIq4rg3bSnnWO uGuMJ22l9AUAMwH6/QbVU+MlSyFzfpibHXt0l4d5/eT8Tih0yPHWz4/vDo8VmsXBNfp1 qaGA9D+P9WpPuwjjWj0V/G61HQR/FL12/67qAHq2FtUhSRsKR9J176y9dNDt8gxHcxP5 gVgekIkiccyJcdzHQuSdrPeo1bdBUkdrl3o5PSIStN1NibczWEaMaei2Dxz415Jy/jDc rWAw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=czwgpDvI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id hb37si11288608ejc.81.2021.03.15.07.08.25; Mon, 15 Mar 2021 07:08:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=czwgpDvI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234645AbhCOOEm (ORCPT + 99 others); Mon, 15 Mar 2021 10:04:42 -0400 Received: from mail.kernel.org ([198.145.29.99]:34534 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231953AbhCON5W (ORCPT ); Mon, 15 Mar 2021 09:57:22 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id CB47364DAD; Mon, 15 Mar 2021 13:57:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1615816631; bh=Ux5Nui7i5uAA02jA+/Fd1375riJrUm2Q5kXkTqXaIOQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=czwgpDvIHiQ5/pYTphECD6jiuP/Ux2hLzS/5GXPwprlpYr2SJE2PqADmaKeCQWNhR zURAC3a6dXjxKsLF5VdGdXzZCD3V1CMTGboMvlokwtXv55rBK56ndSsLwK1pebxg2D f4TRIXE6523UFso/fsYou9gNuIBCPFHJkDQpZuRA= From: gregkh@linuxfoundation.org To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Florian Westphal , Pablo Neira Ayuso Subject: [PATCH 5.4 015/168] netfilter: nf_nat: undo erroneous tcp edemux lookup Date: Mon, 15 Mar 2021 14:54:07 +0100 Message-Id: <20210315135550.840152163@linuxfoundation.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210315135550.333963635@linuxfoundation.org> References: <20210315135550.333963635@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Greg Kroah-Hartman From: Florian Westphal commit 03a3ca37e4c6478e3a84f04c8429dd5889e107fd upstream. Under extremely rare conditions TCP early demux will retrieve the wrong socket. 1. local machine establishes a connection to a remote server, S, on port p. This gives: laddr:lport -> S:p ... both in tcp and conntrack. 2. local machine establishes a connection to host H, on port p2. 2a. TCP stack choses same laddr:lport, so we have laddr:lport -> H:p2 from TCP point of view. 2b). There is a destination NAT rewrite in place, translating H:p2 to S:p. This results in following conntrack entries: I) laddr:lport -> S:p (origin) S:p -> laddr:lport (reply) II) laddr:lport -> H:p2 (origin) S:p -> laddr:lport2 (reply) NAT engine has rewritten laddr:lport to laddr:lport2 to map the reply packet to the correct origin. When server sends SYN/ACK to laddr:lport2, the PREROUTING hook will undo-the SNAT transformation, rewriting IP header to S:p -> laddr:lport This causes TCP early demux to associate the skb with the TCP socket of the first connection. The INPUT hook will then reverse the DNAT transformation, rewriting the IP header to H:p2 -> laddr:lport. Because packet ends up with the wrong socket, the new connection never completes: originator stays in SYN_SENT and conntrack entry remains in SYN_RECV until timeout, and responder retransmits SYN/ACK until it gives up. To resolve this, orphan the skb after the input rewrite: Because the source IP address changed, the socket must be incorrect. We can't move the DNAT undo to prerouting due to backwards compatibility, doing so will make iptables/nftables rules to no longer match the way they did. After orphan, the packet will be handed to the next protocol layer (tcp, udp, ...) and that will repeat the socket lookup just like as if early demux was disabled. Fixes: 41063e9dd1195 ("ipv4: Early TCP socket demux.") Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1427 Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman --- net/netfilter/nf_nat_proto.c | 25 +++++++++++++++++++++---- 1 file changed, 21 insertions(+), 4 deletions(-) --- a/net/netfilter/nf_nat_proto.c +++ b/net/netfilter/nf_nat_proto.c @@ -646,8 +646,8 @@ nf_nat_ipv4_fn(void *priv, struct sk_buf } static unsigned int -nf_nat_ipv4_in(void *priv, struct sk_buff *skb, - const struct nf_hook_state *state) +nf_nat_ipv4_pre_routing(void *priv, struct sk_buff *skb, + const struct nf_hook_state *state) { unsigned int ret; __be32 daddr = ip_hdr(skb)->daddr; @@ -660,6 +660,23 @@ nf_nat_ipv4_in(void *priv, struct sk_buf } static unsigned int +nf_nat_ipv4_local_in(void *priv, struct sk_buff *skb, + const struct nf_hook_state *state) +{ + __be32 saddr = ip_hdr(skb)->saddr; + struct sock *sk = skb->sk; + unsigned int ret; + + ret = nf_nat_ipv4_fn(priv, skb, state); + + if (ret == NF_ACCEPT && sk && saddr != ip_hdr(skb)->saddr && + !inet_sk_transparent(sk)) + skb_orphan(skb); /* TCP edemux obtained wrong socket */ + + return ret; +} + +static unsigned int nf_nat_ipv4_out(void *priv, struct sk_buff *skb, const struct nf_hook_state *state) { @@ -736,7 +753,7 @@ nf_nat_ipv4_local_fn(void *priv, struct static const struct nf_hook_ops nf_nat_ipv4_ops[] = { /* Before packet filtering, change destination */ { - .hook = nf_nat_ipv4_in, + .hook = nf_nat_ipv4_pre_routing, .pf = NFPROTO_IPV4, .hooknum = NF_INET_PRE_ROUTING, .priority = NF_IP_PRI_NAT_DST, @@ -757,7 +774,7 @@ static const struct nf_hook_ops nf_nat_i }, /* After packet filtering, change source */ { - .hook = nf_nat_ipv4_fn, + .hook = nf_nat_ipv4_local_in, .pf = NFPROTO_IPV4, .hooknum = NF_INET_LOCAL_IN, .priority = NF_IP_PRI_NAT_SRC,