Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp1025791yba; Thu, 4 Apr 2019 02:42:58 -0700 (PDT) X-Google-Smtp-Source: APXvYqzCTaREFky8bM/sDSR2PCHFKOq0+9NDKJ4n6D4VMhxHxs1VsW0sFypIDV7fqgFabY0pBd+v X-Received: by 2002:a17:902:2888:: with SMTP id f8mr5416864plb.244.1554370978176; Thu, 04 Apr 2019 02:42:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554370978; cv=none; d=google.com; s=arc-20160816; b=PU2+/pPduCoz5Tup9ibfuwaNhxlzAbIljudSG8weXOoY8hAfSLc0KKHP9MMXxN+/hn uSEPuQpn/kOAPQdYRSMvSA3TSP1tEgDI8v3H0uMlZ+IzB8mKt6jv4QAv4ekqKFuglr1K 27u8QP4Fm3ihl4z/2NK/sdR13jXoEzk/JWzyTs9DsdWTNirc4I6QRs2afMBKk//BGaei sDKMtfIc6WZXume1P4UKTsK1fGCBOZVVTlqeskBnCNRSRUqP/T9IllW10Xmqdvy/7ZDb eRs2vf+tl4SenazaUJUgFWDtWSfazsfp6E2TSsWGxuIiPqcpC4fZvB70+IuJeJkHsBuV +k+Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=HYH5yTF7FXH38aA7YFyma549RK3sHPUtcoBjQoy1szM=; b=EbShQ7YyayF9D8v9Dz0PnX+0pgGIu7sHOn68xmEFgkDou4I9mqMtQ4gzQz7cTjeZ/s 9iql3E7woZLRHxhRNoIrAl+ip/BCMOsv++UJZMtNDTEKMNwTRiUPnykVE3lef3DC3Llm wsSe0aE/johMQGOEKl0lF/UCryqvLiAsowGtC7Sa6KfTiWizUxw25oqbCjxhMydrKOEc 2pBcR6bPcprbQm7mOCz7Clgjrbn7tiD1iUZRi1/P4mOIaoxeANVYlbOilNQGRY8AN9rG TlymFrGyr7HhN3FNzyFmAW7sbTMbskMz+Hyi3AE8Vgk6497ijycRC7TzobtiFVGIO2yU iNCQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=s6s18r0u; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h14si15905539pgg.377.2019.04.04.02.42.43; Thu, 04 Apr 2019 02:42:58 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=s6s18r0u; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731457AbfDDJBj (ORCPT + 99 others); Thu, 4 Apr 2019 05:01:39 -0400 Received: from mail.kernel.org ([198.145.29.99]:38692 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731189AbfDDJBc (ORCPT ); Thu, 4 Apr 2019 05:01:32 -0400 Received: from localhost (83-86-89-107.cable.dynamic.v4.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 7704721850; Thu, 4 Apr 2019 09:01:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1554368490; bh=KatdPcVPVaw/F9jlGqkRWTBB7/g78SuYAF7QTsHnEuI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=s6s18r0uhh68j2zogvETGGa4/SdjMB0eYekcIfRwUMwPKlaZUKHaHtVWq0nYCX1Mg Sne8kKPUHXh8qwfybNWlZ+UdsYu3MKWr+msF/SZdSVld/nJi3lUPhLjw1x88Qi0gf4 z0OL5pB0sTBsWV+hp3D14LzLtm0Y9NTefuPQk50Y= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Jozsef Kadlecsik , Florian Westphal , Pablo Neira Ayuso , Sasha Levin Subject: [PATCH 4.19 043/187] netfilter: conntrack: tcp: only close if RST matches exact sequence Date: Thu, 4 Apr 2019 10:46:20 +0200 Message-Id: <20190404084605.112395270@linuxfoundation.org> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190404084603.119654039@linuxfoundation.org> References: <20190404084603.119654039@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review X-Patchwork-Hint: ignore MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.19-stable review patch. If anyone has any objections, please let me know. ------------------ [ Upstream commit be0502a3f2e94211a8809a09ecbc3a017189b8fb ] TCP resets cause instant transition from established to closed state provided the reset is in-window. Endpoints that implement RFC 5961 require resets to match the next expected sequence number. RST segments that are in-window (but that do not match RCV.NXT) are ignored, and a "challenge ACK" is sent back. Main problem for conntrack is that its a middlebox, i.e. whereas an end host might have ACK'd SEQ (and would thus accept an RST with this sequence number), conntrack might not have seen this ACK (yet). Therefore we can't simply flag RSTs with non-exact match as invalid. This updates RST processing as follows: 1. If the connection is in a state other than ESTABLISHED, nothing is changed, RST is subject to normal in-window check. 2. If the RSTs sequence number either matches exactly RCV.NXT, connection state moves to CLOSE. 3. The same applies if the RST sequence number aligns with a previous packet in the same direction. In all other cases, the connection remains in ESTABLISHED state. If the normal-in-window check passes, the timeout will be lowered to that of CLOSE. If the peer sends a challenge ack, connection timeout will be reset. If the challenge ACK triggers another RST (RST was valid after all), this 2nd RST will match expected sequence and conntrack state changes to CLOSE. If no challenge ACK is received, the connection will time out after CLOSE seconds (10 seconds by default), just like without this patch. Packetdrill test case: 0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 0.000 bind(3, ..., ...) = 0 0.000 listen(3, 1) = 0 0.100 < S 0:0(0) win 32792 0.100 > S. 0:0(0) ack 1 win 64240 0.200 < . 1:1(0) ack 1 win 257 0.200 accept(3, ..., ...) = 4 // Receive a segment. 0.210 < P. 1:1001(1000) ack 1 win 46 0.210 > . 1:1(0) ack 1001 // Application writes 1000 bytes. 0.250 write(4, ..., 1000) = 1000 0.250 > P. 1:1001(1000) ack 1001 // First reset, old sequence. Conntrack (correctly) considers this // invalid due to failed window validation (regardless of this patch). 0.260 < R 2:2(0) ack 1001 win 260 // 2nd reset, but too far ahead sequence. Same: correctly handled // as invalid. 0.270 < R 99990001:99990001(0) ack 1001 win 260 // in-window, but not exact sequence. // Current Linux kernels might reply with a challenge ack, and do not // remove connection. // Without this patch, conntrack state moves to CLOSE. // With patch, timeout is lowered like CLOSE, but connection stays // in ESTABLISHED state. 0.280 < R 1010:1010(0) ack 1001 win 260 // Expect challenge ACK 0.281 > . 1001:1001(0) ack 1001 win 501 // With or without this patch, RST will cause connection // to move to CLOSE (sequence number matches) // 0.282 < R 1001:1001(0) ack 1001 win 260 // ACK 0.300 < . 1001:1001(0) ack 1001 win 257 // more data could be exchanged here, connection // is still established // Client closes the connection. 0.610 < F. 1001:1001(0) ack 1001 win 260 0.650 > . 1001:1001(0) ack 1002 // Close the connection without reading outstanding data 0.700 close(4) = 0 // so one more reset. Will be deemed acceptable with patch as well: // connection is already closing. 0.701 > R. 1001:1001(0) ack 1002 win 501 // End packetdrill test case. With patch, this generates following conntrack events: [NEW] 120 SYN_SENT src=10.0.2.1 dst=10.0.0.1 sport=5437 dport=80 [UNREPLIED] [UPDATE] 60 SYN_RECV src=10.0.2.1 dst=10.0.0.1 sport=5437 dport=80 [UPDATE] 432000 ESTABLISHED src=10.0.2.1 dst=10.0.0.1 sport=5437 dport=80 [ASSURED] [UPDATE] 120 FIN_WAIT src=10.0.2.1 dst=10.0.0.1 sport=5437 dport=80 [ASSURED] [UPDATE] 60 CLOSE_WAIT src=10.0.2.1 dst=10.0.0.1 sport=5437 dport=80 [ASSURED] [UPDATE] 10 CLOSE src=10.0.2.1 dst=10.0.0.1 sport=5437 dport=80 [ASSURED] Without patch, first RST moves connection to close, whereas socket state does not change until FIN is received. [NEW] 120 SYN_SENT src=10.0.2.1 dst=10.0.0.1 sport=5141 dport=80 [UNREPLIED] [UPDATE] 60 SYN_RECV src=10.0.2.1 dst=10.0.0.1 sport=5141 dport=80 [UPDATE] 432000 ESTABLISHED src=10.0.2.1 dst=10.0.0.1 sport=5141 dport=80 [ASSURED] [UPDATE] 10 CLOSE src=10.0.2.1 dst=10.0.0.1 sport=5141 dport=80 [ASSURED] Cc: Jozsef Kadlecsik Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin --- net/netfilter/nf_conntrack_proto_tcp.c | 50 ++++++++++++++++++++------ 1 file changed, 40 insertions(+), 10 deletions(-) diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c index 247b89784a6f..842f3f86fb2e 100644 --- a/net/netfilter/nf_conntrack_proto_tcp.c +++ b/net/netfilter/nf_conntrack_proto_tcp.c @@ -769,6 +769,12 @@ static int tcp_error(struct net *net, struct nf_conn *tmpl, return NF_ACCEPT; } +static bool nf_conntrack_tcp_established(const struct nf_conn *ct) +{ + return ct->proto.tcp.state == TCP_CONNTRACK_ESTABLISHED && + test_bit(IPS_ASSURED_BIT, &ct->status); +} + /* Returns verdict for packet, or -1 for invalid. */ static int tcp_packet(struct nf_conn *ct, const struct sk_buff *skb, @@ -963,16 +969,38 @@ static int tcp_packet(struct nf_conn *ct, new_state = TCP_CONNTRACK_ESTABLISHED; break; case TCP_CONNTRACK_CLOSE: - if (index == TCP_RST_SET - && (ct->proto.tcp.seen[!dir].flags & IP_CT_TCP_FLAG_MAXACK_SET) - && before(ntohl(th->seq), ct->proto.tcp.seen[!dir].td_maxack)) { - /* Invalid RST */ - spin_unlock_bh(&ct->lock); - nf_ct_l4proto_log_invalid(skb, ct, "invalid rst"); - return -NF_ACCEPT; + if (index != TCP_RST_SET) + break; + + if (ct->proto.tcp.seen[!dir].flags & IP_CT_TCP_FLAG_MAXACK_SET) { + u32 seq = ntohl(th->seq); + + if (before(seq, ct->proto.tcp.seen[!dir].td_maxack)) { + /* Invalid RST */ + spin_unlock_bh(&ct->lock); + nf_ct_l4proto_log_invalid(skb, ct, "invalid rst"); + return -NF_ACCEPT; + } + + if (!nf_conntrack_tcp_established(ct) || + seq == ct->proto.tcp.seen[!dir].td_maxack) + break; + + /* Check if rst is part of train, such as + * foo:80 > bar:4379: P, 235946583:235946602(19) ack 42 + * foo:80 > bar:4379: R, 235946602:235946602(0) ack 42 + */ + if (ct->proto.tcp.last_index == TCP_ACK_SET && + ct->proto.tcp.last_dir == dir && + seq == ct->proto.tcp.last_end) + break; + + /* ... RST sequence number doesn't match exactly, keep + * established state to allow a possible challenge ACK. + */ + new_state = old_state; } - if (index == TCP_RST_SET - && ((test_bit(IPS_SEEN_REPLY_BIT, &ct->status) + if (((test_bit(IPS_SEEN_REPLY_BIT, &ct->status) && ct->proto.tcp.last_index == TCP_SYN_SET) || (!test_bit(IPS_ASSURED_BIT, &ct->status) && ct->proto.tcp.last_index == TCP_ACK_SET)) @@ -988,7 +1016,7 @@ static int tcp_packet(struct nf_conn *ct, * segments we ignored. */ goto in_window; } - /* Just fall through */ + break; default: /* Keep compilers happy. */ break; @@ -1023,6 +1051,8 @@ static int tcp_packet(struct nf_conn *ct, if (ct->proto.tcp.retrans >= tn->tcp_max_retrans && timeouts[new_state] > timeouts[TCP_CONNTRACK_RETRANS]) timeout = timeouts[TCP_CONNTRACK_RETRANS]; + else if (unlikely(index == TCP_RST_SET)) + timeout = timeouts[TCP_CONNTRACK_CLOSE]; else if ((ct->proto.tcp.seen[0].flags | ct->proto.tcp.seen[1].flags) & IP_CT_TCP_FLAG_DATA_UNACKNOWLEDGED && timeouts[new_state] > timeouts[TCP_CONNTRACK_UNACK]) -- 2.19.1