Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757409Ab1BPB14 (ORCPT ); Tue, 15 Feb 2011 20:27:56 -0500 Received: from kroah.org ([198.145.64.141]:39463 "EHLO coco.kroah.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932162Ab1BPAWG (ORCPT ); Tue, 15 Feb 2011 19:22:06 -0500 X-Mailbox-Line: From gregkh@clark.kroah.org Tue Feb 15 16:14:39 2011 Message-Id: <20110216001439.137611606@clark.kroah.org> User-Agent: quilt/0.48-11.2 Date: Tue, 15 Feb 2011 16:13:18 -0800 From: Greg KH To: linux-kernel@vger.kernel.org, stable@kernel.org Cc: stable-review@kernel.org, torvalds@linux-foundation.org, akpm@linux-foundation.org, alan@lxorguk.ukuu.org.uk, Eric Dumazet , "David S. Miller" Subject: [141/272] ipv4: IP defragmentation must be ECN aware In-Reply-To: <20110216001559.GA31413@kroah.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4256 Lines: 146 2.6.37-stable review patch. If anyone has any objections, please let us know. ------------------ From: Eric Dumazet [ Upstream commit 6623e3b24a5ebb07e81648c478d286a1329ab891 ] RFC3168 (The Addition of Explicit Congestion Notification to IP) states : 5.3. Fragmentation ECN-capable packets MAY have the DF (Don't Fragment) bit set. Reassembly of a fragmented packet MUST NOT lose indications of congestion. In other words, if any fragment of an IP packet to be reassembled has the CE codepoint set, then one of two actions MUST be taken: * Set the CE codepoint on the reassembled packet. However, this MUST NOT occur if any of the other fragments contributing to this reassembly carries the Not-ECT codepoint. * The packet is dropped, instead of being reassembled, for any other reason. This patch implements this requirement for IPv4, choosing the first action : If one fragment had NO-ECT codepoint reassembled frame has NO-ECT ElIf one fragment had CE codepoint reassembled frame has CE Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman --- net/ipv4/ip_fragment.c | 34 ++++++++++++++++++++++++++++++++++ 1 file changed, 34 insertions(+) --- a/net/ipv4/ip_fragment.c +++ b/net/ipv4/ip_fragment.c @@ -45,6 +45,7 @@ #include #include #include +#include /* NOTE. Logic of IP defragmentation is parallel to corresponding IPv6 * code now. If you change something here, _PLEASE_ update ipv6/reassembly.c @@ -70,11 +71,28 @@ struct ipq { __be32 daddr; __be16 id; u8 protocol; + u8 ecn; /* RFC3168 support */ int iif; unsigned int rid; struct inet_peer *peer; }; +#define IPFRAG_ECN_CLEAR 0x01 /* one frag had INET_ECN_NOT_ECT */ +#define IPFRAG_ECN_SET_CE 0x04 /* one frag had INET_ECN_CE */ + +static inline u8 ip4_frag_ecn(u8 tos) +{ + tos = (tos & INET_ECN_MASK) + 1; + /* + * After the last operation we have (in binary): + * INET_ECN_NOT_ECT => 001 + * INET_ECN_ECT_1 => 010 + * INET_ECN_ECT_0 => 011 + * INET_ECN_CE => 100 + */ + return (tos & 2) ? 0 : tos; +} + static struct inet_frags ip4_frags; int ip_frag_nqueues(struct net *net) @@ -137,6 +155,7 @@ static void ip4_frag_init(struct inet_fr qp->protocol = arg->iph->protocol; qp->id = arg->iph->id; + qp->ecn = ip4_frag_ecn(arg->iph->tos); qp->saddr = arg->iph->saddr; qp->daddr = arg->iph->daddr; qp->user = arg->user; @@ -316,6 +335,7 @@ static int ip_frag_reinit(struct ipq *qp qp->q.fragments = NULL; qp->q.fragments_tail = NULL; qp->iif = 0; + qp->ecn = 0; return 0; } @@ -328,6 +348,7 @@ static int ip_frag_queue(struct ipq *qp, int flags, offset; int ihl, end; int err = -ENOENT; + u8 ecn; if (qp->q.last_in & INET_FRAG_COMPLETE) goto err; @@ -339,6 +360,7 @@ static int ip_frag_queue(struct ipq *qp, goto err; } + ecn = ip4_frag_ecn(ip_hdr(skb)->tos); offset = ntohs(ip_hdr(skb)->frag_off); flags = offset & ~IP_OFFSET; offset &= IP_OFFSET; @@ -472,6 +494,7 @@ found: } qp->q.stamp = skb->tstamp; qp->q.meat += skb->len; + qp->ecn |= ecn; atomic_add(skb->truesize, &qp->q.net->mem); if (offset == 0) qp->q.last_in |= INET_FRAG_FIRST_IN; @@ -583,6 +606,17 @@ static int ip_frag_reasm(struct ipq *qp, iph = ip_hdr(head); iph->frag_off = 0; iph->tot_len = htons(len); + /* RFC3168 5.3 Fragmentation support + * If one fragment had INET_ECN_NOT_ECT, + * reassembled frame also has INET_ECN_NOT_ECT + * Elif one fragment had INET_ECN_CE + * reassembled frame also has INET_ECN_CE + */ + if (qp->ecn & IPFRAG_ECN_CLEAR) + iph->tos &= ~INET_ECN_MASK; + else if (qp->ecn & IPFRAG_ECN_SET_CE) + iph->tos |= INET_ECN_CE; + IP_INC_STATS_BH(net, IPSTATS_MIB_REASMOKS); qp->q.fragments = NULL; qp->q.fragments_tail = NULL; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/