Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965682AbcJYKbi (ORCPT ); Tue, 25 Oct 2016 06:31:38 -0400 Received: from mx1.redhat.com ([209.132.183.28]:53300 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S941655AbcJYKbd (ORCPT ); Tue, 25 Oct 2016 06:31:33 -0400 Date: Tue, 25 Oct 2016 08:31:26 -0200 From: Marcelo Ricardo Leitner To: Jon Maxwell Cc: tlfalcon@linux.vnet.ibm.com, benh@kernel.crashing.org, paulus@samba.org, mpe@ellerman.id.au, davem@davemloft.net, tom@herbertland.com, jarod@redhat.com, hofrat@osadl.org, netdev@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, jmaxwell@redhat.com Subject: Re: [PATCH net-next] ibmveth: calculate correct gso_size and set gso_type Message-ID: <20161025103126.GW2948@localhost.localdomain> References: <1477372421-11656-1-git-send-email-jmaxwell37@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1477372421-11656-1-git-send-email-jmaxwell37@gmail.com> User-Agent: Mutt/1.7.0 (2016-08-17) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Tue, 25 Oct 2016 10:31:32 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2841 Lines: 80 On Tue, Oct 25, 2016 at 04:13:41PM +1100, Jon Maxwell wrote: > We recently encountered a bug where a few customers using ibmveth on the > same LPAR hit an issue where a TCP session hung when large receive was > enabled. Closer analysis revealed that the session was stuck because the > one side was advertising a zero window repeatedly. > > We narrowed this down to the fact the ibmveth driver did not set gso_size > which is translated by TCP into the MSS later up the stack. The MSS is > used to calculate the TCP window size and as that was abnormally large, > it was calculating a zero window, even although the sockets receive buffer > was completely empty. > > We were able to reproduce this and worked with IBM to fix this. Thanks Tom > and Marcelo for all your help and review on this. > > The patch fixes both our internal reproduction tests and our customers tests. > > Signed-off-by: Jon Maxwell > --- > drivers/net/ethernet/ibm/ibmveth.c | 19 +++++++++++++++++++ > 1 file changed, 19 insertions(+) > > diff --git a/drivers/net/ethernet/ibm/ibmveth.c b/drivers/net/ethernet/ibm/ibmveth.c > index 29c05d0..3028c33 100644 > --- a/drivers/net/ethernet/ibm/ibmveth.c > +++ b/drivers/net/ethernet/ibm/ibmveth.c > @@ -1182,6 +1182,8 @@ static int ibmveth_poll(struct napi_struct *napi, int budget) > int frames_processed = 0; > unsigned long lpar_rc; > struct iphdr *iph; > + bool large_packet = 0; > + u16 hdr_len = ETH_HLEN + sizeof(struct tcphdr); Compiler may optmize this, but maybe move hdr_len to [*] ? > > restart_poll: > while (frames_processed < budget) { > @@ -1236,10 +1238,27 @@ static int ibmveth_poll(struct napi_struct *napi, int budget) > iph->check = 0; > iph->check = ip_fast_csum((unsigned char *)iph, iph->ihl); > adapter->rx_large_packets++; > + large_packet = 1; > } > } > } > > + if (skb->len > netdev->mtu) { [*] > + iph = (struct iphdr *)skb->data; > + if (be16_to_cpu(skb->protocol) == ETH_P_IP && iph->protocol == IPPROTO_TCP) { The if line above is too long, should be broken in two. > + hdr_len += sizeof(struct iphdr); > + skb_shinfo(skb)->gso_type = SKB_GSO_TCPV4; > + skb_shinfo(skb)->gso_size = netdev->mtu - hdr_len; > + } else if (be16_to_cpu(skb->protocol) == ETH_P_IPV6 && > + iph->protocol == IPPROTO_TCP) { ^ And this one should start 3 spaces later, right below be16_.... Marcelo > + hdr_len += sizeof(struct ipv6hdr); > + skb_shinfo(skb)->gso_type = SKB_GSO_TCPV6; > + skb_shinfo(skb)->gso_size = netdev->mtu - hdr_len; > + } > + if (!large_packet) > + adapter->rx_large_packets++; > + } > + > napi_gro_receive(napi, skb); /* send it up */ > > netdev->stats.rx_packets++; > -- > 1.8.3.1 >