Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1947140AbdDYLoF (ORCPT ); Tue, 25 Apr 2017 07:44:05 -0400 Received: from mail-yb0-f196.google.com ([209.85.213.196]:35918 "EHLO mail-yb0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1947089AbdDYLoA (ORCPT ); Tue, 25 Apr 2017 07:44:00 -0400 MIME-Version: 1.0 In-Reply-To: References: <1493114155-12101-1-git-send-email-honli@redhat.com> From: Erez Shitrit Date: Tue, 25 Apr 2017 14:43:59 +0300 Message-ID: Subject: Re: [PATCH] IB/IPoIB: Check the headroom size To: Or Gerlitz Cc: Honggang LI , Erez Shitrit , Doug Ledford , "Hefty, Sean" , Hal Rosenstock , Paolo Abeni , "linux-rdma@vger.kernel.org" , Linux Kernel , Linux Netdev List Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2573 Lines: 56 On Tue, Apr 25, 2017 at 2:14 PM, Or Gerlitz wrote: > On Tue, Apr 25, 2017 at 2:11 PM, Erez Shitrit wrote: >> On Tue, Apr 25, 2017 at 1:32 PM, Or Gerlitz wrote: >>> On Tue, Apr 25, 2017 at 12:55 PM, Honggang LI wrote: >>>> From: Honggang Li >>>> >>>> Minimal hard_header_len set by bond_compute_features is ETH_HLEN, which >>>> is smaller than IPOIB_HARD_LEN. ipoib_hard_header should check the >>>> size of headroom to avoid skb_under_panic. >>> >>> sounds terrible, ipoib bonding is supported since ~2007, thanks for >>> reporting on that. >>> >>>> [ 122.871493] ipoib_hard_header: skb->head= ffff8808179d9400, skb->data= ffff8808179d9420, skb_headroom= 0x20 >>>> [ 123.055400] bond0: Releasing backup interface mthca_ib1 >>>> [ 123.560529] bond_compute_features:1112 bond0 bond_dev->hard_header_len = 14 >>>> [ 123.568822] CPU: 0 PID: 12336 Comm: ifdown-ib Not tainted 4.9.0-debug #1 >>> >>> did you generate this trace by calling dump_stack or this is existing >>> kernel code. >>> >>>> Fixes: fc791b633515 ('IB/ipoib: move back IB LL address into the hard header') >>> >>> this is more of WA to avoid some crash or failure but not fixing the >>> actual problem >>> >>> Erez, can you comment? >> >> We saw that after commit fc791b633515, it happened while removing bond >> interface after its slaves (ipoib interface) removed. >> At that point the bond interface sets its dev_harheader_len to be as >> eth interfaces (14 instead of 24), and if a process which doesn't >> aware of the slaves removal or was at the middle of the sending tries >> to send (igmp) packet it goes to ipoib with no space in the skb for >> it, and here comes the panic. > > thanks for the info. Is this bug there since ipoib/bonding day one > (and hence my bug...) > or was indeed introduced later? if later, can you explain how > fc791b633515 introduced > that or you only know it by bisection? commit "fc791b633515" changes the size of the dev_hardlen to be 24 and required 24 extra bytes in the skb, before it was only 4, if skb is aligned to eth "mode" it already has 14 bytes for hard-header. So only after that commit we have the issue. > >> I agree with you that this fix is w/a, and it is a fix in the data >> path for all the packets while the panic is in a control flow. It >> probably should be fixed in the bonding driver. > > so what's your suggestion? fc791b633515 is 6m old, and it means the bug > is in stable kernels and probably also in inbox drivers > > Or.