Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp3055017ybi; Sun, 26 May 2019 13:30:41 -0700 (PDT) X-Google-Smtp-Source: APXvYqzw7EvH8f6Z/xfUCiy9k2Dxnsd7gGfY3UfoZhDzvlZPyxgY8HFr3J79PTZELovV+rdB8H8G X-Received: by 2002:a65:654f:: with SMTP id a15mr105481989pgw.73.1558902641839; Sun, 26 May 2019 13:30:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1558902641; cv=none; d=google.com; s=arc-20160816; b=bJlSpiIUOoTq1X2Y6UylWtLS4QLWqIawBebfAFNYTFQV6IYZIOQIo1VY1tg7aRjwty E5vO1DmjweV4bBq9mNNQvg19z1Ma64LytajZpZ9FmUSKDisRwkvoKQJU6IqM6I8/BPXs ih3Gj7oNc4RKh51JCcQliNGJoyZYzaLkcyES/Bn8gjIzOKw5qxQ5UlCNHQYiEDhPcoOS H7vQvxjNqtLZC0uRZblhVwTT5Yy9dwWqVG7cT17f3YeInItmL8SaRtzd0w9uh5jPy9T1 HBBwOErKB3hh9zM/pAfkzRZMJFjtaskARV3+jyX01yM7mw3jC/AppwDZstR6wtfSP16y 8Qtg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:from:subject:cc:to:message-id:date; bh=xcfEHkL4zXxxUCxvF/XzjwWCHa8qv9te3ax3wLu8LLA=; b=TgtUnoqZl92xnOVOg1h6Bqc8fk438joP/KjB/3Mx8l6CctrcZgcRK37BhVxfp5vm51 JN5r91hWHCoXS2fGEdNpqoTIpxK2/FpyQPtrxxs1VilsvOjBzRIW7NUweiG8AnC+FnoF xvNQL0RQWcZ3uK1u0Dzl5cYBEbqq+NUVdWhc0jJrtuwhny1wSfzDfCGT97lRuOzvzdPL Ht8GfS1eveHQBmYoMLQrOnDcImhGZvJhE/zXlJt6YKYCfPXIYokYTMGx7iEMh7fi+IIL 0Q4PT8sVCvSr6mjV7ZMunkDdmpNsnMQQXbIWsrnuIKYnWMsqcje0/rBMbYknMX8EXRuR jEsQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q13si16294951pgq.496.2019.05.26.13.30.26; Sun, 26 May 2019 13:30:41 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726055AbfEZU3W (ORCPT + 99 others); Sun, 26 May 2019 16:29:22 -0400 Received: from shards.monkeyblade.net ([23.128.96.9]:46006 "EHLO shards.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725616AbfEZU3V (ORCPT ); Sun, 26 May 2019 16:29:21 -0400 Received: from localhost (unknown [IPv6:2601:601:9f80:35cd::3d8]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) (Authenticated sender: davem-davemloft) by shards.monkeyblade.net (Postfix) with ESMTPSA id 33B121423D7B8; Sun, 26 May 2019 13:29:21 -0700 (PDT) Date: Sun, 26 May 2019 13:29:20 -0700 (PDT) Message-Id: <20190526.132920.535955459085533409.davem@davemloft.net> To: jarod@redhat.com Cc: linux-kernel@vger.kernel.org, j.vosburgh@gmail.com, vfalico@gmail.com, andy@greyhouse.net, netdev@vger.kernel.org, Heesoon.Kim@stratus.com Subject: Re: [PATCH net] bonding/802.3ad: fix slave link initialization transition states From: David Miller In-Reply-To: <20190524134928.16834-1-jarod@redhat.com> References: <20190524134928.16834-1-jarod@redhat.com> X-Mailer: Mew version 6.8 on Emacs 26.1 Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.5.12 (shards.monkeyblade.net [149.20.54.216]); Sun, 26 May 2019 13:29:21 -0700 (PDT) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Jarod Wilson Date: Fri, 24 May 2019 09:49:28 -0400 > Once in a while, with just the right timing, 802.3ad slaves will fail to > properly initialize, winding up in a weird state, with a partner system > mac address of 00:00:00:00:00:00. This started happening after a fix to > properly track link_failure_count tracking, where an 802.3ad slave that > reported itself as link up in the miimon code, but wasn't able to get a > valid speed/duplex, started getting set to BOND_LINK_FAIL instead of > BOND_LINK_DOWN. That was the proper thing to do for the general "my link > went down" case, but has created a link initialization race that can put > the interface in this odd state. > > The simple fix is to instead set the slave link to BOND_LINK_DOWN again, > if the link has never been up (last_link_up == 0), so the link state > doesn't bounce from BOND_LINK_DOWN to BOND_LINK_FAIL -- it hasn't failed > in this case, it simply hasn't been up yet, and this prevents the > unnecessary state change from DOWN to FAIL and getting stuck in an init > failure w/o a partner mac. > > Fixes: ea53abfab960 ("bonding/802.3ad: fix link_failure_count tracking") > Tested-by: Heesoon Kim > Signed-off-by: Jarod Wilson Applied and queued up for -stable.