Received: by 10.192.165.156 with SMTP id m28csp1126571imm; Wed, 11 Apr 2018 12:57:35 -0700 (PDT) X-Google-Smtp-Source: AIpwx4+8UMFZoJ7O2ezMaWzSPcGZWS3SyfDU6tK6xhwPnMZwhv8/XfgdcMcYSbM8KTb0g83eQvUw X-Received: by 2002:a17:902:d88a:: with SMTP id b10-v6mr6532055plz.172.1523476655423; Wed, 11 Apr 2018 12:57:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1523476655; cv=none; d=google.com; s=arc-20160816; b=Hp7T0YWOXYtOFoqD3hoKtdWYGjU768D0dZIDkArytzatB16IPdZySUNrelhTI5KaaS 0HuXz8XA5ZDHSNc8FIF0kz5kbyDngZLcQe3pF6OxOrqEIxTyeFTbY2+ilnhWkF9dUoXv fpGjsi6DYS6XIk0b0e+Rug7iXyi7bquBrwE3te1mUlSxTYKK4ioDoPWJnMZmaDkpomt8 jTH3wPVWJl3uCM5AZgJ2CxiMbPvJG1f20jeKNtnz6bbGdKKxT/cl3QtHAPiqMqOF6PpM alTmD/4SMbfNzikT2D87T99PBJSWBfuawikWMSOqyL3cjnKxWGxCoemyRryXDx8lhlP4 qt0w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :in-reply-to:message-id:date:subject:cc:to:from :arc-authentication-results; bh=1jZq0rGcLFMMoyFe02oF0iatBHn3xgJUbSbh2nry/Og=; b=QbNrWiL+KUxlew798ZKNicJKImajLbjbcOWAkz2ZrOJ6xP5/gWJiB7YRk4Nd4RT8Dg 9fziJWFEsuYrrPJG9A6+H8fB6Y7PGO//Kxp2dmHrbwxFJJJ/4k5k9yJa8fIs6tVqQJ5M 4BsFtk2wVVkrYD9NgprB8eipCwVnW+KFOxQw/KkhQquxuVODQPpsB1RBIQhrxFe9cBCE ID+RKdWoesJHNyjERgHhGwCOMQE/90zl/VRhyf+lk5zCi1OQGzKVVhGTidbYwCHx4nWr /1WsD6LEtMZKT7bSUcgZvU9PeD+kE1Bpu7LuQ3ZKxoQ4pJxqewYSevBZRYylVlzpMN0r aLdw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p7-v6si86770pll.476.2018.04.11.12.56.58; Wed, 11 Apr 2018 12:57:35 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756427AbeDKS5c (ORCPT + 99 others); Wed, 11 Apr 2018 14:57:32 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:36322 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754476AbeDKS5a (ORCPT ); Wed, 11 Apr 2018 14:57:30 -0400 Received: from localhost (LFbn-1-12247-202.w90-92.abo.wanadoo.fr [90.92.61.202]) by mail.linuxfoundation.org (Postfix) with ESMTPSA id 7B6EEC06; Wed, 11 Apr 2018 18:57:29 +0000 (UTC) From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Nithin Nayak Sujir , Mahesh Bandewar , Jay Vosburgh , "David S. Miller" , Sasha Levin Subject: [PATCH 4.9 112/310] bonding: Dont update slave->link until ready to commit Date: Wed, 11 Apr 2018 20:34:11 +0200 Message-Id: <20180411183627.007204341@linuxfoundation.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180411183622.305902791@linuxfoundation.org> References: <20180411183622.305902791@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.9-stable review patch. If anyone has any objections, please let me know. ------------------ From: Nithin Sujir [ Upstream commit 797a93647a48d6cb8a20641a86a71713a947f786 ] In the loadbalance arp monitoring scheme, when a slave link change is detected, the slave->link is immediately updated and slave_state_changed is set. Later down the function, the rtnl_lock is acquired and the changes are committed, updating the bond link state. However, the acquisition of the rtnl_lock can fail. The next time the monitor runs, since slave->link is already updated, it determines that link is unchanged. This results in the bond link state permanently out of sync with the slave link. This patch modifies bond_loadbalance_arp_mon() to handle link changes identical to bond_ab_arp_{inspect/commit}(). The new link state is maintained in slave->new_link until we're ready to commit at which point it's copied into slave->link. NOTE: miimon_{inspect/commit}() has a more complex state machine requiring the use of the bond_{propose,commit}_link_state() functions which maintains the intermediate state in slave->link_new_state. The arp monitors don't require that. Testing: This bug is very easy to reproduce with the following steps. 1. In a loop, toggle a slave link of a bond slave interface. 2. In a separate loop, do ifconfig up/down of an unrelated interface to create contention for rtnl_lock. Within a few iterations, the bond link goes out of sync with the slave link. Signed-off-by: Nithin Nayak Sujir Cc: Mahesh Bandewar Cc: Jay Vosburgh Acked-by: Mahesh Bandewar Signed-off-by: David S. Miller Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman --- drivers/net/bonding/bond_main.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -2605,11 +2605,13 @@ static void bond_loadbalance_arp_mon(str bond_for_each_slave_rcu(bond, slave, iter) { unsigned long trans_start = dev_trans_start(slave->dev); + slave->new_link = BOND_LINK_NOCHANGE; + if (slave->link != BOND_LINK_UP) { if (bond_time_in_interval(bond, trans_start, 1) && bond_time_in_interval(bond, slave->last_rx, 1)) { - slave->link = BOND_LINK_UP; + slave->new_link = BOND_LINK_UP; slave_state_changed = 1; /* primary_slave has no meaning in round-robin @@ -2636,7 +2638,7 @@ static void bond_loadbalance_arp_mon(str if (!bond_time_in_interval(bond, trans_start, 2) || !bond_time_in_interval(bond, slave->last_rx, 2)) { - slave->link = BOND_LINK_DOWN; + slave->new_link = BOND_LINK_DOWN; slave_state_changed = 1; if (slave->link_failure_count < UINT_MAX) @@ -2667,6 +2669,11 @@ static void bond_loadbalance_arp_mon(str if (!rtnl_trylock()) goto re_arm; + bond_for_each_slave(bond, slave, iter) { + if (slave->new_link != BOND_LINK_NOCHANGE) + slave->link = slave->new_link; + } + if (slave_state_changed) { bond_slave_state_change(bond); if (BOND_MODE(bond) == BOND_MODE_XOR)