Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752131AbbHQTLy (ORCPT ); Mon, 17 Aug 2015 15:11:54 -0400 Received: from mail-bl2on0128.outbound.protection.outlook.com ([65.55.169.128]:21904 "EHLO na01-bl2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1750936AbbHQTLu (ORCPT ); Mon, 17 Aug 2015 15:11:50 -0400 Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=uwe.koziolek@redknee.com; Message-ID: <55D22E64.6020807@redknee.com> Date: Mon, 17 Aug 2015 20:56:36 +0200 From: Uwe Koziolek User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.8.0 MIME-Version: 1.0 To: Jarod Wilson , Veaceslav Falico CC: , Jay Vosburgh , Andy Gospodarek , Subject: Re: [PATCH] net/bonding: send arp in interval if no active slave References: <1439828583-27325-1-git-send-email-jarod@redhat.com> <20150817165500.GA21512@vps.falico.eu> <55D215F7.3080905@redhat.com> In-Reply-To: <55D215F7.3080905@redhat.com> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [84.132.37.231] X-ClientProxiedBy: VI1PR06CA0037.eurprd06.prod.outlook.com (25.162.116.175) To SN2PR0501MB909.namprd05.prod.outlook.com (25.160.17.14) X-Microsoft-Exchange-Diagnostics: 1;SN2PR0501MB909;2:VDrPpAtVAizJYxpuoiNnRMXpa68ObrbA/p1aU8v3OB629bNGtT3KThK3ykBcxtIpbYyDZDALgyqqQ3G6xKakGg1zIHRKSBcFh8ivGc560QBbQIAVOzGH/L6dfAnkGpHHaxryWbNS1xeLvGFfAG2AjTlAlu2FWnBtj4vYaSXM3Tk=;3:DHCkgXvjkJLnTi0X3WJahxowKZBGLci7lG0EtdwApHmU05susLo3pXpGZNj85BLQjIkvOhAZFDxNVif1zyeYCSEkPjKOLoLbqEIo51Roq+XaYMI3oFYHZw0vGYbFAT74JyJb2883KgqWjHMKyjVfSw==;25:iYDHaFE+/6yBMMPUpeONfrMmnt860mw6C+pgsJlXjZ1COxWPUjB/eb5c7nZlXlmoXtu7mCK/lVsVslE4hJ+C2KjxiOaNH0wW3oE3oeNCIbUQw8zmYgTDzVob2NXU2F4Q2YTp0gHhGAZ32mOd0gVAvzdlclp3xthP06tSjNPgszaYyg4wW32kbFjh/jPYR+oE1y09MV11h0DcJJGmDDP6y7151qfPPV4qb9zsWLQIOkM/me5YBaaUqBvuYISGo2k9 X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:SN2PR0501MB909; X-Microsoft-Exchange-Diagnostics: 1;SN2PR0501MB909;20:ELLX48kjFP5CzZSKO1QIILBdp6VIWZWALOXt5917LEUKNy9Z8KYa3cItNQVJQ8Y2V9aDh8Y/6PJ89MOWkHrBj3UNwEDfVAy3djdJDtoNGgQr9wXx/URZvn2WBndFiq2rZnuDo4NANc8/rIM8CgzJFqfRkosGLAen4gQ4ADJWWaGlgQNtXHCkGfkOu6UmFOEgr/nanTWqE9xF61j52MEKiBcWN5XIvLFEkcZuR0we3KvQKOIlIA9p5HeU+HPUQv+1mb0GPyQnbkr8YgQ33hNTkDvUAAbh/Qi9ycXcHa1WlczqejWUapnDnJAZ1Przp/gcLmWCNXXAl1SD2WhY88H2x6UytOMOQmepZ9PEppZVFzwOBL2KJr7KpXn6D7wnN0mo2AHIvmk96/RUrsQzFN9BBGMSWw+eoe/1i3BV2ABE1nJRwjV4auMURLcak39MkWKkQTMXTxg5j8EwsxXBRACXZs7e4306oP4w/RtYkRRPyH73XjnpbPJVZpbCEcCccVMS;4:Gm87uzbyXcogra35O0Rr1w51FZvymUrlcA0ayhmFdKj44OjhaCmlB6xQBFO0e2YWEJ6y0/ePDyHit20ldnI+fHDe6jNYQ4GrEyplfMLMcuGpN1nLt4rWqNhUUvo1xXtA5AOSSvLjZ8tDB2Tbf9nj8kXXi6AGGpkHyrkPbm7eTzNwwF/D0G2nb8xe7tE9Hjg/a0AD+MGZvubmcPu9JN7K7gAZaHWf8zfmlo/ibqPqhIenvFag+3dpVdOMkKjIjS9Ady/XRkTyAGHkaeoiJZ1X3qjIR4uQhDLN4n8qMWNF/v/KKru5C+QjWI0ioY38iGsx X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(5005006)(8121501046)(3002001);SRVR:SN2PR0501MB909;BCL:0;PCL:0;RULEID:;SRVR:SN2PR0501MB909; X-Forefront-PRVS: 0671F32598 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(6049001)(6009001)(24454002)(54534003)(189002)(199003)(377454003)(377424004)(2950100001)(5001830100001)(65816999)(80316001)(50986999)(40100003)(50466002)(117156001)(47776003)(77156002)(105586002)(59896002)(62966003)(122386002)(64126003)(19580405001)(4001350100001)(5001860100001)(5001770100001)(97736004)(76176999)(86362001)(64706001)(106356001)(4001540100001)(5001920100001)(23746002)(65806001)(65956001)(66066001)(81156007)(92566002)(36756003)(46102003)(87266999)(68736005)(5001960100002)(83506001)(189998001)(42186005)(77096005)(54356999)(19580395003)(101416001)(33656002)(87976001);DIR:OUT;SFP:1102;SCL:1;SRVR:SN2PR0501MB909;H:[192.168.222.3];FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?Windows-1252?Q?1;SN2PR0501MB909;23:TegSAG31kcNv118e58KkETPWLB5+fIdCORjc?= =?Windows-1252?Q?JVXDXlUkz9n+zPXmT0XvRg6aSiGEeuSm05wK5mo2wZDvQcRh7gqsaFS5?= =?Windows-1252?Q?pICqtudMXaHXqR5RM+ZxXHDUMY9KeQ9ir+o6Ph1x0Peu9oqsoL/y8mt8?= =?Windows-1252?Q?d5K/zqyvqh5wSIxLbBak0g1mV/gvaOEB6pKFXyEGrpepmqk2L/tqY7SZ?= =?Windows-1252?Q?v+M6m4VPctJBtTokCuqe1kUDxX3j7GPV5FSJ0qvMZ2ZA833+irY2Ut2N?= =?Windows-1252?Q?x2rBRRFLgD1cRRG01pk0J+QGzvpANmH9T0XfvESZiwOn17EdYD7aGpQ+?= =?Windows-1252?Q?MJN8FW5n/hQajnccO+1NzxIHlaO5Dh1MyDbplgdz2t+aGMm6KtG3YVWE?= =?Windows-1252?Q?GjXeaYbNVy63e4Bbc5ZuFzhsesRjeI+3Wr9BZVGmRIGU5ooCa/mIRDr/?= =?Windows-1252?Q?17EqaMKDuYXl0wsmTaAcnVJr9QGlrXtMOKX2pPkHf3cfjIE8MgKz4tAM?= =?Windows-1252?Q?buOwHKDQeqRtKuZ27/L3N/j3hQa5uI2Pc130nNzbq7BqCsjBbEMK33ee?= =?Windows-1252?Q?xC/185501pSLLIKUQXaQdVD4qp04OU4ljVTqt/NtcA784L6rMtV5a3Hv?= =?Windows-1252?Q?cU+MOZEF062ofCXFMDP9d+UohjPI1A8VCSHMWhIFcgS74biAnW2U18Hi?= =?Windows-1252?Q?h0Lu1QHMVQS7ztc0HfrQb1zDcPyPXsardaFnWfN+wd527OHupm/S5Ozj?= =?Windows-1252?Q?oxBPsJVY7sta0vdjA2OBT8eveoNHtFKJzuF2Eh2BVZ62LOYBlGVD39W+?= =?Windows-1252?Q?TZF+EzKAnTrOdwch4CiiPU2nBmzXSPx2bSXldVAy41CpYptD321Xnla7?= =?Windows-1252?Q?YYUvEt1hWwtML0OsMnUbl1IA9BtWxpC8zHtsyUsiz1BLtxpqIGYxRo5e?= =?Windows-1252?Q?E+rr3DXtx+sd5bHV3bXTLnQ/FK8E+WsXyk4scJ4xXVCDVP0bJ+yi9HbT?= =?Windows-1252?Q?pIzeV88Zs04N9lXciSDNpPvIL9JBzEzdHZctws+4NuaT0+o1XLC1bvTb?= =?Windows-1252?Q?g9x/CZMKhiKO0owPRLnWqrHiXXHvh5bQ3OSqgGZr69G2sfJS/dX0gjkK?= =?Windows-1252?Q?YBv0wIM/gVdbarDf1ei4GyZRI51WNM4pA5RM/FRY6WdtIupQyQS8LtJc?= =?Windows-1252?Q?/spbR3jYOk7yOmCvzpWsLO/AnnN7tOYYwYcc3WtQWhhIPICWPAqkaJOb?= =?Windows-1252?Q?sUtNHv1SfPTSIcn7txm8eJKvHAJbRQOyJJ0pnJvGqlkcpy4yYQLnJ3vm?= =?Windows-1252?Q?mH0OwLcbPPAc+6BT0QpWpFGwZ+g2HDyu9pLuEOzzg9UKIHo2wXu/FANG?= =?Windows-1252?Q?fMLRBG1H2VOtKx8BWE8MoV/k9Fqq7j3WIv/gKx4c6a9/Z9q1xear+NuJ?= =?Windows-1252?Q?jpIP2m0lNfhMB205t/xtoZcTkz9l0pyBmCI1MGsajp10YB/1FjRpUwxs?= =?Windows-1252?Q?oynnUZF5v6DbB4yM+yjslktnXD2P?= X-Microsoft-Exchange-Diagnostics: 1;SN2PR0501MB909;5:nfdiDKHA2694bm29W37BDLoQ/s/LKbXLZQnt1d7LES7IRCJCfJl5+nFIkIFBEM4noldxTDIKXOyGrPVIdV7olUJS5wE6xqTbNvnQbEjyZGNK3Zg4n4jKTlo145turA5ZJGRBFYSJSGn9hpvgYRRy/A==;24:s7C0rtGJ0vIgCprNck6HCMaop/BK5qtN7g4YLIfsLEhcXQn3HRlxSHNpYwA7RXzJZeZyeCr/BItFOExHRxfz29ZoVMPJjOQ5ForaZir7htk=;20:vaYD6wPUo/UpMzSDsZXcNqyYINXW8L7BcypFJf9ESmtDlFC9f0emdbzdkXJcrV7F4FbWr6uZJ7W2Ef3VqJ9Fwg== SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: redknee.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 17 Aug 2015 18:56:50.5819 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN2PR0501MB909 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5094 Lines: 126 On2015-08-17 07:12 PM,Jarod Wilson wrote: > On 2015-08-17 12:55 PM, Veaceslav Falico wrote: >> On Mon, Aug 17, 2015 at 12:23:03PM -0400, Jarod Wilson wrote: >>> From: Uwe Koziolek >>> >>> With some very finicky switch hardware, active backup bonding can get >>> into >>> a situation where we play ping-pong between interfaces, trying to >>> get one >>> to come up as the active slave. There seems to be an issue with the >>> switch's arp replies either taking too long, or simply getting lost, >>> so we >>> wind up unable to get any interface up and active. Sometimes, the issue >>> sorts itself out after a while, sometimes it doesn't. >>> >>> Testing with num_grat_arp has proven fruitless, but sending an >>> additional >>> arp on curr_arp_slave if we're still in the arp_interval timeslice in >>> bond_ab_arp_probe(), has shown to produce 100% reliability in testing >>> with >>> this hardware combination. >> >> Sorry, I don't understand the logic of why it works, and what exactly >> are >> we fixiing here. >> >> It also breaks completely the logic for link state management in case >> of no >> current active slave for 2*arp_interval. >> >> Could you please elaborate what exactly is fixed here, and how it >> works? :) > > I can either duplicate some information from the bug, or Uwe can, to > illustrate the exact nature of the problem. > >> p.s. num_grat_arp maybe could help? > > That was my thought as well, but as I understand it, that route was > explored, and it didn't help any. I don't actually have a reproducer > setup of my own, unfortunately, so I'm kind of caught in the middle > here... > > Uwe, can you perhaps further enlighten us as to what num_grat_arp > settings were tried that didn't help? I'm still of the mind that if > num_grat_arp *didn't* help, we probably need to do something keyed off > num_grat_arp. The bonding slaves are connected to high available switches, each of the slaves is connected to a different switch. If the bond is starting, only the selected slave sends one arp-request. If a matching arp_response was received, this slave and the bond is going into state up, sending the gratitious arps... But if you got no arp reply the next slave was selected. With most of the newer switches, not overloaded, or with other software bugs, or with a single switch configuration, you would get a arp response on the first arp request. But in case of high availability configuration with non perfect switches like HP ProCurve 54xx, also with some Cisco models, you may not get a response on the first arp request. I have seen network snoops, there the switches are not responding to the first arp request on slave 1, the second arp request was sent on slave 2 but the response was received on slave one, and all following arp requests are anwsered on the wrong slave for a longer time. The proposed change sents up to 3 arp requests on a down bond using the same slave, delayed by arp_interval. Using problematic switches i have seen the the arp response on the right slave at latest on the second arp request. So the bond is going into state up. How does it works: The bonds in up state are handled on the beginning of bond_ab_arp_probe procedure, the other part of this procedure is handling the slave change. The proposed change is bypassing the slave change for 2 additional calls of bond_ab_arp_probe. Now the retries are not only for an up bond available, they are also implemented for a down bond. The num_grat_arp has no chance to solve the problem. The num_grat_arp is only used, if a different slave is going active. But in our case, the bonding slaves are not going into the state active for a longer time. > >>> [jarod: manufacturing of changelog] >>> CC: Jay Vosburgh >>> CC: Veaceslav Falico >>> CC: Andy Gospodarek >>> CC: netdev@vger.kernel.org >>> Signed-off-by: Uwe Koziolek >>> Signed-off-by: Jarod Wilson >>> --- >>> drivers/net/bonding/bond_main.c | 5 +++++ >>> 1 file changed, 5 insertions(+) >>> >>> diff --git a/drivers/net/bonding/bond_main.c >>> b/drivers/net/bonding/bond_main.c >>> index 0c627b4..60b9483 100644 >>> --- a/drivers/net/bonding/bond_main.c >>> +++ b/drivers/net/bonding/bond_main.c >>> @@ -2794,6 +2794,11 @@ static bool bond_ab_arp_probe(struct bonding >>> *bond) >>> return should_notify_rtnl; >>> } >>> >>> + if (bond_time_in_interval(bond, curr_arp_slave->last_link_up, >>> 2)) { >>> + bond_arp_send_all(bond, curr_arp_slave); >>> + return should_notify_rtnl; >>> + } >>> + >>> bond_set_slave_inactive_flags(curr_arp_slave, >>> BOND_SLAVE_NOTIFY_LATER); >>> >>> bond_for_each_slave_rcu(bond, slave, iter) { >>> -- >>> 1.8.3.1 >>> > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/