Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1764837AbcJaKGA (ORCPT ); Mon, 31 Oct 2016 06:06:00 -0400 Received: from mx1.redhat.com ([209.132.183.28]:42518 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1764698AbcJaKE4 (ORCPT ); Mon, 31 Oct 2016 06:04:56 -0400 From: Vitaly Kuznetsov To: KY Srinivasan Cc: "devel\@linuxdriverproject.org" , "Van De Ven\, Arjan" , "linux-kernel\@vger.kernel.org" , "Haiyang Zhang" Subject: Re: [PATCH] Drivers: hv: vmbus: Raise retry/wait limits in vmbus_post_msg() References: <1477480307-5546-1-git-send-email-vkuznets@redhat.com> Date: Mon, 31 Oct 2016 11:04:52 +0100 In-Reply-To: (KY Srinivasan's message of "Wed, 26 Oct 2016 19:52:16 +0000") Message-ID: <87shrcbzqz.fsf@vitty.brq.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Mon, 31 Oct 2016 10:04:55 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2571 Lines: 56 KY Srinivasan writes: >> -----Original Message----- >> From: Vitaly Kuznetsov [mailto:vkuznets@redhat.com] >> Sent: Wednesday, October 26, 2016 4:12 AM >> To: devel@linuxdriverproject.org >> Cc: linux-kernel@vger.kernel.org; KY Srinivasan ; >> Haiyang Zhang >> Subject: [PATCH] Drivers: hv: vmbus: Raise retry/wait limits in >> vmbus_post_msg() >> >> DoS protection conditions were altered in WS2016 and now it's easy to get >> -EAGAIN returned from vmbus_post_msg() (e.g. when we try changing MTU >> on a >> netvsc device in a loop). All vmbus_post_msg() callers don't retry the >> operation and we usually end up with a non-functional device or crash. >> >> While host's DoS protection conditions are unknown to me my tests show >> that >> it can take up to 46 attempts to send a message after changing udelay() to >> mdelay() and caping msec at '256', this means we can wait up to 10 seconds >> before the message is sent so we need to use msleep() instead. Almost all >> vmbus_post_msg() callers are ready to sleep but there is one special case: >> vmbus_initiate_unload() which can be called from interrupt/NMI context >> and >> we can't sleep there. I'm also not sure about the lonely >> vmbus_send_tl_connect_request() which has no in-tree users but its >> external >> users are most likely waiting for the host to reply so sleeping there is >> also appropriate. > > Vitaly, > > One of the reasons why the delay was in microseconds was to make sure that the boot time > was not adversely affected by the delay we had in setting up the channel. The change to microsecond > delay and other changes in this code reduced the time it took to initialize netvsc from > 200 milliseconds to about 12 milliseconds. This is important for us as we look at achieving sub-second > boot times. > The situation you are trying to address are test cases where you are hitting the host with > requests that triggers hosts DOS prevention code. Perhaps we could have a hybrid approach: we > retain microsecond wait until we hit a threshold and then we use millisecond delays. This way, the normal boot > path is still fast while we can handle some of the other cases where the host DOS prevention code kicks in. > Ok, I actually tested boot time with my patch and didn't see a difference (so I guess our first attempt to send messages usually succeeds) but if we're concearned about less-than-a-second boot time we'd rather keep the microseonds delay for first several attempts. I'll do v2. Thanks, -- Vitaly