Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755516AbcLOKyJ (ORCPT ); Thu, 15 Dec 2016 05:54:09 -0500 Received: from mx1.redhat.com ([209.132.183.28]:36072 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752252AbcLOKyI (ORCPT ); Thu, 15 Dec 2016 05:54:08 -0500 From: Vitaly Kuznetsov To: Olaf Hering Cc: kys@microsoft.com, gregkh@linuxfoundation.org, linux-kernel@vger.kernel.org, devel@linuxdriverproject.org Subject: Re: move hyperv CHANNELMSG_UNLOAD from crashed kernel to kdump kernel References: <20161207085110.GC1618@aepfle.de> <87r3594hef.fsf@vitty.brq.redhat.com> <20161215103402.GA6336@aepfle.de> Date: Thu, 15 Dec 2016 11:54:05 +0100 In-Reply-To: <20161215103402.GA6336@aepfle.de> (Olaf Hering's message of "Thu, 15 Dec 2016 11:34:03 +0100") Message-ID: <87mvfx4g4y.fsf@vitty.brq.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Thu, 15 Dec 2016 10:54:08 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1633 Lines: 41 Olaf Hering writes: > On Thu, Dec 15, Vitaly Kuznetsov wrote: > >> I see a number of minor but at least one major issue against such move: >> At least for some Hyper-V versions (2012R2 for example) >> CHANNELMSG_UNLOAD_RESPONSE is delivered to the CPU which initially sent >> CHANNELMSG_REQUESTOFFERS and on kdump we may not have this CPU up as >> we usually do kdump with nr_cpus=1 (and on the CPU which crashed). > > Since the kdump or kexec kernel will send the unload during boot I would > expect the response to arrive where it was sent, independent from the > number of cpus. > We actually need to read the reply and empty the message slot to make unload happen. And reading on a different CPU may not work, see: http://driverdev.linuxdriverproject.org/pipermail/driverdev-devel/2016-December/097330.html >> Minor issue is the necessity preserve the information about >> message/events pages across kexec. > > I guess this info is stored somewhere, and the relevant gfns can be > preserved across kernels, if we try really hard. > > But after looking further at the involved code paths it seems that the > implemnted polling might be good enough to snatch the response. Was the > mdelay(10) just an arbitrary decision? I observed delays up to several seconds (!) before CHANNELMSG_UNLOAD_RESPONSE is delivered. > I interpret the comments in vmbus_signal_eom such that the host may > overwrite the response. Perhaps such thing may happen during the mdelay? No, (at least in theory) the host is never supposed to overwrite messages, it waits for the guest to clean the slot and do wrmsr. -- Vitaly