Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp1611216imu; Wed, 12 Dec 2018 00:52:24 -0800 (PST) X-Google-Smtp-Source: AFSGD/Xzr5Uo6IyDR47NBnuJ3zNj/Jn9JTy0amIeGnhx/4SU9fdRasvyDiuyRVk/fy/b63hODlUB X-Received: by 2002:a17:902:d891:: with SMTP id b17mr19439523plz.80.1544604743923; Wed, 12 Dec 2018 00:52:23 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1544604743; cv=none; d=google.com; s=arc-20160816; b=HvfQRnCfU9/9lKTdcF1iCiFbWox8YvqOZrerSISwnYQw64Df++1AKH2AOQTw1Mx3f2 Sv5Ybzm2YDN6zfcOgDtOzRoE+zGawcFieE/9e4XUL5rnmuvqydelpknxsqA7gyJ3B2/u EW7T07+t4TcuxT9v1thPxEXgbZW6YkNbt0rJfBB4oUZGY4bwFo2N/K2JkoUx/WhgQWvd 2MpBX5jmM26bPu3sNk7C8m0HhxUo5DuU8vorynsppk8t5Wl12oN84NghLf5p4HJsdiGp CSAfODobXZn1LG5y8MAiT6IsOqewcgurt5/Jc13z4MYVGfEIRvV1uhnm3nvBhCjJ9tZ+ EnHg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-disposition :content-transfer-encoding:mime-version:in-reply-to:references :subject:cc:to:from:date:message-id; bh=Zb8sQBYptDUQ/cwFUyijzj/NqtP/8LKj4GM1Do1BnWs=; b=k2p/wsiQ15l+9d6Is+hfvDgCGgXaOMl5b2+x4sRjV94umDZ2Bg4+n5rE6ttJ53JUQJ AzjVaYPeM3Y461KQl+jGBd2guZqYyKl62OANzNIi8+Mm2Hr48/T8F3yHe0X8GgR2w6U0 sNAIv8FOx65neOLYFcXUqQzyY0mdQT4PT0+YwFj8iVx13DfN4y6Nj6prQz8KegkVJO74 dPEtNIMxFpt5FAXcWbIeqGDINk9Z7Ryy+0f0GjpPFghECD1lCukiMvCHkaPpfU41Lbc1 RdHuL0BNWo4YDX1SSWJwtuLkvun0IGCDxrYVnQaA3dRgEcuDz16q2NfyeTwMZEqeBuT2 jVvA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id go15si14814446plb.219.2018.12.12.00.52.08; Wed, 12 Dec 2018 00:52:23 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726746AbeLLIvG (ORCPT + 99 others); Wed, 12 Dec 2018 03:51:06 -0500 Received: from prv1-mh.provo.novell.com ([137.65.248.33]:47188 "EHLO prv1-mh.provo.novell.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726242AbeLLIvF (ORCPT ); Wed, 12 Dec 2018 03:51:05 -0500 Received: from INET-PRV1-MTA by prv1-mh.provo.novell.com with Novell_GroupWise; Wed, 12 Dec 2018 01:51:04 -0700 Message-Id: <5C10CBF50200007800205596@prv1-mh.provo.novell.com> X-Mailer: Novell GroupWise Internet Agent 18.1.0 Date: Wed, 12 Dec 2018 01:51:01 -0700 From: "Jan Beulich" To: "Chao Gao" Cc: "Roger Pau Monne" , "Jia-Ju Bai" , "Stefano Stabellini" , "xen-devel" , "Boris Ostrovsky" , "Juergen Gross" , Subject: Re: [PATCH] xen: xen-pciback: Reset MSI-X state when exposing a device References: <1543976357-1053-1-git-send-email-chao.gao@intel.com> <20181205093223.dncg4nq4dh6xmrhk@mac> <20181212070654.GA13411@gao-cwp> In-Reply-To: <20181212070654.GA13411@gao-cwp> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org >>> On 12.12.18 at 08:06, wrote: > On Wed, Dec 05, 2018 at 09:01:33AM -0500, Boris Ostrovsky wrote: >>On 12/5/18 4:32 AM, Roger Pau Monné wrote: >>> On Wed, Dec 05, 2018 at 10:19:17AM +0800, Chao Gao wrote: >>>> I find some pass-thru devices don't work any more across guest reboot. >>>> Assigning it to another guest also meets the same issue. And the only >>>> way to make it work again is un-binding and binding it to pciback. >>>> Someone reported this issue one year ago [1]. More detail also can be >>>> found in [2]. >>>> >>>> The root-cause is Xen's internal MSI-X state isn't reset properly >>>> during reboot or re-assignment. In the above case, Xen set maskall bit >>>> to mask all MSI interrupts after it detected a potential security >>>> issue. Even after device reset, Xen didn't reset its internal maskall >>>> bit. As a result, maskall bit would be set again in next write to >>>> MSI-X message control register. >>>> >>>> Given that PHYSDEVOPS_prepare_msix() also triggers Xen resetting MSI-X >>>> internal state of a device, we employ it to fix this issue rather than >>>> introducing another dedicated sub-hypercall. >>>> >>>> Note that PHYSDEVOPS_release_msix() will fail if the mapping between >>>> the device's msix and pirq has been created. This limitation prevents >>>> us calling this function when detaching a device from a guest during >>>> guest shutdown. Thus it is called right before calling >>>> PHYSDEVOPS_prepare_msix(). >>> s/PHYSDEVOPS/PHYSDEVOP/ (no final S). And then I would also drop the >>> () at the end of the hypercall name since it's not a function. >>> >>> I'm also wondering why the release can't be done when the device is >>> detached from the guest (or the guest has been shut down). This makes >>> me worry about the raciness of the attach/detach procedure: if there's >>> a state where pciback assumes the device has been detached from the >>> guest, but there are still pirqs bound, an attempt to attach to >>> another guest in such state will fail. >> >>I wonder whether this additional reset functionality could be done out >>of xen_pcibk_xenbus_remove(). We first do a (best effort) device reset >>and then do the extra things that are not properly done there. > > No. It cannot be done in xen_pcibk_xenbus_remove() without modifying > the handler of PHYSDEVOP_release_msix. To do a successful Xen internal > MSI-X state reset, PHYSDEVOP_{release, prepare}_msix should be finished > without error. But ATM, xen expects that no msi is bound to pirq when > doing PHYSDEVOP_release_msix. Otherwise it fails with error code -EBUSY. > However, the expectation isn't guaranteed in xen_pcibk_xenbus_remove(). > In some cases, if qemu fails to unmap MSIs, MSIs are unmapped by Xen > at last minute, which happens after device reset in > xen_pcibk_xenbus_remove(). But that may need taking care of: I don't think it is a good idea to have anything left from the prior owning domain when the device gets reset. I.e. left over IRQ bindings should perhaps be forcibly cleared before invoking the reset; in fact I'd expect this to happen in the course of domain destruction, and I'd expect the device reset to come after the domain was cleaned up. Perhaps simply an ordering issue in the tool stack? Jan