Received: by 2002:a05:6a10:9e8c:0:0:0:0 with SMTP id y12csp1147600pxx; Fri, 30 Oct 2020 03:22:59 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxKJITTe0ENwZdVo0yiLMdOyl6Jz5j01hwv0rABDlw1uA1Wg8ZtPHAMSTWOzSJHIutWrSX5 X-Received: by 2002:a17:906:4b0f:: with SMTP id y15mr1620387eju.198.1604053379570; Fri, 30 Oct 2020 03:22:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1604053379; cv=none; d=google.com; s=arc-20160816; b=HZHiS9maAIHfLNbg5RQ0k21VSD4u+YXh/quLdaxS+4yk1LC8liktSqM8F6EwdjEeJn qI/ctRgmd2fk5khIX6L5TI+kXslxV8SrzywW4JyZBEfCic/o1RMwKysLzBVLLMqfV7P5 D7CR4r0BPti0xrNC7BA7byMlx/MnvCeyrp7yG6PyU607UuxWq03iZ6wj71jfhlHcVnM2 EquzTibY46ad78+yZYOWtHzV0Whug3QJbqellfwb0FipiGd5BluqJmPkSdlYF5W9WDL2 A/RoQY1fJX/epT4NiMJQ2kUiJr8riPZnY/f/XSt8VrdENn+br+1PI5R2bRDezPhnHt9Y Ejjw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=9QnPAe/9TEZoI6Lq3d/QkgDd9m9jvfSBzhcmPvWbSw4=; b=uaUiIS29PovkuBNbo4LbJ9c89/3EVfIzVIXzPTL67DZ0IyLGx/Gu/kK0hQcC2rWY7B XLxWXUP727qM7A2nq3zXlscShqxIULSLwr0EVZVH9LzEvDlAJvTDb+LhsOX/Gg9d2WOA uTDLEvoPBaaATCLHgpQfMWY+rZPt7q3BFjrjIE3yYvoXzLqEk47uwAX21LDEMeUxA+I/ GzUEZlzgi0IGAnfiBJOlSD6pvj+KiKRYAyRGR7K2BR8DaCwaqYW7Utdzp6Tb8NHOJr6z 5PI023ystqfVGS70AKpFYBYPAuWzKapjypVfpmfC69lZS7eY2uIKQFwNzLlfnBy/LWJi 2z9A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=xNIZ8cVB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id di8si1297764edb.533.2020.10.30.03.22.36; Fri, 30 Oct 2020 03:22:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=xNIZ8cVB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726343AbgJ3KS5 (ORCPT + 99 others); Fri, 30 Oct 2020 06:18:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52086 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726199AbgJ3KS4 (ORCPT ); Fri, 30 Oct 2020 06:18:56 -0400 Received: from mail-wm1-x342.google.com (mail-wm1-x342.google.com [IPv6:2a00:1450:4864:20::342]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6548AC0613D9 for ; Fri, 30 Oct 2020 03:18:54 -0700 (PDT) Received: by mail-wm1-x342.google.com with SMTP id k18so2418008wmj.5 for ; Fri, 30 Oct 2020 03:18:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=9QnPAe/9TEZoI6Lq3d/QkgDd9m9jvfSBzhcmPvWbSw4=; b=xNIZ8cVBFbQ6bdjtq0Hmby0tUGqrogQ55WzB1NVugGBAxs6/uT7n7bTbHZG26skP/6 vlBkhqQR3eH1Jh1mHIOKSojs8X8EfyS4S5Ck3WuvfBx1L89a7KWXB1WgMOpYcxRRC805 I8j5s11o2aY3ulzx31MvrsaxIp5/9ov5SiZtTelhDB/pbex1AhTA5GLWYI5Z0McJChPO 6MSvuxHNMhRWEGuowKnaPUgqPITviOE1D8Qtsg/3XDpv4nDGSkZ1i1Rw2Eg5Ycwrc1IC nZKvLRX+YDV7cPfrsoulWOqh6TJHwmx7YwkkL5kkvarpIlary1jzpnkRuyFGG4A0wPnn 7PRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=9QnPAe/9TEZoI6Lq3d/QkgDd9m9jvfSBzhcmPvWbSw4=; b=o1DfXZJ3XePdxZpYJCgbAR9IKH8gGN6lJ8rwBNqKEqr05DUDZZklZFvvW6mMm4G+iZ 9zHpW2kXHeL+wB51rW1gQB38yo21qiRF5PS1SD7igQGdPm+H9iMjAGtetXTkfECUA1Un 5UUCJfmcQlb1M/LPjL5cblHElWjfqgrFgsuBlcbdFZBNd1P5WIJY485o4xEyzkSaZ1f6 ssBt/WmoOebpkeUq9W4hpoMCrMfqBJKsFCvDnUrqLgxLC5beXGuyiXzAYJMbzUECnGqD dJa/n379nkRwriA/UWEf0CHQo0QvYq6lyovJF3gIzEPRBMqU15Srqe1fR8F2i7IhCROj IiNg== X-Gm-Message-State: AOAM531MjsmBY58peH2Jy7grPtO13YGurgF5wjhklmMG7rcep0loaWao 8TwjU4nLy95R88eXRjSdR6kq5g== X-Received: by 2002:a1c:1d51:: with SMTP id d78mr1755124wmd.60.1604053128250; Fri, 30 Oct 2020 03:18:48 -0700 (PDT) Received: from myrica ([2001:1715:4e26:a7e0:116c:c27a:3e7f:5eaf]) by smtp.gmail.com with ESMTPSA id r28sm10482662wrr.81.2020.10.30.03.18.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 30 Oct 2020 03:18:47 -0700 (PDT) Date: Fri, 30 Oct 2020 11:18:27 +0100 From: Jean-Philippe Brucker To: Jacob Pan Cc: Jacob Pan , iommu@lists.linux-foundation.org, LKML , Joerg Roedel , Alex Williamson , Lu Baolu , David Woodhouse , Jonathan Corbet , linux-api@vger.kernel.org, Jean-Philippe Brucker , Eric Auger , Yi Liu , "Tian, Kevin" , Raj Ashok , Wu Hao , Yi Sun , Dave Jiang , Randy Dunlap , linux-doc@vger.kernel.org Subject: Re: [PATCH v3 01/14] docs: Document IO Address Space ID (IOASID) APIs Message-ID: <20201030101827.GB122147@myrica> References: <1601329121-36979-1-git-send-email-jacob.jun.pan@linux.intel.com> <1601329121-36979-2-git-send-email-jacob.jun.pan@linux.intel.com> <20201020135809.GA1515830@myrica> <20201026140506.1349dbb5@jacob-builder> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20201026140506.1349dbb5@jacob-builder> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Oct 26, 2020 at 02:05:06PM -0700, Jacob Pan wrote: > > This looks good to me, with small comments below. > > > Can I add your Reviewed-by tag after addressing the comments? Yes sure, this took forever to review so I'm happy not to do another pass :) > > > +Each IOASID set is created with a token, which can be one of the > > > +following token types: > > > + > > > + - IOASID_SET_TYPE_NULL (Arbitrary u64 value) > > > > Maybe NULL isn't the best name then. NONE? > > > Agreed, 'NONE' makes more sense. Although patch 5 only allows a NULL token for this type. So the name seems fine, you could just fix this description. > > > +IOASID core has the notion of "custom allocator" such that guest can > > > +register virtual command allocator that precedes the default one. > > > > "Supersedes", rather than "precedes"? > > > My understanding is that 'supersede' means replace something but 'precede' > means get in front of something. I do want to emphasis that the custom > allocator takes precedence over the default allocator. Right it's ambiguous. The custom allocator does entirely replace the allocation action, but the default one is still used for storage. Anyway, you can leave this. > > > +Let's examine the IOASID life cycle again when free happens *before* > > > +unbind. This could be a result of misbehaving guests or crash. Assuming > > > +VFIO cannot enforce unbind->free order. Notice that the setup part up > > > +until step #12 is identical to the normal case, the flow below starts > > > +with step 13. > > > + > > > +:: > > > + > > > + VFIO IOMMU KVM VDCM IOASID Ref > > > + .................................................................. > > > + 13 -------- GUEST STARTS DMA -------------------------- > > > + 14 -------- *GUEST MISBEHAVES!!!* ---------------- > > > + 15 ioasid_free() > > > + 16 ioasid_notify(FREE) > > > + 17 mark_free_pending > > > (1) > > > > Could we use superscript ¹²³⁴ for footnotes? These look like function > > parameters > > > yes, much better > > > > + 18 kvm_nb_handler(FREE) > > > + 19 vmcs_update_atomic() > > > + 20 ioasid_put_locked() -> 3 > > > + 21 vdcm_nb_handler(FREE) > > > + 22 iomm_nb_handler(FREE) > > > > iommu_nb_handler > > > got it > > > > + 23 ioasid_free() returns(2) schedule_work() 2 > > > > I completely lost track here, couldn't figure out in which direction to > > read the diagram. What work is scheduled? > The time line goes downward but we only control the notification order in > terms of when the events are received. Some completions are async thus out > of order done by work items. The only in-order completion is the KVM update > of its PASID translation table. > > After #23, the async works are scheduled to complete clean up work outside > the spinlock(held by the caller of the atomic notifier). > > Any suggestions to improve the readability of the time line? Maybe explain what happens from line 23: ioasid_free() schedules... a FREE notification? Which happens on line 24 (corresponding to the second schedule_work()?) and is handled by (a) VDCM to clear the device context and (b) IOMMU to clear the PASID context, both ending up dropping their ref. > > > Why does the IOMMU driver drop > > its reference to the IOASID before unbdind_gpasid()? > > > This is the exception case where userspace issues IOASID free before > unbind_gpasid(). The equivalent of unbind is performed in the IOASID_FREE > notification handler. In IOASID_FREE handler, reference is dropped and > private data deleted. After that, if unbind comes to IOMMU driver, it will > not find IOASID private data therefore just return. Right ok. As you noted below the damage is caused by and limited to the guest, so I think it's fine. > > > > + 24 schedule_work() vdev_clear_wk(hpasid) > > > + 25 teardown_pasid_wk() > > > + 26 ioasid_put() -> 1 > > > + 27 ioasid_put() 0 > > > + 28 Reclaimed > > > + 29 unbind_gpasid() > > > + 30 iommu_unbind()->ioasid_find() Fails(3) > > > + -------------- New Life Cycle Begin ---------------------------- > > > + > > > +Note: > > > + > > > +1. By marking IOASID FREE_PENDING at step #17, no new references can be > > > + held. ioasid_get/find() will return -ENOENT; > > > > s/held/taken > > > Got it. > > > Thanks, > > Jean > > > > > +2. After step #23, all events can go out of order. Shall not affect > > > + the outcome. > > > +3. IOMMU driver fails to find private data for unbinding. If unbind is > > > + called after the same IOASID is allocated for the same guest again, > > > + this is a programming error. The damage is limited to the guest > > > + itself since unbind performs permission checking based on the > > > + IOASID set associated with the guest process. "guest process" can be confusing (process run by the guest?), just "guest" might be better. Thanks, Jean