Received: by 2002:a05:6602:2086:0:0:0:0 with SMTP id a6csp3278016ioa; Mon, 25 Apr 2022 23:43:43 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy4i87E4p0o9yZq+HOukzg2uyR3j0oouoGEfolGYFNybkZ3qSEIhLiGqFv5EQWVC0NQ8IkW X-Received: by 2002:a17:907:7b8c:b0:6f3:a7d8:2609 with SMTP id ne12-20020a1709077b8c00b006f3a7d82609mr3969008ejc.53.1650955423614; Mon, 25 Apr 2022 23:43:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1650955423; cv=none; d=google.com; s=arc-20160816; b=upbh/W8lk8/f7YFEBgZ9uV4AtWIiM5dsHCciDatT3Fqg0VDcb+z2xWVs+k2utIyKmw UxAfTxoMymt6Fxw/3ZMpejVUT5fw6ANCjinkrDQqQwpgBDeBlnsyzbKgVgYhSQhESxY9 AF823yEPRhh5nn0AMPIq6c8qw1IOaOrq5eyK+Ik5C4WZjGc5bhncMLukHTPGFCELXNE+ 200bkGvHl8lRhusIm310x3ex9s4gd8LzB6N/diJ/T8UMb3NyHoGAEutkmWWFD2P05BV3 S5i4i/W5ulQSIVguCARlQQ9b3z7lKFGzXtYE+jvLG7qziXaYtQEOy583w++dx71stQY8 zXPw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=lc4t2W7XBxfi+ErE8/NA8oAuexEw2QbYDM47/7yqkFg=; b=XxBgtVyiK1UoxWJu5KdwwrpNr0ZBMSC7mWtzvLV0RkiYj+mFhvco3qtaaivEfNQ00e 0b0Xgwylg47iku9etvsVjBMF6D44AzIuWXJzOC07/vltmFPpjLW0u4UNpo8EBMHE12T2 o9j6c9JxW0EVQD+98u5wYwSAzDxwcKOioszftFUjUwS3K1uk3NkuFDSBqvWM5Cs211ka 46tFke6giSH/vzU0YNDZhg6oIhTbOvOdtzukQ5MdLsL7DlW4gvBgGqqsVqityLF2nqxL DkWM2yGPKShw7nKbu2ibmdHR4rJfrK6XhQhZmnpTg4a9ImajuKCFJVFUtzCpYEebwKYB fDzA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="kiG/hpzg"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f6-20020a170906c08600b006f3a8137351si2152692ejz.181.2022.04.25.23.43.20; Mon, 25 Apr 2022 23:43:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="kiG/hpzg"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244125AbiDZEjO (ORCPT + 99 others); Tue, 26 Apr 2022 00:39:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33282 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244109AbiDZEjN (ORCPT ); Tue, 26 Apr 2022 00:39:13 -0400 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 990EA6B08A for ; Mon, 25 Apr 2022 21:36:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1650947766; x=1682483766; h=date:from:to:cc:subject:message-id:references: mime-version:content-transfer-encoding:in-reply-to; bh=HoSgYXIAQU+NGqckIO+BRiklpNR9Ayr8HUcEmJ0UHcE=; b=kiG/hpzgmN0jivIdOA+8rKuPdv/BUGBz4ygC662lxKSW6yJxliBHf3vM +sJOgoPJt5j3acAGBPfG4fQCsRk9GZzXZwVV8EiR0Q9Hj8hPGvJBunSz7 C6qqG9UFDiNnLqw3ArJM8E1zat/w05qyKgRByg/ugWcq+aB4PuUGtKmkd Mn4Z3slTCSlGSZzvw+3AoOY7wJZHqP+ithHb7DGY8gXkRdceKfzO8w9Tg gwyijJKggdSpI88f7dGqxsHud7iivWfPNu87g3aIqmfG1hjaxKAR55klz gw7D3nuVaA1B8DAcuUlB2sBLLoK7uQNstyo679LUVyUtG9vuVRpM4cslG g==; X-IronPort-AV: E=McAfee;i="6400,9594,10328"; a="290590897" X-IronPort-AV: E=Sophos;i="5.90,290,1643702400"; d="scan'208";a="290590897" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Apr 2022 21:35:44 -0700 X-IronPort-AV: E=Sophos;i="5.90,290,1643702400"; d="scan'208";a="595563703" Received: from fyu1.sc.intel.com ([172.25.103.126]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Apr 2022 21:35:43 -0700 Date: Mon, 25 Apr 2022 21:36:19 -0700 From: Fenghua Yu To: Zhangfei Gao Cc: Jean-Philippe Brucker , Jacob Pan , Dave Hansen , Tony Luck , Ashok Raj , Ravi V Shankar , Peter Zijlstra , robin.murphy@arm.com, Dave Hansen , x86 , linux-kernel , iommu , Ingo Molnar , Borislav Petkov , Andy Lutomirski , Josh Poimboeuf , Thomas Gleixner , will@kernel.org Subject: Re: [PATCH v4 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit Message-ID: References: <76ec6342-0d7c-7c7b-c132-2892e4048fa1@intel.com> <20220425083444.00af5674@jacob-builder> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Spam-Status: No, score=-7.7 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Apr 26, 2022 at 12:28:00PM +0800, Zhangfei Gao wrote: > Hi, Jean > > On 2022/4/26 上午12:13, Jean-Philippe Brucker wrote: > > Hi Jacob, > > > > On Mon, Apr 25, 2022 at 08:34:44AM -0700, Jacob Pan wrote: > > > Hi Jean-Philippe, > > > > > > On Mon, 25 Apr 2022 15:26:40 +0100, Jean-Philippe Brucker > > > wrote: > > > > > > > On Mon, Apr 25, 2022 at 07:18:36AM -0700, Dave Hansen wrote: > > > > > On 4/25/22 06:53, Jean-Philippe Brucker wrote: > > > > > > On Sat, Apr 23, 2022 at 07:13:39PM +0800, zhangfei.gao@foxmail.com > > > > > > wrote: > > > > > > > > > On 5.17 > > > > > > > > > fops_release is called automatically, as well as > > > > > > > > > iommu_sva_unbind_device. On 5.18-rc1. > > > > > > > > > fops_release is not called, have to manually call close(fd) > > > > > > > > Right that's weird > > > > > > > Looks it is caused by the fix patch, via mmget, which may add > > > > > > > refcount of fd. > > > > > > Yes indirectly I think: when the process mmaps the queue, > > > > > > mmap_region() takes a reference to the uacce fd. That reference is > > > > > > released either by explicit close() or munmap(), or by exit_mmap() > > > > > > (which is triggered by mmput()). Since there is an mm->fd dependency, > > > > > > we cannot add a fd->mm dependency, so no mmget()/mmput() in > > > > > > bind()/unbind(). > > > > > > > > > > > > I guess we should go back to refcounted PASIDs instead, to avoid > > > > > > freeing them until unbind(). > > > > > Yeah, this is a bit gnarly for -rc4. Let's just make sure there's > > > > > nothing else simple we can do. > > > > > > > > > > How does the IOMMU hardware know that all activity to a given PASID is > > > > > finished? That activity should, today, be independent of an mm or a > > > > > fd's lifetime. > > > > In the case of uacce, it's tied to the fd lifetime: opening an accelerator > > > > queue calls iommu_sva_bind_device(), which sets up the PASID context in > > > > the IOMMU. Closing the queue calls iommu_sva_unbind_device() which > > > > destroys the PASID context (after the device driver stopped all DMA for > > > > this PASID). > > > > > > > For VT-d, it is essentially the same flow except managed by the individual > > > drivers such as DSA. > > > If free() happens before unbind(), we deactivate the PASIDs and suppress > > > faults from the device. When the unbind finally comes, we finalize the > > > PASID teardown. It seems we have a need for an intermediate state where > > > PASID is "pending free"? > > Yes we do have that state, though I'm not sure we need to make it explicit > > in the ioasid allocator. > > > > Could we move mm_pasid_drop() to __mmdrop() instead of __mmput()? For Arm > > we do need to hold the mm_count until unbind(), and mmgrab()/mmdrop() is > > also part of Lu's rework [1]. > > Move mm_pasid_drop to __mmdrop looks workable. > > The nginx works since ioasid is not freed when master exit until nginx stop. > > The ioasid does not free immediately when fops_release->unbind finished. > Instead, __mmdrop happens a bit lazy,  which has no issue though > I passed 10000 times exit without unbind test, the pasid allocation is ok. > > Thanks > > > diff --git a/kernel/fork.c b/kernel/fork.c > index 9796897560ab..60f417f69367 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -792,6 +792,8 @@ void __mmdrop(struct mm_struct *mm) >         mmu_notifier_subscriptions_destroy(mm); >         check_mm(mm); >         put_user_ns(mm->user_ns); > +       mm_pasid_drop(mm); >         free_mm(mm); >  } >  EXPORT_SYMBOL_GPL(__mmdrop); > @@ -1190,7 +1192,6 @@ static inline void __mmput(struct mm_struct *mm) >         } >         if (mm->binfmt) >                 module_put(mm->binfmt->module); > -       mm_pasid_drop(mm); >         mmdrop(mm); >  } Thank you very much, Zhangfei! I just now sent out an identical patch. It works on X86 as well. So seems the patch is the right fix. Either you can send out the patch or I add your Signed-off-by? Either way is OK for me. Thanks. -Fenghua