Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp1630564pxb; Thu, 4 Mar 2021 16:58:33 -0800 (PST) X-Google-Smtp-Source: ABdhPJxd0P26VOHbpevn3pMnwneprSvoBDMAHjy/X0fwDbudZ5BrvxmH1dJGe1W8Z+DE2zUIl+ME X-Received: by 2002:a6b:d217:: with SMTP id q23mr5966773iob.173.1614905912981; Thu, 04 Mar 2021 16:58:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1614905912; cv=none; d=google.com; s=arc-20160816; b=RhGpq+aGqhwXnZmkLGH9DD0jUlK/Qf0Yrd5ZAbEDZGYxSFqqNIbae+aYieIqwgdcP3 sJBK/DRNHOoJ9/fn34MpbmFFsar5p4YVwIz3RydtgUf7tytoggmBXM3EggIrsvttdX4J m4HeWXeABLCk+ieMi9BvP4JFB5ClKrhKCyM3QEow6iIOJCGpklXiBEJRS3h7FJaluxNa 4WTwslwOKHdK2NvaE8GCnbmKqjSTcz+TpG+3NAhxsjLnZWPPRsdKMeueXR7J7q2CEiSz UAc7Ouv52z3Y+BBZKIoUGSmIfmskHja9mOX+rp5DuAcrWbmOUS5GdpGAE4yHVBWWFRqB +JLw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :organization:references:in-reply-to:message-id:subject:cc:to:from :date:ironport-sdr:ironport-sdr; bh=CE00TWXhfzsNCBz4L1W4Hluk8p/RchztTfDi1VFBgx8=; b=Lt/cCNRrm2ZYjZ1UPKMpDBjwzrjAXrUrgJE3lff5Q4L6e9lDRIeU1VF96fqljX/WfI qoHGUmFSdKfogrE17XJjjso8KjZycO6EkfWmySpVypHinIzzVEb+LRPq4kXTh0Nz0mdP hzI9G5uUiA7kND8YrR28cq3FDCCWAc09X3ZBnGoOWB/oA9/vxDQqlkv85hnsXDGwi0Ji Br7hFWOZ6ME+rR0ee6lz0It4ZIhOErTVvnTe8TyOSM4iRARqs9DCZktXBPRPTU0UTS9H cm6/h2w3qnfRqxWDaEuPdOsMj451QfzuK8PPTy+PFt00AcYWGonekMwHEhLMElXVfoM7 7bYQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id q4si1017041jat.34.2021.03.04.16.58.20; Thu, 04 Mar 2021 16:58:32 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238350AbhCDV1m (ORCPT + 99 others); Thu, 4 Mar 2021 16:27:42 -0500 Received: from mga03.intel.com ([134.134.136.65]:46354 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237839AbhCDV1k (ORCPT ); Thu, 4 Mar 2021 16:27:40 -0500 IronPort-SDR: ay2YA9k2DmLcJTf5v8ZyRlwRmAVn8f+u0uD6c0gEJHOVbXvgXksVF2votuNLeWRRmPL+mbiu20 i8fB+ayvg3jg== X-IronPort-AV: E=McAfee;i="6000,8403,9913"; a="187565543" X-IronPort-AV: E=Sophos;i="5.81,223,1610438400"; d="scan'208";a="187565543" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Mar 2021 13:25:53 -0800 IronPort-SDR: MQ7Ih7H0p2xK1hWg7n4AeoanVjOPZRwTjBXJnjrogCBsf4jtzESf6/D/mAX8KKNBeg8DF+etxU JJCkaD1zIg3A== X-IronPort-AV: E=Sophos;i="5.81,223,1610438400"; d="scan'208";a="384637280" Received: from jacob-builder.jf.intel.com (HELO jacob-builder) ([10.7.199.155]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Mar 2021 13:25:53 -0800 Date: Thu, 4 Mar 2021 13:28:09 -0800 From: Jacob Pan To: Jason Gunthorpe Cc: Jean-Philippe Brucker , Tejun Heo , LKML , Joerg Roedel , Lu Baolu , David Woodhouse , , , Johannes Weiner , Jean-Philippe Brucker , Alex Williamson , Eric Auger , "Jonathan Corbet" , Raj Ashok , "Tian, Kevin" , Yi Liu , Wu Hao , Dave Jiang , jacob.jun.pan@linux.intel.com Subject: Re: [RFC PATCH 15/18] cgroup: Introduce ioasids controller Message-ID: <20210304132809.75b3fa55@jacob-builder> In-Reply-To: <20210304190253.GL4247@nvidia.com> References: <1614463286-97618-1-git-send-email-jacob.jun.pan@linux.intel.com> <1614463286-97618-16-git-send-email-jacob.jun.pan@linux.intel.com> <20210303131726.7a8cb169@jacob-builder> <20210303160205.151d114e@jacob-builder> <20210304094603.4ab6c1c4@jacob-builder> <20210304175402.GG4247@nvidia.com> <20210304110144.39ef0941@jacob-builder> <20210304190253.GL4247@nvidia.com> Organization: OTC X-Mailer: Claws Mail 3.17.5 (GTK+ 2.24.32; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Jason, On Thu, 4 Mar 2021 15:02:53 -0400, Jason Gunthorpe wrote: > On Thu, Mar 04, 2021 at 11:01:44AM -0800, Jacob Pan wrote: > > > > For something like qemu I'd expect to put the qemu process in a cgroup > > > with 1 PASID. Who cares what qemu uses the PASID for, or how it was > > > allocated? > > > > For vSVA, we will need one PASID per guest process. But that is up to > > the admin based on whether or how many SVA capable devices are directly > > assigned. > > I hope the virtual IOMMU driver can communicate the PASID limit and > the cgroup machinery in the guest can know what the actual limit is. > For VT-d, emulated vIOMMU can communicate with the guest IOMMU driver on how many PASID bits are supported (extended cap reg PASID size fields). But it cannot communicate how many PASIDs are in the pool(host cgroup capacity). The QEMU process may not be the only one in a cgroup so it cannot give hard guarantees. I don't see a good way to communicate accurately at runtime as the process migrates or limit changes. We were thinking to adopt the "Limits" model as defined in the cgroup-v2 doc. " Limits ------ A child can only consume upto the configured amount of the resource. Limits can be over-committed - the sum of the limits of children can exceed the amount of resource available to the parent. " So the guest cgroup would still think it has full 20 bits of PASID at its disposal. But PASID allocation may fail before reaching the full 20 bits (2M). Similar on the host side, we only enforce the limit set by the cgroup but not guarantee it. > I was thinking of a case where qemu is using a single PASID to setup > the guest kVA or similar > got it. > Jason Thanks, Jacob