Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp354127pxu; Fri, 11 Dec 2020 04:05:49 -0800 (PST) X-Google-Smtp-Source: ABdhPJwCRjZhZNcTwPH95v8SuGzrhjDBQKgMcAuwt39dQ9/F+vAGbs6MUwLEcZym+4uqeJlDn2+c X-Received: by 2002:a50:ccc8:: with SMTP id b8mr11398496edj.152.1607688349447; Fri, 11 Dec 2020 04:05:49 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1607688349; cv=none; d=google.com; s=arc-20160816; b=TvDsi78yNOT5fXqQv2EMDFBFBkbqZKim8dIxn1noZuPv3pKc2xk7Lz/NgcA3c1sp7T PNnfYy2J/DEjvMJ4yLdrLYaTqi7/iLB5oprimD9KBkHmCvlunIZETdV7bj1JaFljcgxA Fux6QoU5o2warcBAx9DuRW9yekPFgWfNDouRX1bjlkiunq73hEnWcI5PVL61/xAcHfrQ WGDEOhLd/rdoZ7mVQPCOeRUhPcdA71RmCQxrcaBhkSU0rGh6fWZ3Rwg/QhVrGgNuHIUO tnvNk6NrrmeFhVFA7iEZAiVtL8Nvv8nAtK1MyNlIZJX6/BzQUGbb016yvzNEqtBprSq2 Mr+g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:from:cc:references:to :subject; bh=Br639JwJrINcSuQNV8wgNTRbNQd7UcrcFr0ROHSshwk=; b=GPF19QkNE8JLM1aOl6PKMLf3ZtvxBSbjpG1DqPKPYiAoMW8Hxi+0BOdvlApt6P1VmN wuC+Bu0wCgiPG5S6TcONlUbx90jCV1idrsm1H3wxa5vVFgXfZJpY3/K9Q0b1cDY/14/C paKGznu0B34zn/UQgtbvfgb3kgjtzzpN3Cl0Jio1+kI/lIe/Y6VigfnuXdFGelWIE5Bp C64Fd+YCvvaJMNikIGkr8DaL6nT67qZOzdP9PeDXOQ6m22NvmASV95trZBniP3NMaqF/ fjtiyxC/xwbjT3ThEEL7oRaB5xzoF0M3pBt0uaIayKVGKvVTjmUO18JOzY3gz799fjt0 mU4A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id j18si4732870edj.99.2020.12.11.04.05.26; Fri, 11 Dec 2020 04:05:49 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2436548AbgLKGxG (ORCPT + 99 others); Fri, 11 Dec 2020 01:53:06 -0500 Received: from szxga07-in.huawei.com ([45.249.212.35]:9867 "EHLO szxga07-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2436527AbgLKGwk (ORCPT ); Fri, 11 Dec 2020 01:52:40 -0500 Received: from DGGEMS410-HUB.china.huawei.com (unknown [172.30.72.60]) by szxga07-in.huawei.com (SkyGuard) with ESMTP id 4CshJn1TbZz7CXq; Fri, 11 Dec 2020 14:51:21 +0800 (CST) Received: from [10.174.187.37] (10.174.187.37) by DGGEMS410-HUB.china.huawei.com (10.3.19.210) with Microsoft SMTP Server id 14.3.487.0; Fri, 11 Dec 2020 14:51:48 +0800 Subject: Re: [PATCH 1/7] vfio: iommu_type1: Clear added dirty bit when unwind pin To: Alex Williamson References: <20201210073425.25960-1-zhukeqian1@huawei.com> <20201210073425.25960-2-zhukeqian1@huawei.com> <20201210121646.24fb3cd8@omen.home> CC: , , , , , Cornelia Huck , "Marc Zyngier" , Will Deacon , Robin Murphy , Joerg Roedel , Catalin Marinas , James Morse , "Suzuki K Poulose" , Sean Christopherson , Julien Thierry , Mark Brown , "Thomas Gleixner" , Andrew Morton , Alexios Zavras , , From: zhukeqian Message-ID: Date: Fri, 11 Dec 2020 14:51:47 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.7.1 MIME-Version: 1.0 In-Reply-To: <20201210121646.24fb3cd8@omen.home> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.187.37] X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2020/12/11 3:16, Alex Williamson wrote: > On Thu, 10 Dec 2020 15:34:19 +0800 > Keqian Zhu wrote: > >> Currently we do not clear added dirty bit of bitmap when unwind >> pin, so if pin failed at halfway, we set unnecessary dirty bit >> in bitmap. Clearing added dirty bit when unwind pin, userspace >> will see less dirty page, which can save much time to handle them. >> >> Note that we should distinguish the bits added by pin and the bits >> already set before pin, so introduce bitmap_added to record this. >> >> Signed-off-by: Keqian Zhu >> --- >> drivers/vfio/vfio_iommu_type1.c | 33 ++++++++++++++++++++++----------- >> 1 file changed, 22 insertions(+), 11 deletions(-) >> >> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c >> index 67e827638995..f129d24a6ec3 100644 >> --- a/drivers/vfio/vfio_iommu_type1.c >> +++ b/drivers/vfio/vfio_iommu_type1.c >> @@ -637,7 +637,11 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data, >> struct vfio_iommu *iommu = iommu_data; >> struct vfio_group *group; >> int i, j, ret; >> + unsigned long pgshift = __ffs(iommu->pgsize_bitmap); >> unsigned long remote_vaddr; >> + unsigned long bitmap_offset; >> + unsigned long *bitmap_added; >> + dma_addr_t iova; >> struct vfio_dma *dma; >> bool do_accounting; >> >> @@ -650,6 +654,12 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data, >> >> mutex_lock(&iommu->lock); >> >> + bitmap_added = bitmap_zalloc(npage, GFP_KERNEL); >> + if (!bitmap_added) { >> + ret = -ENOMEM; >> + goto pin_done; >> + } > > > This is allocated regardless of whether dirty tracking is enabled, so > this adds overhead to the common case in order to reduce user overhead > (not correctness) in the rare condition that dirty tracking is enabled, > and the even rarer condition that there's a fault during that case. > This is not a good trade-off. Thanks, Hi Alex, We can allocate the bitmap when dirty tracking is active, do you accept this? Or we can set the dirty bitmap after all pins succeed, but this costs cpu time to locate vfio_dma with iova. Thanks, Keqian > > Alex > > >> + >> /* Fail if notifier list is empty */ >> if (!iommu->notifier.head) { >> ret = -EINVAL; >> @@ -664,7 +674,6 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data, >> do_accounting = !IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu); >> >> for (i = 0; i < npage; i++) { >> - dma_addr_t iova; >> struct vfio_pfn *vpfn; >> >> iova = user_pfn[i] << PAGE_SHIFT; >> @@ -699,14 +708,10 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data, >> } >> >> if (iommu->dirty_page_tracking) { >> - unsigned long pgshift = __ffs(iommu->pgsize_bitmap); >> - >> - /* >> - * Bitmap populated with the smallest supported page >> - * size >> - */ >> - bitmap_set(dma->bitmap, >> - (iova - dma->iova) >> pgshift, 1); >> + /* Populated with the smallest supported page size */ >> + bitmap_offset = (iova - dma->iova) >> pgshift; >> + if (!test_and_set_bit(bitmap_offset, dma->bitmap)) >> + set_bit(i, bitmap_added); >> } >> } >> ret = i; >> @@ -722,14 +727,20 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data, >> pin_unwind: >> phys_pfn[i] = 0; >> for (j = 0; j < i; j++) { >> - dma_addr_t iova; >> - >> iova = user_pfn[j] << PAGE_SHIFT; >> dma = vfio_find_dma(iommu, iova, PAGE_SIZE); >> vfio_unpin_page_external(dma, iova, do_accounting); >> phys_pfn[j] = 0; >> + >> + if (test_bit(j, bitmap_added)) { >> + bitmap_offset = (iova - dma->iova) >> pgshift; >> + clear_bit(bitmap_offset, dma->bitmap); >> + } >> } >> pin_done: >> + if (bitmap_added) >> + bitmap_free(bitmap_added); >> + >> mutex_unlock(&iommu->lock); >> return ret; >> } > > . >