Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp6903071pxb; Wed, 17 Feb 2021 17:26:22 -0800 (PST) X-Google-Smtp-Source: ABdhPJx8CeIuLEz7SUHh7L6lb9InSHMajLseQ0OeTg6iCaqiLRupF2FfgfEay3Qc53ZTOhut9QOt X-Received: by 2002:a17:906:f90f:: with SMTP id lc15mr1615493ejb.367.1613611581889; Wed, 17 Feb 2021 17:26:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1613611581; cv=none; d=google.com; s=arc-20160816; b=RxWwocPeSA3q3Vj3Kd+saRk2cOwEyV8dYA9EBscIlgHH7EWoY5N4+DQVSchex1or+U jKZVXU2gDjjF4mkNRE3zkpa2OQoCs1a5GQBF6qDpL0tbtiJSAScKdQdz5zNNpHyTMBxD JU9OJxN2YLm3l0yKNwg2zBOiUk4G2zPW8ZgvEeATphWWjiyN9R24x5a8GJSHHM7oyr7U 7PGbFG1lMsC1MiRRG+3lJR2On+MZ4v17bo/Fzh4JWifD/g3ki8gexzWeLSUPaOXxZ0MS cinfrgy6ZWSAVSWbiIzHkldjfOgypEsCgOLzA/tx8NCWlfhTYepwl7rmiYM1ZfNMuAlC M0AQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:from:cc:references:to :subject; bh=hQaY+ovj1ZeqEqRp+L9eD467hh4yn5dN6tFLv9TmQzA=; b=O5eap8wIzwBjXrsOF1eVQMVSZANUJu6avKn0WvYEsD9/RQe2lsR12MiM4Sv86q0Qd/ RBP/axuF0NGEJHumHNt8dcSFwQXIxMcw/6wB9+zbfIvUqha+2LxqIWw8DaGFnTSLf1xm TFs12LqxvdBU/4MfMZqy9/60K6qAkNZ72qpqxWUZVTtu0i7KgJMXTyLC8QaKWMZDr7v1 izidtVwKPG96y/VxBE+MuJNwE8zWcoNRqEKwfrC41U51kdlflspROSUPUpnRWvnly6xL R/0hEQ4RiwCkwQZUMuVY3ORa/npp6h3Kwixbb6AfpuzgICZrSrASvcS7PxGu7XfsOzhm HYIg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id g23si2462309edh.11.2021.02.17.17.25.57; Wed, 17 Feb 2021 17:26:21 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230017AbhBRBR7 (ORCPT + 99 others); Wed, 17 Feb 2021 20:17:59 -0500 Received: from szxga04-in.huawei.com ([45.249.212.190]:12179 "EHLO szxga04-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229862AbhBRBR6 (ORCPT ); Wed, 17 Feb 2021 20:17:58 -0500 Received: from DGGEMS401-HUB.china.huawei.com (unknown [172.30.72.58]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4DgxbD5cy2zlLmm; Thu, 18 Feb 2021 09:15:20 +0800 (CST) Received: from [10.174.184.42] (10.174.184.42) by DGGEMS401-HUB.china.huawei.com (10.3.19.201) with Microsoft SMTP Server id 14.3.498.0; Thu, 18 Feb 2021 09:17:06 +0800 Subject: Re: [RFC PATCH 10/11] vfio/iommu_type1: Optimize dirty bitmap population based on iommu HWDBM To: Yi Sun References: <20210128151742.18840-1-zhukeqian1@huawei.com> <20210128151742.18840-11-zhukeqian1@huawei.com> <20210207095630.GA28580@yi.y.sun> <407d28db-1f86-8d4f-ab15-3c3ac56bbe7f@huawei.com> <20210209115744.GB28580@yi.y.sun> CC: , , , , , Will Deacon , "Alex Williamson" , Marc Zyngier , Catalin Marinas , Kirti Wankhede , Cornelia Huck , Mark Rutland , James Morse , Robin Murphy , Suzuki K Poulose , , , , , , , From: Keqian Zhu Message-ID: <811dac11-a530-3218-9819-cea628ccefbc@huawei.com> Date: Thu, 18 Feb 2021 09:17:06 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.7.1 MIME-Version: 1.0 In-Reply-To: <20210209115744.GB28580@yi.y.sun> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.184.42] X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Yi, On 2021/2/9 19:57, Yi Sun wrote: > On 21-02-07 18:40:36, Keqian Zhu wrote: >> Hi Yi, >> >> On 2021/2/7 17:56, Yi Sun wrote: >>> Hi, >>> >>> On 21-01-28 23:17:41, Keqian Zhu wrote: >>> >>> [...] >>> >>>> +static void vfio_dma_dirty_log_start(struct vfio_iommu *iommu, >>>> + struct vfio_dma *dma) >>>> +{ >>>> + struct vfio_domain *d; >>>> + >>>> + list_for_each_entry(d, &iommu->domain_list, next) { >>>> + /* Go through all domain anyway even if we fail */ >>>> + iommu_split_block(d->domain, dma->iova, dma->size); >>>> + } >>>> +} >>> >>> This should be a switch to prepare for dirty log start. Per Intel >>> Vtd spec, there is SLADE defined in Scalable-Mode PASID Table Entry. >>> It enables Accessed/Dirty Flags in second-level paging entries. >>> So, a generic iommu interface here is better. For Intel iommu, it >>> enables SLADE. For ARM, it splits block. >> Indeed, a generic interface name is better. >> >> The vendor iommu driver plays vendor's specific actions to start dirty log, and Intel iommu and ARM smmu may differ. Besides, we may add more actions in ARM smmu driver in future. >> >> One question: Though I am not familiar with Intel iommu, I think it also should split block mapping besides enable SLADE. Right? >> > I am not familiar with ARM smmu. :) So I want to clarify if the block > in smmu is big page, e.g. 2M page? Intel Vtd manages the memory per Yes, for ARM, the "block" is big page :). > page, 4KB/2MB/1GB. There are two ways to manage dirty pages. > 1. Keep default granularity. Just set SLADE to enable the dirty track. > 2. Split big page to 4KB to get finer granularity. According to your statement, I see that VT-D's SLADE behaves like smmu HTTU. They are both based on page-table. Right, we should give more freedom to iommu vendor driver, so a generic interface is better. 1) As you said, set SLADE when enable dirty log. 2) IOMMUs of other architecture may has completely different dirty tracking mechanism. > > But question about the second solution is if it can benefit the user > space, e.g. live migration. If my understanding about smmu block (i.e. > the big page) is correct, have you collected some performance data to > prove that the split can improve performance? Thanks! The purpose of splitting block mapping is to reduce the amount of dirty bytes, which depends on actual DMA transaction. Take an extreme example, if DMA writes one byte, under 1G mapping, the dirty amount reported to userspace is 1G, but under 4K mapping, the dirty amount is just 4K. I will detail the commit message in v2. Thanks, Keqian