Received: by 2002:a05:6512:3d0e:0:0:0:0 with SMTP id d14csp52860lfv; Tue, 12 Apr 2022 17:00:17 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwh14pLfi4CTgNw+y19KvJvEgxnqXw+onz0199UAL2UfJyVCnos40VViuOEtqwLL6jh1Do9 X-Received: by 2002:a05:6a00:15ca:b0:505:bf6f:2b48 with SMTP id o10-20020a056a0015ca00b00505bf6f2b48mr13107199pfu.64.1649808016977; Tue, 12 Apr 2022 17:00:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649808016; cv=none; d=google.com; s=arc-20160816; b=sruq3o+PSoDh5IrN8iQy44aM5CNl11s+kKw3EfwFZ0fRE9cSmoYhpGruQOmJ2B/oxx ynV3o80cRPYlpcedQTSomcY7NgR6CbUkt03WbkyXSk0lUnRwmT4LUsNHJUDinXzokG2i 5CLl1xNjYE9cXskUG2ukxnjxRWjb2ze3OWgx++jBKsg6k9s4giAGEyhnf0zH9Ea2Z/Zv Jx68x23IOFyC365syw3sUxKEos3Y8i/1iFI5FFHxco7pcDznGI/6xMQhL0HLv4OqB7g3 OsILU8JPcNt0GKdYHm1QZ2uZ+FzFVt6g17zwTLTLjq42pQM4KXdMd2IVM59AfJFDgtD4 1ExA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id; bh=ViF7o/qbJR40J+D7cMLj2T6UuK2ZZ7i9KA4t+W3CQgk=; b=Wvp5n5XqMYkX36khUuxyX1uPF2gjFmZRw6BKxyivtfFEX3AmqTnUvbKtUaRSAcSvfc BtOP3bvhENnBseeNpxXN8AJPp1VDyrTj3272FJnH6JXERAU1dc1HdgauEkFYrfbMPgOD 6/Hied2o9PqVB+1q8jyTkxYVztoV6zF4HBMJBKJ9FAsJGSMySIJRVDHpWgEUTrovW3+X hfCN5RBQBU+rcxEzA9eE+5rOTc4VVEwjPNX6EraNJNjzczll5u755+odKmh03Rdz8AOH xtpe9o9VVyMGjs3HwApJ7+E9C9wX1Ko3KOAtxCXNvB/wMjRLJF3evGngkicSPl3rZQOP CmMA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id q7-20020a170902eb8700b001588b580ba6si2574812plg.303.2022.04.12.17.00.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Apr 2022 17:00:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 7054520DB34; Tue, 12 Apr 2022 14:56:35 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1351753AbiDLNfg (ORCPT + 99 others); Tue, 12 Apr 2022 09:35:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36362 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233128AbiDLNf2 (ORCPT ); Tue, 12 Apr 2022 09:35:28 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id E072E12AF9 for ; Tue, 12 Apr 2022 06:33:10 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9264A150C; Tue, 12 Apr 2022 06:33:10 -0700 (PDT) Received: from [10.1.196.40] (e121345-lin.cambridge.arm.com [10.1.196.40]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 7D8BD3F70D; Tue, 12 Apr 2022 06:33:09 -0700 (PDT) Message-ID: Date: Tue, 12 Apr 2022 14:33:05 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux aarch64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 Subject: Re: [PATCH] dma-direct: avoid redundant memory sync for swiotlb Content-Language: en-GB To: Chao Gao , linux-kernel@vger.kernel.org, iommu@lists.linux-foundation.org Cc: m.szyprowski@samsung.com, hch@lst.de, Wang Zhaoyang1 , Gao Liang , Kevin Tian References: <20220412113805.3210-1-chao.gao@intel.com> From: Robin Murphy In-Reply-To: <20220412113805.3210-1-chao.gao@intel.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3.0 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 12/04/2022 12:38 pm, Chao Gao wrote: > When we looked into FIO performance with swiotlb enabled in VM, we found > swiotlb_bounce() is always called one more time than expected for each DMA > read request. > > It turns out that the bounce buffer is copied to original DMA buffer twice > after the completion of a DMA request (one is done by in > dma_direct_sync_single_for_cpu(), the other by swiotlb_tbl_unmap_single()). > But the content in bounce buffer actually doesn't change between the two > rounds of copy. So, one round of copy is redundant. > > Pass DMA_ATTR_SKIP_CPU_SYNC flag to swiotlb_tbl_unmap_single() to > skip the memory copy in it. It's still a little suboptimal and non-obvious to call into SWIOTLB twice though - even better might be for SWIOTLB to call arch_sync_dma_for_cpu() at the appropriate place internally, then put the dma_direct_sync in an else path here. I'm really not sure why we have the current disparity between map and unmap in this regard... :/ Robin. > This fix increases FIO 64KB sequential read throughput in a guest with > swiotlb=force by 5.6%. > > Reported-by: Wang Zhaoyang1 > Reported-by: Gao Liang > Signed-off-by: Chao Gao > Reviewed-by: Kevin Tian > --- > kernel/dma/direct.h | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/kernel/dma/direct.h b/kernel/dma/direct.h > index 4632b0f4f72e..8a6cd53dbe8c 100644 > --- a/kernel/dma/direct.h > +++ b/kernel/dma/direct.h > @@ -114,6 +114,7 @@ static inline void dma_direct_unmap_page(struct device *dev, dma_addr_t addr, > dma_direct_sync_single_for_cpu(dev, addr, size, dir); > > if (unlikely(is_swiotlb_buffer(dev, phys))) > - swiotlb_tbl_unmap_single(dev, phys, size, dir, attrs); > + swiotlb_tbl_unmap_single(dev, phys, size, dir, > + attrs | DMA_ATTR_SKIP_CPU_SYNC); > } > #endif /* _KERNEL_DMA_DIRECT_H */