Received: by 2002:a05:6358:9144:b0:117:f937:c515 with SMTP id r4csp664718rwr; Thu, 4 May 2023 08:07:00 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7lDxaNVYPj9U+eJoT1wrgvUS0Rcf0YMUDk0DjjL4OubgOkrSJXPGANZf8UhoeOYJcH+h4z X-Received: by 2002:a92:dc44:0:b0:331:9a82:33f6 with SMTP id x4-20020a92dc44000000b003319a8233f6mr197544ilq.5.1683212820176; Thu, 04 May 2023 08:07:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683212820; cv=none; d=google.com; s=arc-20160816; b=NEMQ9mOOgsan79w8Imh1+kz8X9Ta8fybqt+GfF1j6TwZhEDMFTkwRqp24etm5jDN7N we2lboH7gDwUTuB83VsfpqwpaY6VwdmENK86DB00bsUgTY9cLstPkvecHfgLs3TgEXa/ eP2JCgmgnKvWEj1j928C/u8uQNev0Dad2ICqW8WrySXjN6cRzaTxdY2iym3b7S+M8rPg q+NdwoSIbVhHAlxR5f8n0abLf7L9U7tMBGWtrdr5vvOt6MSCxXH6bwqOSVBAjf4dQf4A O+TNsm3evBjDMMC0/vLzuusG+qx77tqbaUeCEpQgxJ53Ohuw740F0LTyCvxX96ngIvx+ 9jVA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=cmjaIRPyR5OKdf02GACB/fx4eJBLPx7/7rEGOj0i7h0=; b=VJJnjQTBKupBUoEKM7YKr3hpVC4+Ae+b0ax6TbJ2VA2tIrO/58a2paawt4PNAhkwnk +HbrYmnjScbaDVjR8w5BrfDvd+DA1EmNI87e15QhvbUPugl1cyIGYI+kfDB73GxmtF9R dgcRQFWpbllSGmM4FCLkrIsj2noKdMxFhi8JBkH8IjXW6noc11UxjPLYW9jCznfzxoL+ af8R4ZywrYDweYxU+ZR8CU9VIXCbNhvRI5elQNNirMfftlQI4MDi1emf3BnJvg0AXwGh 34T7SrQvoKzk6Trh02F3Wcp0WjnSZ4wjAYAcb52R/OJdWIRUi69w8988u8iON57GxBUb vqoQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=Npa8tJhk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u9-20020a92da89000000b0033133403881si3819376iln.169.2023.05.04.08.06.38; Thu, 04 May 2023 08:07:00 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=Npa8tJhk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230458AbjEDO73 (ORCPT + 99 others); Thu, 4 May 2023 10:59:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52350 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230427AbjEDO7G (ORCPT ); Thu, 4 May 2023 10:59:06 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 532D72D5D for ; Thu, 4 May 2023 07:58:44 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id 3f1490d57ef6-b9a79db4e7fso769947276.0 for ; Thu, 04 May 2023 07:58:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1683212304; x=1685804304; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=cmjaIRPyR5OKdf02GACB/fx4eJBLPx7/7rEGOj0i7h0=; b=Npa8tJhk/Rl57/4enBWz8lg2nvQDhdQTSnbVDFE2lgk2BLJxivKtPSY6hUFdn3wW6l v1REXZVoP5NsVUr0gD29PV9sRNxZnxP/CQptfuBxi+cjV+7YFDPjVBomfDNpvo1skWzP PtFEt9GOtleAd8qMiZmFL3/BnaGyZInLz7SjaJvEghEyiePkZYORM8TgpbRCru2WZVBB BpLgrQcZXXRuLhuylbktYv7D8qn/V79l2U6AZ4B+8k/XskfQ4zVdhSTblZ7/mNXS3qiu J0jwtX1DfmT6RtM9sTzA8GAO0zMmoXp75diF6r8cD0zh5KIAVC+F0Td4y/Cs3lFVqBFa qQGg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683212304; x=1685804304; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=cmjaIRPyR5OKdf02GACB/fx4eJBLPx7/7rEGOj0i7h0=; b=guOjT48mz0iLc4GArE8kSZyv1ZKBC0hJtiBgXkmNNfbueNpg3aOp5cKEHSu81tIO0Q DXodMY3VsKNujAYC20tkXfNipJXryyAkm/AuLLFHC1Y/yG7CMLKwItpQ12vLvwHhAcjw SfgsOaDfVVIBSCev5X8FfPpb0610AsBmPB/YDr9xUp4ZbhSD6ILirCdOYvBEU4N9Jgmb QG/NPiz179iceVFzJQyOISGOwZ9MipcqC1SUtEc8w0DjfTAXr1FszJurdPHmF0V60AMu whL9K7QWa/d1jcp1PnJPq6YlfUqUlO2y/Iwi+yC9ezQZbOFBaWuuGd7Z24WbqCc1Bgm/ 45BA== X-Gm-Message-State: AC+VfDwSx/dksydX73l1VRL/tEU7inkoyVQ27mJ1qIeBY7+BFkgg2sjd 3vWGnkg5SAqj9gyH8ylUJ3Hhu/emCPv4mA== X-Received: from joychakr.c.googlers.com ([fda3:e722:ac3:cc00:4f:4b78:c0a8:6ea]) (user=joychakr job=sendgmr) by 2002:a05:6902:993:b0:b9d:fe66:a424 with SMTP id bv19-20020a056902099300b00b9dfe66a424mr125691ybb.2.1683212304642; Thu, 04 May 2023 07:58:24 -0700 (PDT) Date: Thu, 4 May 2023 14:57:35 +0000 In-Reply-To: <20230504145737.286444-1-joychakr@google.com> Mime-Version: 1.0 References: <20230504145737.286444-1-joychakr@google.com> X-Mailer: git-send-email 2.40.1.495.gc816e09b53d-goog Message-ID: <20230504145737.286444-6-joychakr@google.com> Subject: [PATCH 5/7] dmaengine: pl330: Quirk to optimize AxSize for peripheral usecases From: Joy Chakraborty To: Vinod Koul , Rob Herring , Krzysztof Kozlowski Cc: dmaengine@vger.kernel.org, linux-kernel@vger.kernel.org, devicetree@vger.kernel.org, manugautam@google.com, danielmentz@google.com, sjadavani@google.com, Joy Chakraborty Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Add quirk "arm,pl330-optimize-dev2mem-axsize" to choose maximum possible AxSize for transactions towards memory during usecases which copy data between memory and peripherals. Currently PL330 driver chooses equal AxLen and AxSize for both loads and stores to/from memory and peripherals which is inefficient towards memory as the whole bus width is not used for transfers as a peripheral might be limited to use only a narrow size of the buswidth available. Example scenario: A peripheral might require data byte by byte which would make AxSize = 1 byte and AxLen = 16 for both load from memory and store to Peripheral. This can be optimized for memory by using maximum AxSize (say 16bytes) then load from memory can be done with AxSize = 16byte, AxLen = 1 and store to peripheral with AxSize = 1byte, AxLen = 16 beats. Instruction setup with quirk : 512bytes copy from Memory(16bytes * 4beats) to Peripheral(4bytes * 16 beats) --- DMAMOV CCR 0xbd0239 DMAMOV SAR 0xffffe000 DMAMOV DAR 0xffffc860 DMALP_1 7 DMAFLUSHP 0 DMAWFPB 0 DMALDB DMASTPB 0 DMALPENDA_1 bjmpto_7 DMASEV 3 DMAEND --- Signed-off-by: Joy Chakraborty --- drivers/dma/pl330.c | 105 +++++++++++++++++++++++++++++++++++++------- 1 file changed, 89 insertions(+), 16 deletions(-) diff --git a/drivers/dma/pl330.c b/drivers/dma/pl330.c index e5e610c91f18..b4933fab8a62 100644 --- a/drivers/dma/pl330.c +++ b/drivers/dma/pl330.c @@ -35,6 +35,7 @@ #define PL330_QUIRK_BROKEN_NO_FLUSHP BIT(0) #define PL330_QUIRK_PERIPH_BURST BIT(1) +#define PL330_QUIRK_OPTIMIZE_DEV2MEM_AXSIZE BIT(2) enum pl330_cachectrl { CCTRL0, /* Noncacheable and nonbufferable */ @@ -519,6 +520,10 @@ static struct pl330_of_quirks { { .quirk = "arm,pl330-periph-burst", .id = PL330_QUIRK_PERIPH_BURST, + }, + { + .quirk = "arm,pl330-optimize-dev2mem-axsize", + .id = PL330_QUIRK_OPTIMIZE_DEV2MEM_AXSIZE, } }; @@ -2677,6 +2682,56 @@ static inline int get_burst_len(struct dma_pl330_desc *desc, unsigned int brst_s return burst_len; } +/* + * Returns burst size to be used to copy data from/to memory during a + * peripheral transfer + */ +static unsigned int get_periph_mem_brst_sz(dma_addr_t addr, size_t len, + struct dma_pl330_chan *pch, int quirks) +{ + unsigned int burst, burst_size = pch->burst_sz; + + if (quirks & PL330_QUIRK_OPTIMIZE_DEV2MEM_AXSIZE) { + /* Select max possible burst size */ + burst = pch->dmac->pcfg.data_bus_width / 8; + + /* + * Make sure we use a burst size that aligns with the memory and length. + */ + while ((addr | len) & (burst - 1)) + burst /= 2; + + burst_size = __ffs(burst); + } + return burst_size; +} + +/* + * Returns burst length to be used to copy data from/to memory during a + * peripheral transfer + */ +static unsigned int get_periph_mem_brst_len(struct dma_pl330_desc *desc, + struct dma_pl330_chan *pch, + unsigned int burst_size, int quirks) +{ + unsigned int burst_len = pch->burst_len; + + if (quirks & PL330_QUIRK_OPTIMIZE_DEV2MEM_AXSIZE && + burst_size != pch->burst_sz) { + /* Select max possible burst len */ + burst_len = get_burst_len(desc, burst_size); + + /* + * Adjust AxLen to keep number of bytes same in Load/Store + */ + if (burst_size > pch->burst_sz) + burst_len = pch->burst_len >> (burst_size - pch->burst_sz); + else + pch->burst_len = burst_len >> (pch->burst_sz - burst_size); + } + return burst_len; +} + static struct dma_async_tx_descriptor *pl330_prep_dma_cyclic( struct dma_chan *chan, dma_addr_t dma_addr, size_t len, size_t period_len, enum dma_transfer_direction direction, @@ -2684,8 +2739,8 @@ static struct dma_async_tx_descriptor *pl330_prep_dma_cyclic( { struct dma_pl330_desc *desc = NULL, *first = NULL; struct dma_pl330_chan *pch = to_pchan(chan); + unsigned int i, burst_size, burst_len; struct pl330_dmac *pl330 = pch->dmac; - unsigned int i; dma_addr_t dst; dma_addr_t src; @@ -2729,28 +2784,35 @@ static struct dma_async_tx_descriptor *pl330_prep_dma_cyclic( return NULL; } + burst_size = get_periph_mem_brst_sz(dma_addr, period_len, pch, pl330->quirks); + burst_len = get_periph_mem_brst_len(desc, pch, burst_size, pl330->quirks); + switch (direction) { case DMA_MEM_TO_DEV: desc->rqcfg.src_inc = 1; desc->rqcfg.dst_inc = 0; src = dma_addr; dst = pch->fifo_dma; + desc->rqcfg.src_brst_size = burst_size; + desc->rqcfg.src_brst_len = burst_len; + desc->rqcfg.dst_brst_size = pch->burst_sz; + desc->rqcfg.dst_brst_len = pch->burst_len; break; case DMA_DEV_TO_MEM: desc->rqcfg.src_inc = 0; desc->rqcfg.dst_inc = 1; src = pch->fifo_dma; dst = dma_addr; + desc->rqcfg.src_brst_size = pch->burst_sz; + desc->rqcfg.src_brst_len = pch->burst_len; + desc->rqcfg.dst_brst_size = burst_size; + desc->rqcfg.dst_brst_len = burst_len; break; default: break; } desc->rqtype = direction; - desc->rqcfg.src_brst_size = pch->burst_sz; - desc->rqcfg.src_brst_len = pch->burst_len; - desc->rqcfg.dst_brst_size = pch->burst_sz; - desc->rqcfg.dst_brst_len = pch->burst_len; desc->bytes_requested = period_len; fill_px(&desc->px, dst, src, period_len); @@ -2850,7 +2912,11 @@ pl330_prep_slave_sg(struct dma_chan *chan, struct scatterlist *sgl, { struct dma_pl330_desc *first, *desc = NULL; struct dma_pl330_chan *pch = to_pchan(chan); + unsigned int burst_size, burst_len; + struct pl330_dmac *pl330; struct scatterlist *sg; + dma_addr_t mem_addr; + size_t len; int i; if (unlikely(!pch || !sgl || !sg_len)) @@ -2862,13 +2928,12 @@ pl330_prep_slave_sg(struct dma_chan *chan, struct scatterlist *sgl, return NULL; first = NULL; + pl330 = pch->dmac; for_each_sg(sgl, sg, sg_len, i) { desc = pl330_get_desc(pch); if (!desc) { - struct pl330_dmac *pl330 = pch->dmac; - dev_err(pch->dmac->ddma.dev, "%s:%d Unable to fetch desc\n", __func__, __LINE__); @@ -2882,29 +2947,37 @@ pl330_prep_slave_sg(struct dma_chan *chan, struct scatterlist *sgl, else list_add_tail(&desc->node, &first->node); + mem_addr = sg_dma_address(sg); + len = sg_dma_len(sg); + + burst_size = get_periph_mem_brst_sz(mem_addr, len, pch, pl330->quirks); + burst_len = get_periph_mem_brst_len(desc, pch, burst_size, pl330->quirks); + switch (direction) { case DMA_MEM_TO_DEV: desc->rqcfg.src_inc = 1; desc->rqcfg.dst_inc = 0; - fill_px(&desc->px, pch->fifo_dma, sg_dma_address(sg), - sg_dma_len(sg)); + desc->rqcfg.src_brst_size = burst_size; + desc->rqcfg.src_brst_len = burst_len; + desc->rqcfg.dst_brst_size = pch->burst_sz; + desc->rqcfg.dst_brst_len = pch->burst_len; + fill_px(&desc->px, pch->fifo_dma, mem_addr, len); break; case DMA_DEV_TO_MEM: desc->rqcfg.src_inc = 0; desc->rqcfg.dst_inc = 1; - fill_px(&desc->px, sg_dma_address(sg), pch->fifo_dma, - sg_dma_len(sg)); + desc->rqcfg.src_brst_size = pch->burst_sz; + desc->rqcfg.src_brst_len = pch->burst_len; + desc->rqcfg.dst_brst_size = burst_size; + desc->rqcfg.dst_brst_len = burst_len; + fill_px(&desc->px, mem_addr, pch->fifo_dma, len); break; default: break; } - desc->rqcfg.src_brst_size = pch->burst_sz; - desc->rqcfg.src_brst_len = pch->burst_len; - desc->rqcfg.dst_brst_size = pch->burst_sz; - desc->rqcfg.dst_brst_len = pch->burst_len; desc->rqtype = direction; - desc->bytes_requested = sg_dma_len(sg); + desc->bytes_requested = len; } /* Return the last desc in the chain */ -- 2.40.1.495.gc816e09b53d-goog