Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp911658pxj; Thu, 27 May 2021 14:48:25 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzR+qLWrONEb2F++MjjEbtjFD3U0bvoyRzyGLQ65TABPsMXmIBaYia0lvWngcnb2ylXYLpB X-Received: by 2002:a5d:9e51:: with SMTP id i17mr4551083ioi.122.1622152105743; Thu, 27 May 2021 14:48:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1622152105; cv=none; d=google.com; s=arc-20160816; b=Sy+UsopvwdCBzY/Z+gKTEFnfaLi79X2gWpkk60d3ec35y80IlEstx9H4pHQQ0TsA4h 4pRAlkajrJNfe+rRken0EAzWwMhYE2samYFTwoprYa962ZK2RfP4Oa5csjq6n9Y5AJCj sGrh7oAlPlXd/mJgKfqLhCfFo5he8nxKveo55UJjln+tZ/rHsBx1xKBKB9OAMQJrYF+J 9HCL8PoAsdApkQyV5L3l7YpUdASt+FYci4Ood8+NMXUGvoRmRHcextBRCPgmMy5wT1NK HX66HhKDuJwKGRz9SOCiDj1TZx4Knk7M9gZwEw4JvpDVATMULYOKVFZViBleIoJiuRbA eVtA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:cc:to:subject:from:date; bh=Kjl+DI+aRLY8WsVySKp8YOpEMfS8yQUqaVXHeEOUhqo=; b=d1MOZl1e25r+wvqYGP+C9QAcAY4Mn3vmy0pe0F1tb+Trmy/5FIGpL3G6xpKA2jzYuL MSU74NSGysBbM+PSKNiu8GWfE9oR8/L5LrSqprn4C2fsfvYZRFclicYxsEIucxT/uYSp P2gErTz6xnYffT7ZaEUiNstzW+68YYnwtCPVSvIeR4eYMb2DYCtHgL1yKm4f1hr0loBP 5CobCQIRbccK1v2aq7oiT/0g5IYefYGiZOOQPngojXe2kZWbrqeGiDRfUROzk1vFD+4e P3aRY7+B7cMOd78d3FD5CjczEft9Kfa/iz7LC03AvEbiaZhOaS/PtbJYCm5WF7ncMqGf hMCQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=crapouillou.net Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u10si4344755ill.152.2021.05.27.14.48.12; Thu, 27 May 2021 14:48:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=crapouillou.net Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235680AbhE0PdR convert rfc822-to-8bit (ORCPT + 99 others); Thu, 27 May 2021 11:33:17 -0400 Received: from aposti.net ([89.234.176.197]:41566 "EHLO aposti.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235395AbhE0PdQ (ORCPT ); Thu, 27 May 2021 11:33:16 -0400 Date: Thu, 27 May 2021 13:43:13 +0100 From: Paul Cercueil Subject: Re: [PATCH v5 1/3] drm: Add support for GEM buffers backed by non-coherent memory To: Tomi Valkeinen Cc: Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Daniel Vetter , linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-mips@vger.kernel.org, Christoph Hellwig , list@opendingux.net, Lokesh Vutla , "Yadav, Pratyush" Message-Id: <1CNRTQ.12RZKDP9M4XR3@crapouillou.net> In-Reply-To: <26b5424a-ff4c-090b-a08f-0e3e32278180@ideasonboard.com> References: <20210523170415.90410-1-paul@crapouillou.net> <20210523170415.90410-2-paul@crapouillou.net> <26b5424a-ff4c-090b-a08f-0e3e32278180@ideasonboard.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1; format=flowed Content-Transfer-Encoding: 8BIT Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Tomi, Le jeu., mai 27 2021 at 13:40:07 +0300, Tomi Valkeinen a ?crit : > On 23/05/2021 20:04, Paul Cercueil wrote: >> Having GEM buffers backed by non-coherent memory is interesting in >> the >> particular case where it is faster to render to a non-coherent buffer >> then sync the data cache, than to render to a write-combine buffer, >> and >> (by extension) much faster than using a shadow buffer. This is true >> for >> instance on some Ingenic SoCs, where even simple blits (e.g. memcpy) >> are about three times faster using this method. >> >> Add a 'map_noncoherent' flag to the drm_gem_cma_object structure, >> which >> can be set by the drivers when they create the dumb buffer. >> >> Since this really only applies to software rendering, disable this >> flag >> as soon as the CMA objects are exported via PRIME. >> >> v3: New patch. Now uses a simple 'map_noncoherent' flag to control >> how >> the objects are mapped, and use the new dma_mmap_pages function. >> >> v4: Make sure map_noncoherent is always disabled when creating GEM >> objects meant to be used with dma-buf. >> >> Signed-off-by: Paul Cercueil >> Acked-by: Thomas Zimmermann >> --- >> drivers/gpu/drm/drm_gem_cma_helper.c | 38 >> +++++++++++++++++++++------- >> include/drm/drm_gem_cma_helper.h | 3 +++ >> 2 files changed, 32 insertions(+), 9 deletions(-) >> >> diff --git a/drivers/gpu/drm/drm_gem_cma_helper.c >> b/drivers/gpu/drm/drm_gem_cma_helper.c >> index 7942cf05cd93..235c7a63da2b 100644 >> --- a/drivers/gpu/drm/drm_gem_cma_helper.c >> +++ b/drivers/gpu/drm/drm_gem_cma_helper.c >> @@ -46,6 +46,7 @@ static const struct drm_gem_object_funcs >> drm_gem_cma_default_funcs = { >> * __drm_gem_cma_create - Create a GEM CMA object without >> allocating memory >> * @drm: DRM device >> * @size: size of the object to allocate >> + * @private: true if used for internal purposes >> * >> * This function creates and initializes a GEM CMA object of the >> given size, >> * but doesn't allocate any memory to back the object. >> @@ -55,11 +56,11 @@ static const struct drm_gem_object_funcs >> drm_gem_cma_default_funcs = { >> * error code on failure. >> */ >> static struct drm_gem_cma_object * >> -__drm_gem_cma_create(struct drm_device *drm, size_t size) >> +__drm_gem_cma_create(struct drm_device *drm, size_t size, bool >> private) >> { >> struct drm_gem_cma_object *cma_obj; >> struct drm_gem_object *gem_obj; >> - int ret; >> + int ret = 0; >>  if (drm->driver->gem_create_object) >> gem_obj = drm->driver->gem_create_object(drm, size); >> @@ -73,7 +74,14 @@ __drm_gem_cma_create(struct drm_device *drm, >> size_t size) >>  cma_obj = container_of(gem_obj, struct drm_gem_cma_object, >> base); >> - ret = drm_gem_object_init(drm, gem_obj, size); >> + if (private) { >> + drm_gem_private_object_init(drm, gem_obj, size); >> + >> + /* Always use writecombine for dma-buf mappings */ >> + cma_obj->map_noncoherent = false; >> + } else { >> + ret = drm_gem_object_init(drm, gem_obj, size); >> + } >> if (ret) >> goto error; >> @@ -111,12 +119,19 @@ struct drm_gem_cma_object >> *drm_gem_cma_create(struct drm_device *drm, >>  size = round_up(size, PAGE_SIZE); >> - cma_obj = __drm_gem_cma_create(drm, size); >> + cma_obj = __drm_gem_cma_create(drm, size, false); >> if (IS_ERR(cma_obj)) >> return cma_obj; >> - cma_obj->vaddr = dma_alloc_wc(drm->dev, size, &cma_obj->paddr, >> - GFP_KERNEL | __GFP_NOWARN); >> + if (cma_obj->map_noncoherent) { >> + cma_obj->vaddr = dma_alloc_noncoherent(drm->dev, size, >> + &cma_obj->paddr, >> + DMA_TO_DEVICE, >> + GFP_KERNEL | __GFP_NOWARN); >> + } else { >> + cma_obj->vaddr = dma_alloc_wc(drm->dev, size, &cma_obj->paddr, >> + GFP_KERNEL | __GFP_NOWARN); >> + } >> if (!cma_obj->vaddr) { >> drm_dbg(drm, "failed to allocate buffer with size %zu\n", >> size); >> @@ -432,7 +447,7 @@ drm_gem_cma_prime_import_sg_table(struct >> drm_device *dev, >> return ERR_PTR(-EINVAL); >>  /* Create a CMA GEM buffer. */ >> - cma_obj = __drm_gem_cma_create(dev, attach->dmabuf->size); >> + cma_obj = __drm_gem_cma_create(dev, attach->dmabuf->size, true); >> if (IS_ERR(cma_obj)) >> return ERR_CAST(cma_obj); >> @@ -499,8 +514,13 @@ int drm_gem_cma_mmap(struct drm_gem_object >> *obj, struct vm_area_struct *vma) >>  cma_obj = to_drm_gem_cma_obj(obj); >> - ret = dma_mmap_wc(cma_obj->base.dev->dev, vma, cma_obj->vaddr, >> - cma_obj->paddr, vma->vm_end - vma->vm_start); >> + vma->vm_page_prot = vm_get_page_prot(vma->vm_flags); >> + if (!cma_obj->map_noncoherent) >> + vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot); >> + >> + ret = dma_mmap_pages(cma_obj->base.dev->dev, >> + vma, vma->vm_end - vma->vm_start, >> + virt_to_page(cma_obj->vaddr)); > > This breaks mmap on TI's J7 EVM (tidss driver). All DRM apps just die > when doing mmap. Changing these lines back to dma_mmap_wc() makes it > work. > > Is dma_alloc_wc() even compatible with dma_mmap_pages()? My bad, dma_mmap_wc() is not just a regular mmap with writecombine page protection. The solution, I guess, would be to call dma_mmap_wc() if (!cma_obj->map_noncoherent). I can send a patch later this week, unless you already have one? Cheers, -Paul