Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp1793563pxk; Fri, 2 Oct 2020 21:04:42 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyjsEI6ty87IGtSL6l1VpKMBhUL6GHjmzRqamsBpK+5g2Yzx+aEf/5fQDjfILWkVT02ABKC X-Received: by 2002:a50:8e58:: with SMTP id 24mr6302505edx.226.1601697882726; Fri, 02 Oct 2020 21:04:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1601697882; cv=none; d=google.com; s=arc-20160816; b=Hsh9z1QAIvXSnkEYMK/HH/SPmgJHrWIFwJ0kFo60CyvGT3Hn1tDSgba36M35SOGsxH kfqeBwyjBM5TKc2dbHD16dbpS+M2cb2Z4krnvpTxCMhdsKxlb+0Tn8JOk5kyp2njCyxW YNI5i48OrOzmiXssooUQxfSIrVjwJ0KVa9heckCXHQfJMJ8kkVEozVW4Bp2GfIjc+O6C s4SDbIXZ6dabKuuocNbXfuBf0m0Phzsm6OqQZIGMplFkRPe0i4A8/ugHT3aaPV2FjpzR q9wycGfU8j726nzawbUpqTxOPPuNUEpZ/bW259umxysclbdCbPfvvFRJo+Rhe9jqjaFq xOIg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=VjaWCVVHnQkyjgZIf+F5rOHFvTh6Oj2mEVi69T9s924=; b=Em3FFttkur5KHi71gZZlo0PxUeRJMBRWO6A0SXk3+L+6Q12VtKDjDBp0aZw12J4S8g 0f5aOkIM9IM00z+YRPAZaqGgqRuN/llPasD7KGWdl7YuTOScUOdHG/Iy5mIqITpAicPD 6zKQnIiZlOSTS9tBsmMTgidd4d1wSvxJwKDz5xE65cgbaSbK3sSVyaO36Hs+BZaYDKqf g+Q/EXnL8cDXQFBkRzHX29LU+A7htns/wdfgCuPPLH6uZAff6V0y8Uv0Pk4UpU0UWHju laDPrGYGRpZhO+s1u7+Lnrye551+LNMuejgLRrOqZ15EIKpBepwT9fT1yE6SJwCkr4tz yL5A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=KHGKrTS4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id qo18si2452543ejb.161.2020.10.02.21.04.20; Fri, 02 Oct 2020 21:04:42 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=KHGKrTS4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725772AbgJCEDG (ORCPT + 99 others); Sat, 3 Oct 2020 00:03:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39032 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725730AbgJCEDF (ORCPT ); Sat, 3 Oct 2020 00:03:05 -0400 Received: from mail-pf1-x442.google.com (mail-pf1-x442.google.com [IPv6:2607:f8b0:4864:20::442]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5C725C0613E2 for ; Fri, 2 Oct 2020 21:03:04 -0700 (PDT) Received: by mail-pf1-x442.google.com with SMTP id e10so2229920pfj.1 for ; Fri, 02 Oct 2020 21:03:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=VjaWCVVHnQkyjgZIf+F5rOHFvTh6Oj2mEVi69T9s924=; b=KHGKrTS4eMx4zeqKXxUrtf+ZXVtgB0NHPyLsRT96T1A3zG5TuEhdRy6U18S43zwA1f 2YIldOPesNlBtEKnk0r5EV+J3Air7t7eHaKcV7ydpMka5SLd6wmNy+RFto6yfOJnvnlr 9seQC5gFITUdsj0smCtMwOowEnt+3koJAv3ItloHOn6of1Y8HuK46F7Sf27EaVsMvwCM d3SrxbrCwIldC+c8v3HsbLx9Ncs8DNbfVDrzq/Cogdha17EnpLBWUbW4FJyVsTKZsFdk ZG1LubdsfYfXus12O2dAGXG2UVfZV8VN4fjMGlK++4U8fBGTqP6GiLcn3TSGOjir2b0L EFPA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=VjaWCVVHnQkyjgZIf+F5rOHFvTh6Oj2mEVi69T9s924=; b=KCHUUeHnLXDd8fp9J6RmpM01hB6fVGgbph/rv41c2X1tdPFIi5nCv/CuCsi7lluF81 fRizFXzytMcEdt0ITpQ57D24juhoU+TkB51lSOksHrG0Ca7flpa3K2N6sZvbvbShPunx nW5X/un1WtVvc/qCgKW/TsADahu7Dk4GnRyAlJGo1kVSO8h/YDWEZ5V04pc8NNzOIDlL l4XtmwHeOylhnwy3YxNcZCaapy6JmhETVxM0bQAPqWnGAKXEFdEwn1oUIprgFIEDUcIH LVxUSFrBi5qx42vQEW+0qWqKL81JHVtkYi+HDr/zVBIdOsj/teZoGJc8Ce6rT67RwW7c VBxA== X-Gm-Message-State: AOAM531SzI/05bPS7CMxVFVPyP0rmDcEfyo2bMn/LompGhesiP3/z4M/ M6hwrj4nbgyDujqGa6OsUiOYtXqkZMIsNg== X-Received: by 2002:aa7:8249:0:b029:142:2501:39dd with SMTP id e9-20020aa782490000b0290142250139ddmr5982786pfn.44.1601697783389; Fri, 02 Oct 2020 21:03:03 -0700 (PDT) Received: from localhost.localdomain ([2601:1c2:680:1319:692:26ff:feda:3a81]) by smtp.gmail.com with ESMTPSA id 190sm3909290pfy.22.2020.10.02.21.03.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 02 Oct 2020 21:03:02 -0700 (PDT) From: John Stultz To: lkml Cc: John Stultz , Sumit Semwal , Liam Mark , Laura Abbott , Brian Starkey , Hridya Valsaraju , Suren Baghdasaryan , Sandeep Patil , Daniel Mentz , Chris Goldsworthy , =?UTF-8?q?=C3=98rjan=20Eide?= , Robin Murphy , Ezequiel Garcia , Simon Ser , James Jones , linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org Subject: [PATCH v3 0/7] dma-buf: Performance improvements for system heap & a system-uncached implementation Date: Sat, 3 Oct 2020 04:02:50 +0000 Message-Id: <20201003040257.62768-1-john.stultz@linaro.org> X-Mailer: git-send-email 2.17.1 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hey All, So this is another revision of my patch series to performance optimizations to the dma-buf system heap. Unfortunately, in working these up, I realized the heap-helpers infrastructure we tried to add to miniimize code duplication is not as generic as we intended. For some heaps it makes sense to deal with page lists, for other heaps it makes more sense to track things with sgtables. So this series reworks the system heap to use sgtables, and then consolidates the pagelist method from the heap-helpers into the CMA heap. After which the heap-helpers logic is removed (as it is unused). I'd still like to find a better way to avoid some of the logic duplication in implementing the entire dma_buf_ops handlers per heap. But unfortunately that code is tied somewhat to how the buffer's memory is tracked. After this, the series introduces an optimization that Ørjan Eide implemented for ION that avoids calling sync on attachments that don't have a mapping. Next, an optimization to use larger order pages for the system heap. This change brings us closer to the current performance of the ION code. Unfortunately, after submitting the last round, I realized that part of the reason the page-pooling patch I had included was providing such great performance numbers, was because the network page-pool implementation doesn't zero pages that it pulls from the cache. This is very inappropriate for buffers we pass to userland and was what gave it an unfair advantage (almost constant time performance) relative to ION's allocation performance numbers. I added some patches to zero the buffers manually similar to how ION does it, but I found this resulted in basically no performance improvement from the standard page allocator. Thus I've dropped that patch in this series for now. Unfortunately this means we still have a performance delta from the ION system heap as measured by my microbenchmark, and this delta comes from ION system_heap's use of deferred freeing of pages. So less work is done in the measured interval of the microbenchmark. I'll be looking at adding similar code eventually but I don't want to hold the rest of the patches up on this, as it is still a good improvement over the current code. I've updated the chart I shared earlier with current numbers (including with the unsubmitted net pagepool implementation, and with a different unsubmitted pagepool implementation borrowed from ION) here: https://docs.google.com/spreadsheets/d/1-1C8ZQpmkl_0DISkI6z4xelE08MlNAN7oEu34AnO4Ao/edit?usp=sharing I did add to this series a reworked version of my uncached system heap implementation I was submitting a few weeks back. Since it duplicated a lot of the now reworked system heap code, I realized it would be much simpler to add the functionality to the system_heap implementaiton itself. While not improving the core allocation performance, the uncached heap allocations do result in *much* improved performance on HiKey960 as it avoids a lot of flushing and invalidating buffers that the cpu doesn't touch often. Feedback on these would be great! thanks -john New in v3: * Dropped page-pool patches as after correcting the code to zero buffers, they provided no net performance gain. * Added system-uncached implementation ontop of reworked system-heap. * Use the new sgtable mapping functions, in the system and cma code as Suggested-by: Daniel Mentz * Cleanup: Use page_size() rather then open-coding it Cc: Sumit Semwal Cc: Liam Mark Cc: Laura Abbott Cc: Brian Starkey Cc: Hridya Valsaraju Cc: Suren Baghdasaryan Cc: Sandeep Patil Cc: Daniel Mentz Cc: Chris Goldsworthy Cc: Ørjan Eide Cc: Robin Murphy Cc: Ezequiel Garcia Cc: Simon Ser Cc: James Jones Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org John Stultz (7): dma-buf: system_heap: Rework system heap to use sgtables instead of pagelists dma-buf: heaps: Move heap-helper logic into the cma_heap implementation dma-buf: heaps: Remove heap-helpers code dma-buf: heaps: Skip sync if not mapped dma-buf: system_heap: Allocate higher order pages if available dma-buf: dma-heap: Keep track of the heap device struct dma-buf: system_heap: Add a system-uncached heap re-using the system heap drivers/dma-buf/dma-heap.c | 33 +- drivers/dma-buf/heaps/Makefile | 1 - drivers/dma-buf/heaps/cma_heap.c | 327 +++++++++++++++--- drivers/dma-buf/heaps/heap-helpers.c | 271 --------------- drivers/dma-buf/heaps/heap-helpers.h | 53 --- drivers/dma-buf/heaps/system_heap.c | 480 ++++++++++++++++++++++++--- include/linux/dma-heap.h | 9 + 7 files changed, 741 insertions(+), 433 deletions(-) delete mode 100644 drivers/dma-buf/heaps/heap-helpers.c delete mode 100644 drivers/dma-buf/heaps/heap-helpers.h -- 2.17.1