Received: by 2002:ac2:464d:0:0:0:0:0 with SMTP id s13csp79285lfo; Tue, 17 May 2022 16:35:31 -0700 (PDT) X-Google-Smtp-Source: ABdhPJycRtm0Jw+/oRa5h+AxUP8ekKmFtDnk70JCUAeSp223wNRhgdFBgvIBhcffoTESxI07Mlhy X-Received: by 2002:a63:b45:0:b0:3c1:9a7c:8cb2 with SMTP id a5-20020a630b45000000b003c19a7c8cb2mr21093311pgl.197.1652830530889; Tue, 17 May 2022 16:35:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1652830530; cv=none; d=google.com; s=arc-20160816; b=Cze3JADABL91P383I2R258P3HkjbfkB0gCv0B1ZrDi0pIZ5XxSs4P6ft7WyJE77bB3 A7TaMMMN/Q2lKCugnitXwhCTXUCmyaaLKKsahfsBSvzJWUDnbSeaVG2HutoqqmTman6d nhaoPALbd37xQSAhISxUyxAL9IiAYEmhex2I56ObXVehmQfQ8pdCnhGSKfwmenmZnOcB N+FKd5F9CohCHsixtO6Zs0t0dHHJePNF8bnM2hzJLIaiatgQ6HK3/ZbcqZMLozKhbBLF 19cCZ7SLfJj8b+48Uzwmtnz0lYB48ab1kVqRYBDyrAqsTyg2nC0EFjaBIgfRUosgGFbB uXrg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=YeWHEdiEl4Cq8ce8kfBVpVlo8D7OiDCUHl/qifLiacg=; b=Bny/YPFjhx3xwJghaO+fPuUQpACOucjxa2WDazacWIY1+Ydb72BHi7Q41YSyMxusLV 1Vuy3zCXVyfX8y9fP8ZPNQLalpgbfutMynXhx1W8zXMTSMXO9EsVoN7ZcFbABp5QuUqs t/N5qkULfnTZ56gAqC2Cbue2e5QQwyZ+oJVojiqMMY+8DqrC0kaZfo4EjsAaYecal0Q6 QHRvYhb6wMUHbBBoe58XPq0E2U6DMWwPtXb4D/m8UsooJWJ8nWk6/l+SCWgGHoLDFXnp bno/2aY/jSfXhC/idvoO1ioApVC33a2P1SNY++eJY+aLKTpzCqHabM5pXW7NaguprUGp Po3w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=Ac44BK8O; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q18-20020a17090311d200b0016199633218si620643plh.457.2022.05.17.16.34.57; Tue, 17 May 2022 16:35:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=Ac44BK8O; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231639AbiEQXJw (ORCPT + 99 others); Tue, 17 May 2022 19:09:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56932 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229512AbiEQXJu (ORCPT ); Tue, 17 May 2022 19:09:50 -0400 Received: from mail-ed1-x52e.google.com (mail-ed1-x52e.google.com [IPv6:2a00:1450:4864:20::52e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7694337AA8 for ; Tue, 17 May 2022 16:09:49 -0700 (PDT) Received: by mail-ed1-x52e.google.com with SMTP id eg11so806566edb.11 for ; Tue, 17 May 2022 16:09:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=YeWHEdiEl4Cq8ce8kfBVpVlo8D7OiDCUHl/qifLiacg=; b=Ac44BK8Oxljl1wYTS+T/ipuXIsyXnhAcD2EFhI7JosK6s2AsQoPQngNUvXID4XQU8R mW0BrgZPM9PBNfCplH1dStY4W0uPggK8Wt+6CMBU7mscY1UFX/6hj+LnoR0uGO52NXre +l1DxyMl5Jwi0KNt25psFVNG6+TgblbCYW2TtaI+GwLWcc0rj9ss2lo1oue9eMU5+mcl QXteQBsXYw0S2TIAGZrSAk7sk04efAskF4tiEnJP+nQNSTQe23HVsY46BflkEcbSzII4 b0A+fWJDBtTWwT2EvWpIPUBE45sfCcfqGke+ggcODtNpXcqxB5rOTK9AMM+Nb18pnlM0 rQ/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=YeWHEdiEl4Cq8ce8kfBVpVlo8D7OiDCUHl/qifLiacg=; b=o4tKoS5aQ2u+/bD8W6RWBeSz6V6y/xQB2GASx0a4WBnBNJ47mF0Wq8j8RhnXmUlnMr b2fUngURGQ5MwQBLmKWT+seOz+m29U0B8F9X8PqEGKsBf14q/iorsoFvahYO7BFkR9aO BX3wiDeBGLJKS9msi4IsBZRfBeY7aN7B2ILi+MPOPSukAyPpV1OGogakn61WHu5ir5rV 565lF0i2xzZr/2aQptlAa/HWOfm23HWPeaEaU5bcTo5eLied6AAv/FcvOiDN30khrVTf uG3X0RFYo9hxwSqQSWv6IshzitqOznUizYBypGEL+b0OxslvPp7z2EnIdHQ6R88X6bPQ XXJA== X-Gm-Message-State: AOAM532HOG5nk7pF2ijxDzEIpmQiJHzmS5QmE9/gO8o16qXcdgNjI9an YnoSbDWWmL93AMIQ4rrXgaYYNUCZrZqP8St2JwUpjA== X-Received: by 2002:a50:ed8b:0:b0:42a:a7e0:f889 with SMTP id h11-20020a50ed8b000000b0042aa7e0f889mr17019013edr.79.1652828987857; Tue, 17 May 2022 16:09:47 -0700 (PDT) MIME-Version: 1.0 References: <20220516171315.2400578-1-tjmercier@google.com> <175c5af3-9224-9c8e-0784-349dad9a2954@amd.com> <0875fa95-3a25-a354-1433-201fca81ed3e@amd.com> In-Reply-To: From: "T.J. Mercier" Date: Tue, 17 May 2022 16:09:36 -0700 Message-ID: Subject: Re: [PATCH v2] dma-buf: Move sysfs work out of DMA-BUF export path To: Greg Kroah-Hartman Cc: =?UTF-8?Q?Christian_K=C3=B6nig?= , Suren Baghdasaryan , Kalesh Singh , Minchan Kim , Greg Kroah-Hartman , John Stultz , Sumit Semwal , Daniel Vetter , Hridya Valsaraju , kernel-team@android.com, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, May 16, 2022 at 11:13 PM Greg Kroah-Hartman wrote: > > On Mon, May 16, 2022 at 05:08:05PM -0700, T.J. Mercier wrote: > > On Mon, May 16, 2022 at 12:21 PM Christian K=C3=B6nig > > wrote: > > > > > > Am 16.05.22 um 20:08 schrieb T.J. Mercier: > > > > On Mon, May 16, 2022 at 10:20 AM Christian K=C3=B6nig > > > > wrote: > > > >> Am 16.05.22 um 19:13 schrieb T.J. Mercier: > > > >>> Recently, we noticed an issue where a process went into direct re= claim > > > >>> while holding the kernfs rw semaphore for sysfs in write (exclusi= ve) > > > >>> mode. This caused processes who were doing DMA-BUF exports and re= leases > > > >>> to go into uninterruptible sleep since they needed to acquire the= same > > > >>> semaphore for the DMA-BUF sysfs entry creation/deletion. In order= to avoid > > > >>> blocking DMA-BUF export for an indeterminate amount of time while > > > >>> another process is holding the sysfs rw semaphore in exclusive mo= de, > > > >>> this patch moves the per-buffer sysfs file creation to the defaul= t work > > > >>> queue. Note that this can lead to a short-term inaccuracy in the = dmabuf > > > >>> sysfs statistics, but this is a tradeoff to prevent the hot path = from > > > >>> being blocked. A work_struct is added to dma_buf to achieve this,= but as > > > >>> it is unioned with the kobject in the sysfs_entry, dma_buf does n= ot > > > >>> increase in size. > > > >> I'm still not very keen of this approach as it strongly feels like= we > > > >> are working around shortcoming somewhere else. > > > >> > > > > My read of the thread for the last version is that we're running in= to > > > > a situation where sysfs is getting used for something it wasn't > > > > originally intended for, but we're also stuck with this sysfs > > > > functionality for dmabufs. > > > > > > > >>> Fixes: bdb8d06dfefd ("dmabuf: Add the capability to expose DMA-BU= F stats in sysfs") > > > >>> Originally-by: Hridya Valsaraju > > > >>> Signed-off-by: T.J. Mercier > > > >>> > > > >>> --- > > > >>> See the originally submitted patch by Hridya Valsaraju here: > > > >>> https://nam11.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F= %2Flkml.org%2Flkml%2F2022%2F1%2F4%2F1066&data=3D05%7C01%7Cchristian.koe= nig%40amd.com%7C794614324d114880a25508da37672e4b%7C3dd8961fe4884e608e11a82d= 994e183d%7C0%7C0%7C637883213566903705%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wL= jAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&s= data=3DbGlA2FeubfSeL5XDHYyWMZqUXfScoCphZjjK4jrqQJs%3D&reserved=3D0 > > > >>> > > > >>> v2 changes: > > > >>> - Defer only sysfs creation instead of creation and teardown per > > > >>> Christian K=C3=B6nig > > > >>> > > > >>> - Use a work queue instead of a kthread for deferred work per > > > >>> Christian K=C3=B6nig > > > >>> --- > > > >>> drivers/dma-buf/dma-buf-sysfs-stats.c | 56 +++++++++++++++++++= +------- > > > >>> include/linux/dma-buf.h | 14 ++++++- > > > >>> 2 files changed, 54 insertions(+), 16 deletions(-) > > > >>> > > > >>> diff --git a/drivers/dma-buf/dma-buf-sysfs-stats.c b/drivers/dma-= buf/dma-buf-sysfs-stats.c > > > >>> index 2bba0babcb62..67b0a298291c 100644 > > > >>> --- a/drivers/dma-buf/dma-buf-sysfs-stats.c > > > >>> +++ b/drivers/dma-buf/dma-buf-sysfs-stats.c > > > >>> @@ -11,6 +11,7 @@ > > > >>> #include > > > >>> #include > > > >>> #include > > > >>> +#include > > > >>> > > > >>> #include "dma-buf-sysfs-stats.h" > > > >>> > > > >>> @@ -168,10 +169,46 @@ void dma_buf_uninit_sysfs_statistics(void) > > > >>> kset_unregister(dma_buf_stats_kset); > > > >>> } > > > >>> > > > >>> +static void sysfs_add_workfn(struct work_struct *work) > > > >>> +{ > > > >>> + struct dma_buf_sysfs_entry *sysfs_entry =3D > > > >>> + container_of(work, struct dma_buf_sysfs_entry, sysf= s_add_work); > > > >>> + struct dma_buf *dmabuf =3D sysfs_entry->dmabuf; > > > >>> + > > > >>> + /* > > > >>> + * A dmabuf is ref-counted via its file member. If this han= dler holds the only > > > >>> + * reference to the dmabuf, there is no need for sysfs kobj= ect creation. This is an > > > >>> + * optimization and a race; when the reference count drops = to 1 immediately after > > > >>> + * this check it is not harmful as the sysfs entry will sti= ll get cleaned up in > > > >>> + * dma_buf_stats_teardown, which won't get called until the= final dmabuf reference > > > >>> + * is released, and that can't happen until the end of this= function. > > > >>> + */ > > > >>> + if (file_count(dmabuf->file) > 1) { > > > >> Please completely drop that. I see absolutely no justification for= this > > > >> additional complexity. > > > >> > > > > This case gets hit around 5% of the time in my testing so the else = is > > > > not a completely unused branch. > > > > > > Well I can only repeat myself: This means that your userspace is > > > severely broken! > > > > > > DMA-buf are meant to be long living objects > > This patch addresses export *latency* regardless of how long-lived the > > object is. Even a single, long-lived export will benefit from this > > change if it would otherwise be blocked on adding an object to sysfs. > > I think attempting to improve this latency still has merit. > > Fixing the latency is nice, but as it's just pushing the needed work off > to another code path, it will take longer overall for the sysfs stuff to > be ready for userspace to see. > > Perhaps we need to step back and understand what this code is supposed > to be doing. As I recall, it was created because some systems do not > allow debugfs anymore, and they wanted the debugging information that > the dmabuf code was exposing to debugfs on a "normal" system. Moving > that logic to sysfs made sense, but now I am wondering why we didn't see > these issues in the debugfs code previously? > The debugfs stuff doesn't happen on every export, right? > Perhaps we should go just one step further and make a misc device node > for dmabug debugging information to be in and just have userspace > poll/read on the device node and we spit the info that used to be in > debugfs out through that? That way this only affects systems when they > want to read the information and not normal code paths? Yeah that's a > hack, but this whole thing feels overly complex now. > And deprecate sysfs support? I'm happy to try out anything you think might be a better way. As far as complexity of this patch, this revision is a much simpler version of the one from Hridya you already reviewed. > thanks, > > greg k-h