Received: by 2002:a05:6a10:1a4d:0:0:0:0 with SMTP id nk13csp6005303pxb; Mon, 14 Feb 2022 12:57:26 -0800 (PST) X-Google-Smtp-Source: ABdhPJzRdJ74NdDrquLNM2FMZ1Ae8PvHCayee+ubFfVLtUm3XpWzlalkd7grrKR9oNmqtj2g+24i X-Received: by 2002:a17:90b:4b12:b0:1b9:8932:d47c with SMTP id lx18-20020a17090b4b1200b001b98932d47cmr612918pjb.50.1644872246726; Mon, 14 Feb 2022 12:57:26 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1644872246; cv=none; d=google.com; s=arc-20160816; b=zjJGvrKF8JJFufJTkwSSvRTy1aAOIZ2GF57qH4ZfmPJewh1PcyaFIGFXfSh5YJDNcA RwCMV9jEkWNDvBsC2CSgdzdcJWVgVcRlt5V36D9o2rhEaArd9ygBAJ4LZUABgCPdCrgM Dna3Fwj8njl6THpgyHc3OmmBKCEWYEhxQT9UVjzWCQGeAmD5F8mR/zKo39HCU0WJeXyW JowXnFG4aVygZBauIqgZMx3FuwK4woTTaLxuS2EzbKoQ4p8pD3An+Ue+bN61RdCxoD2h kkuhRt0/Np5NqoONxDGS9kNsoWDT9w0We0BkmtvTE8+con2Vu/E0GT+i6bCRvS+Q1rhD ySSg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=0OkHBiaZEL4gbI88uAFjC4BevdegDkSw2mEVoE6XUQM=; b=KZsGbbt5k+nvreUuzqn9VPBjW4tYajF0ff5rM/fUGlwSatLfPLWp0akCeZpbQcJAtF /qlxAcr5se0K3BLuW/ieOS9yA0SyGvSRmrn5yyNNaTsiuJ9eGYUqKMBmnFWLPq0oPn0y PA1OXQq4CeKKfYE9TaMarC08rjLYHJOLkIaCfaCRxEKJ6MUfnZ5VuaSX4GO18VcsK1i8 AgJ9jzYZcoJLmBWnGJL6CWL41aZj2lBuVxZBXWwTbXEvD/Vn+QhBdnqBHyeOAubcOs1r xlvK7XhSSUGZX0e8kwWWs4RHdu5oKef5m39dfFpJs1dxnOvldnlkHb9jAsohaIMCR9LS ZmyA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b="o/c8WGQ7"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id v70si714461pgd.288.2022.02.14.12.57.10; Mon, 14 Feb 2022 12:57:26 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b="o/c8WGQ7"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229509AbiBNU4R (ORCPT + 99 others); Mon, 14 Feb 2022 15:56:17 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:41504 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229647AbiBNU4N (ORCPT ); Mon, 14 Feb 2022 15:56:13 -0500 Received: from mail-lj1-x22b.google.com (mail-lj1-x22b.google.com [IPv6:2a00:1450:4864:20::22b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D1519DBD16 for ; Mon, 14 Feb 2022 12:55:48 -0800 (PST) Received: by mail-lj1-x22b.google.com with SMTP id a42so8600292ljq.13 for ; Mon, 14 Feb 2022 12:55:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=0OkHBiaZEL4gbI88uAFjC4BevdegDkSw2mEVoE6XUQM=; b=o/c8WGQ7vgEvxtzZSzEGkFFVcPUiHrl0G51Eg5NUiYHpcN3AIktYX1CzP78vLb2HCv LyiqVlMYdkmHuBOpsyo/3V9Cj1FG5i2UtaP2pa4VVE7DPWVR0qrTIeuZJzSShHz31KtV aG2q2Q+V1pIZOhlqrMm86Ieu10aJTAtNG3yAp85NUacZ5B9enUnoF98FgCEJuYwGuOR1 t/YcS4wqhUQgpObHz7ewBESmzKaLyvkXbT7E+XnLM/nDVEbqTumoE8kp9c32zg+fZ/Cm kD09NV6hr9tEchdA4EsbPzDQ4YuOHJVTlkhoDxtAqfFcgQJzv29t30e/mp0RRlaVif14 p9DQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=0OkHBiaZEL4gbI88uAFjC4BevdegDkSw2mEVoE6XUQM=; b=NCGG2unQE9gh/4QZ03pVLvg7V3vpmmuJcjwvV/Wx1KdmubQrg7PBaOPW4UvIGYtvhN sCXjaixEsmm+CGTFgghQxs94w22/QxHDmSrae9oKJPFAojUsfLMXq2Dx9WJ1r7Q5PAR0 IwRdfQRCtI+sPKAjS7P97Uw455SAlRc2Z7GAgsKLZH21vo28vI6wB5IsjjJw1eSHMMrh 9a3PwUqmY35klsDh8uTWgRGVNAEDS3/IkLdELzC1P1c4vzU9EaoqJGKRClMhkR1/GNGH Glllw/2LmFT9Ndj1cdKtYcnLp6ZvrKiGLp/MsXxTtFSsbd691URaOd6TZ8YTTWxSgKtX WJ6A== X-Gm-Message-State: AOAM531gMRmvn6ey0yWhrNOjYyi+sdzPDzXUU+sEeIlmFBsFJgxbZ6tm 5CV/wRXQiH7U4SgFoH6Mu1fNMQH1KgHDbzFDdsUvZT4q/Ag= X-Received: by 2002:a05:6512:139e:: with SMTP id p30mr572513lfa.502.1644869981206; Mon, 14 Feb 2022 12:19:41 -0800 (PST) MIME-Version: 1.0 References: <20220211161831.3493782-1-tjmercier@google.com> <20220211161831.3493782-7-tjmercier@google.com> In-Reply-To: From: Todd Kjos Date: Mon, 14 Feb 2022 12:19:28 -0800 Message-ID: Subject: Re: [RFC v2 6/6] android: binder: Add a buffer flag to relinquish ownership of fds To: Suren Baghdasaryan Cc: Greg Kroah-Hartman , "T.J. Mercier" , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Daniel Vetter , Jonathan Corbet , =?UTF-8?B?QXJ2ZSBIasO4bm5ldsOlZw==?= , Todd Kjos , Martijn Coenen , Joel Fernandes , Christian Brauner , Hridya Valsaraju , Sumit Semwal , =?UTF-8?Q?Christian_K=C3=B6nig?= , Benjamin Gaignard , Liam Mark , Laura Abbott , Brian Starkey , John Stultz , Tejun Heo , Zefan Li , Johannes Weiner , Kalesh Singh , Kenny.Ho@amd.com, DRI mailing list , "open list:DOCUMENTATION" , LKML , linux-media , "moderated list:DMA BUFFER SHARING FRAMEWORK" , cgroups mailinglist Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Feb 14, 2022 at 11:29 AM Suren Baghdasaryan wro= te: > > On Mon, Feb 14, 2022 at 10:33 AM Todd Kjos wrote: > > > > On Fri, Feb 11, 2022 at 11:19 PM Greg Kroah-Hartman > > wrote: > > > > > > On Fri, Feb 11, 2022 at 04:18:29PM +0000, T.J. Mercier wrote: > > > > Title: "android: binder: Add a buffer flag to relinquish ownership of f= ds" > > > > Please drop the "android:" from the title. > > > > > > This patch introduces a buffer flag BINDER_BUFFER_FLAG_SENDER_NO_NE= ED > > > > that a process sending an fd array to another process over binder I= PC > > > > can set to relinquish ownership of the fds being sent for memory > > > > accounting purposes. If the flag is found to be set during the fd a= rray > > > > translation and the fd is for a DMA-BUF, the buffer is uncharged fr= om > > > > the sender's cgroup and charged to the receiving process's cgroup > > > > instead. > > > > > > > > It is up to the sending process to ensure that it closes the fds > > > > regardless of whether the transfer failed or succeeded. > > > > > > > > Most graphics shared memory allocations in Android are done by the > > > > graphics allocator HAL process. On requests from clients, the HAL p= rocess > > > > allocates memory and sends the fds to the clients over binder IPC. > > > > The graphics allocator HAL will not retain any references to the > > > > buffers. When the HAL sets the BINDER_BUFFER_FLAG_SENDER_NO_NEED fo= r fd > > > > arrays holding DMA-BUF fds, the gpu cgroup controller will be able = to > > > > correctly charge the buffers to the client processes instead of the > > > > graphics allocator HAL. > > > > > > > > From: Hridya Valsaraju > > > > Signed-off-by: Hridya Valsaraju > > > > Co-developed-by: T.J. Mercier > > > > Signed-off-by: T.J. Mercier > > > > --- > > > > changes in v2 > > > > - Move dma-buf cgroup charge transfer from a dma_buf_op defined by = every > > > > heap to a single dma-buf function for all heaps per Daniel Vetter a= nd > > > > Christian K=C3=B6nig. > > > > > > > > drivers/android/binder.c | 26 +++++++++++++++++++++++++= + > > > > include/uapi/linux/android/binder.h | 1 + > > > > 2 files changed, 27 insertions(+) > > > > > > > > diff --git a/drivers/android/binder.c b/drivers/android/binder.c > > > > index 8351c5638880..f50d88ded188 100644 > > > > --- a/drivers/android/binder.c > > > > +++ b/drivers/android/binder.c > > > > @@ -42,6 +42,7 @@ > > > > > > > > #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt > > > > > > > > +#include > > > > #include > > > > #include > > > > #include > > > > @@ -2482,8 +2483,10 @@ static int binder_translate_fd_array(struct = list_head *pf_head, > > > > { > > > > binder_size_t fdi, fd_buf_size; > > > > binder_size_t fda_offset; > > > > + bool transfer_gpu_charge =3D false; > > > > const void __user *sender_ufda_base; > > > > struct binder_proc *proc =3D thread->proc; > > > > + struct binder_proc *target_proc =3D t->to_proc; > > > > int ret; > > > > > > > > fd_buf_size =3D sizeof(u32) * fda->num_fds; > > > > @@ -2521,8 +2524,15 @@ static int binder_translate_fd_array(struct = list_head *pf_head, > > > > if (ret) > > > > return ret; > > > > > > > > + if (IS_ENABLED(CONFIG_CGROUP_GPU) && > > > > + parent->flags & BINDER_BUFFER_FLAG_SENDER_NO_NEED) > > > > + transfer_gpu_charge =3D true; > > > > + > > > > for (fdi =3D 0; fdi < fda->num_fds; fdi++) { > > > > u32 fd; > > > > + struct dma_buf *dmabuf; > > > > + struct gpucg *gpucg; > > > > + > > > > binder_size_t offset =3D fda_offset + fdi * sizeof(fd= ); > > > > binder_size_t sender_uoffset =3D fdi * sizeof(fd); > > > > > > > > @@ -2532,6 +2542,22 @@ static int binder_translate_fd_array(struct = list_head *pf_head, > > > > in_reply_to); > > > > if (ret) > > > > return ret > 0 ? -EINVAL : ret; > > > > + > > > > + if (!transfer_gpu_charge) > > > > + continue; > > > > + > > > > + dmabuf =3D dma_buf_get(fd); > > > > + if (IS_ERR(dmabuf)) > > > > + continue; > > > > + > > > > + gpucg =3D gpucg_get(target_proc->tsk); > > > > + ret =3D dma_buf_charge_transfer(dmabuf, gpucg); > > > > + if (ret) { > > > > + pr_warn("%d:%d Unable to transfer DMA-BUF fd = charge to %d", > > > > + proc->pid, thread->pid, target_proc->= pid); > > > > + gpucg_put(gpucg); > > > > + } > > > > + dma_buf_put(dmabuf); > > > > Since we are creating a new gpu cgroup abstraction, couldn't this > > "transfer" be done in userspace by the target instead of in the kernel > > driver? Then this patch would reduce to just a flag on the buffer > > object. > > Are you suggesting to have a userspace accessible cgroup interface for > transferring buffer charges and the target process to use that > interface for requesting the buffer to be charged to its cgroup? Well, I'm asking why we need to do these cgroup-ish actions in the kernel when it seems more natural to do it in userspace. > I'm worried about the case when the target process does not request > the transfer after receiving the buffer with this flag set. The charge > would stay with the wrong process and accounting will be invalid. I suspect this would be implemented in libbinder wherever the fd array object is handled, so it wouldn't require changes to every process. > > Technically, since the proposed cgroup supports charge transfer from > the very beginning, the userspace can check if the cgroup is mounted > and if so then it knows this feature is supported. Has some userspace code for this been written? I'd like to be convinced that these changes need to be in the binder kernel driver instead of in userspace. > > > This also solves the issue that Greg brought up about > > userspace needing to know whether the kernel implements this feature > > (older kernel running with newer userspace). I think we could just > > reserve some flags for userspace to use (and since those flags are > > "reserved" for older kernels, this would enable this feature even for > > old kernels) > > > > > > } > > > > return 0; > > > > } > > > > diff --git a/include/uapi/linux/android/binder.h b/include/uapi/lin= ux/android/binder.h > > > > index 3246f2c74696..169fd5069a1a 100644 > > > > --- a/include/uapi/linux/android/binder.h > > > > +++ b/include/uapi/linux/android/binder.h > > > > @@ -137,6 +137,7 @@ struct binder_buffer_object { > > > > > > > > enum { > > > > BINDER_BUFFER_FLAG_HAS_PARENT =3D 0x01, > > > > + BINDER_BUFFER_FLAG_SENDER_NO_NEED =3D 0x02, > > > > }; > > > > > > > > /* struct binder_fd_array_object - object describing an array of f= ds in a buffer > > > > -- > > > > 2.35.1.265.g69c8d7142f-goog > > > > > > > > > > How does userspace know that binder supports this new flag? And wher= e > > > is the userspace test for this new feature? Isn't there a binder tes= t > > > framework somewhere? > > > > > > thanks, > > > > > > greg k-h