Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965761AbbLPPgM (ORCPT ); Wed, 16 Dec 2015 10:36:12 -0500 Received: from mail-wm0-f47.google.com ([74.125.82.47]:34619 "EHLO mail-wm0-f47.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S964795AbbLPPgJ (ORCPT ); Wed, 16 Dec 2015 10:36:09 -0500 Date: Wed, 16 Dec 2015 16:36:06 +0100 From: Daniel Vetter To: Dmitry Torokhov Cc: Gustavo Padovan , Greg Kroah-Hartman , devel@driverdev.osuosl.org, Andrew Bresticker , Arve =?iso-8859-1?B?SGr4bm5lduVn?= , dri-devel@lists.freedesktop.org, "linux-kernel@vger.kernel.org" , Riley Andrews , linux-media@vger.kernel.org Subject: Re: [PATCH] android: fix warning when releasing active sync point Message-ID: <20151216153606.GR30437@phenom.ffwll.local> Mail-Followup-To: Dmitry Torokhov , Gustavo Padovan , Greg Kroah-Hartman , devel@driverdev.osuosl.org, Andrew Bresticker , Arve =?iso-8859-1?B?SGr4bm5lduVn?= , dri-devel@lists.freedesktop.org, "linux-kernel@vger.kernel.org" , Riley Andrews , linux-media@vger.kernel.org References: <20151215012955.GA28277@dtor-ws> <20151215092601.GI3189@phenom.ffwll.local> <20151215190008.GE883@joana> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Operating-System: Linux phenom 4.2.0-1-amd64 User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4714 Lines: 92 On Tue, Dec 15, 2015 at 11:08:01AM -0800, Dmitry Torokhov wrote: > On Tue, Dec 15, 2015 at 11:00 AM, Gustavo Padovan wrote: > > 2015-12-15 Daniel Vetter : > > > >> On Mon, Dec 14, 2015 at 05:29:55PM -0800, Dmitry Torokhov wrote: > >> > Userspace can close the sync device while there are still active fence > >> > points, in which case kernel produces the following warning: > >> > > >> > [ 43.853176] ------------[ cut here ]------------ > >> > [ 43.857834] WARNING: CPU: 0 PID: 892 at /mnt/host/source/src/third_party/kernel/v3.18/drivers/staging/android/sync.c:439 android_fence_release+0x88/0x104() > >> > [ 43.871741] CPU: 0 PID: 892 Comm: Binder_5 Tainted: G U 3.18.0-07661-g0550ce9 #1 > >> > [ 43.880176] Hardware name: Google Tegra210 Smaug Rev 1+ (DT) > >> > [ 43.885834] Call trace: > >> > [ 43.888294] [] dump_backtrace+0x0/0x10c > >> > [ 43.893697] [] show_stack+0x10/0x1c > >> > [ 43.898756] [] dump_stack+0x74/0xb8 > >> > [ 43.903814] [] warn_slowpath_common+0x84/0xb0 > >> > [ 43.909736] [] warn_slowpath_null+0x14/0x20 > >> > [ 43.915482] [] android_fence_release+0x84/0x104 > >> > [ 43.921582] [] fence_release+0x104/0x134 > >> > [ 43.927066] [] sync_fence_free+0x74/0x9c > >> > [ 43.932552] [] sync_fence_release+0x34/0x48 > >> > [ 43.938304] [] __fput+0x100/0x1b8 > >> > [ 43.943185] [] ____fput+0x8/0x14 > >> > [ 43.947982] [] task_work_run+0xb0/0xe4 > >> > [ 43.953297] [] do_notify_resume+0x44/0x5c > >> > [ 43.958867] ---[ end trace 5a2aa4027cc5d171 ]--- > >> > > >> > Let's fix it by introducing a new optional callback (disable_signaling) > >> > to fence operations so that drivers can do proper clean ups when we > >> > remove last callback for given fence. > >> > > >> > Reviewed-by: Andrew Bresticker > >> > Signed-off-by: Dmitry Torokhov > >> > --- > >> > drivers/dma-buf/fence.c | 6 +++++- > >> > drivers/staging/android/sync.c | 8 ++++++++ > >> > include/linux/fence.h | 2 ++ > >> > 3 files changed, 15 insertions(+), 1 deletion(-) > >> > > >> > diff --git a/drivers/dma-buf/fence.c b/drivers/dma-buf/fence.c > >> > index 7b05dbe..0ed73ad 100644 > >> > --- a/drivers/dma-buf/fence.c > >> > +++ b/drivers/dma-buf/fence.c > >> > @@ -304,8 +304,12 @@ fence_remove_callback(struct fence *fence, struct fence_cb *cb) > >> > spin_lock_irqsave(fence->lock, flags); > >> > > >> > ret = !list_empty(&cb->node); > >> > - if (ret) > >> > + if (ret) { > >> > list_del_init(&cb->node); > >> > + if (list_empty(&fence->cb_list)) > >> > + if (fence->ops->disable_signaling) > >> > + fence->ops->disable_signaling(fence); > >> > >> What exactly is the bug here? A fence with no callbacks registered any > >> more shouldn't have any problem. Why exactly does this blow up? > > > > The WARN_ON is probably this one: > > https://android.googlesource.com/kernel/common/+/android-3.18/drivers/staging/android/sync.c#433 > > > > I've been wondering in the last few days if this warning is really > > necessary. If the user is closing a sync_timeline that has unsignalled > > fences it should probably be aware of that already. Then I think it is > > okay to remove the the sync_pt from the active_list at the release-time. > > In fact I've already prepared a patch doing that. Thoughts? > > > > Maybe, but you need to make sure that you only affecting your fences. > > My main objection is that still leaves fence_remove_callback() being > not mirror image of fence_add_callback(). That's 100% intentional. I looked at the sync.c code a bit more and it duplicates a bunch of the fence stuff still. We need to either merge that code into the mainline struct fence logic, or remove it. There shouldn't really be any need for the userspace ABI layer to keep track of active fences at all. Worse, it means that you must use the sync_pt struct to be able to export it to userspace, and can't just export any normal struct fence object. That breaks the abstraction we're aiming for. Imo just remove that WARN_ON for now. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/