Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932186Ab3CDS4w (ORCPT ); Mon, 4 Mar 2013 13:56:52 -0500 Received: from mail-da0-f46.google.com ([209.85.210.46]:57696 "EHLO mail-da0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758570Ab3CDS4t (ORCPT ); Mon, 4 Mar 2013 13:56:49 -0500 Date: Mon, 4 Mar 2013 10:56:45 -0800 From: Tejun Heo To: richard -rw- weinberger Cc: Srinivas Eeda , ocfs2-devel@oss.oracle.com, linux-fsdevel , ocfs2-users@oss.oracle.com, LKML , Ingo Molnar , akpm@linux-foundation.org Subject: Re: [Ocfs2-users] [OCFS2] Crash at o2net_shutdown_sc() Message-ID: <20130304185645.GK30413@htj.dyndns.org> References: <513120AF.5030501@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2470 Lines: 55 Hello, On Sat, Mar 02, 2013 at 09:41:54AM +0100, richard -rw- weinberger wrote: > On Fri, Mar 1, 2013 at 10:42 PM, Srinivas Eeda wrote: > > Yes that was the crash I was referring to which stopped me from testing my > > other patch on mainline. I think the crashes started since some workqueue > > patches introduced by commit 57b30ae77bf00d2318df711ef9a4d2a9be0a3a2a > > Earlier kernels should be fine. > > > > Patch https://lkml.org/lkml/2012/10/18/592 tried to address one fix which > > helped ramster that uses same ocfs2/o2net code. There still seems to be > > another problem that crashes ocfs2. > > If commit 57b30ae (workqueue: reimplement cancel_delayed_work() using > try_to_grab_pending()) > introduced that regression, it is time to CC Tejun. Hmmm..... > >> [ 1514.840690] BUG: unable to handle kernel NULL pointer dereference > >> at 0000000000000028 > >> [ 1514.841627] IP: [] kernel_sock_ioctl+0x50/0x50 I suppose it's because either sock->ops or sock->ops->ioctl is NULL? Can someone teach me what could lead to such conditions here and how it could be affected by cancel_delayed_work()? > >> [ 1514.841627] Call Trace: > >> [ 1514.841627] [] ? o2net_shutdown_sc+0x106/0x1e0 I suppose this is the work function? Wondering why '?' is there tho. > >> [ 1514.841627] [] ? __switch_to+0x2a/0x4a0 > >> [ 1514.841627] [] ? _raw_spin_unlock_irq+0x12/0x40 > >> [ 1514.841627] [] ? finish_task_switch+0x56/0xc0 > >> [ 1514.841627] [] process_one_work+0x133/0x510 > >> [ 1514.841627] [] ? > >> o2net_sc_connect_completed+0xf0/0xf0 > >> [ 1514.841627] [] worker_thread+0x15d/0x450 > >> [ 1514.841627] [] ? busy_worker_rebind_fn+0x100/0x100 > >> [ 1514.841627] [] kthread+0xbb/0xc0 > >> [ 1514.841627] [] ? e1000_regdump+0x262/0x3be > >> [ 1514.841627] [] ? kthread_create_on_node+0x130/0x130 > >> [ 1514.841627] [] ret_from_fork+0x7c/0xb0 > >> [ 1514.841627] [] ? kthread_create_on_node+0x130/0x130 Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/