Return-Path: linux-nfs-owner@vger.kernel.org Received: from mail-vc0-f176.google.com ([209.85.220.176]:60157 "EHLO mail-vc0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932354Ab3CDUKG (ORCPT ); Mon, 4 Mar 2013 15:10:06 -0500 Received: by mail-vc0-f176.google.com with SMTP id fk10so3614579vcb.21 for ; Mon, 04 Mar 2013 12:10:05 -0800 (PST) MIME-Version: 1.0 In-Reply-To: References: <4FA345DA4F4AE44899BD2B03EEEC2FA9286AD113@sacexcmbx05-prd.hq.netapp.com> <20130304100432.5c7ea704@tlielax.poochiereds.net> <4FA345DA4F4AE44899BD2B03EEEC2FA9286AD381@sacexcmbx05-prd.hq.netapp.com> Date: Mon, 4 Mar 2013 12:10:05 -0800 Message-ID: Subject: Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held! From: Mandeep Singh Baines To: "Myklebust, Trond" Cc: Ming Lei , Linus Torvalds , Al Viro , Jeff Layton , "J. Bruce Fields" , Linux Kernel Mailing List , "linux-nfs@vger.kernel.org" , "Rafael J. Wysocki" , Ben Chan , Oleg Nesterov , Ingo Molnar Content-Type: text/plain; charset=ISO-8859-1 Sender: linux-nfs-owner@vger.kernel.org List-ID: On Mon, Mar 4, 2013 at 12:09 PM, Mandeep Singh Baines wrote: > On Mon, Mar 4, 2013 at 7:53 AM, Myklebust, Trond > wrote: >> On Mon, 2013-03-04 at 23:33 +0800, Ming Lei wrote: >>> Hi, >>> >>> CC guys who introduced the lockdep change. >>> >>> On Mon, Mar 4, 2013 at 11:04 PM, Jeff Layton wrote: >>> >>> > >>> > I don't get it -- why is it bad to hold a lock across a freeze event? >>> >>> At least this may deadlock another mount.nfs during freezing, :-) >>> >>> See detailed explanation in the commit log: >>> >>> commit 6aa9707099c4b25700940eb3d016f16c4434360d >>> Author: Mandeep Singh Baines >>> Date: Wed Feb 27 17:03:18 2013 -0800 >>> >>> lockdep: check that no locks held at freeze time >>> >>> We shouldn't try_to_freeze if locks are held. Holding a lock can cause a >>> deadlock if the lock is later acquired in the suspend or hibernate path >>> (e.g. by dpm). Holding a lock can also cause a deadlock in the case of >>> cgroup_freezer if a lock is held inside a frozen cgroup that is later >>> acquired by a process outside that group. >>> >> >> This is bloody ridiculous... If you want to add functionality to >> implement cgroup or per-process freezing, then do it through some other >> api instead of trying to push your problems onto others by adding new >> global locking rules. >> >> Filesystems are a shared resource that have _nothing_ to do with process >> cgroups. They need to be suspended when the network goes down or other >> resources that they depend on are suspended. At that point, there is no >> "what if I launch a new mount command?" scenario. >> > > Hi Trond, > > My intention was to introduce new rules. My change simply introduces a D'oh. s/was/was not/ Regards, Mandeep > check for a deadlock case that can already happen. > > I think a deadlock could happen under the following scenario: > > 1) An administrator wants to freeze a container. Perhaps to checkpoint > it and it migrate it some place else. > 2) An nfs mount was in progress so we hit this code path and freeze > with a lock held. > 3) Another container tries to nfs mount. > 4) Deadlock. > > Regards, > Mandeep > >> Trond >> -- >> Trond Myklebust >> Linux NFS client maintainer >> >> NetApp >> Trond.Myklebust@netapp.com >> www.netapp.com