Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S262814AbVAKQJm (ORCPT ); Tue, 11 Jan 2005 11:09:42 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S262813AbVAKQJm (ORCPT ); Tue, 11 Jan 2005 11:09:42 -0500 Received: from penguin.cohaesio.net ([212.97.129.34]:55739 "EHLO mail.cohaesio.net") by vger.kernel.org with ESMTP id S262816AbVAKQJO convert rfc822-to-8bit (ORCPT ); Tue, 11 Jan 2005 11:09:14 -0500 From: Anders Saaby Organization: Cohaesio A/S To: linux-kernel@vger.kernel.org Subject: Re: panic - Attempting to free lock with active block list Date: Tue, 11 Jan 2005 17:09:42 +0100 User-Agent: KMail/1.7.2 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8BIT Content-Disposition: inline Message-Id: <200501111709.42934.as@cohaesio.com> X-OriginalArrivalTime: 11 Jan 2005 16:09:11.0596 (UTC) FILETIME=[E3515EC0:01C4F7F7] Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2667 Lines: 82 Hi Myklebust(s) :) I have seen the exact same error on one of my webservers which is serving from an NFS export and under heavy load. ~2 hours uptime before panic'ing. I then tried Trond's patch which seems to work. 14 hours of uptime now. :) Anyways, I have a couple of issues you might be able to clear up for me: First issue: New strange message in the kernel log: "nlmclnt_lock: VFS is out of sync with lock manager!" - What does this mean? - Is it bad?, What can i do? Second issue: my fs/nfs/file.c doesn't look like yours (Vanilla 2.6.10): ????????status?=?NFS_PROTO(inode)->lock(filp,?cmd,?fl); ????????/*?If?we?were?signalled?we?still?need?to?ensure?that ?????????*?we?clean?up?any?state?on?the?server.?We?therefore ?????????*?record?the?lock?call?as?having?succeeded?in?order?to ?????????*?ensure?that?locks_remove_posix()?cleans?it?out?when ?????????*?the?process?exits. ?????????*/ ????????if?(status?==?-EINTR?||?status?==?-ERESTARTSYS) ????????????????posix_lock_file_wait(filp,?fl); ????????unlock_kernel(); ????????if?(status?f_mapping); ????????down(&inode->i_sem); ????????nfs_wb_all(inode);??????/*?we?may?have?slept?*/ ????????up(&inode->i_sem); ????????filemap_fdatawait(filp->f_mapping); ????????nfs_zap_caches(inode); ????????return?0; So... Am I missing another patch or something else? Jan-Frode Myklebust wrote: > On Wed, Jan 05, 2005 at 10:54:03PM +0100, Trond Myklebust wrote: >> >> Looking at the NFS code, I can attempt a wild guess about what may be >> happening: there may be a race when pressing ^C in the middle of a >> blocking NFS lock RPC call, and if so, the following patch will fix it. > > > A whopping 9 hours of uptime now :) So the one-liner patch seems to have > fixed it. > > Thanks! > >> - posix_lock_file(filp, fl); >> + posix_lock_file_wait(filp, fl); > > > -jf -- Med venlig hilsen - Best regards - Meilleures salutations Anders Saaby Systems Engineer ------------------------------------------------ Cohaesio A/S - Maglebjergvej 5D - DK-2800 Lyngby Phone: +45 45 880 888 - Fax: +45 45 880 777 Mail: as@cohaesio.com - http://www.cohaesio.com ------------------------------------------------ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/