Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp4262256pxb; Tue, 10 Nov 2020 11:48:32 -0800 (PST) X-Google-Smtp-Source: ABdhPJzaLb/FjZduTIiCnzgBiuS6K4qYFs0cHWdaBMOSeStyBh6IgdRg+7tJLfkdndNkZtoMzjTX X-Received: by 2002:a05:6402:2da:: with SMTP id b26mr1058511edx.176.1605037712412; Tue, 10 Nov 2020 11:48:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1605037712; cv=none; d=google.com; s=arc-20160816; b=P3e+7NsAG5BbXQC9bu2zEy6iT/JRuO7AExNzhE1yKEOYL3aNQ1Y8h3btxYGP4YYac3 jyCDTR1PzVDzfcHSbP7i9wccVMCEtPHfMmCeAfdIA4DhNTnH9w6L+7PkvfbgVRIY0Dug PB8ec0f1pld39ZqFuvMNAYmhxFXHfKNREWtpGcXdU1rAST03UUn1U/wdayIK6z1Jsudu xmaqERKCMZc6jYj6FVpX/dSrFjH3YaUUP5Mt18A6APwl6k1uc84R4K0chPFuNCr/PVsE 4DbhiDoVXotfgIAObEnznI2zc/sBj9GJUm+5GecW8T1fp378dYWBu7FmrrjwaPRmUcYb 2p8A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=VFMiEqz5jM0V4br9SyS9hVRmg2CtnRUQ0TV2ibZRy70=; b=Hdh9ujbQCUtpl4zjtAvgLvcT5RhuhVaM7KgD8zI0o4jkUIeiKUrHZYrlEaEWBj9Yan 3CFTfX2QXaI+32MnpPAwPBBTWweszjPCeQIQ14936bWpnx+wwxZxX3q2cMR/MaRO+tkf Y0/vU0af/7Xg7avHvvUdvqka0ui1/Ok04F+sR8npt2ubvEko1ndlskNmY4LKmVImGnFS 0ZhboadYejNulk8rNOU2XzYtliF9Upo6wWHheqvMGWHQU0LsrTzWNtm4CB7DXgqDD/CN uA66oloDSZB7z7ro79Xa9Ory4LyErYDEZwoRROyctZVzrBRSC26ZT6+hxRSZ1FBr1q60 t7fw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id e26si13135169edr.55.2020.11.10.11.47.49; Tue, 10 Nov 2020 11:48:32 -0800 (PST) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730788AbgKJTqa (ORCPT + 99 others); Tue, 10 Nov 2020 14:46:30 -0500 Received: from outgoing-auth-1.mit.edu ([18.9.28.11]:55542 "EHLO outgoing.mit.edu" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725862AbgKJTqa (ORCPT ); Tue, 10 Nov 2020 14:46:30 -0500 Received: from callcc.thunk.org (pool-72-74-133-215.bstnma.fios.verizon.net [72.74.133.215]) (authenticated bits=0) (User authenticated as tytso@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 0AAJkNKv007881 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 10 Nov 2020 14:46:24 -0500 Received: by callcc.thunk.org (Postfix, from userid 15806) id 8D1D4420107; Tue, 10 Nov 2020 14:46:23 -0500 (EST) Date: Tue, 10 Nov 2020 14:46:23 -0500 From: "Theodore Y. Ts'o" To: Chris Friesen Cc: Jan Kara , linux-ext4@vger.kernel.org Subject: Re: looking for assistance with jbd2 (and other processes) hung trying to write to disk Message-ID: <20201110194623.GC2951190@mit.edu> References: <17a059de-6e95-ef97-6e0a-5e52af1b9a04@windriver.com> <20201110114202.GF20780@quack2.suse.cz> <7fa5a43f-bdd6-9cf1-172a-b2af47239e96@windriver.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <7fa5a43f-bdd6-9cf1-172a-b2af47239e96@windriver.com> Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Tue, Nov 10, 2020 at 09:57:39AM -0600, Chris Friesen wrote: > No, there are quite a few of them. I've included them below. I agree, it's > not clear who's holding the lock. Is there a way to find that out? > > Just to be sure, I'm looking for whoever has the BH_Lock bit set on the > buffer_head "b_state" field, right? I don't see any ownership field the way > we have for mutexes. Is there some way to find out who would have locked > the buffer? It's quite possible that the buffer was locked as part of doing I/O, and we are just waiting for the I/O to complete. An example of this is in journal_submit_commit_record(), where we lock the buffer using lock_buffer(), and then call submit_bh() to submit the buffer for I/O. When the I/O is completed, the buffer head will be unlocked, and we can check the buffer_uptodate flag to see if the I/O completed successfully. (See journal_wait_on_commit_record() for an example of this.) So the first thing I'd suggest doing is looking at the console output or dmesg output from the crashdump to see if there are any clues in terms of kernel messages from the device driver before things locked up. This could be as simple as the device falling off the bus, in which case there might be some kernel error messages from the block layer or device driver that would give some insight. Good luck, - Ted