From: Jan Kara Subject: Re: process hangs in ext4_sync_file Date: Wed, 23 Oct 2013 12:20:42 +0200 Message-ID: <20131023102042.GE1275@quack.suse.cz> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: linux-ext4@vger.kernel.org To: Sandeep Joshi Return-path: Received: from cantor2.suse.de ([195.135.220.15]:59442 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752786Ab3JWKUp (ORCPT ); Wed, 23 Oct 2013 06:20:45 -0400 Content-Disposition: inline In-Reply-To: Sender: linux-ext4-owner@vger.kernel.org List-ID: On Mon 21-10-13 18:09:02, Sandeep Joshi wrote: > I am seeing a problem reported 4 years earlier > https://lkml.org/lkml/2009/3/12/226 > (same stack as seen by Alexander) > > The problem is reproducible. Let me know if you need any info in > addition to that seen below. > > I have multiple threads in a process doing heavy IO on a ext4 > filesystem mounted with (discard, noatime) on a SSD or HDD. > > This is on Linux 3.8.0-29-generic #42~precise1-Ubuntu SMP Wed Aug 14 > 16:19:23 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux > > For upto minutes at a time, one of the threads seems to hang in sync to disk. > > When I check the thread stack in /proc, I find that the stack is one > of the following two > > ] sleep_on_page+0xe/0x20 > [] wait_on_page_bit+0x78/0x80 > [] filemap_fdatawait_range+0x10c/0x1a0 > [] filemap_write_and_wait_range+0x68/0x80 > [] ext4_sync_file+0x6f/0x2b0 > [] vfs_fsync+0x2b/0x40 > [] sys_msync+0x143/0x1d0 > [] system_call_fastpath+0x1a/0x1f > [] 0xffffffffffffffff > > > OR > > > [] jbd2_log_wait_commit+0xb5/0x130 > [] jbd2_complete_transaction+0x53/0x90 > [] ext4_sync_file+0x1ed/0x2b0 > [] vfs_fsync+0x2b/0x40 > [] sys_msync+0x143/0x1d0 > [] system_call_fastpath+0x1a/0x1f > [] 0xffffffffffffffff > > Any clues? We are waiting for IO to complete. As the first thing, try to remount your filesystem without 'discard' mount option. That is often causing problems. Honza -- Jan Kara SUSE Labs, CR