Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754531Ab0BWWlf (ORCPT ); Tue, 23 Feb 2010 17:41:35 -0500 Received: from bld-mail13.adl6.internode.on.net ([150.101.137.98]:54649 "EHLO mail.internode.on.net" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1754127Ab0BWWld (ORCPT ); Tue, 23 Feb 2010 17:41:33 -0500 Date: Wed, 24 Feb 2010 09:41:11 +1100 From: Dave Chinner To: Mikael Abrahamsson Cc: linux-kernel@vger.kernel.org, dm-devel@redhat.com Subject: Re: disk/crypto performance regression 2.6.31 -> 2.6.32 (mmap problem?) Message-ID: <20100223224111.GC16175@discord.disaster> References: <20100223044818.GD22370@discord.disaster> <20100223133219.GB16175@discord.disaster> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4515 Lines: 103 On Tue, Feb 23, 2010 at 09:42:40PM +0100, Mikael Abrahamsson wrote: > On Wed, 24 Feb 2010, Dave Chinner wrote: > > Thought I should send the dump only to you, would you like me to post to > the list as well and include a link to the logfile (think attachments > shouldn't be done too much to lkml, right?). Feel free to post any > information from this letter or attachment back onto lkml if you feel > it's appropriate. Attachments under roughly 100kb (IIRC) are fine for LKML. This is allowed specifically to let people attach log files and configs. I've re-added the lkml CC to continue the discussion there. I also added the DM mailing list, so that they know directly that barriers are causing _significant_ system slowdowns. This is important, because there have been several reports of this problem since the start of the year to XFS forums as people are upgrading kernels. >> Barriers are only recently supported across DM and MD, so it would be >> worth checking you logs for the last mount of the filesystems to >> confirm. > > You're right, it doesn't say that anymore in 2.6.31, so I think I'm > indeed running with barriers on. And the stack traces confirm that. Every single blocked process output set has this trace in it: kdmflush D 0000000000000000 0 3082 2 0x00000000 ffff880208553d20 0000000000000046 ffff880210a5f640 0000000000015880 ffff8802241ac7c0 0000000000015880 0000000000015880 0000000000015880 0000000000015880 ffff8802241ac7c0 0000000000015880 0000000000015880 Call Trace: [] io_schedule+0x28/0x40 [] dm_wait_for_completion+0x87/0x150 [] ? default_wake_function+0x0/0x10 [] ? __split_and_process_bio+0x107/0x170 [] dm_flush+0x25/0x70 [] process_barrier+0x59/0xf0 [] dm_wq_work+0x3b/0x100 [] ? dm_wq_work+0x0/0x100 [] run_workqueue+0x95/0x170 [] worker_thread+0xa4/0x120 [] ? autoremove_wake_function+0x0/0x40 [] ? worker_thread+0x0/0x120 [] kthread+0xa6/0xb0 [] child_rip+0xa/0x20 [] ? kthread+0x0/0xb0 [] ? child_rip+0x0/0x20 That is, it appears that DM is blocking waiting for a barrier to complete. Everything else is backed up waiting for IO completion to occur... >>> Currently it's "lazy-count=0", so I'll change that setting tonight. > > I didn't do this before the test I'm referring to now. > >> When IO is really slow so we get a better idea of where things are >> blocking. Running a few of these 30s apart will give a fair indication >> of what is blocked and what is making progress.... > > Attached to this email, logfile from yesterday and today. > > Some interesting parts as well, that didn't trigger from the above sysrq > stuff: > > Feb 22 19:56:16 ub kernel: [201245.583915] INFO: task md0_raid5:425 blocked for more than 120 seconds. > Feb 22 22:36:16 ub kernel: [210846.031875] INFO: task xfssyncd:3167 blocked for more than 120 seconds. > Feb 23 18:10:18 ub kernel: [281287.492499] INFO: task kdmflush:3082 blocked for more than 120 seconds. > Feb 23 18:12:18 ub kernel: [281407.491041] INFO: task kdmflush:3082 blocked for more than 120 seconds. > Feb 23 21:36:18 ub kernel: [293647.665917] INFO: task md0_raid5:425 blocked for more than 120 seconds. > > Didn't really think md0_raid5 could be blocked like that... They are blocking in: [] md_super_wait+0xc1/0xe0 [] ? autoremove_wake_function+0x0/0x40 [] md_update_sb+0x230/0x3b0 [] md_check_recovery+0x272/0x590 [] raid5d+0x21/0x320 [raid456] [] md_thread+0x44/0x120 I'm assuming this is just taking a long time because of all the other IO that is going on slowing this down. The other 3 (xfssyncd, kdmflush) are all waiting for barrier IO to complete... So it sounds like the immediate WAR for your extremely slow IO problem is to mount the XFS filesystems with "-o nobarrier", but keep in mind this can result in corruption if a crash occurrs. Cheers, Dave. -- Dave Chinner david@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/