Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752145Ab0BSFbL (ORCPT ); Fri, 19 Feb 2010 00:31:11 -0500 Received: from mta2.srv.hcvlny.cv.net ([167.206.4.197]:55038 "EHLO mta2.srv.hcvlny.cv.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751648Ab0BSFbJ (ORCPT ); Fri, 19 Feb 2010 00:31:09 -0500 Date: Fri, 19 Feb 2010 00:31:13 -0500 From: Michael Breuer Subject: Re: Hung task - sync - 2.6.33-rc7 w/md6 multicore rebuild in process In-reply-to: <20100219040206.GE28392@discord.disaster> To: Dave Chinner Cc: Jan Kara , Linux Kernel Mailing List Message-id: <4B7E2221.4020009@majjas.com> MIME-version: 1.0 Content-type: text/plain; charset=ISO-8859-1; format=flowed Content-transfer-encoding: 7BIT References: <4B76D87E.3050107@majjas.com> <4B76EC83.5050401@majjas.com> <20100218023934.GC8897@atrey.karlin.mff.cuni.cz> <4B7D74BE.6030906@majjas.com> <20100219014349.GD28392@discord.disaster> <4B7DF80D.6090309@majjas.com> <20100219040206.GE28392@discord.disaster> User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.7) Gecko/20100111 Lightning/1.0b2pre Thunderbird/3.0.1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3001 Lines: 69 On 2/18/2010 11:02 PM, Dave Chinner wrote: > On Thu, Feb 18, 2010 at 09:31:41PM -0500, Michael Breuer wrote: > >> On 2/18/2010 8:43 PM, Dave Chinner wrote: >> >>> >>> This is probably where the barrier IOs are coming from. With a RAID >>> resync going on (so all IO is going to be slow to begin with) and >>> writeback is causing barriers to be issued (which are really slow on >>> software RAID5/6), having sync take so long is not out of the >>> question if you have lots of dirty inodes to write back. A kernel >>> compile will generate lots of dirty inodes. >>> >>> Even taking the barrier IOs out of the question, I've seen reports >>> of sync or unmount taking over 10 hours to complete on software >>> RAID5 because there were hundreds of thousands of dirty inodes to >>> write back and each inode being written back caused a synchronous >>> RAID5 RMW cycle to occur. Hence writeback could only clean 50 >>> inodes/sec because as soon as RMW cycles RAID5/6 devices start >>> they go slower than single spindle devices. This sounds very >>> similar to what you are seeing here, >>> >>> i.e. The reports don't indicate to me that there is a bug in the >>> writeback code, just your disk subsystem has very, very low >>> throughput in these conditions.... >>> >> Probably true... and the system does recover. The only thing I'd point >> out is that the subsystem isn't (or perhaps shouldn't) be this sluggish. >> I hypothesize that the low throughput under these condition is a result >> of: >> 1) multicore raid support (pushing the resync at higher rates) >> > Possibly, though barrier support for RAID5/6 is shiny new as well. > > >> 2) time spent in fs cache reclaim. The sync slowdown only occurs when fs >> cache is in heavy (10Gb) use. >> > Not surprising ;) > > >> I actually could not recreate the issue until I did a grep -R foo /usr/ >> >>> /dev/null to force high fs cache utilization. For what it's worth, two >>> >> kernel rebuilds (many dirty inodes) and then a sync with about 12Mb >> dirty (/proc/meminfo) didn't cause an issue. The issue only happens when >> fs cache is heavily used. I also never saw this before enabling >> multicore raid. >> > "grep -R foo /usr/" will dirty every inode that touchs (atime) and > they have to be written back out. That's almost certainly creating > more dirty inodes than a kernel build - there are about 400,000 > inodes under /usr on my system. That would be enough to trigger very > long sync times if inode writeback is slow. > > Cheers, > > Dave. > My filesystems are mounted relatime. Just confirmed that dirty pages doesn't climb all that much with the grep -R foo /usr > /dev/null. The only apparant impact is to fs cache. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/