Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754783AbYFVPSX (ORCPT ); Sun, 22 Jun 2008 11:18:23 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753230AbYFVPSO (ORCPT ); Sun, 22 Jun 2008 11:18:14 -0400 Received: from rv-out-0506.google.com ([209.85.198.224]:10364 "EHLO rv-out-0506.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753095AbYFVPSN (ORCPT ); Sun, 22 Jun 2008 11:18:13 -0400 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references; b=Eu43kDrr58uUV4+kxrPeWyAiFV+xBz3Ls7+gQsSj1T9Vws+gP3mSMySSV4DiWo3YsG sDKb2eisoPA+hAzZhEpkpYkPwoTz2raHf2dN1hTxzTbLKZ4UBxmUA3pXbGmpyvCvOZYe 47nIlyf7GD1Tr6QittZEtH/udSUpG1Zi8beaw= Message-ID: <48f7fe350806220818n52cc36e9v28bdc6cf2c4cf841@mail.gmail.com> Date: Sun, 22 Jun 2008 11:18:12 -0400 From: "Ryan Hope" To: "Peter Zijlstra" Subject: Re: [BUG] Lockless patches cause hardlock under heavy IO Cc: "Nick Piggin" , linux-mm@vger.kernel.org, LKML In-Reply-To: <1214147244.3223.307.camel@lappy.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <48f7fe350806181415l4eba61b3i1d206de03147575e@mail.gmail.com> <1213863122.16944.257.camel@twins> <200806191819.31968.nickpiggin@yahoo.com.au> <48f7fe350806220737q7cc48d81g29b0fc85fc59d390@mail.gmail.com> <1214147244.3223.307.camel@lappy.programming.kicks-ass.net> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3789 Lines: 81 Well in the current version of the patchset we are using, one user would start playing some game (disabling "Disable Heap Randomization" fixed the hardlocks for him... the other user got hardlocks when copying an ISO from a reiser4 partition to a reiserfs partition (disabling swap fixed the issue for him). On Sun, Jun 22, 2008 at 11:07 AM, Peter Zijlstra wrote: > On Sun, 2008-06-22 at 10:37 -0400, Ryan Hope wrote: >> Well I couldn't stop playing with this... I am pretty sure the cause >> of the hardlocks is in the second half of the patches (the speculative >> page ref patches). I reversed all of those patches so that just the >> GUP patchs were included and no more hardlocks... then I applied the >> concurrent page cache patches from the -rt branch include 1 OLD >> speculative page ref patch and this caused hardlocks for peopel again. >> However enabling heap randomization fixed the hardlocks for one of the >> users and the disabling swap fixed the issue of the other user. I hope >> this helps. > > What are people doing to make it hang? > >> On Thu, Jun 19, 2008 at 4:19 AM, Nick Piggin wrote: >> > On Thursday 19 June 2008 18:12, Peter Zijlstra wrote: >> >> On Wed, 2008-06-18 at 17:15 -0400, Ryan Hope wrote: >> >> > I applied the following patches from 2.6-26-rc5-mm3 to 2.6.26-rc6 and >> >> > they caused a hardlock under heavy IO: >> >> >> >> What kind of machine, how much memory, how many spindles, what >> >> filesystem and what is heavy load? >> >> >> >> Furthermore, try the NMI watchdog with serial/net-console to capture its >> >> output. >> > >> > >> > Good suggestions. A trace would be really helpful. >> > >> > As Arjan suggested, debug options especially CONFIG_DEBUG_VM would be >> > a good idea to turn on if you haven't already. >> > >> > BTW. what was the reason for applying those patches? Did you hit the >> > problem with -mm also, and hope to narrow it down? >> > >> > >> >> > x86-implement-pte_special.patch >> >> > mm-introduce-get_user_pages_fast.patch >> >> > mm-introduce-get_user_pages_fast-fix.patch >> >> > mm-introduce-get_user_pages_fast-checkpatch-fixes.patch >> >> > x86-lockless-get_user_pages_fast.patch >> >> > x86-lockless-get_user_pages_fast-checkpatch-fixes.patch >> >> > x86-lockless-get_user_pages_fast-fix.patch >> >> > x86-lockless-get_user_pages_fast-fix-2.patch >> >> > x86-lockless-get_user_pages_fast-fix-2-fix-fix.patch >> >> > x86-lockless-get_user_pages_fast-fix-warning.patch >> >> > dio-use-get_user_pages_fast.patch >> >> > splice-use-get_user_pages_fast.patch >> >> > x86-support-1gb-hugepages-with-get_user_pages_lockless.patch >> >> > # >> >> > mm-readahead-scan-lockless.patch >> >> > radix-tree-add-gang_lookup_slot-gang_lookup_slot_tag.patch >> >> > #mm-speculative-page-references.patch: clameter saw bustage >> >> > mm-speculative-page-references.patch >> >> > mm-speculative-page-references-fix.patch >> >> > mm-speculative-page-references-fix-fix.patch >> >> > mm-speculative-page-references-hugh-fix3.patch >> >> > mm-lockless-pagecache.patch >> >> > mm-spinlock-tree_lock.patch >> >> > powerpc-implement-pte_special.patch >> >> > >> >> > I am on an x86_64. I dont know what other info you need... >> > >> > Can you isolate it to one of the two groups of patches? I suspect it >> > might be the latter so you might try that first -- this version of >> > speculative page references is very nice in theory but it is a little >> > more complex to implement the slowpaths so it could be an error there. >> > > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/