Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754118Ab2FCSQP (ORCPT ); Sun, 3 Jun 2012 14:16:15 -0400 Received: from mx1.redhat.com ([209.132.183.28]:21060 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753992Ab2FCSQO (ORCPT ); Sun, 3 Jun 2012 14:16:14 -0400 Date: Sun, 3 Jun 2012 14:15:48 -0400 From: Dave Jones To: Hugh Dickins Cc: Linus Torvalds , Bartlomiej Zolnierkiewicz , Kyungmin Park , Marek Szyprowski , Mel Gorman , Minchan Kim , Rik van Riel , Andrew Morton , Cong Wang , Markus Trippelsdorf , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: WARNING: at mm/page-writeback.c:1990 __set_page_dirty_nobuffers+0x13a/0x170() Message-ID: <20120603181548.GA306@redhat.com> Mail-Followup-To: Dave Jones , Hugh Dickins , Linus Torvalds , Bartlomiej Zolnierkiewicz , Kyungmin Park , Marek Szyprowski , Mel Gorman , Minchan Kim , Rik van Riel , Andrew Morton , Cong Wang , Markus Trippelsdorf , linux-kernel@vger.kernel.org, linux-mm@kvack.org References: <20120530163317.GA13189@redhat.com> <20120531005739.GA4532@redhat.com> <20120601023107.GA19445@redhat.com> <20120601161205.GA1918@redhat.com> <20120601171606.GA3794@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2570 Lines: 67 On Fri, Jun 01, 2012 at 09:40:35PM -0700, Hugh Dickins wrote: > In which case, yes, much better to follow your suggestion, and hold > the lock (with irqs disabled) for only half the time. > > Similarly untested patch below. Things aren't happy with that patch at all. ============================================= [ INFO: possible recursive locking detected ] 3.5.0-rc1+ #50 Not tainted --------------------------------------------- trinity-child1/31784 is trying to acquire lock: (&(&zone->lock)->rlock){-.-.-.}, at: [] suitable_migration_target.isra.15+0x19d/0x1e0 but task is already holding lock: (&(&zone->lock)->rlock){-.-.-.}, at: [] compaction_alloc+0x21b/0x2f0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(&zone->lock)->rlock); lock(&(&zone->lock)->rlock); *** DEADLOCK *** May be due to missing lock nesting notation 2 locks held by trinity-child1/31784: #0: (&mm->mmap_sem){++++++}, at: [] vm_mmap_pgoff+0x66/0xb0 #1: (&(&zone->lock)->rlock){-.-.-.}, at: [] compaction_alloc+0x21b/0x2f0 stack backtrace: Pid: 31784, comm: trinity-child1 Not tainted 3.5.0-rc1+ #50 Call Trace: [] __lock_acquire+0x1584/0x1aa0 [] ? trace_hardirqs_off_caller+0x28/0xc0 [] ? local_clock+0x47/0x60 [] lock_acquire+0x92/0x1f0 [] ? suitable_migration_target.isra.15+0x19d/0x1e0 [] ? _raw_spin_lock_irqsave+0x25/0x90 [] _raw_spin_lock_irqsave+0x52/0x90 [] ? suitable_migration_target.isra.15+0x19d/0x1e0 [] suitable_migration_target.isra.15+0x19d/0x1e0 [] compaction_alloc+0x22e/0x2f0 [] migrate_pages+0xc7/0x540 [] ? isolate_freepages_block+0x260/0x260 [] compact_zone+0x216/0x480 [] ? trace_hardirqs_off_caller+0x28/0xc0 [] compact_zone_order+0x8d/0xd0 [] ? get_page_from_freelist+0x565/0x970 [] try_to_compact_pages+0xc9/0x140 [] __alloc_pages_direct_compact+0xaa/0x1d0 Then a bunch of NMI backtraces, and a hard lockup. Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/