Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756661Ab1CWTjN (ORCPT ); Wed, 23 Mar 2011 15:39:13 -0400 Received: from mx5.twosigma.com ([208.77.212.35]:55241 "EHLO mx5.twosigma.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756546Ab1CWTjL convert rfc822-to-8bit (ORCPT ); Wed, 23 Mar 2011 15:39:11 -0400 From: Sean Noonan To: Sean Noonan , "'linux-kernel@vger.kernel.org'" CC: Trammell Hudson , Martin Bligh , Stephen Degler , Christos Zoulas , "'linux-xfs@oss.sgi.com'" Date: Wed, 23 Mar 2011 15:39:05 -0400 Subject: RE: XFS memory allocation deadlock in 2.6.38 Thread-Topic: XFS memory allocation deadlock in 2.6.38 Thread-Index: Acvn48onhwj/45wTTJCiJ9OQbQPjgABrWqTQ Message-ID: <081DDE43F61F3D43929A181B477DCA95639B5327@MSXAOA6.twosigma.com> References: <081DDE43F61F3D43929A181B477DCA95639B52FD@MSXAOA6.twosigma.com> In-Reply-To: <081DDE43F61F3D43929A181B477DCA95639B52FD@MSXAOA6.twosigma.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4828 Lines: 111 I believe this patch fixes the behavior: diff --git a/mm/memory.c b/mm/memory.c index e48945a..740d5ab 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3461,7 +3461,9 @@ int make_pages_present(unsigned long addr, unsigned long end) * to break COW, except for shared mappings because these don't COW * and we would not want to dirty them for nothing. */ - write = (vma->vm_flags & (VM_WRITE | VM_SHARED)) == VM_WRITE; + write = (vma->vm_flags & VM_WRITE) != 0; + if (write && ((vma->vm_flags & VM_SHARED) !=0) && (vma->vm_file == NULL)) + write = 0; BUG_ON(addr >= end); BUG_ON(end > vma->vm_end); len = DIV_ROUND_UP(end, PAGE_SIZE) - addr/PAGE_SIZE; This was traced to the following commit: 5ecfda041e4b4bd858d25bbf5a16c2a6c06d7272 is the first bad commit commit 5ecfda041e4b4bd858d25bbf5a16c2a6c06d7272 Author: Michel Lespinasse Date: Thu Jan 13 15:46:09 2011 -0800 mlock: avoid dirtying pages and triggering writeback When faulting in pages for mlock(), we want to break COW for anonymous or file pages within VM_WRITABLE, non-VM_SHARED vmas. However, there is no need to write-fault into VM_SHARED vmas since shared file pages can be mlocked first and dirtied later, when/if they actually get written to. Skipping the write fault is desirable, as we don't want to unnecessarily cause these pages to be dirtied and queued for writeback. Signed-off-by: Michel Lespinasse Cc: Hugh Dickins Cc: Rik van Riel Cc: Kosaki Motohiro Cc: Peter Zijlstra Cc: Nick Piggin Cc: Theodore Tso Cc: Michael Rubin Cc: Suleiman Souhlal Cc: Dave Chinner Cc: Christoph Hellwig Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds :040000 040000 604eede2f45b7e5276ce9725b715ed15a868861d 3c175eadf4cf33d4f78d4d455c9a04f3df2c199e M mm -----Original Message----- From: Sean Noonan Sent: Monday, March 21, 2011 12:20 To: 'linux-kernel@vger.kernel.org' Cc: Trammell Hudson; Martin Bligh; Stephen Degler; Christos Zoulas Subject: XFS memory allocation deadlock in 2.6.38 This message was originally posted to the XFS mailing list, but received no responses. Thus, I am sending it to LKML on the advice of Martin. Using the attached program, we are able to reproduce this bug reliably. $ make vmtest $ ./vmtest /xfs/hugefile.dat $(( 16 * 1024 * 1024 * 1024 )) # vmtest /xfs/hugefile.dat: mapped 17179869184 bytes in 33822066943 ticks 749660: avg 13339 max 234667 ticks 371945: avg 26885 max 281616 ticks --- At this point, we see the following on the console: [593492.694806] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250) [593506.724367] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250) [593524.837717] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250) [593556.742386] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250) This is the same message presented in http://oss.sgi.com/bugzilla/show_bug.cgi?id=410 We started testing with 2.6.38-rc7 and have seen this bug through to the .0 release. This does not appear to be present in 2.6.33, but we have not done testing in between. We have tested with ext4 and do not encounter this bug. CONFIG_XFS_FS=y CONFIG_XFS_QUOTA=y CONFIG_XFS_POSIX_ACL=y CONFIG_XFS_RT=y # CONFIG_XFS_DEBUG is not set # CONFIG_VXFS_FS is not set Here is the stack from the process: [] call_rwsem_down_write_failed+0x13/0x20 [] xfs_ilock+0x7e/0x110 [] __xfs_get_blocks+0x8f/0x4e0 [] xfs_get_blocks+0x11/0x20 [] __block_write_begin+0x1ee/0x5b0 [] block_page_mkwrite+0x9d/0xf0 [] xfs_vm_page_mkwrite+0x15/0x20 [] do_wp_page+0x54b/0x820 [] handle_pte_fault+0x3cc/0x820 [] handle_mm_fault+0x175/0x2f0 [] do_page_fault+0x159/0x470 [] page_fault+0x1f/0x30 [] 0xffffffffffffffff # uname -a Linux testhost 2.6.38 #2 SMP PREEMPT Fri Mar 18 15:00:59 GMT 2011 x86_64 GNU/Linux Please let me know if additional information is required. Thanks! Sean -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/