Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754415AbaAIIuE (ORCPT ); Thu, 9 Jan 2014 03:50:04 -0500 Received: from mail-ea0-f177.google.com ([209.85.215.177]:44998 "EHLO mail-ea0-f177.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751201AbaAIIt4 (ORCPT ); Thu, 9 Jan 2014 03:49:56 -0500 Date: Thu, 9 Jan 2014 09:49:52 +0100 From: Simon Baatz To: Mikulas Patocka Cc: Joonsoo Kim , Andi Kleen , Christoph Lameter , Pekka Enberg , linux-kernel@vger.kernel.org, linux-parisc@vger.kernel.org, Russell King - ARM Linux , linux-arm-kernel@lists.infradead.org Subject: Re: [PATCH] fix crash when using XFS on loopback Message-ID: <20140109084951.GA9665@schnuecks.de> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Mikulas, On Sat, Jan 04, 2014 at 12:45:45PM -0500, Mikulas Patocka wrote: > The patch 8456a648cf44f14365f1f44de90a3da2526a4776 causes crash in the > LVM2 testsuite on PA-RISC (the crashing test is fsadm.sh). The testsuite > doesn't crash on 3.12, crashes on 3.13-rc1 and later. > > Bad Address (null pointer deref?): Code=15 regs=000000413edd89a0 (Addr=000006202224647d) > CPU: 3 PID: 24008 Comm: loop0 Not tainted 3.13.0-rc6 #5 > task: 00000001bf3c0048 ti: 000000413edd8000 task.ti: 000000413edd8000 > > YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI > PSW: 00001000000001101111100100001110 Not tainted > r00-03 000000ff0806f90e 00000000405c8de0 000000004013e6c0 000000413edd83f0 > r04-07 00000000405a95e0 0000000000000200 00000001414735f0 00000001bf349e40 > r08-11 0000000010fe3d10 0000000000000001 00000040829c7778 000000413efd9000 > r12-15 0000000000000000 000000004060d800 0000000010fe3000 0000000010fe3000 > r16-19 000000413edd82a0 00000041078ddbc0 0000000000000010 0000000000000001 > r20-23 0008f3d0d83a8000 0000000000000000 00000040829c7778 0000000000000080 > r24-27 00000001bf349e40 00000001bf349e40 202d66202224640d 00000000405a95e0 > r28-31 202d662022246465 000000413edd88f0 000000413edd89a0 0000000000000001 > sr00-03 000000000532c000 0000000000000000 0000000000000000 000000000532c000 > sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > > IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000401fe42c 00000000401fe430 > IIR: 539c0030 ISR: 00000000202d6000 IOR: 000006202224647d > CPU: 3 CR30: 000000413edd8000 CR31: 0000000000000000 > ORIG_R28: 00000000405a95e0 > IAOQ[0]: vma_interval_tree_iter_first+0x14/0x48 > IAOQ[1]: vma_interval_tree_iter_first+0x18/0x48 > RP(r2): flush_dcache_page+0x128/0x388 > Backtrace: > [<000000004013e6c0>] flush_dcache_page+0x128/0x388 > [<0000000010fe6ca0>] lo_splice_actor+0x90/0x148 [loop] > [<00000000402579b0>] splice_from_pipe_feed+0xc0/0x1d0 > [<00000000402580a4>] __splice_from_pipe+0xac/0xc0 > [<0000000010fe6bbc>] lo_direct_splice_actor+0x1c/0x70 [loop] > [<000000004025854c>] splice_direct_to_actor+0xec/0x228 > [<0000000010fe63ac>] lo_receive+0xe4/0x298 [loop] > [<0000000010fe69d8>] loop_thread+0x478/0x640 [loop] > [<000000004018975c>] kthread+0x134/0x168 > [<000000004012c020>] end_fault_vector+0x20/0x28 > [<00000000115e0098>] xfs_setsize_buftarg+0x0/0x90 [xfs] > > Kernel panic - not syncing: Bad Address (null pointer deref?) > > The patch 8456a648cf44f14365f1f44de90a3da2526a4776 changes the page > structure so that the slab subsystem reuses the page->mapping field. > > The crash happens in the following way: > * XFS allocates some memory from slab and issues a bio to read data into > it. > * the bio is sent to the loopback device. > * lo_receive creates an actor and calls splice_direct_to_actor. > * lo_splice_actor copies data to the target page. > * lo_splice_actor calls flush_dcache_page because the page may be mapped > by userspace. In that case we need to flush the kernel cache. > * flush_dcache_page asks for the list of userspace mappings, however that > page->mapping field is reused by the slab subsystem for a different > purpose. This causes the crash. > > Note that other architectures without coherent caches (sparc, arm, mips) > also call page_mapping from flush_dcache_page, so they may crash in the > same way. > > This patch fixes this bug by testing if the page is a slab page in > page_mapping and returning NULL if it is. > > > The patch also fixes VM_BUG_ON(PageSlab(page)) that could happen in > earlier kernels in the same scenario on architectures without cache > coherence when CONFIG_DEBUG_VM is enabled - so it should be backported to > stable kernels. > > > In the old kernels, the function page_mapping is placed in > include/linux/mm.h, so you should modify the patch accordingly when > backporting it. > > > Signed-off-by: Mikulas Patocka > Cc: stable@vger.kernel.org > > --- > mm/util.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > Index: linux-3.13-rc6/mm/util.c > =================================================================== > --- linux-3.13-rc6.orig/mm/util.c 2014-01-04 00:06:07.000000000 +0100 > +++ linux-3.13-rc6/mm/util.c 2014-01-04 00:24:42.000000000 +0100 > @@ -390,7 +390,10 @@ struct address_space *page_mapping(struc > { > struct address_space *mapping = page->mapping; > > - VM_BUG_ON(PageSlab(page)); > + /* This happens if someone calls flush_dcache_page on slab page */ > + if (unlikely(PageSlab(page))) > + return NULL; > + > if (unlikely(PageSwapCache(page))) { > swp_entry_t entry; I don't think that this is the correct fix. According to cachetlb.txt flush_(kernel_)dcache_page() is not supposed to be called with a slab page in the first place. There is code in the kernel to avoid that (see for example the discussion in [1] and [2]). Also on ARM, page_mapping() == NULL results in flush_(kernel_)dcache_page() assuming that the page is an anon page. Consequently, it would flush the slab page, which make no sense. Thus, I think we either need to add the check to the original caller of flush_dcache_page() or we allow flush_(kernel_)dcache_page() to be called with slab pages and put the check there (this has been proposed by Russell King once [3], but would affect multiple architectures) - Simon [1] https://lkml.org/lkml/2013/10/24/414 [2] https://lkml.org/lkml/2013/10/28/432 [3] https://lkml.org/lkml/2013/10/27/89 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/