Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S966431AbcLVVGi (ORCPT ); Thu, 22 Dec 2016 16:06:38 -0500 Received: from ipmail05.adl6.internode.on.net ([150.101.137.143]:7525 "EHLO ipmail05.adl6.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S966157AbcLVVGh (ORCPT ); Thu, 22 Dec 2016 16:06:37 -0500 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2DeIgCeP1xYIOyiLHlYBhoBAQEBAgEBAQEIAQEBAYM1AQEBAQEfgWaCfoN5h0yUOAEBAQEBB4EcjDiED4ROggmGHAICAQECgWtCEgECAQEBAQEBAQYBAQEBAQE5RUIShBUBBTocIxAIAxgJJQ8FJQMHGhOIYAytJ4p/AQEBBwIBJSCFY4UngT8BgnAMAYVuBZp4kS2QX0mNXYQPJQEwgQgWDYQjJ4FZKjSCLIQCgjsBAQE Date: Fri, 23 Dec 2016 08:06:07 +1100 From: Dave Chinner To: Linus Torvalds Cc: Thomas Gleixner , Ingo Molnar , Peter Anvin , Linux Kernel Mailing List , the arch/x86 maintainers Subject: Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0 Message-ID: <20161222210607.GK4758@dastard> References: <20161214222411.GH4326@dastard> <20161214222953.GI4326@dastard> <20161216185906.t2wmrr6wqjdsrduw@straylight.hirudinean.org> <20161221221638.GD4758@dastard> <20161222001303.nvrtm22szn3hgxar@straylight.hirudinean.org> <20161222051322.GF4758@dastard> <20161222062858.GG4758@dastard> <20161222204240.GJ4758@dastard> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20161222204240.GJ4758@dastard> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3456 Lines: 80 On Fri, Dec 23, 2016 at 07:42:40AM +1100, Dave Chinner wrote: > On Thu, Dec 22, 2016 at 09:24:12AM -0800, Linus Torvalds wrote: > > On Wed, Dec 21, 2016 at 10:28 PM, Dave Chinner wrote: > > > > > > This sort of thing is normally indicative of a memory reclaim or > > > lock contention problem. Profile showed unusual spinlock contention, > > > but then I realised there was only one kswapd thread running. > > > Yup, sure enough, it's caused by a major change in memory reclaim > > > behaviour: > > > > > > [ 0.000000] Zone ranges: > > > [ 0.000000] DMA [mem 0x0000000000001000-0x0000000000ffffff] > > > [ 0.000000] DMA32 [mem 0x0000000001000000-0x00000000ffffffff] > > > [ 0.000000] Normal [mem 0x0000000100000000-0x000000083fffffff] > > > [ 0.000000] Movable zone start for each node > > > [ 0.000000] Early memory node ranges > > > [ 0.000000] node 0: [mem 0x0000000000001000-0x000000000009efff] > > > [ 0.000000] node 0: [mem 0x0000000000100000-0x00000000bffdefff] > > > [ 0.000000] node 0: [mem 0x0000000100000000-0x00000003bfffffff] > > > [ 0.000000] node 0: [mem 0x00000005c0000000-0x00000005ffffffff] > > > [ 0.000000] node 0: [mem 0x0000000800000000-0x000000083fffffff] > > > [ 0.000000] Initmem setup node 0 [mem 0x0000000000001000-0x000000083fffffff] > > > > > > the numa=fake=4 CLI option is broken. > > > > Ok, I think that is independent of anything else. Removing block > > people and adding the x86 people. > > > > I'm not seeing anything at all that would change the fake numa stuff, > > but maybe the cpu hotplug changes? > > > > Thomas/Ingo/Peter - Dave is going away for several months, so you > > won't get feedback from him, but can you look at this? Or maybe point > > me towards the right people - I'm seeing no possible relevant changes > > at all fir x85 numa since 4.9, so it must be some indirect breakage. > > > > Dave is using fake-numa to do performance testing in a VM, and it's a > > big deal for the node optimizations for writeback etc. Do you have any > > ideas? > > > > Dave, if you're still around, can you send out the kernel config file > > you used... > > Looking at this fresh this morning (i.e. not pissed off by having > everything I tried to do fail in different ways all afternoon) I > found this: > > $ grep NUMA .config > CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y > # CONFIG_NUMA is not set > $ > > The .config I was using for 4.9 got 'make oldconfig' upgraded, and > looking at it there's a bunch of stuff that has been turned off that > I know was set: > > # CONFIG_EXPERT is not set > # CONFIG_PARAVIRT_SPINLOCKS is not set > # CONFIG_COMPACTION is not set > > and stuff I never use so don't set was set, like kernel crash dump, > a bunch of stuff for AMD CPUs, susp/resume and power management > debug, every partition type and filesystem under the sun was > selected, heaps of network devices enabled, etc. > > So it looks like the problem has occurred during oldconfig, meaning > I have no idea exactly WTF I was testing. Rebuilding now with a > saner config, see what happens. Better, but still bad. average files/s is not up to 200k files/s, so still a good 10-15% off where it should be. xfs_repair is back down to 10-15% off where it should be, too. bulkstat still fires off a bad page reference count warning, iscsi still panics immediately. Cheers, Dave. -- Dave Chinner david@fromorbit.com