Date: Fri, 18 Sep 2009 16:56:18 +0100
From: Mel Gorman <mel@csn.ul.ie>
To: Nick Piggin <npiggin@suse.de>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>, linux-kernel@vger.kernel.org,
       akpm@linux-foundation.org, cl@linux-foundation.org,
       heiko.carstens@de.ibm.com, mingo@elte.hu, sachinp@in.ibm.com
Subject: Re: [RFC/PATCH] SLQB: Mark the allocator as broken PowerPC and S390
Message-ID: <20090918155618.GA1225@csn.ul.ie>
References: <1253083059.5478.1.camel@penberg-laptop> <20090917100841.GF13002@csn.ul.ie> <1253183365.4975.20.camel@penberg-laptop> <20090917105707.GA7205@csn.ul.ie> <1253186019.4975.32.camel@penberg-laptop> <20090917111828.GB7205@csn.ul.ie> <20090917114116.GL18404@wotan.suse.de> <20090917181831.GA714@csn.ul.ie> <20090917182842.GS18404@wotan.suse.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-15
Content-Disposition: inline
In-Reply-To: <20090917182842.GS18404@wotan.suse.de>
User-Agent: Mutt/1.5.17+20080114 (2008-01-14)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 4412
Lines: 102

On Thu, Sep 17, 2009 at 08:28:42PM +0200, Nick Piggin wrote:
> On Thu, Sep 17, 2009 at 07:18:32PM +0100, Mel Gorman wrote:
> > > Ahh... it's pretty lame of me. Sachin has been a willing tester :(
> > > I have spent quite a few hours looking at it but I never found
> > > many good leads. Much appreciated if you can make more progress on
> > > it.
> > 
> > Nothing much so far. I've reproduced the problem based on 2.6.31 and slqb-core
> > from Pekka's tree but not a whole pile else. I don't know SLQB at all so the
> > investigation is fuzzy. It appears to initialise SLQB ok but crashes later when
> > setting up SCSI. Not 100% sure what the triggering event is but it might be
> > userspace starting up and other CPUs get involved, possibly corrupting lists.
> > 
> > This machine has two CPUs (0, 1) and two nodes with actual memory (2,3).
> > After applying a patch to kmem_cache_create, I see in the console
> > 
> > MEL::Creating cache pgd_cache CPU 0 Node 0
> > MEL::Creating cache pmd_cache CPU 0 Node 0
> > MEL::Creating cache pid_namespace CPU 0 Node 0
> > MEL::Creating cache shmem_inode_cache CPU 0 Node 0
> > MEL::Creating cache scsi_data_buffer CPU 1 Node 0
> > 
> > It crashes at this point during creation before the struct kmem_cache has
> > been allocated from kmem_cache_cache. Note it's kmem_cache_cache we are
> > failing to allocate from, not scsi_data_buffer.
> 
> Yes, it's crashing in kmem_cache_create, when trying to allocate from
> kmem_cache_cache.
> 
> I didn't get much further. I had thought something must be NULL or
> not set up correctly in kmem_cache_cache, but I didn't work out what.
> 

Somehow it's getting scrambled but I couldn't see anything wrong with
the locking. Weirdly, the following patch allows the kernel to boot much
further. Is it possible the DEFINE_PER_CPU() trick for defining per-node data
is being busted by recent per-cpu changes? Sorry that this is pretty crude,
it's my first proper reading of SLQB so it's slow going.

Although booting gets further with the following patch, it quickly hits
an OOM-storm until the machine dies with the vast majority of pages being
allocated by the slab allocator. There must be some flaw in the node fallback
logic that means that pages are allocated for every slab allocation.

diff --git a/mm/slqb.c b/mm/slqb.c
index 4ca85e2..4d72be2 100644
--- a/mm/slqb.c
+++ b/mm/slqb.c
@@ -1944,16 +1944,16 @@ static void init_kmem_cache_node(struct kmem_cache *s,
 static DEFINE_PER_CPU(struct kmem_cache_cpu, kmem_cache_cpus);
 #endif
 #ifdef CONFIG_NUMA
-/* XXX: really need a DEFINE_PER_NODE for per-node data, but this is better than
- * a static array */
-static DEFINE_PER_CPU(struct kmem_cache_node, kmem_cache_nodes);
+/* XXX: really need a DEFINE_PER_NODE for per-node data because a static
+ *      array is wasteful */
+static struct kmem_cache_node kmem_cache_nodes[MAX_NUMNODES];
 #endif
 
 #ifdef CONFIG_SMP
 static struct kmem_cache kmem_cpu_cache;
 static DEFINE_PER_CPU(struct kmem_cache_cpu, kmem_cpu_cpus);
 #ifdef CONFIG_NUMA
-static DEFINE_PER_CPU(struct kmem_cache_node, kmem_cpu_nodes); /* XXX per-nid */
+static struct kmem_cache_node kmem_cpu_nodes[MAX_NUMNODES]; /* XXX per-nid */
 #endif
 #endif
 
@@ -1962,7 +1962,7 @@ static struct kmem_cache kmem_node_cache;
 #ifdef CONFIG_SMP
 static DEFINE_PER_CPU(struct kmem_cache_cpu, kmem_node_cpus);
 #endif
-static DEFINE_PER_CPU(struct kmem_cache_node, kmem_node_nodes); /*XXX per-nid */
+static struct kmem_cache_node kmem_node_nodes[MAX_NUMNODES]; /*XXX per-nid */
 #endif
 
 #ifdef CONFIG_SMP
@@ -2918,15 +2918,15 @@ void __init kmem_cache_init(void)
 	for_each_node_state(i, N_NORMAL_MEMORY) {
 		struct kmem_cache_node *n;
 
-		n = &per_cpu(kmem_cache_nodes, i);
+		n = &kmem_cache_nodes[i];
 		init_kmem_cache_node(&kmem_cache_cache, n);
 		kmem_cache_cache.node_slab[i] = n;
 #ifdef CONFIG_SMP
-		n = &per_cpu(kmem_cpu_nodes, i);
+		n = &kmem_cpu_nodes[i];
 		init_kmem_cache_node(&kmem_cpu_cache, n);
 		kmem_cpu_cache.node_slab[i] = n;
 #endif
-		n = &per_cpu(kmem_node_nodes, i);
+		n = &kmem_node_nodes[i];
 		init_kmem_cache_node(&kmem_node_cache, n);
 		kmem_node_cache.node_slab[i] = n;
 	}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/