LinuxLists.cc - 2.6.10-bkcurr: major slab corruption preventing booting on ARM

2005-01-04 14:43:58

Subject: 2.6.10-bkcurr: major slab corruption preventing booting on ARM

Hi.

I've had a report from a fellow ARM hacker of their platform not
booting. After they turned on slab debugging, they saw (pieced
together from a report on IRC):

Freeing init memory: 104K
run_init_process(/bin/bash)
Slab corruption: start=c0010934, len=160
Last user: [<c00adc54>](d_alloc+0x28/0x2d8)

I've just run up 2.6.10-bkcurr on a different ARM platform, and
encountered the following output. It looks like there's serious
slab corruption issues in these kernels.

I'll dig a little further into the report below to see if there's
anything obvious.

Starting up networking
eth0: link down
eth0: link up, 10Mbps, half-duplex, lpa 0x0021
Starting network services
slab: Internal list corruption detected in cache 'buffer_head'(63), slabp c7912000(16). Hexdump:

000: 00 01 10 00 00 02 20 00 14 01 00 00 14 21 91 c7
010: 10 00 00 00 10 00 00 00 fe ff ff ff fe ff ff ff
020: fe ff ff ff fe ff ff ff fe ff ff ff fe ff ff ff
030: fe ff ff ff fe ff ff ff fe ff ff ff fe ff ff ff
040: fe ff ff ff fe ff ff ff fe ff ff ff fe ff ff ff
050: fe ff ff ff fe ff ff ff 11 00 00 00 12 00 00 00
060: 13 00 00 00 14 00 00 00 15 00 00 00 16 00 00 00
070: 17 00 00 00 0a 60 6b 6b 19 00 00 00 1a 00 00 00
080: 1b 00 00 00 1c 00 00 00 1d 00 00 00 1e 00 00 00
090: 1f 00 00 00 20 00 00 00 21 00 00 00 22 00 00 00
0a0: 23 00 00 00 24 00 00 00 25 00 00 00 26 00 00 00
0b0: 27 00 00 00 28 00 00 00 29 00 00 00 2a 00 00 00
0c0: 2b 00 00 00 2c 00 00 00 2d 00 00 00 2e 00 00 00
0d0: 2f 00 00 00 30 00 00 00 31 00 00 00 32 00 00 00
0e0: 33 00 00 00 34 00 00 00 35 00 00 00 36 00 00 00
0f0: 37 00 00 00 38 00 00 00 39 00 00 00 3a 00 00 00
kernel BUG at /home/rmk/build/linux-v2.6-local/mm/slab.c:1977!
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = c0004000
[00000000] *pgd=00000000
Internal error: Oops: 817 [#1]
Modules linked in:
CPU: 0
PC is at __bug+0x40/0x54
LR is at 0x1
pc : [<c00263f8>] lr : [<00000001>] Not tainted
sp : c03c5ee4 ip : 60000093 fp : c03c5ef4
r10: 00000007 r9 : 00000000 r8 : c7912018
r7 : 00000000 r6 : c039e8e0 r5 : c7912000 r4 : 00000000
r3 : 00000000 r2 : 00000000 r1 : 000012f3 r0 : 00000001
Flags: nZCv IRQs off FIQs on Mode SVC_32 Segment kernel
Control: 5717F Table: 07A44000 DAC: 00000017
Process events/0 (pid: 3, stack limit = 0xc03c4190)
Stack: (0xc03c5ee4 to 0xc03c6000)
5ee0: c7912114 c03c5f24 c03c5ef8 c005e66c c00263c8 c0399a28 00000007
5f00: c0399a18 c0399a28 c039e8e0 c023ee68 00000000 c023ee78 c03c5f44 c03c5f28
5f20: c005ef64 c005e5e0 c039e8e0 00000000 c039e950 00000001 c03c5f70 c03c5f48
5f40: c005f020 c005eee4 c038fea8 c023ee88 80000013 c038fea0 00000000 c038fe98
5f60: c005ef88 c03c5fc8 c03c5f74 c0048c14 c005ef98 ffffffff ffffffff 00000001
5f80: 00000000 c0035efc 00010000 00000000 00000000 c038d7c0 c0035efc 00100100
5fa0: 00200200 c03c4000 c03b3f34 c038fe98 c0048a4c fffffffc 00000000 c03c5ff4
5fc0: c03c5fcc c004d148 c0048a5c ffffffff ffffffff 00000000 00000000 00000000
5fe0: 00000000 00000000 00000000 c03c5ff8 c003b7b8 c004d0d4 00000000 00000000
Backtrace:
[<c00263b8>] (__bug+0x0/0x54) from [<c005e66c>] (free_block+0x9c/0x18c)
r4 = C7912114
[<c005e5d0>] (free_block+0x0/0x18c) from [<c005ef64>] (drain_array_locked+0x90/0xb4)
[<c005eed4>] (drain_array_locked+0x0/0xb4) from [<c005f020>] (cache_reap+0x98/0x208)
r7 = 00000001 r6 = C039E950 r5 = 00000000 r4 = C039E8E0
[<c005ef88>] (cache_reap+0x0/0x208) from [<c0048c14>] (worker_thread+0x1c8/0x258)
[<c0048a4c>] (worker_thread+0x0/0x258) from [<c004d148>] (kthread+0x84/0xb0)
[<c004d0c4>] (kthread+0x0/0xb0) from [<c003b7b8>] (do_exit+0x0/0x408)
r8 = 00000000 r7 = 00000000 r6 = 00000000 r5 = 00000000
r4 = 00000000
Code: 1b004cba e59f0014 eb004cb8 e3a03000 (e5833000)

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 PCMCIA - http://pcmcia.arm.linux.org.uk/
2.6 Serial core

2005-01-04 16:14:06

by Russell King

[permalink] [raw]

Subject: Re: 2.6.10-bkcurr: major slab corruption preventing booting on ARM

On Tue, Jan 04, 2005 at 02:43:50PM +0000, Russell King wrote:
> I've had a report from a fellow ARM hacker of their platform not
> booting. After they turned on slab debugging, they saw (pieced
> together from a report on IRC):
>
> Freeing init memory: 104K
> run_init_process(/bin/bash)
> Slab corruption: start=c0010934, len=160
> Last user: [<c00adc54>](d_alloc+0x28/0x2d8)
>
> I've just run up 2.6.10-bkcurr on a different ARM platform, and
> encountered the following output. It looks like there's serious
> slab corruption issues in these kernels.
>
> I'll dig a little further into the report below to see if there's
> anything obvious.

Ok, reverting the pud_t patch fixes both these problems (the exact
patch can be found at: http://www.home.arm.linux.org.uk/~rmk/misc/bk4-bk5
Note that this is not a plain bk4-bk5 patch, but just the pud_t
changes brought forward to bk6 or there abouts.)

So, something in the 4 level page table patches is causing random
scribbling in kernel memory.

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 PCMCIA - http://pcmcia.arm.linux.org.uk/
2.6 Serial core

2005-01-04 17:18:49

by Ben Dooks

[permalink] [raw]

Subject: Re: 2.6.10-bkcurr: major slab corruption preventing booting on ARM

2005-01-04 17:23:40

by Russell King

[permalink] [raw]

Subject: Re: 2.6.10-bkcurr: major slab corruption preventing booting on ARM

On Tue, Jan 04, 2005 at 04:10:49PM +0000, Russell King wrote:
> On Tue, Jan 04, 2005 at 02:43:50PM +0000, Russell King wrote:
> > I've had a report from a fellow ARM hacker of their platform not
> > booting. After they turned on slab debugging, they saw (pieced
> > together from a report on IRC):
> >
> > Freeing init memory: 104K
> > run_init_process(/bin/bash)
> > Slab corruption: start=c0010934, len=160
> > Last user: [<c00adc54>](d_alloc+0x28/0x2d8)
> >
> > I've just run up 2.6.10-bkcurr on a different ARM platform, and
> > encountered the following output. It looks like there's serious
> > slab corruption issues in these kernels.
> >
> > I'll dig a little further into the report below to see if there's
> > anything obvious.
>
> Ok, reverting the pud_t patch fixes both these problems (the exact
> patch can be found at: http://www.home.arm.linux.org.uk/~rmk/misc/bk4-bk5
> Note that this is not a plain bk4-bk5 patch, but just the pud_t
> changes brought forward to bk6 or there abouts.)
>
> So, something in the 4 level page table patches is causing random
> scribbling in kernel memory.

Ok, I've narrowed the problem down to something in the following patch.
Andi Kleen suggests that maybe the ARM FIRST_USER_PGD_NR got broken in
by something here. Nick, any ideas?

diff -urN linux-2.6.10-bk4/include/linux/mm.h linux-2.6.10-bk5/include/linux/mm.h
--- linux-2.6.10-bk4/include/linux/mm.h 2004-12-24 13:33:50.000000000 -0800
+++ linux-2.6.10-bk5/include/linux/mm.h 2005-01-02 04:55:30.285949371 -0800
@@ -566,7 +566,7 @@
struct vm_area_struct *start_vma, unsigned long start_addr,
unsigned long end_addr, unsigned long *nr_accounted,
struct zap_details *);
-void clear_page_tables(struct mmu_gather *tlb, unsigned long first, int nr);
+void clear_page_range(struct mmu_gather *tlb, unsigned long addr, unsigned long end);
int copy_page_range(struct mm_struct *dst, struct mm_struct *src,
struct vm_area_struct *vma);
int zeromap_page_range(struct vm_area_struct *vma, unsigned long from,
diff -urN linux-2.6.10-bk4/mm/memory.c linux-2.6.10-bk5/mm/memory.c
--- linux-2.6.10-bk4/mm/memory.c 2004-12-24 13:34:44.000000000 -0800
+++ linux-2.6.10-bk5/mm/memory.c 2005-01-02 04:55:31.265995181 -0800
@@ -34,6 +34,8 @@
*
* 16.07.99 - Support of BIGMEM added by Gerhard Wichert, Siemens AG
* ([email protected])
+ *
+ * Aug/Sep 2004 Changed to four level page tables (Andi Kleen)
*/

#include <linux/kernel_stat.h>
@@ -98,58 +100,107 @@
* Note: this doesn't free the actual pages themselves. That
* has been handled earlier when unmapping all the memory regions.
*/
-static inline void free_one_pmd(struct mmu_gather *tlb, pmd_t * dir)
+static inline void clear_pmd_range(struct mmu_gather *tlb, pmd_t *pmd, unsigned long start, unsigned long end)
{
struct page *page;

- if (pmd_none(*dir))
+ if (pmd_none(*pmd))
return;
- if (unlikely(pmd_bad(*dir))) {
- pmd_ERROR(*dir);
- pmd_clear(dir);
+ if (unlikely(pmd_bad(*pmd))) {
+ pmd_ERROR(*pmd);
+ pmd_clear(pmd);
return;
}
- page = pmd_page(*dir);
- pmd_clear(dir);
- dec_page_state(nr_page_table_pages);
- tlb->mm->nr_ptes--;
- pte_free_tlb(tlb, page);
+ if (!(start & ~PMD_MASK) && !(end & ~PMD_MASK)) {
+ page = pmd_page(*pmd);
+ pmd_clear(pmd);
+ dec_page_state(nr_page_table_pages);
+ tlb->mm->nr_ptes--;
+ pte_free_tlb(tlb, page);
+ }
}

-static inline void free_one_pgd(struct mmu_gather *tlb, pgd_t * dir)
+static inline void clear_pud_range(struct mmu_gather *tlb, pud_t *pud, unsigned long start, unsigned long end)
{
- int j;
- pmd_t * pmd;
+ unsigned long addr = start, next;
+ pmd_t *pmd, *__pmd;

- if (pgd_none(*dir))
+ if (pud_none(*pud))
return;
- if (unlikely(pgd_bad(*dir))) {
- pgd_ERROR(*dir);
- pgd_clear(dir);
+ if (unlikely(pud_bad(*pud))) {
+ pud_ERROR(*pud);
+ pud_clear(pud);
return;
}
- pmd = pmd_offset(dir, 0);
- pgd_clear(dir);
- for (j = 0; j < PTRS_PER_PMD ; j++)
- free_one_pmd(tlb, pmd+j);
- pmd_free_tlb(tlb, pmd);
+
+ pmd = __pmd = pmd_offset(pud, start);
+ do {
+ next = (addr + PMD_SIZE) & PMD_MASK;
+ if (next > end || next <= addr)
+ next = end;
+
+ clear_pmd_range(tlb, pmd, addr, next);
+ pmd++;
+ addr = next;
+ } while (addr && (addr < end));
+
+ if (!(start & ~PUD_MASK) && !(end & ~PUD_MASK)) {
+ pud_clear(pud);
+ pmd_free_tlb(tlb, __pmd);
+ }
+}
+
+
+static inline void clear_pgd_range(struct mmu_gather *tlb, pgd_t *pgd, unsigned long start, unsigned long end)
+{
+ unsigned long addr = start, next;
+ pud_t *pud, *__pud;
+
+ if (pgd_none(*pgd))
+ return;
+ if (unlikely(pgd_bad(*pgd))) {
+ pgd_ERROR(*pgd);
+ pgd_clear(pgd);
+ return;
+ }
+
+ pud = __pud = pud_offset(pgd, start);
+ do {
+ next = (addr + PUD_SIZE) & PUD_MASK;
+ if (next > end || next <= addr)
+ next = end;
+
+ clear_pud_range(tlb, pud, addr, next);
+ pud++;
+ addr = next;
+ } while (addr && (addr < end));
+
+ if (!(start & ~PGDIR_MASK) && !(end & ~PGDIR_MASK)) {
+ pgd_clear(pgd);
+ pud_free_tlb(tlb, __pud);
+ }
}

/*
- * This function clears all user-level page tables of a process - this
- * is needed by execve(), so that old pages aren't in the way.
+ * This function clears user-level page tables of a process.
*
* Must be called with pagetable lock held.
*/
-void clear_page_tables(struct mmu_gather *tlb, unsigned long first, int nr)
+void clear_page_range(struct mmu_gather *tlb, unsigned long start, unsigned long end)
{
- pgd_t * page_dir = tlb->mm->pgd;
-
- page_dir += first;
- do {
- free_one_pgd(tlb, page_dir);
- page_dir++;
- } while (--nr);
+ unsigned long addr = start, next;
+ unsigned long i, nr = pgd_index(end + PGDIR_SIZE-1) - pgd_index(start);
+ pgd_t * pgd = pgd_offset(tlb->mm, start);
+
+ for (i = 0; i < nr; i++) {
+ next = (addr + PGDIR_SIZE) & PGDIR_MASK;
+ if (next > end || next <= addr)
+ next = end;
+
+ clear_pgd_range(tlb, pgd, addr, next);
+ pgd++;
+ addr = next;
+ }
}

pte_t fastcall * pte_alloc_map(struct mm_struct *mm, pmd_t *pmd, unsigned long address)
diff -urN linux-2.6.10-bk4/mm/mmap.c linux-2.6.10-bk5/mm/mmap.c
--- linux-2.6.10-bk4/mm/mmap.c 2004-12-24 13:35:00.000000000 -0800
+++ linux-2.6.10-bk5/mm/mmap.c 2005-01-02 04:55:31.385000743 -0800
@@ -1474,7 +1474,6 @@
{
unsigned long first = start & PGDIR_MASK;
unsigned long last = end + PGDIR_SIZE - 1;
- unsigned long start_index, end_index;
struct mm_struct *mm = tlb->mm;

if (!prev) {
@@ -1499,23 +1498,18 @@
last = next->vm_start;
}
if (prev->vm_end > first)
- first = prev->vm_end + PGDIR_SIZE - 1;
+ first = prev->vm_end;
break;
}
no_mmaps:
if (last < first) /* for arches with discontiguous pgd indices */
return;
- /*
- * If the PGD bits are not consecutive in the virtual address, the
- * old method of shifting the VA >> by PGDIR_SHIFT doesn't work.
- */
- start_index = pgd_index(first);
- if (start_index < FIRST_USER_PGD_NR)
- start_index = FIRST_USER_PGD_NR;
- end_index = pgd_index(last);
- if (end_index > start_index) {
- clear_page_tables(tlb, start_index, end_index - start_index);
- flush_tlb_pgtables(mm, first & PGDIR_MASK, last & PGDIR_MASK);
+ if (first < FIRST_USER_PGD_NR * PGDIR_SIZE)
+ first = FIRST_USER_PGD_NR * PGDIR_SIZE;
+ /* No point trying to free anything if we're in the same pte page */
+ if ((first & PMD_MASK) < (last & PMD_MASK)) {
+ clear_page_range(tlb, first, last);
+ flush_tlb_pgtables(mm, first, last);
}
}

@@ -1844,7 +1838,9 @@
~0UL, &nr_accounted, NULL);
vm_unacct_memory(nr_accounted);
BUG_ON(mm->map_count); /* This is just debugging */
- clear_page_tables(tlb, FIRST_USER_PGD_NR, USER_PTRS_PER_PGD);
+ clear_page_range(tlb, FIRST_USER_PGD_NR * PGDIR_SIZE,
+ (TASK_SIZE + PGDIR_SIZE - 1) & PGDIR_MASK);
+
tlb_finish_mmu(tlb, 0, MM_VM_SIZE(mm));

vma = mm->mmap;

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 PCMCIA - http://pcmcia.arm.linux.org.uk/
2.6 Serial core

2005-01-05 02:02:29

by Nick Piggin

[permalink] [raw]

Subject: Re: 2.6.10-bkcurr: major slab corruption preventing booting on ARM

Russell King wrote:
> On Tue, Jan 04, 2005 at 04:10:49PM +0000, Russell King wrote:
>
>>On Tue, Jan 04, 2005 at 02:43:50PM +0000, Russell King wrote:
>>
>>>I've had a report from a fellow ARM hacker of their platform not
>>>booting. After they turned on slab debugging, they saw (pieced
>>>together from a report on IRC):
>>>
>>>Freeing init memory: 104K
>>>run_init_process(/bin/bash)
>>>Slab corruption: start=c0010934, len=160
>>>Last user: [<c00adc54>](d_alloc+0x28/0x2d8)
>>>
>>>I've just run up 2.6.10-bkcurr on a different ARM platform, and
>>>encountered the following output. It looks like there's serious
>>>slab corruption issues in these kernels.
>>>
>>>I'll dig a little further into the report below to see if there's
>>>anything obvious.
>>
>>Ok, reverting the pud_t patch fixes both these problems (the exact
>>patch can be found at: http://www.home.arm.linux.org.uk/~rmk/misc/bk4-bk5
>>Note that this is not a plain bk4-bk5 patch, but just the pud_t
>>changes brought forward to bk6 or there abouts.)
>>
>>So, something in the 4 level page table patches is causing random
>>scribbling in kernel memory.
>
>
> Ok, I've narrowed the problem down to something in the following patch.
> Andi Kleen suggests that maybe the ARM FIRST_USER_PGD_NR got broken in
> by something here. Nick, any ideas?
>

I see you've had a fix commited to -bk? Yes that looks like it would
cause the problems you are seeing.

Thanks,
Nick