LinuxLists.cc - s390 hugetlb oops with libhugetlbfs test-suite

2008-07-25 21:11:31

Subject: s390 hugetlb oops with libhugetlbfs test-suite

Hi Gerald,

Having noticed that the hugetlbfs code recently got an update to work on s390,
I decided to port the libhugetlbfs test-suite to work on s390. The current
development snapshot is available at
http://libhugetlbfs.ozlabs.org/snapshots/libhugetlbfs-dev-20080724.tar.gz. You
will need a patch [1] to build on s390, but it should roughly work otherwise.
The specific testcase that causes the oops is counters. The bit set in the
flags is PG_Reserved, I think.

Bad page state in process 'counters'
page:000003e040000000 flags:0x0000000000000400 mapping:0000000000000000 mapcount
:0 count:0
Trying to fix it up, but a reboot is needed
Backtrace:
CPU: 3 Not tainted 2.6.26-autokern1 #1
Process counters (pid: 5578, task: 000000003f3ca958, ksp: 000000003f1bbc08)
0000000000000005 000000003f1bb920 0000000000000002 0000000000000000
000000003f1bb9c0 000000003f1bb938 000000003f1bb938 000000000003cfe2
0000000000000000 0000000000000400 0000000000000000 000000000000000b
0000000000000008 0000000000000000 000000003f1bb920 000000003f1bb998
000000000024a780 0000000000016b0a 000000003f1bb920 000000003f1bb970
Call Trace:
(Ý<0000000000016a6c>¨ show_trace+0xdc/0xec)
Ý<000000000007a38a>¨ bad_page+0x96/0xc8
Ý<000000000007b11c>¨ free_hot_cold_page+0x9c/0x1e4
Ý<0000000000099e4a>¨ __unmap_hugepage_range+0x38a/0x3cc
Ý<0000000000099edc>¨ unmap_hugepage_range+0x50/0x78
Ý<00000000000877d6>¨ unmap_vmas+0x14e/0xb34
Ý<000000000008c6f2>¨ exit_mmap+0x14a/0x2f8
Ý<00000000000397dc>¨ mmput+0x60/0x110
Ý<00000000000401b4>¨ do_exit+0x264/0x7f4
Ý<00000000000407d6>¨ do_group_exit+0x92/0xc0
Ý<000000000004b558>¨ get_signal_to_deliver+0x344/0x36c
Ý<000000000001d746>¨ do_signal+0xe2/0x878
Ý<0000000000023b7a>¨ sysc_sigpending+0xe/0x22
Ý<000000000040123a>¨ 0x40123a

[1]

libhugetlbfs: Add basic s390x support

The tests/Makefile change is needed until the signal check can be added
for s390.

diff --git a/Makefile b/Makefile
index f1f83fa..59fdb47 100644
--- a/Makefile
+++ b/Makefile
@@ -63,6 +63,13 @@ CC64 = gcc -m64
LIB64 = lib64
CFLAGS += -DNO_ELFLINK
else
+ifeq ($(ARCH),s390x)
+CC64 = gcc -m64
+CC32 = gcc -m31
+LIB64 = lib64
+LIB32 = lib
+CFLAGS += -DNO_ELFLINK
+else
$(error "Unrecognized architecture ($(ARCH))")
endif
endif
@@ -70,6 +77,7 @@ endif
endif
endif
endif
+endif

ifdef CC32
OBJDIRS += obj32
diff --git a/tests/Makefile b/tests/Makefile
index b723db7..ebd7fb8 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -2,7 +2,7 @@ PREFIX = /usr/local

LIB_TESTS = gethugepagesize test_root find_path unlinked_fd misalign \
readback truncate shared private fork-cow empty_mounts large_mounts \
- meminfo_nohuge ptrace-write-hugepage icache-hygiene slbpacaflush \
+ meminfo_nohuge ptrace-write-hugepage slbpacaflush \
chunk-overcommit mprotect alloc-instantiate-race mlock \
truncate_reserve_wraparound truncate_sigbus_versus_oom \
map_high_truncate_2 truncate_above_4GB direct \

--
Nishanth Aravamudan <[email protected]>
IBM Linux Technology Center

2008-07-26 03:33:14

by Nishanth Aravamudan

[permalink] [raw]

Subject: Re: s390 hugetlb oops with libhugetlbfs test-suite

On 25.07.2008 [14:10:35 -0700], Nishanth Aravamudan wrote:
> Hi Gerald,
>
> Having noticed that the hugetlbfs code recently got an update to work
> on s390, I decided to port the libhugetlbfs test-suite to work on
> s390. The current development snapshot is available at
> http://libhugetlbfs.ozlabs.org/snapshots/libhugetlbfs-dev-20080724.tar.gz.
> You will need a patch [1] to build on s390, but it should roughly work
> otherwise. The specific testcase that causes the oops is counters.
> The bit set in the flags is PG_Reserved, I think.

After enabling some debugging options, I got a slightly more precise
stack trace. The last one I posted was for the 32-bit counters test and
this one is for the 64-bit one, but appears to be the same problem
(PG_reserved still being set).

Bad page state in process 'counters'
page:000003e040000000 flags:0x0000000000000400 mapping:0000000000000000 mapcount:0 count:0
Trying to fix it up, but a reboot is needed
Backtrace:
CPU: 1 Tainted: G W 2.6.26-autokern1 #1
Process counters (pid: 5365, task: 000000003855ab50, ksp: 000000002667bbc0)
0000000000000047 000000002667b7d0 0000000000000002 0000000000000000
000000002667b870 000000002667b7e8 000000002667b7e8 0000000000016f22
0000000000000000 0000000000000000 000002000000000a 000000000000000a
0000000000000000 000000002667b7d0 000000002667b7d0 000000002667b848
000000000025cb28 0000000000016f22 000000002667b7d0 000000002667b828
Call Trace:
([<0000000000016e80>] show_trace+0xd8/0xe8)
[<0000000000016f40>] show_stack+0xb0/0xc0
[<000000000001766c>] dump_stack+0xa8/0xb8
[<000000000007f3da>] bad_page+0x9a/0xd4
[<00000000000802dc>] free_hot_cold_page+0xa0/0x1f8
[<00000000000804f6>] free_hot_page+0x22/0x30
[<000000000008503a>] put_page+0x136/0x144
[<00000000000a0a68>] __unmap_hugepage_range+0x398/0x3dc
[<00000000000a0b00>] unmap_hugepage_range+0x54/0x7c
[<000000000008d6f2>] unmap_vmas+0x152/0xb5c
[<000000000009284e>] exit_mmap+0x14e/0x2fc
[<000000000003b128>] mmput+0x64/0x110
[<000000000004030c>] exit_mm+0xf0/0x100
[<0000000000041fd8>] do_exit+0x268/0x7f8
[<00000000000425fe>] do_group_exit+0x96/0xc4
[<000000000004ddbc>] get_signal_to_deliver+0x350/0x378
[<000000000001e026>] do_signal+0xe6/0x87c
[<0000000000024a0a>] sysc_sigpending+0xe/0x22
[<000000008000163c>] 0x8000163c

I'm not quite sure yet how the hugetlb destructor
(free_huge_page()/update_and_free_page()) can call
free_hot_page()/free_hot_cold_page(), because the comments for both
indicate they should only be used for order-0 pages. Are we somehow
losing the fact that this is a compound page on s390?

If anyone would like a s390-compatible version of libhugetlbfs, just let
me know I can send it off-list.

Thanks,
Nish

--
Nishanth Aravamudan <[email protected]>
IBM Linux Technology Center

2008-07-26 03:41:48

by Nishanth Aravamudan

[permalink] [raw]

Subject: Re: s390 hugetlb oops with libhugetlbfs test-suite

On 25.07.2008 [20:32:54 -0700], Nishanth Aravamudan wrote:
> On 25.07.2008 [14:10:35 -0700], Nishanth Aravamudan wrote:
> > Hi Gerald,
> >
> > Having noticed that the hugetlbfs code recently got an update to work
> > on s390, I decided to port the libhugetlbfs test-suite to work on
> > s390. The current development snapshot is available at
> > http://libhugetlbfs.ozlabs.org/snapshots/libhugetlbfs-dev-20080724.tar.gz.
> > You will need a patch [1] to build on s390, but it should roughly work
> > otherwise. The specific testcase that causes the oops is counters.
> > The bit set in the flags is PG_Reserved, I think.
>
> After enabling some debugging options, I got a slightly more precise
> stack trace. The last one I posted was for the 32-bit counters test and
> this one is for the 64-bit one, but appears to be the same problem
> (PG_reserved still being set).
>
> Bad page state in process 'counters'
> page:000003e040000000 flags:0x0000000000000400 mapping:0000000000000000 mapcount:0 count:0
> Trying to fix it up, but a reboot is needed
> Backtrace:
> CPU: 1 Tainted: G W 2.6.26-autokern1 #1

Odd, this is the same kernel as last time, which wasn't tainted. The reason for
the taint is the following:

------------[ cut here ]------------
Badness at drivers/s390/cio/qdio_main.c:1361
Modules linked in: zfcp(+) scsi_transport_fc scsi_mod vmur qeth qdio ccwgroup dm_mod dasd_fba_mod dasd_eckd_mod dasd_mod ext3 jbd
CPU: 2 Not tainted 2.6.26-autokern1 #1
Process modprobe (pid: 1812, task: 000000003855e300, ksp: 00000000375ffb10)
Krnl PSW : 0704100180000000 000003e0000f8cce (qdio_allocate+0x25a/0x354 [qdio])
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:1 PM:0 EA:3
Krnl GPRS: 00000000381c1048 00000000381c1048 00000000381c1048 0000000000000000
0000000036efb000 0000000000000000 000003e00069662a 0000000000000001
00000000370c2140 00000000370c2c60 0000000036efb000 00000000375ff920
000003e0000f5000 000003e0000fbf68 000003e0000f8ca8 00000000375ff920
Krnl Code: 000003e0000f8cc2: a7110fff tmll %r1,4095
000003e0000f8cc6: a7840004 brc 8,3e0000f8cce
000003e0000f8cca: a7f40001 brc 15,3e0000f8ccc
>000003e0000f8cce: e3a0c0200004 lg %r10,32(%r12)
000003e0000f8cd4: e320a0000004 lg %r2,0(%r10)
000003e0000f8cda: b9020022 ltgr %r2,%r2
000003e0000f8cde: a784003f brc 8,3e0000f8d5c
000003e0000f8ce2: bf1f2018 icm %r1,15,24(%r2)
Call Trace:
([<000003e0000f8ca8>] qdio_allocate+0x234/0x354 [qdio])
[<000003e00068b5a2>] zfcp_qdio_allocate+0x106/0x120 [zfcp]
[<000003e000686dbe>] zfcp_adapter_enqueue+0x62/0x4a4 [zfcp]
[<000003e0006872b4>] zfcp_ccw_probe+0x40/0x8c [zfcp]
[<00000000001af398>] ccw_device_probe+0x4c/0x70
[<0000000000190eba>] driver_probe_device+0x102/0x1ac
[<0000000000190fc8>] __driver_attach+0x64/0x98
[<000000000019032a>] bus_for_each_dev+0x62/0xa0
[<0000000000190c7a>] driver_attach+0x32/0x44
[<000000000019091c>] bus_add_driver+0xdc/0x26c
[<00000000001913aa>] driver_register+0xba/0x16c
[<00000000001b0c2a>] ccw_driver_register+0x42/0x54
[<000003e00000f4bc>] zfcp_ccw_register+0x24/0x38 [zfcp]
[<000003e00000f140>] zfcp_module_init+0x140/0x498 [zfcp]
[<0000000000067a24>] sys_init_module+0x1b40/0x1cb8
[<000000000002498a>] sysc_noemu+0x10/0x16
[<0000020000125eb2>] 0x20000125eb2
Last Breaking-Event-Address:
[<0000000000000000>] 0x0

--
Nishanth Aravamudan <[email protected]>
IBM Linux Technology Center

2008-07-29 17:23:32

by Gerald Schaefer

[permalink] [raw]

Subject: Re: s390 hugetlb oops with libhugetlbfs test-suite

On Fri, 2008-07-25 at 20:41 -0700, Nishanth Aravamudan wrote:
> Odd, this is the same kernel as last time, which wasn't tainted. The reason for
> the taint is the following:
>
> ------------[ cut here ]------------
> Badness at drivers/s390/cio/qdio_main.c:1361

Hi,

This was fixed by our qdio developer recently, but the fix is not yet
upstream. However, I doubt that this is related to the hugetlbfs problem.

I was able to reproduce the "Bad page state", with the current libhugetlbfs
development snapshot and your patch for s390. It only happened on machines
with software large page emulation, but we certainly have a problem there.
Thanks for reporting this bug and adding s390 support to libhugetlbfs, I
will look into this problem.

Thanks,
Gerald

2008-07-29 21:47:17

by Nishanth Aravamudan

[permalink] [raw]

Subject: Re: s390 hugetlb oops with libhugetlbfs test-suite

On 29.07.2008 [19:22:43 +0200], Gerald Schaefer wrote:
> On Fri, 2008-07-25 at 20:41 -0700, Nishanth Aravamudan wrote:
> > Odd, this is the same kernel as last time, which wasn't tainted. The reason for
> > the taint is the following:
> >
> > ------------[ cut here ]------------
> > Badness at drivers/s390/cio/qdio_main.c:1361
>
> Hi,
>
> This was fixed by our qdio developer recently, but the fix is not yet
> upstream. However, I doubt that this is related to the hugetlbfs problem.

Yeah, it didn't seem related, and the first dump I posted didn't trigger
that badness.

> I was able to reproduce the "Bad page state", with the current libhugetlbfs
> development snapshot and your patch for s390. It only happened on machines
> with software large page emulation, but we certainly have a problem there.
> Thanks for reporting this bug and adding s390 support to libhugetlbfs, I
> will look into this problem.

Just FYI, I've merged up the patch that adds support and it is in the
latest development snapshot of libhugetlbfs:
http://libhugetlbfs.ozlabs.org/snapshots/libhugetlbfs-dev-20080729.tar.gz.
Beyond the counters case, I saw one other issue on s390 (haven't
confirmed if it happens anywhere else), the icache-hygiene test, when
run manually a few times, will fail every so often claiming that one of
the mmap()s returned ENOMEM. I haven't had time to track that down yet,
but it might be because of the address space layout and the size of the
hugepage on s390.

Thanks,
Nish

--
Nishanth Aravamudan <[email protected]>
IBM Linux Technology Center

2008-08-05 15:50:17

by Gerald Schaefer

[permalink] [raw]

Subject: Re: s390 hugetlb oops with libhugetlbfs test-suite

On Tue, 2008-07-29 at 14:46 -0700, Nishanth Aravamudan wrote:
> Beyond the counters case, I saw one other issue on s390 (haven't
> confirmed if it happens anywhere else), the icache-hygiene test, when
> run manually a few times, will fail every so often claiming that one of
> the mmap()s returned ENOMEM. I haven't had time to track that down yet,
> but it might be because of the address space layout and the size of the
> hugepage on s390.

I have posted a patch that will fix the counters oops. So far, I could
not reproduce the icache-hygiene problem, did you use hardware large page
support or software emulation? But I probably noticed a similar one:

map_high_truncate_2 (32): FAIL mmap() 1: Cannot allocate memory

That one should be because of the 31-bit address space layout on s390,
as it tries to mmap too much (1.5GB).

I also noticed that task-size-overrun (64) seems to hang up on my system,
I can only continue after ctrl+c, can you verify that?

Thanks,
Gerald

2008-08-05 16:15:17

by Nishanth Aravamudan

[permalink] [raw]

Subject: Re: s390 hugetlb oops with libhugetlbfs test-suite

On 05.08.2008 [17:49:32 +0200], Gerald Schaefer wrote:
> On Tue, 2008-07-29 at 14:46 -0700, Nishanth Aravamudan wrote:
> > Beyond the counters case, I saw one other issue on s390 (haven't
> > confirmed if it happens anywhere else), the icache-hygiene test, when
> > run manually a few times, will fail every so often claiming that one of
> > the mmap()s returned ENOMEM. I haven't had time to track that down yet,
> > but it might be because of the address space layout and the size of the
> > hugepage on s390.
>
> I have posted a patch that will fix the counters oops.

I'm guessing you've tested it, but I will go ahead and do so and
confirm.

> So far, I could not reproduce the icache-hygiene problem, did you use
> hardware large page support or software emulation?

It was the same box that produce the counters oops, so I'm guessing
software emulation?

> But I probably noticed a similar one:
>
> map_high_truncate_2 (32): FAIL mmap() 1: Cannot allocate memory
>
> That one should be because of the 31-bit address space layout on s390,
> as it tries to mmap too much (1.5GB).

Ah, that could easily be.

> I also noticed that task-size-overrun (64) seems to hang up on my
> system, I can only continue after ctrl+c, can you verify that?

I think that happens, yes, because of how many mmap's it takes to run
into the topmost mapping. I'll check it out.

Thanks,
Nish

--
Nishanth Aravamudan <[email protected]>
IBM Linux Technology Center