2013-04-04 17:04:45

by Toralf Förster

[permalink] [raw]
Subject: user mode linux: trinity syscall "madvise" triggers quickly BUG: Bad page map

if SLUB is used (SLAB works fine till now with this syscall).

What I get with the this trinity command line

$> trinity --children 2 -c madvise

for a 32 bit Gentoo Linux (both as host and as guest)
and host kernel 3.8.5 and guest kernel 3.9-rc5 is the following :


2013-04-04T19:00:29.220+02:00 trinity kernel: BUG: Bad page map in process trinity-child1 pte:0032d045 pmd:3932e1e1
2013-04-04T19:00:29.220+02:00 trinity kernel: page:0a73f5a0 count:1 mapcount:-1 mapping: (null) index:0x0
2013-04-04T19:00:29.220+02:00 trinity kernel: page flags: 0x404(referenced|reserved)
2013-04-04T19:00:29.220+02:00 trinity kernel: addr:00100000 vm_flags:00040055 anon_vma: (null) mapping: (null) index:100
2013-04-04T19:00:29.220+02:00 trinity kernel: vma->vm_ops->fault: special_mapping_fault+0x0/0x80
2013-04-04T19:00:29.220+02:00 trinity kernel: 413c7d20: [<0836efd8>] dump_stack+0x22/0x24
2013-04-04T19:00:29.220+02:00 trinity kernel: 413c7d38: [<0837039b>] print_bad_pte+0x17b/0x197
2013-04-04T19:00:29.220+02:00 trinity kernel: 413c7d7c: [<080e2c68>] unmap_single_vma+0x268/0x430
2013-04-04T19:00:29.220+02:00 trinity kernel: 413c7ddc: [<080e33f4>] zap_page_range+0x74/0xb0
2013-04-04T19:00:29.220+02:00 trinity kernel: 413c7e10: [<080e172d>] sys_madvise+0x3bd/0x720
2013-04-04T19:00:29.225+02:00 trinity kernel: 413c7eac: [<08064a92>] handle_syscall+0x82/0xb0
2013-04-04T19:00:29.225+02:00 trinity kernel: 413c7ef4: [<08076f0d>] userspace+0x46d/0x590
2013-04-04T19:00:29.225+02:00 trinity kernel: 413c7fec: [<080617cc>] fork_handler+0x6c/0x70
2013-04-04T19:00:29.225+02:00 trinity kernel: 413c7ffc: [<00000002>] 0x2
2013-04-04T19:00:29.225+02:00 trinity kernel:
2013-04-04T19:00:29.225+02:00 trinity kernel: Disabling lock debugging due to kernel taint
2013-04-04T19:00:29.225+02:00 trinity kernel: BUG: Bad page state in process trinity-child1 pfn:0032d
2013-04-04T19:00:29.225+02:00 trinity kernel: page:0a73f5a0 count:0 mapcount:-1 mapping: (null) index:0x0
2013-04-04T19:00:29.225+02:00 trinity kernel: page flags: 0x404(referenced|reserved)
2013-04-04T19:00:29.225+02:00 trinity kernel: 413c7cd8: [<0836efd8>] dump_stack+0x22/0x24
2013-04-04T19:00:29.230+02:00 trinity kernel: 413c7cf0: [<080cf185>] bad_page+0xb5/0xe0
2013-04-04T19:00:29.230+02:00 trinity kernel: 413c7d0c: [<080cf223>] free_pages_prepare+0x73/0xb0
2013-04-04T19:00:29.230+02:00 trinity kernel: 413c7d28: [<080d064d>] free_hot_cold_page+0x1d/0x100
2013-04-04T19:00:29.230+02:00 trinity kernel: 413c7d50: [<080d2fde>] __put_single_page+0x1e/0x30
2013-04-04T19:00:29.230+02:00 trinity kernel: 413c7d64: [<080d3107>] put_page+0x27/0x30
2013-04-04T19:00:29.230+02:00 trinity kernel: 413c7d6c: [<080f026c>] free_page_and_swap_cache+0x3c/0x50
2013-04-04T19:00:29.230+02:00 trinity kernel: 413c7d7c: [<080e2c85>] unmap_single_vma+0x285/0x430
2013-04-04T19:00:29.230+02:00 trinity kernel: 413c7ddc: [<080e33f4>] zap_page_range+0x74/0xb0
2013-04-04T19:00:29.230+02:00 trinity kernel: 413c7e10: [<080e172d>] sys_madvise+0x3bd/0x720
2013-04-04T19:00:29.230+02:00 trinity kernel: 413c7eac: [<08064a92>] handle_syscall+0x82/0xb0
2013-04-04T19:00:29.234+02:00 trinity kernel: 413c7ef4: [<08076f0d>] userspace+0x46d/0x590
2013-04-04T19:00:29.234+02:00 trinity kernel: 413c7fec: [<080617cc>] fork_handler+0x6c/0x70
2013-04-04T19:00:29.234+02:00 trinity kernel: 413c7ffc: [<00000002>] 0x2
2013-04-04T19:00:29.234+02:00 trinity kernel:
2013-04-04T19:00:30.174+02:00 trinity kernel: BUG: Bad rss-counter state mm:40623600 idx:0 val:-1

--
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3


2013-04-06 12:47:24

by Toralf Förster

[permalink] [raw]
Subject: Re: user mode linux: trinity syscall "madvise" triggers quickly BUG: Bad page map

On 04/06/2013 01:43 PM, richard -rw- weinberger wrote:
>> What I get with the this trinity command line
>>
>> gt; trinity --children 2 -c madvise
>
> Does the attached patch fix the problem?

yes, the issue for this syscall is fixed now.

--
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3

2013-04-06 12:55:07

by Toralf Förster

[permalink] [raw]
Subject: Re: user mode linux: trinity syscall "madvise" triggers quickly BUG: Bad page map

On 04/06/2013 02:47 PM, Toralf Förster wrote:
> On 04/06/2013 01:43 PM, richard -rw- weinberger wrote:
>>> What I get with the this trinity command line
>>>
>>> gt; trinity --children 2 -c madvise
>>
>> Does the attached patch fix the problem?
>
> yes, the issue for this syscall is fixed now.
>

But now I get - after running this command for a longer time :


2013-04-06T14:49:55.755+02:00 trinity kernel: do_syscall_stub : ret = -12, offset = 1052680, data = 3f2ff008
2013-04-06T14:49:55.755+02:00 trinity kernel: do_syscall_stub: syscall 192 failed, return value = 0xfffffff4, expected return value = 0x280e5000
2013-04-06T14:49:55.755+02:00 trinity kernel: syscall parameters: 0x280e5000 0x1000 0x7 0x11 0x3 0x11cba
2013-04-06T14:49:55.755+02:00 trinity kernel: Failed to flush page for address 0x280e5000
2013-04-06T14:49:57.600+02:00 trinity kernel: do_syscall_stub : ret = -12, offset = 1052680, data = 3f397008
2013-04-06T14:49:57.600+02:00 trinity kernel: do_syscall_stub: syscall 192 failed, return value = 0xfffffff4, expected return value = 0x280e5000
2013-04-06T14:49:57.600+02:00 trinity kernel: syscall parameters: 0x280e5000 0x1000 0x7 0x11 0x3 0x1ad80
2013-04-06T14:49:57.600+02:00 trinity kernel: Failed to flush page for address 0x280e5000
2013-04-06T14:50:01.000+02:00 trinity cron[1071]: (root) CMD (test -x /usr/sbin/run-crons && /usr/sbin/run-crons)
2013-04-06T14:51:15.336+02:00 trinity kernel: do_syscall_stub : ret = -12, offset = 1052680, data = 3f2fe008
2013-04-06T14:51:15.336+02:00 trinity kernel: do_syscall_stub: syscall 192 failed, return value = 0xfffffff4, expected return value = 0x280e3000
2013-04-06T14:51:15.336+02:00 trinity kernel: syscall parameters: 0x280e3000 0x1000 0x7 0x11 0x3 0xd561
2013-04-06T14:51:15.336+02:00 trinity kernel: Failed to flush page for address 0x280e3000
2013-04-06T14:51:20.224+02:00 trinity kernel: do_syscall_stub : ret = -12, offset = 1052680, data = 3f394008
2013-04-06T14:51:20.224+02:00 trinity kernel: do_syscall_stub: syscall 192 failed, return value = 0xfffffff4, expected return value = 0x280e5000
2013-04-06T14:51:20.224+02:00 trinity kernel: syscall parameters: 0x280e5000 0x1000 0x7 0x11 0x3 0x29571
2013-04-06T14:51:20.224+02:00 trinity kernel: Failed to flush page for address 0x280e5000
2013-04-06T14:52:34.379+02:00 trinity kernel: do_syscall_stub : ret = -12, offset = 1052680, data = 3f22a008
2013-04-06T14:52:34.379+02:00 trinity kernel: do_syscall_stub: syscall 192 failed, return value = 0xfffffff4, expected return value = 0x280e3000
2013-04-06T14:52:34.379+02:00 trinity kernel: syscall parameters: 0x280e3000 0x1000 0x7 0x11 0x3 0x9b50
2013-04-06T14:52:34.379+02:00 trinity kernel: Failed to flush page for address 0x280e3000
2013-04-06T14:52:40.422+02:00 trinity kernel: do_syscall_stub : ret = -12, offset = 1052680, data = 3f397008
2013-04-06T14:52:40.422+02:00 trinity kernel: do_syscall_stub: syscall 192 failed, return value = 0xfffffff4, expected return value = 0x280e3000
2013-04-06T14:52:40.422+02:00 trinity kernel: syscall parameters: 0x280e3000 0x1000 0x7 0x11 0x3 0xde37
2013-04-06T14:52:40.422+02:00 trinity kernel: Failed to flush page for address 0x280e3000

and the appropriate linux - process at the host process runs at 100%

--
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3

2013-04-06 16:45:06

by Richard Weinberger

[permalink] [raw]
Subject: Re: user mode linux: trinity syscall "madvise" triggers quickly BUG: Bad page map

On Sat, Apr 6, 2013 at 2:55 PM, Toralf F?rster <[email protected]> wrote:
> On 04/06/2013 02:47 PM, Toralf F?rster wrote:
>> On 04/06/2013 01:43 PM, richard -rw- weinberger wrote:
>>>> What I get with the this trinity command line
>>>>
>>>> gt; trinity --children 2 -c madvise
>>>
>>> Does the attached patch fix the problem?
>>
>> yes, the issue for this syscall is fixed now.
>>
>
> But now I get - after running this command for a longer time :
>
>
> 2013-04-06T14:49:55.755+02:00 trinity kernel: do_syscall_stub : ret = -12, offset = 1052680, data = 3f2ff008
> 2013-04-06T14:49:55.755+02:00 trinity kernel: do_syscall_stub: syscall 192 failed, return value = 0xfffffff4, expected return value = 0x280e5000
> 2013-04-06T14:49:55.755+02:00 trinity kernel: syscall parameters: 0x280e5000 0x1000 0x7 0x11 0x3 0x11cba
> 2013-04-06T14:49:55.755+02:00 trinity kernel: Failed to flush page for address 0x280e5000
> 2013-04-06T14:49:57.600+02:00 trinity kernel: do_syscall_stub : ret = -12, offset = 1052680, data = 3f397008
> 2013-04-06T14:49:57.600+02:00 trinity kernel: do_syscall_stub: syscall 192 failed, return value = 0xfffffff4, expected return value = 0x280e5000
> 2013-04-06T14:49:57.600+02:00 trinity kernel: syscall parameters: 0x280e5000 0x1000 0x7 0x11 0x3 0x1ad80
> 2013-04-06T14:49:57.600+02:00 trinity kernel: Failed to flush page for address 0x280e5000
> 2013-04-06T14:50:01.000+02:00 trinity cron[1071]: (root) CMD (test -x /usr/sbin/run-crons && /usr/sbin/run-crons)
> 2013-04-06T14:51:15.336+02:00 trinity kernel: do_syscall_stub : ret = -12, offset = 1052680, data = 3f2fe008
> 2013-04-06T14:51:15.336+02:00 trinity kernel: do_syscall_stub: syscall 192 failed, return value = 0xfffffff4, expected return value = 0x280e3000
> 2013-04-06T14:51:15.336+02:00 trinity kernel: syscall parameters: 0x280e3000 0x1000 0x7 0x11 0x3 0xd561
> 2013-04-06T14:51:15.336+02:00 trinity kernel: Failed to flush page for address 0x280e3000
> 2013-04-06T14:51:20.224+02:00 trinity kernel: do_syscall_stub : ret = -12, offset = 1052680, data = 3f394008
> 2013-04-06T14:51:20.224+02:00 trinity kernel: do_syscall_stub: syscall 192 failed, return value = 0xfffffff4, expected return value = 0x280e5000
> 2013-04-06T14:51:20.224+02:00 trinity kernel: syscall parameters: 0x280e5000 0x1000 0x7 0x11 0x3 0x29571
> 2013-04-06T14:51:20.224+02:00 trinity kernel: Failed to flush page for address 0x280e5000
> 2013-04-06T14:52:34.379+02:00 trinity kernel: do_syscall_stub : ret = -12, offset = 1052680, data = 3f22a008
> 2013-04-06T14:52:34.379+02:00 trinity kernel: do_syscall_stub: syscall 192 failed, return value = 0xfffffff4, expected return value = 0x280e3000
> 2013-04-06T14:52:34.379+02:00 trinity kernel: syscall parameters: 0x280e3000 0x1000 0x7 0x11 0x3 0x9b50
> 2013-04-06T14:52:34.379+02:00 trinity kernel: Failed to flush page for address 0x280e3000
> 2013-04-06T14:52:40.422+02:00 trinity kernel: do_syscall_stub : ret = -12, offset = 1052680, data = 3f397008
> 2013-04-06T14:52:40.422+02:00 trinity kernel: do_syscall_stub: syscall 192 failed, return value = 0xfffffff4, expected return value = 0x280e3000
> 2013-04-06T14:52:40.422+02:00 trinity kernel: syscall parameters: 0x280e3000 0x1000 0x7 0x11 0x3 0xde37
> 2013-04-06T14:52:40.422+02:00 trinity kernel: Failed to flush page for address 0x280e3000
>
> and the appropriate linux - process at the host process runs at 100%

You are running out of available memory mappings on the host side.
mmap() fails because you have too much mappings installed.
See vm.max_map_count

--
Thanks,
//richard