2013-11-22 06:02:23

by Yuanhan Liu

[permalink] [raw]
Subject: WARNING: CPU: 0 PID: 1 at init/main.c:711 do_one_initcall()

Greetings,

I got the below dmesg and the first bad commit is

commit 20545536cd8ea949c61527b6395ec8c0d2c237b1
Author: Boaz Harrosh <[email protected]>
Date: Thu Jul 19 15:22:37 2012 +0300

RFC: do_xor_speed Broken on UML do to jiffies

Remember that hang I reported a while back on UML. Well
I'm at it again, and it still hangs and I found why.

I have dprinted jiffies and it never advances during the
loop at do_xor_speed. There for it is stuck in an endless
loop. I have also dprinted current_kernel_time() and it
returns the same constant value as well.

Note that it does usually work on UML, only during
the modprobe of xor.ko while that test is running. It looks
like some lucking is preventing the clock from ticking.

However ktime_get_ts does work for me so I changed the code
as below, so I can work. See how I put several safety
guards, to never get hangs again.
And I think my time based approach is more accurate then
previous system.

UML guys please investigate the jiffies issue? what is
xor.ko not doing right?

Signed-off-by: Boaz Harrosh <[email protected]>

+------------------------------------------------------------------+----+
| | |
+------------------------------------------------------------------+----+
| boot_successes | 0 |
| boot_failures | 29 |
| WARNING:CPU:PID:at_init/main.c:do_one_initcall() | 29 |
| initcall_calibrate_xor_blocks_returned_with_preemption_imbalance | 29 |
+------------------------------------------------------------------+----+

[ 0.127025] generic_sse: 148.363 MB/sec
[ 0.127478] xor: using function: prefetch64-sse (152.727 MB/sec)
[ 0.128017] ------------[ cut here ]------------
[ 0.128531] WARNING: CPU: 0 PID: 1 at init/main.c:711 do_one_initcall+0x105/0x115()
[ 0.129018] initcall calibrate_xor_blocks+0x0/0x144 returned with preemption imbalance
[ 0.130013] Modules linked in:
[ 0.130357] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.12.0-11285-gb242bff #91
[ 0.131013] 0000000000000000 ffff88000d0dde00 ffffffff8161acc5 ffff88000d0dde48
[ 0.132554] ffff88000d0dde38 ffffffff81052de9 ffffffff81000316 ffffffff81a77cfd
[ 0.133380] 0000000000000000 0000000000000000 0000000000000000 ffff88000d0dde98
[ 0.134213] Call Trace:
[ 0.134493] [<ffffffff8161acc5>] dump_stack+0x4e/0x7a
[ 0.135017] [<ffffffff81052de9>] warn_slowpath_common+0x75/0x8e
[ 0.135654] [<ffffffff81000316>] ? do_one_initcall+0x105/0x115
[ 0.136015] [<ffffffff81a77cfd>] ? do_xor_speed+0xdd/0xdd
[ 0.137016] [<ffffffff81052e49>] warn_slowpath_fmt+0x47/0x49
[ 0.137628] [<ffffffff810c8382>] ? free_pages+0x51/0x53
[ 0.138015] [<ffffffff81a77cfd>] ? do_xor_speed+0xdd/0xdd
[ 0.138623] [<ffffffff81000316>] do_one_initcall+0x105/0x115
[ 0.139017] [<ffffffff81a59ed6>] kernel_init_freeable+0x115/0x19b
[ 0.140016] [<ffffffff81a59707>] ? do_early_param+0x88/0x88
[ 0.140630] [<ffffffff81610ff9>] ? rest_init+0xbd/0xbd
[ 0.141016] [<ffffffff81611002>] kernel_init+0x9/0xfa
[ 0.141567] [<ffffffff8162a98c>] ret_from_fork+0x7c/0xb0
[ 0.142016] [<ffffffff81610ff9>] ? rest_init+0xbd/0xbd
[ 0.143028] ---[ end trace 19b4eab334350767 ]---
[ 0.143530] atomic64 test passed for x86-64 platform with CX8 and with SSE

git bisect start b242bff548c34510fd9b7f0e29b885263dfb8903 5e01dc7b26d9f24f39abace5da98ccbd6a5ceb52 --
git bisect good 5cbb3d216e2041700231bcfc383ee5f8b7fc8b74 # 09:25 20+ 0 Merge branch 'akpm' (patches from Andrew Morton)
git bisect good 7e1a1e9378018aeea2c7e8a3dd2ceb1db1523b0b # 09:42 20+ 0 Merge tag 'xfs-for-linus-v3.13-rc1' of git://oss.sgi.com/xfs/xfs
git bisect good 4937e2a6f939a41bf811378e80d71f68aa0950c6 # 10:08 20+ 0 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
git bisect good 210e812f036736aeda097d9a6ef84b1f2b334bae # 10:31 20+ 0 perf header: Fix bogus group name
git bisect good d5bdaf4f68f0590fc481bca54bcaffeb27b75fca # 10:54 20+ 0 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha
git bisect good e630a6bcf18079b2ab6b03d55c9757e8ef6656b6 # 11:03 20+ 0 staging: lustre: fix checkpatch issue regarding pointer coding style
git bisect good 6449a5811e62ab9587b54feca45c06cfee0e37cd # 11:10 20+ 0 Merge 'btrfs/for-linus' into devel-cairo-x86_64-201311220159
git bisect good 78103b692e7aa6a8e2ef678c9a3465d6bfe44559 # 11:14 20+ 0 Merge 'staging/opw-next' into devel-cairo-x86_64-201311220159
git bisect good 7acd71879ce408af2d2ca3cd3ec3a86d0667ceae # 11:24 20+ 0 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
git bisect good 2ea4606fd707f05cddce72219a5f90ca471c09d6 # 11:32 20+ 0 drm/msm: add atomic support
git bisect bad cd7efef070cc5420858c271a9908df3f86cef83b # 11:40 0- 3 {SQUASHME} exofs_ioctl: Fix for deadlock when close of root node
git bisect bad 13cf7003526891bfb7ad12fc5cff01cf9e734dc2 # 11:43 0- 14 {SPLITME} exofs_ioctl: All the new and external files
git bisect bad 20545536cd8ea949c61527b6395ec8c0d2c237b1 # 11:46 0- 17 RFC: do_xor_speed Broken on UML do to jiffies
git bisect good 4a9a4b3528afce48d3f4b1c07b988040e78112e2 # 11:50 20+ 0 pnfs: Proper delay for NFS4ERR_RECALLCONFLICT in layout_get_done
# first bad commit: [20545536cd8ea949c61527b6395ec8c0d2c237b1] RFC: do_xor_speed Broken on UML do to jiffies
git bisect good 4a9a4b3528afce48d3f4b1c07b988040e78112e2 # 11:52 60+ 0 pnfs: Proper delay for NFS4ERR_RECALLCONFLICT in layout_get_done
git bisect bad b242bff548c34510fd9b7f0e29b885263dfb8903 # 11:52 0- 29 Merge 'open-osd/exofs_ioctl' into devel-cairo-x86_64-201311220159
git bisect good 727fb2e90de9b05224b1801b4c21e7fe18506b43 # 12:07 60+ 0 Revert "RFC: do_xor_speed Broken on UML do to jiffies"
git bisect good 527d1511310a89650000081869260394e20c7013 # 12:26 60+ 0 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
git bisect good f3fa585afa93230883dc4c259dc03df6234a5e5f # 12:42 60+ 0 Add linux-next specific files for 20131122


Thanks.

--yliu


Attachments:
(No filename) (6.35 kB)
dmesg-quantal-roam-13:20131122022047:x86_64-randconfig-c1-1122:3.12.0-11285-gb242bff:91 (232.01 kB)
x86_64-randconfig-c1-1122-b242bff548c34510fd9b7f0e29b885263dfb8903-initcall-calibrate_xor_blocks-18887.log (29.98 kB)
config-3.12.0-11285-gb242bff (85.01 kB)
Download all attachments

2013-11-25 12:35:42

by Boaz Harrosh

[permalink] [raw]
Subject: Re: WARNING: CPU: 0 PID: 1 at init/main.c:711 do_one_initcall()

On 11/22/2013 08:02 AM, Yuanhan Liu wrote:
> Greetings,
>
> I got the below dmesg and the first bad commit is
>
> commit 20545536cd8ea949c61527b6395ec8c0d2c237b1
> Author: Boaz Harrosh <[email protected]>
> Date: Thu Jul 19 15:22:37 2012 +0300
>
> RFC: do_xor_speed Broken on UML do to jiffies
>

Hi Sir Yuanhan.

I understand that you are running exofs_ioctl branch on linux-open-osd.git .
Please tell me more why you choose to run this branch it is an experimental
pNFS+Ganesha+exofs branch that we are working on around here. It might have
problems.

Yes this patch has problems, I know. I have it in my tree because I need
it if I want to use XOR engine with a UML system. If you do need to run
this branch *exofs_ioctl* on your system then it is best you revert this
patch.

Thanks for the report I think I'll just remove that patch and run with it
locally.

Cheers
Boaz

> Remember that hang I reported a while back on UML. Well
> I'm at it again, and it still hangs and I found why.
>
> I have dprinted jiffies and it never advances during the
> loop at do_xor_speed. There for it is stuck in an endless
> loop. I have also dprinted current_kernel_time() and it
> returns the same constant value as well.
>
> Note that it does usually work on UML, only during
> the modprobe of xor.ko while that test is running. It looks
> like some lucking is preventing the clock from ticking.
>
> However ktime_get_ts does work for me so I changed the code
> as below, so I can work. See how I put several safety
> guards, to never get hangs again.
> And I think my time based approach is more accurate then
> previous system.
>
> UML guys please investigate the jiffies issue? what is
> xor.ko not doing right?
>
> Signed-off-by: Boaz Harrosh <[email protected]>
>
> +------------------------------------------------------------------+----+
> | | |
> +------------------------------------------------------------------+----+
> | boot_successes | 0 |
> | boot_failures | 29 |
> | WARNING:CPU:PID:at_init/main.c:do_one_initcall() | 29 |
> | initcall_calibrate_xor_blocks_returned_with_preemption_imbalance | 29 |
> +------------------------------------------------------------------+----+
>
> [ 0.127025] generic_sse: 148.363 MB/sec
> [ 0.127478] xor: using function: prefetch64-sse (152.727 MB/sec)
> [ 0.128017] ------------[ cut here ]------------
> [ 0.128531] WARNING: CPU: 0 PID: 1 at init/main.c:711 do_one_initcall+0x105/0x115()
> [ 0.129018] initcall calibrate_xor_blocks+0x0/0x144 returned with preemption imbalance
> [ 0.130013] Modules linked in:
> [ 0.130357] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.12.0-11285-gb242bff #91
> [ 0.131013] 0000000000000000 ffff88000d0dde00 ffffffff8161acc5 ffff88000d0dde48
> [ 0.132554] ffff88000d0dde38 ffffffff81052de9 ffffffff81000316 ffffffff81a77cfd
> [ 0.133380] 0000000000000000 0000000000000000 0000000000000000 ffff88000d0dde98
> [ 0.134213] Call Trace:
> [ 0.134493] [<ffffffff8161acc5>] dump_stack+0x4e/0x7a
> [ 0.135017] [<ffffffff81052de9>] warn_slowpath_common+0x75/0x8e
> [ 0.135654] [<ffffffff81000316>] ? do_one_initcall+0x105/0x115
> [ 0.136015] [<ffffffff81a77cfd>] ? do_xor_speed+0xdd/0xdd
> [ 0.137016] [<ffffffff81052e49>] warn_slowpath_fmt+0x47/0x49
> [ 0.137628] [<ffffffff810c8382>] ? free_pages+0x51/0x53
> [ 0.138015] [<ffffffff81a77cfd>] ? do_xor_speed+0xdd/0xdd
> [ 0.138623] [<ffffffff81000316>] do_one_initcall+0x105/0x115
> [ 0.139017] [<ffffffff81a59ed6>] kernel_init_freeable+0x115/0x19b
> [ 0.140016] [<ffffffff81a59707>] ? do_early_param+0x88/0x88
> [ 0.140630] [<ffffffff81610ff9>] ? rest_init+0xbd/0xbd
> [ 0.141016] [<ffffffff81611002>] kernel_init+0x9/0xfa
> [ 0.141567] [<ffffffff8162a98c>] ret_from_fork+0x7c/0xb0
> [ 0.142016] [<ffffffff81610ff9>] ? rest_init+0xbd/0xbd
> [ 0.143028] ---[ end trace 19b4eab334350767 ]---
> [ 0.143530] atomic64 test passed for x86-64 platform with CX8 and with SSE
>
> git bisect start b242bff548c34510fd9b7f0e29b885263dfb8903 5e01dc7b26d9f24f39abace5da98ccbd6a5ceb52 --
> git bisect good 5cbb3d216e2041700231bcfc383ee5f8b7fc8b74 # 09:25 20+ 0 Merge branch 'akpm' (patches from Andrew Morton)
> git bisect good 7e1a1e9378018aeea2c7e8a3dd2ceb1db1523b0b # 09:42 20+ 0 Merge tag 'xfs-for-linus-v3.13-rc1' of git://oss.sgi.com/xfs/xfs
> git bisect good 4937e2a6f939a41bf811378e80d71f68aa0950c6 # 10:08 20+ 0 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
> git bisect good 210e812f036736aeda097d9a6ef84b1f2b334bae # 10:31 20+ 0 perf header: Fix bogus group name
> git bisect good d5bdaf4f68f0590fc481bca54bcaffeb27b75fca # 10:54 20+ 0 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha
> git bisect good e630a6bcf18079b2ab6b03d55c9757e8ef6656b6 # 11:03 20+ 0 staging: lustre: fix checkpatch issue regarding pointer coding style
> git bisect good 6449a5811e62ab9587b54feca45c06cfee0e37cd # 11:10 20+ 0 Merge 'btrfs/for-linus' into devel-cairo-x86_64-201311220159
> git bisect good 78103b692e7aa6a8e2ef678c9a3465d6bfe44559 # 11:14 20+ 0 Merge 'staging/opw-next' into devel-cairo-x86_64-201311220159
> git bisect good 7acd71879ce408af2d2ca3cd3ec3a86d0667ceae # 11:24 20+ 0 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
> git bisect good 2ea4606fd707f05cddce72219a5f90ca471c09d6 # 11:32 20+ 0 drm/msm: add atomic support
> git bisect bad cd7efef070cc5420858c271a9908df3f86cef83b # 11:40 0- 3 {SQUASHME} exofs_ioctl: Fix for deadlock when close of root node
> git bisect bad 13cf7003526891bfb7ad12fc5cff01cf9e734dc2 # 11:43 0- 14 {SPLITME} exofs_ioctl: All the new and external files
> git bisect bad 20545536cd8ea949c61527b6395ec8c0d2c237b1 # 11:46 0- 17 RFC: do_xor_speed Broken on UML do to jiffies
> git bisect good 4a9a4b3528afce48d3f4b1c07b988040e78112e2 # 11:50 20+ 0 pnfs: Proper delay for NFS4ERR_RECALLCONFLICT in layout_get_done
> # first bad commit: [20545536cd8ea949c61527b6395ec8c0d2c237b1] RFC: do_xor_speed Broken on UML do to jiffies
> git bisect good 4a9a4b3528afce48d3f4b1c07b988040e78112e2 # 11:52 60+ 0 pnfs: Proper delay for NFS4ERR_RECALLCONFLICT in layout_get_done
> git bisect bad b242bff548c34510fd9b7f0e29b885263dfb8903 # 11:52 0- 29 Merge 'open-osd/exofs_ioctl' into devel-cairo-x86_64-201311220159
> git bisect good 727fb2e90de9b05224b1801b4c21e7fe18506b43 # 12:07 60+ 0 Revert "RFC: do_xor_speed Broken on UML do to jiffies"
> git bisect good 527d1511310a89650000081869260394e20c7013 # 12:26 60+ 0 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
> git bisect good f3fa585afa93230883dc4c259dc03df6234a5e5f # 12:42 60+ 0 Add linux-next specific files for 20131122
>
>
> Thanks.
>
> --yliu
>

2013-11-25 13:25:42

by Yuanhan Liu

[permalink] [raw]
Subject: Re: WARNING: CPU: 0 PID: 1 at init/main.c:711 do_one_initcall()

On Mon, Nov 25, 2013 at 12:43:42PM +0200, Boaz Harrosh wrote:
> On 11/22/2013 08:02 AM, Yuanhan Liu wrote:
> > Greetings,
> >
> > I got the below dmesg and the first bad commit is
> >
> > commit 20545536cd8ea949c61527b6395ec8c0d2c237b1
> > Author: Boaz Harrosh <[email protected]>
> > Date: Thu Jul 19 15:22:37 2012 +0300
> >
> > RFC: do_xor_speed Broken on UML do to jiffies
> >
>
> Hi Sir Yuanhan.
>
> I understand that you are running exofs_ioctl branch on linux-open-osd.git .
> Please tell me more why you choose to run this branch it is an experimental

Hi Boaz,

We are running an 0day kernel testing system. We will test all developers'
tree we tracked in our system automatically. And obviously, linux-open-osd
is in that list.

This system can't tell whether a branch is experimental unless
- You put one extra line of "Dont-Auto-Build" to the head commit log.

- the branch name contains "experimental", say exofs_ioctl-experimental

If both items aren't convenient to you, you can ask us to remove your
tree from that list. Then you will never get report like this from us.
However, you may lose a chance to find build, boot and performance bug
automatically for you ;)

--yliu

> pNFS+Ganesha+exofs branch that we are working on around here. It might have
> problems.
>
> Yes this patch has problems, I know. I have it in my tree because I need
> it if I want to use XOR engine with a UML system. If you do need to run
> this branch *exofs_ioctl* on your system then it is best you revert this
> patch.
>
> Thanks for the report I think I'll just remove that patch and run with it
> locally.
>
> Cheers
> Boaz
>
> > Remember that hang I reported a while back on UML. Well
> > I'm at it again, and it still hangs and I found why.
> >
> > I have dprinted jiffies and it never advances during the
> > loop at do_xor_speed. There for it is stuck in an endless
> > loop. I have also dprinted current_kernel_time() and it
> > returns the same constant value as well.
> >
> > Note that it does usually work on UML, only during
> > the modprobe of xor.ko while that test is running. It looks
> > like some lucking is preventing the clock from ticking.
> >
> > However ktime_get_ts does work for me so I changed the code
> > as below, so I can work. See how I put several safety
> > guards, to never get hangs again.
> > And I think my time based approach is more accurate then
> > previous system.
> >
> > UML guys please investigate the jiffies issue? what is
> > xor.ko not doing right?
> >
> > Signed-off-by: Boaz Harrosh <[email protected]>
> >
> > +------------------------------------------------------------------+----+
> > | | |
> > +------------------------------------------------------------------+----+
> > | boot_successes | 0 |
> > | boot_failures | 29 |
> > | WARNING:CPU:PID:at_init/main.c:do_one_initcall() | 29 |
> > | initcall_calibrate_xor_blocks_returned_with_preemption_imbalance | 29 |
> > +------------------------------------------------------------------+----+
> >
> > [ 0.127025] generic_sse: 148.363 MB/sec
> > [ 0.127478] xor: using function: prefetch64-sse (152.727 MB/sec)
> > [ 0.128017] ------------[ cut here ]------------
> > [ 0.128531] WARNING: CPU: 0 PID: 1 at init/main.c:711 do_one_initcall+0x105/0x115()
> > [ 0.129018] initcall calibrate_xor_blocks+0x0/0x144 returned with preemption imbalance
> > [ 0.130013] Modules linked in:
> > [ 0.130357] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.12.0-11285-gb242bff #91
> > [ 0.131013] 0000000000000000 ffff88000d0dde00 ffffffff8161acc5 ffff88000d0dde48
> > [ 0.132554] ffff88000d0dde38 ffffffff81052de9 ffffffff81000316 ffffffff81a77cfd
> > [ 0.133380] 0000000000000000 0000000000000000 0000000000000000 ffff88000d0dde98
> > [ 0.134213] Call Trace:
> > [ 0.134493] [<ffffffff8161acc5>] dump_stack+0x4e/0x7a
> > [ 0.135017] [<ffffffff81052de9>] warn_slowpath_common+0x75/0x8e
> > [ 0.135654] [<ffffffff81000316>] ? do_one_initcall+0x105/0x115
> > [ 0.136015] [<ffffffff81a77cfd>] ? do_xor_speed+0xdd/0xdd
> > [ 0.137016] [<ffffffff81052e49>] warn_slowpath_fmt+0x47/0x49
> > [ 0.137628] [<ffffffff810c8382>] ? free_pages+0x51/0x53
> > [ 0.138015] [<ffffffff81a77cfd>] ? do_xor_speed+0xdd/0xdd
> > [ 0.138623] [<ffffffff81000316>] do_one_initcall+0x105/0x115
> > [ 0.139017] [<ffffffff81a59ed6>] kernel_init_freeable+0x115/0x19b
> > [ 0.140016] [<ffffffff81a59707>] ? do_early_param+0x88/0x88
> > [ 0.140630] [<ffffffff81610ff9>] ? rest_init+0xbd/0xbd
> > [ 0.141016] [<ffffffff81611002>] kernel_init+0x9/0xfa
> > [ 0.141567] [<ffffffff8162a98c>] ret_from_fork+0x7c/0xb0
> > [ 0.142016] [<ffffffff81610ff9>] ? rest_init+0xbd/0xbd
> > [ 0.143028] ---[ end trace 19b4eab334350767 ]---
> > [ 0.143530] atomic64 test passed for x86-64 platform with CX8 and with SSE
> >
> > git bisect start b242bff548c34510fd9b7f0e29b885263dfb8903 5e01dc7b26d9f24f39abace5da98ccbd6a5ceb52 --
> > git bisect good 5cbb3d216e2041700231bcfc383ee5f8b7fc8b74 # 09:25 20+ 0 Merge branch 'akpm' (patches from Andrew Morton)
> > git bisect good 7e1a1e9378018aeea2c7e8a3dd2ceb1db1523b0b # 09:42 20+ 0 Merge tag 'xfs-for-linus-v3.13-rc1' of git://oss.sgi.com/xfs/xfs
> > git bisect good 4937e2a6f939a41bf811378e80d71f68aa0950c6 # 10:08 20+ 0 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
> > git bisect good 210e812f036736aeda097d9a6ef84b1f2b334bae # 10:31 20+ 0 perf header: Fix bogus group name
> > git bisect good d5bdaf4f68f0590fc481bca54bcaffeb27b75fca # 10:54 20+ 0 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha
> > git bisect good e630a6bcf18079b2ab6b03d55c9757e8ef6656b6 # 11:03 20+ 0 staging: lustre: fix checkpatch issue regarding pointer coding style
> > git bisect good 6449a5811e62ab9587b54feca45c06cfee0e37cd # 11:10 20+ 0 Merge 'btrfs/for-linus' into devel-cairo-x86_64-201311220159
> > git bisect good 78103b692e7aa6a8e2ef678c9a3465d6bfe44559 # 11:14 20+ 0 Merge 'staging/opw-next' into devel-cairo-x86_64-201311220159
> > git bisect good 7acd71879ce408af2d2ca3cd3ec3a86d0667ceae # 11:24 20+ 0 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
> > git bisect good 2ea4606fd707f05cddce72219a5f90ca471c09d6 # 11:32 20+ 0 drm/msm: add atomic support
> > git bisect bad cd7efef070cc5420858c271a9908df3f86cef83b # 11:40 0- 3 {SQUASHME} exofs_ioctl: Fix for deadlock when close of root node
> > git bisect bad 13cf7003526891bfb7ad12fc5cff01cf9e734dc2 # 11:43 0- 14 {SPLITME} exofs_ioctl: All the new and external files
> > git bisect bad 20545536cd8ea949c61527b6395ec8c0d2c237b1 # 11:46 0- 17 RFC: do_xor_speed Broken on UML do to jiffies
> > git bisect good 4a9a4b3528afce48d3f4b1c07b988040e78112e2 # 11:50 20+ 0 pnfs: Proper delay for NFS4ERR_RECALLCONFLICT in layout_get_done
> > # first bad commit: [20545536cd8ea949c61527b6395ec8c0d2c237b1] RFC: do_xor_speed Broken on UML do to jiffies
> > git bisect good 4a9a4b3528afce48d3f4b1c07b988040e78112e2 # 11:52 60+ 0 pnfs: Proper delay for NFS4ERR_RECALLCONFLICT in layout_get_done
> > git bisect bad b242bff548c34510fd9b7f0e29b885263dfb8903 # 11:52 0- 29 Merge 'open-osd/exofs_ioctl' into devel-cairo-x86_64-201311220159
> > git bisect good 727fb2e90de9b05224b1801b4c21e7fe18506b43 # 12:07 60+ 0 Revert "RFC: do_xor_speed Broken on UML do to jiffies"
> > git bisect good 527d1511310a89650000081869260394e20c7013 # 12:26 60+ 0 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
> > git bisect good f3fa585afa93230883dc4c259dc03df6234a5e5f # 12:42 60+ 0 Add linux-next specific files for 20131122
> >
> >
> > Thanks.
> >
> > --yliu
> >

2013-11-25 13:58:00

by Boaz Harrosh

[permalink] [raw]
Subject: Re: WARNING: CPU: 0 PID: 1 at init/main.c:711 do_one_initcall()

On 11/25/2013 03:25 PM, Yuanhan Liu wrote:
>
> Hi Boaz,
>
> We are running an 0day kernel testing system. We will test all developers'
> tree we tracked in our system automatically. And obviously, linux-open-osd
> is in that list.
>
> This system can't tell whether a branch is experimental unless
> - You put one extra line of "Dont-Auto-Build" to the head commit log.
>
> - the branch name contains "experimental", say exofs_ioctl-experimental
>
> If both items aren't convenient to you, you can ask us to remove your
> tree from that list. Then you will never get report like this from us.
> However, you may lose a chance to find build, boot and performance bug
> automatically for you ;)
>
> --yliu
>


Ha OK very cool. I will remember to put -experimental on the branch name
this is fine I will do it ASAP.

Thanks so much. Do you have some web based info on the build system?
Do you have a place I can see test results and tests summery?

Cheers
Boaz

2013-11-25 14:15:47

by Richard Weinberger

[permalink] [raw]
Subject: Re: WARNING: CPU: 0 PID: 1 at init/main.c:711 do_one_initcall()

On Fri, Nov 22, 2013 at 7:02 AM, Yuanhan Liu
<[email protected]> wrote:
> Greetings,
>
> I got the below dmesg and the first bad commit is
>
> commit 20545536cd8ea949c61527b6395ec8c0d2c237b1
> Author: Boaz Harrosh <[email protected]>
> Date: Thu Jul 19 15:22:37 2012 +0300
>
> RFC: do_xor_speed Broken on UML do to jiffies
>
> Remember that hang I reported a while back on UML. Well
> I'm at it again, and it still hangs and I found why.
>
> I have dprinted jiffies and it never advances during the
> loop at do_xor_speed. There for it is stuck in an endless
> loop. I have also dprinted current_kernel_time() and it
> returns the same constant value as well.
>
> Note that it does usually work on UML, only during
> the modprobe of xor.ko while that test is running. It looks
> like some lucking is preventing the clock from ticking.
>
> However ktime_get_ts does work for me so I changed the code
> as below, so I can work. See how I put several safety
> guards, to never get hangs again.
> And I think my time based approach is more accurate then
> previous system.
>
> UML guys please investigate the jiffies issue? what is
> xor.ko not doing right?

This patch never hit my mailbox...

> Signed-off-by: Boaz Harrosh <[email protected]>
>
> +------------------------------------------------------------------+----+
> | | |
> +------------------------------------------------------------------+----+
> | boot_successes | 0 |
> | boot_failures | 29 |
> | WARNING:CPU:PID:at_init/main.c:do_one_initcall() | 29 |
> | initcall_calibrate_xor_blocks_returned_with_preemption_imbalance | 29 |
> +------------------------------------------------------------------+----+
>
> [ 0.127025] generic_sse: 148.363 MB/sec
> [ 0.127478] xor: using function: prefetch64-sse (152.727 MB/sec)
> [ 0.128017] ------------[ cut here ]------------
> [ 0.128531] WARNING: CPU: 0 PID: 1 at init/main.c:711 do_one_initcall+0x105/0x115()
> [ 0.129018] initcall calibrate_xor_blocks+0x0/0x144 returned with preemption imbalance
> [ 0.130013] Modules linked in:
> [ 0.130357] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.12.0-11285-gb242bff #91
> [ 0.131013] 0000000000000000 ffff88000d0dde00 ffffffff8161acc5 ffff88000d0dde48
> [ 0.132554] ffff88000d0dde38 ffffffff81052de9 ffffffff81000316 ffffffff81a77cfd
> [ 0.133380] 0000000000000000 0000000000000000 0000000000000000 ffff88000d0dde98
> [ 0.134213] Call Trace:
> [ 0.134493] [<ffffffff8161acc5>] dump_stack+0x4e/0x7a
> [ 0.135017] [<ffffffff81052de9>] warn_slowpath_common+0x75/0x8e
> [ 0.135654] [<ffffffff81000316>] ? do_one_initcall+0x105/0x115
> [ 0.136015] [<ffffffff81a77cfd>] ? do_xor_speed+0xdd/0xdd
> [ 0.137016] [<ffffffff81052e49>] warn_slowpath_fmt+0x47/0x49
> [ 0.137628] [<ffffffff810c8382>] ? free_pages+0x51/0x53
> [ 0.138015] [<ffffffff81a77cfd>] ? do_xor_speed+0xdd/0xdd
> [ 0.138623] [<ffffffff81000316>] do_one_initcall+0x105/0x115
> [ 0.139017] [<ffffffff81a59ed6>] kernel_init_freeable+0x115/0x19b
> [ 0.140016] [<ffffffff81a59707>] ? do_early_param+0x88/0x88
> [ 0.140630] [<ffffffff81610ff9>] ? rest_init+0xbd/0xbd
> [ 0.141016] [<ffffffff81611002>] kernel_init+0x9/0xfa
> [ 0.141567] [<ffffffff8162a98c>] ret_from_fork+0x7c/0xb0
> [ 0.142016] [<ffffffff81610ff9>] ? rest_init+0xbd/0xbd
> [ 0.143028] ---[ end trace 19b4eab334350767 ]---
> [ 0.143530] atomic64 test passed for x86-64 platform with CX8 and with SSE
>
> git bisect start b242bff548c34510fd9b7f0e29b885263dfb8903 5e01dc7b26d9f24f39abace5da98ccbd6a5ceb52 --
> git bisect good 5cbb3d216e2041700231bcfc383ee5f8b7fc8b74 # 09:25 20+ 0 Merge branch 'akpm' (patches from Andrew Morton)
> git bisect good 7e1a1e9378018aeea2c7e8a3dd2ceb1db1523b0b # 09:42 20+ 0 Merge tag 'xfs-for-linus-v3.13-rc1' of git://oss.sgi.com/xfs/xfs
> git bisect good 4937e2a6f939a41bf811378e80d71f68aa0950c6 # 10:08 20+ 0 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
> git bisect good 210e812f036736aeda097d9a6ef84b1f2b334bae # 10:31 20+ 0 perf header: Fix bogus group name
> git bisect good d5bdaf4f68f0590fc481bca54bcaffeb27b75fca # 10:54 20+ 0 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha
> git bisect good e630a6bcf18079b2ab6b03d55c9757e8ef6656b6 # 11:03 20+ 0 staging: lustre: fix checkpatch issue regarding pointer coding style
> git bisect good 6449a5811e62ab9587b54feca45c06cfee0e37cd # 11:10 20+ 0 Merge 'btrfs/for-linus' into devel-cairo-x86_64-201311220159
> git bisect good 78103b692e7aa6a8e2ef678c9a3465d6bfe44559 # 11:14 20+ 0 Merge 'staging/opw-next' into devel-cairo-x86_64-201311220159
> git bisect good 7acd71879ce408af2d2ca3cd3ec3a86d0667ceae # 11:24 20+ 0 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
> git bisect good 2ea4606fd707f05cddce72219a5f90ca471c09d6 # 11:32 20+ 0 drm/msm: add atomic support
> git bisect bad cd7efef070cc5420858c271a9908df3f86cef83b # 11:40 0- 3 {SQUASHME} exofs_ioctl: Fix for deadlock when close of root node
> git bisect bad 13cf7003526891bfb7ad12fc5cff01cf9e734dc2 # 11:43 0- 14 {SPLITME} exofs_ioctl: All the new and external files
> git bisect bad 20545536cd8ea949c61527b6395ec8c0d2c237b1 # 11:46 0- 17 RFC: do_xor_speed Broken on UML do to jiffies
> git bisect good 4a9a4b3528afce48d3f4b1c07b988040e78112e2 # 11:50 20+ 0 pnfs: Proper delay for NFS4ERR_RECALLCONFLICT in layout_get_done
> # first bad commit: [20545536cd8ea949c61527b6395ec8c0d2c237b1] RFC: do_xor_speed Broken on UML do to jiffies
> git bisect good 4a9a4b3528afce48d3f4b1c07b988040e78112e2 # 11:52 60+ 0 pnfs: Proper delay for NFS4ERR_RECALLCONFLICT in layout_get_done
> git bisect bad b242bff548c34510fd9b7f0e29b885263dfb8903 # 11:52 0- 29 Merge 'open-osd/exofs_ioctl' into devel-cairo-x86_64-201311220159
> git bisect good 727fb2e90de9b05224b1801b4c21e7fe18506b43 # 12:07 60+ 0 Revert "RFC: do_xor_speed Broken on UML do to jiffies"
> git bisect good 527d1511310a89650000081869260394e20c7013 # 12:26 60+ 0 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
> git bisect good f3fa585afa93230883dc4c259dc03df6234a5e5f # 12:42 60+ 0 Add linux-next specific files for 20131122
>
>
> Thanks.
>
> --yliu



--
Thanks,
//richard

2013-11-25 14:32:19

by Boaz Harrosh

[permalink] [raw]
Subject: Re: WARNING: CPU: 0 PID: 1 at init/main.c:711 do_one_initcall()

On 11/25/2013 04:15 PM, Richard Weinberger wrote:
> On Fri, Nov 22, 2013 at 7:02 AM, Yuanhan Liu
> <[email protected]> wrote:
>> Greetings,
>>
>> I got the below dmesg and the first bad commit is
>>
>> commit 20545536cd8ea949c61527b6395ec8c0d2c237b1
>> Author: Boaz Harrosh <[email protected]>
>> Date: Thu Jul 19 15:22:37 2012 +0300
>>
>> RFC: do_xor_speed Broken on UML do to jiffies
>>
>> Remember that hang I reported a while back on UML. Well
>> I'm at it again, and it still hangs and I found why.
>>
>> I have dprinted jiffies and it never advances during the
>> loop at do_xor_speed. There for it is stuck in an endless
>> loop. I have also dprinted current_kernel_time() and it
>> returns the same constant value as well.
>>
>> Note that it does usually work on UML, only during
>> the modprobe of xor.ko while that test is running. It looks
>> like some lucking is preventing the clock from ticking.
>>
>> However ktime_get_ts does work for me so I changed the code
>> as below, so I can work. See how I put several safety
>> guards, to never get hangs again.
>> And I think my time based approach is more accurate then
>> previous system.
>>
>> UML guys please investigate the jiffies issue? what is
>> xor.ko not doing right?
>
> This patch never hit my mailbox...
>
>> Signed-off-by: Boaz Harrosh <[email protected]>
>>

Sir Richard

I never followed on this patch. Sorry. I do think it is in
the right direction, but it has a dev-by-zero problem and
a 32 bitness problem. (So I never sent it, beyond the initial
query)

[It stopped to be very important for me since I stopped using
UM very much. Ever since FC17 I'm unable to produce a running
image. It just will not boot, an image that a kvm would. So
very sad me, but no UML for me anymore.]

If you want to investigate. try doing a modprobe xor.ko on
a uml and see. since xor.ko is only built on demand for example
enabling exofs or object-layout-driver will select it. ,or MD-raid5.

Thanks
Boaz

2013-11-25 14:43:47

by Richard Weinberger

[permalink] [raw]
Subject: Re: WARNING: CPU: 0 PID: 1 at init/main.c:711 do_one_initcall()

Am Montag, 25. November 2013, 16:32:19 schrieb Boaz Harrosh:
> On 11/25/2013 04:15 PM, Richard Weinberger wrote:
> > On Fri, Nov 22, 2013 at 7:02 AM, Yuanhan Liu
> >
> > <[email protected]> wrote:
> >> Greetings,
> >>
> >> I got the below dmesg and the first bad commit is
> >>
> >> commit 20545536cd8ea949c61527b6395ec8c0d2c237b1
> >> Author: Boaz Harrosh <[email protected]>
> >> Date: Thu Jul 19 15:22:37 2012 +0300
> >>
> >> RFC: do_xor_speed Broken on UML do to jiffies
> >>
> >> Remember that hang I reported a while back on UML. Well
> >> I'm at it again, and it still hangs and I found why.
> >>
> >> I have dprinted jiffies and it never advances during the
> >> loop at do_xor_speed. There for it is stuck in an endless
> >> loop. I have also dprinted current_kernel_time() and it
> >> returns the same constant value as well.
> >>
> >> Note that it does usually work on UML, only during
> >> the modprobe of xor.ko while that test is running. It looks
> >> like some lucking is preventing the clock from ticking.
> >>
> >> However ktime_get_ts does work for me so I changed the code
> >> as below, so I can work. See how I put several safety
> >> guards, to never get hangs again.
> >> And I think my time based approach is more accurate then
> >> previous system.
> >>
> >> UML guys please investigate the jiffies issue? what is
> >> xor.ko not doing right?
> >
> > This patch never hit my mailbox...
> >
> >> Signed-off-by: Boaz Harrosh <[email protected]>
>
> Sir Richard
>
> I never followed on this patch. Sorry. I do think it is in
> the right direction, but it has a dev-by-zero problem and
> a 32 bitness problem. (So I never sent it, beyond the initial
> query)
>
> [It stopped to be very important for me since I stopped using
> UM very much. Ever since FC17 I'm unable to produce a running
> image. It just will not boot, an image that a kvm would. So
> very sad me, but no UML for me anymore.]

Sad to hear.
Did you try one from http://fs.devloop.org.uk/?
Just booted a FC18 on UML. Works fine.

In the past we've had some issues with systemd-enabled distros.
systemd uncovered problem in the UML tty driver.

> If you want to investigate. try doing a modprobe xor.ko on
> a uml and see. since xor.ko is only built on demand for example
> enabling exofs or object-layout-driver will select it. ,or MD-raid5.

Ok.

Thanks,
//richard

2013-11-26 10:36:51

by Yuanhan Liu

[permalink] [raw]
Subject: Re: WARNING: CPU: 0 PID: 1 at init/main.c:711 do_one_initcall()

On Mon, Nov 25, 2013 at 03:57:40PM +0200, Boaz Harrosh wrote:
> On 11/25/2013 03:25 PM, Yuanhan Liu wrote:
> >
> > Hi Boaz,
> >
> > We are running an 0day kernel testing system. We will test all developers'
> > tree we tracked in our system automatically. And obviously, linux-open-osd
> > is in that list.
> >
> > This system can't tell whether a branch is experimental unless
> > - You put one extra line of "Dont-Auto-Build" to the head commit log.
> >
> > - the branch name contains "experimental", say exofs_ioctl-experimental
> >
> > If both items aren't convenient to you, you can ask us to remove your
> > tree from that list. Then you will never get report like this from us.
> > However, you may lose a chance to find build, boot and performance bug
> > automatically for you ;)
> >
> > --yliu
> >
>
>
> Ha OK very cool. I will remember to put -experimental on the branch name
> this is fine I will do it ASAP.
>
> Thanks so much. Do you have some web based info on the build system?

Sorry, nope.

> Do you have a place I can see test results and tests summery?

If you like, I can add you into the build-notify list. Once the build
finished, you might get an email like following:

--yliu

----
From: kbuild test robot <[email protected]>
To: Yuanhan Liu <[email protected]>
Subject: [yuanhan:branch] a828a375f9fcc422c8d2613d774d031fa8a02a97 BUILD SUCCESS

git://bee.sh.intel.com/git/yliu/linux.git branch
a828a375f9fcc422c8d2613d774d031fa8a02a97 branch: commit summary

elapsed time: 262m

configs tested: 189

arm allnoconfig
arm almodconfig
arm at91_dt_defconfig
arm imx_v6_v7_defconfig
arm marzen_defconfig
arm omap2plus_defconfig
arm prima2_defconfig
arm s3c2410_defconfig
arm spear13xx_defconfig
arm tegra_defconfig
avr32 atngw100_defconfig
avr32 atstk1006_defconfig
frv defconfig
m68k amiga_defconfig
m68k m5475evb_defconfig
m68k multi_defconfig
microblaze mmu_defconfig
microblaze nommu_defconfig
mn10300 asb2364_defconfig
openrisc or1ksim_defconfig
tile tilegx_defconfig
um defconfig
x86_64 acpi-redef
x86_64 randconfig-a0-1105
x86_64 randconfig-a1-1105
x86_64 randconfig-a2-1105
x86_64 randconfig-a3-1105
x86_64 randconfig-a4-1105
x86_64 randconfig-a5-1105
i386 randconfig-c0-1105
i386 randconfig-c1-1105
i386 randconfig-c2-1105
i386 randconfig-c3-1105
i386 randconfig-c4-1105
i386 randconfig-c5-1105
i386 randconfig-c6-1105
i386 randconfig-c7-1105
i386 randconfig-c8-1105
i386 randconfig-c9-1105
x86_64 randconfig-c0-1105
x86_64 randconfig-c1-1105
x86_64 randconfig-c2-1105
x86_64 randconfig-c3-1105
x86_64 randconfig-c4-1105
x86_64 randconfig-c5-1105
x86_64 randconfig-c6-1105
x86_64 randconfig-c7-1105
x86_64 randconfig-c8-1105
x86_64 randconfig-c9-1105
ia64 alldefconfig
ia64 allmodconfig
ia64 allnoconfig
ia64 defconfig
mips allmodconfig
mips allnoconfig
mips fuloong2e_defconfig
i386 randconfig-i000-1105
i386 randconfig-i001-1105
i386 randconfig-i002-1105
i386 randconfig-i003-1105
i386 randconfig-i004-1105
i386 randconfig-i005-1105
i386 randconfig-i006-1105
i386 randconfig-i007-1105
i386 randconfig-i008-1105
i386 randconfig-i009-1105
powerpc chroma_defconfig
powerpc corenet64_smp_defconfig
powerpc gamecube_defconfig
powerpc linkstation_defconfig
powerpc wii_defconfig
sparc defconfig
sparc64 allmodconfig
sparc64 allnoconfig
sparc64 defconfig
x86_64 allyesconfig
x86_64 randconfig-x000-1104
x86_64 randconfig-x001-1104
x86_64 randconfig-x002-1104
x86_64 randconfig-x003-1104
x86_64 randconfig-x004-1104
x86_64 randconfig-x005-1104
x86_64 randconfig-x006-1104
x86_64 randconfig-x007-1104
x86_64 randconfig-x008-1104
x86_64 randconfig-x009-1104
i386 randconfig-x000-1104
i386 randconfig-x001-1104
i386 randconfig-x002-1104
i386 randconfig-x003-1104
i386 randconfig-x004-1104
i386 randconfig-x005-1104
i386 randconfig-x006-1104
i386 randconfig-x007-1104
i386 randconfig-x008-1104
i386 randconfig-x009-1104
alpha allyesconfig
avr32 allyesconfig
blackfin allyesconfig
cris allyesconfig
ia64 allyesconfig
m68k allyesconfig
mips allyesconfig
parisc allyesconfig
powerpc allyesconfig
s390 allyesconfig
sh allyesconfig
sparc allyesconfig
sparc64 allyesconfig
tile allyesconfig
um allyesconfig
xtensa allyesconfig
x86_64 allyesdebian
x86_64 lkp
x86_64 lkp-CONFIG_DEBUG_MUTEXES
x86_64 lkp-CONFIG_DEBUG_RODATA
x86_64 lkp-CONFIG_SCHED_DEBUG
x86_64 lkp-CONFIG_SCSI_DEBUG
x86_64 nfsroot
x86_64 randconfig-j0-1105
x86_64 randconfig-j1-1105
x86_64 randconfig-j2-1105
x86_64 randconfig-j3-1105
x86_64 randconfig-j4-1105
x86_64 randconfig-j5-1105
x86_64 randconfig-j6-1105
x86_64 randconfig-j7-1105
x86_64 randconfig-j8-1105
x86_64 randconfig-j9-1105
blackfin BF526-EZBRD_defconfig
blackfin BF533-EZKIT_defconfig
blackfin BF561-EZKIT-SMP_defconfig
blackfin TCM-BF537_defconfig
cris etrax-100lx_v2_defconfig
m32r m32104ut_defconfig
m32r mappi3.smp_defconfig
m32r opsput_defconfig
m32r usrv_defconfig
sh allnoconfig
sh rsk7269_defconfig
sh sh7785lcr_32bit_defconfig
sh titan_defconfig
xtensa common_defconfig
xtensa iss_defconfig
i386 randconfig-r0-1105
i386 randconfig-r1-1105
i386 randconfig-r2-1105
i386 randconfig-r3-1105
i386 randconfig-r4-1105
i386 randconfig-r5-1105
i386 randconfig-r6-1105
i386 randconfig-r7-1105
i386 randconfig-r8-1105
i386 randconfig-r9-1105
alpha defconfig
parisc allnoconfig
parisc b180_defconfig
parisc c3000_defconfig
parisc defconfig
s390 allmodconfig
s390 allnoconfig
s390 defconfig
x86_64 allnoconfig
i386 allyesconfig
powerpc allmodconfig
powerpc allnoconfig
powerpc defconfig
powerpc ppc64_defconfig
x86_64 allmodconfig
i386 randconfig-x0-1105
i386 randconfig-x1-1105
i386 randconfig-x2-1105
i386 randconfig-x3-1105
i386 randconfig-x4-1105
i386 randconfig-x5-1105
i386 randconfig-x6-1105
i386 randconfig-x7-1105
i386 randconfig-x8-1105
i386 randconfig-x9-1105
x86_64 randconfig-x0-1105
x86_64 randconfig-x1-1105
x86_64 randconfig-x2-1105
x86_64 randconfig-x3-1105
x86_64 randconfig-x4-1105
x86_64 randconfig-x5-1105
x86_64 randconfig-x6-1105
x86_64 randconfig-x7-1105
x86_64 randconfig-x8-1105
x86_64 randconfig-x9-1105

Thanks,
Fengguang

2013-11-26 10:54:28

by Boaz Harrosh

[permalink] [raw]
Subject: Re: WARNING: CPU: 0 PID: 1 at init/main.c:711 do_one_initcall()

On 11/26/2013 12:37 PM, Yuanhan Liu wrote:
>
> If you like, I can add you into the build-notify list. Once the build
> finished, you might get an email like following:
>
> --yliu

Na, its fine. thanks though

Cheers
Boaz