This commit makes the ltp cpuctl latency test #2 hang indefinitely:
commit b5d9d734a53e0204aab0089079cbde2a1285a38f
Author: Mike Galbraith <[email protected]>
Date: Tue Sep 8 11:12:28 2009 +0200
sched: Ensure that a child can't gain time over it's parent after fork()
When I revert this commit the test progresses as it did in 2.6.31. I
have seen this issue on 2.6.32 and 2.6.32.19. The hang goes away in
2.6.33 starting with this commit:
commit 88ec22d3edb72b261f8628226cd543589a6d5e1b
Author: Peter Zijlstra <[email protected]>
Date: Wed Dec 16 18:04:41 2009 +0100
sched: Remove the cfs_rq dependency from set_task_cpu()
Even though this appears to be resolved in 2.6.33, I am reporting it
because 2.6.32 is the "long-term stable release".
My test system is a single socket dual core amd -
model name : Dual Core AMD Opteron(tm) Processor 180
with 4GB of RAM.
Kernel config file attached.
The issue is easily reproducible for me by downloading and building ltp,
then running
testcases/kernel/controllers/cpuctl/run_cpuctl_latency_test.sh 2
Please let me know if you need any other information to help reproduce
this issue.
Thanks
Josh
On Tue, 2010-08-24 at 13:10 -0700, Josh Hunt wrote:
> This commit makes the ltp cpuctl latency test #2 hang indefinitely:
>
> commit b5d9d734a53e0204aab0089079cbde2a1285a38f
> Author: Mike Galbraith <[email protected]>
> Date: Tue Sep 8 11:12:28 2009 +0200
>
> sched: Ensure that a child can't gain time over it's parent after fork()
Ouch. Yeah, that commit is buggy, and never got fixed up in stable.
Reverting it will restore a slightly less buggy, but not very good
situation. Getting the fork problems all fixed up took a while.
(quick fix vs revert didn't help your testcase)
> When I revert this commit the test progresses as it did in 2.6.31. I
> have seen this issue on 2.6.32 and 2.6.32.19. The hang goes away in
> 2.6.33 starting with this commit:
>
> commit 88ec22d3edb72b261f8628226cd543589a6d5e1b
> Author: Peter Zijlstra <[email protected]>
> Date: Wed Dec 16 18:04:41 2009 +0100
>
> sched: Remove the cfs_rq dependency from set_task_cpu()
Excellent timing you have. I have a tree of backports, but I wasn't
counting this commit as a must have, merely highly desirable. This
testcase showed that it's a needed fix.
> Even though this appears to be resolved in 2.6.33, I am reporting it
> because 2.6.32 is the "long-term stable release".
Yeah, there are a _lot_ of fixes that should wander back to 32-stable.
> My test system is a single socket dual core amd -
> model name : Dual Core AMD Opteron(tm) Processor 180
> with 4GB of RAM.
> Kernel config file attached.
>
> The issue is easily reproducible for me by downloading and building ltp,
> then running
> testcases/kernel/controllers/cpuctl/run_cpuctl_latency_test.sh 2
>
> Please let me know if you need any other information to help reproduce
> this issue.
No, the testcase works well. Thanks.
-Mike
On 08/24/2010 10:56 PM, Mike Galbraith wrote:
>
> Excellent timing you have. I have a tree of backports, but I wasn't
> counting this commit as a must have, merely highly desirable. This
> testcase showed that it's a needed fix.
>
I'd be interested in looking at this tree when it's available.
Thanks
Josh
On Wed, 2010-08-25 at 13:19 -0700, Josh Hunt wrote:
> On 08/24/2010 10:56 PM, Mike Galbraith wrote:
> >
> > Excellent timing you have. I have a tree of backports, but I wasn't
> > counting this commit as a must have, merely highly desirable. This
> > testcase showed that it's a needed fix.
> >
>
> I'd be interested in looking at this tree when it's available.
(sent quilt stack offline, anybody else wants it, holler, and you'll
receive one [absolutely free!] 50k tarball)
Hi, Mike
On Wed, 25 Aug 2010 07:56:01 +0200
Mike Galbraith <[email protected]> wrote:
> On Tue, 2010-08-24 at 13:10 -0700, Josh Hunt wrote:
> > This commit makes the ltp cpuctl latency test #2 hang indefinitely:
> >
> > commit b5d9d734a53e0204aab0089079cbde2a1285a38f
> > Author: Mike Galbraith <[email protected]>
> > Date: Tue Sep 8 11:12:28 2009 +0200
> >
> > sched: Ensure that a child can't gain time over it's parent after fork()
>
> Ouch. Yeah, that commit is buggy, and never got fixed up in stable.
> Reverting it will restore a slightly less buggy, but not very good
> situation. Getting the fork problems all fixed up took a while.
> (quick fix vs revert didn't help your testcase)
I'm interested in this problem, because I hit the same problem in RHEL6 beta2.
(It based on 2.6.32)
Are you writing a patch to solving this problem?
If you are doing, I can test it in RHEL6 beta2 (or latest).
Appendix.
I could reproduce this problem without ltp. See below.(case 1)
But if cpus are not completely busy, it couldn't occure.(case 2)
[case1]
1) Run busy loop process (number of cpu) in same cpu cgroup.
2) attach process to 1)'s cpu cgroup
-> attach process unfinished
Ex)
# mkdir /cgroup/cpu/test/tasks
# echo $$ > /cgroup/cpu/test/tasks
# ./loop 8 &
[1] 27202
# mpstat -P ALL 1
Linux 2.6.32-37.el6.x86_64 (StingerG.localdomain) 08/31/2010 _x86_64_ (8 CPU)
03:08:45 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
03:08:46 PM all 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
03:08:46 PM 0 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
03:08:46 PM 1 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
03:08:46 PM 2 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
03:08:46 PM 3 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
03:08:46 PM 4 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
03:08:46 PM 5 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
03:08:46 PM 6 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
03:08:46 PM 7 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
# echo $$ > /cgroup/cpu/tasks
# time echo $$ > /cgroup/cpu/test/tasks <- unfinish this operation
[case2]
# echo $$ > /cgroup/cpu/test/tasks
# ./loop 7 &
[1] 27259
# mpstat -P ALL 1
Linux 2.6.32-37.el6.x86_64 (StingerG.localdomain) 08/31/2010 _x86_64_ (8 CPU)
03:12:00 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
03:12:01 PM all 83.42 0.00 0.00 0.12 0.00 0.00 0.00 0.00 16.46
03:12:01 PM 0 72.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 28.00
03:12:01 PM 1 60.75 0.00 0.00 0.00 0.00 0.00 0.00 0.00 39.25
03:12:01 PM 2 98.99 0.00 0.00 1.01 0.00 0.00 0.00 0.00 0.00
03:12:01 PM 3 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
03:12:01 PM 4 67.29 0.00 0.00 0.00 0.00 0.00 0.00 0.00 32.71
03:12:01 PM 5 72.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 28.00
03:12:01 PM 6 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
03:12:01 PM 7 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
# echo $$ > /cgroup/cpu/tasks
# time echo $$ > /cgroup/cpu/test/tasks
real 0m0.006s
user 0m0.000s
sys 0m0.000s
> > When I revert this commit the test progresses as it did in 2.6.31. I
> > have seen this issue on 2.6.32 and 2.6.32.19. The hang goes away in
> > 2.6.33 starting with this commit:
> >
> > commit 88ec22d3edb72b261f8628226cd543589a6d5e1b
> > Author: Peter Zijlstra <[email protected]>
> > Date: Wed Dec 16 18:04:41 2009 +0100
> >
> > sched: Remove the cfs_rq dependency from set_task_cpu()
>
> Excellent timing you have. I have a tree of backports, but I wasn't
> counting this commit as a must have, merely highly desirable. This
> testcase showed that it's a needed fix.
>
> > Even though this appears to be resolved in 2.6.33, I am reporting it
> > because 2.6.32 is the "long-term stable release".
>
> Yeah, there are a _lot_ of fixes that should wander back to 32-stable.
>
> > My test system is a single socket dual core amd -
> > model name : Dual Core AMD Opteron(tm) Processor 180
> > with 4GB of RAM.
> > Kernel config file attached.
> >
> > The issue is easily reproducible for me by downloading and building ltp,
> > then running
> > testcases/kernel/controllers/cpuctl/run_cpuctl_latency_test.sh 2
> >
> > Please let me know if you need any other information to help reproduce
> > this issue.
>
> No, the testcase works well. Thanks.
>
> -Mike
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
Minoru Usui <[email protected]>