Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751208AbdGOF0r (ORCPT ); Sat, 15 Jul 2017 01:26:47 -0400 Received: from mail-oi0-f68.google.com ([209.85.218.68]:34946 "EHLO mail-oi0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751120AbdGOF0q (ORCPT ); Sat, 15 Jul 2017 01:26:46 -0400 MIME-Version: 1.0 In-Reply-To: <20170715033725.acs67xgf4z6utyxn@sasha-lappy> References: <1498756511-11714-1-git-send-email-fweisbec@gmail.com> <1498756511-11714-6-git-send-email-fweisbec@gmail.com> <20170715033725.acs67xgf4z6utyxn@sasha-lappy> From: Wanpeng Li Date: Sat, 15 Jul 2017 13:26:45 +0800 Message-ID: Subject: Re: [PATCH 5/5] sched: Accumulate vtime on top of nsec clocksource To: "Levin, Alexander (Sasha Levin)" Cc: Frederic Weisbecker , LKML , Peter Zijlstra , Thomas Gleixner , Luiz Capitulino , Ingo Molnar , Rik van Riel Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 7334 Lines: 147 2017-07-15 11:37 GMT+08:00 Levin, Alexander (Sasha Levin) : > On Thu, Jun 29, 2017 at 07:15:11PM +0200, Frederic Weisbecker wrote: >>From: Wanpeng Li >> >>Currently the cputime source used by vtime is jiffies. When we cross >>a context boundary and jiffies have changed since the last snapshot, the >>pending cputime is accounted to the switching out context. >> >>This system works ok if the ticks are not aligned across CPUs. If they >>instead are aligned (ie: all fire at the same time) and the CPUs run in >>userspace, the jiffies change is only observed on tick exit and therefore >>the user cputime is accounted as system cputime. This is because the >>CPU that maintains timekeeping fires its tick at the same time as the >>others. It updates jiffies in the middle of the tick and the other CPUs >>see that update on IRQ exit: >> >> CPU 0 (timekeeper) CPU 1 >> ------------------- ------------- >> jiffies = N >> ... run in userspace for a jiffy >> tick entry tick entry (sees jiffies = N) >> set jiffies = N + 1 >> tick exit tick exit (sees jiffies = N + 1) >> account 1 jiffy as stime >> >>Fix this with using a nanosec clock source instead of jiffies. The >>cputime is then accumulated and flushed everytime the pending delta >>reaches a jiffy in order to mitigate the accounting overhead. >> >>[fweisbec: changelog, rebase on struct vtime, field renames, add delta >>on cputime readers, keep idle vtime as-is (low overhead accounting), >>harmonize clock sources] >> >>Reported-by: Luiz Capitulino >>Suggested-by: Thomas Gleixner >>Not-Yet-Signed-off-by: Wanpeng Li >>Cc: Rik van Riel >>Cc: Peter Zijlstra >>Cc: Thomas Gleixner >>Cc: Wanpeng Li >>Cc: Ingo Molnar >>Cc: Luiz Capitulino >>Signed-off-by: Frederic Weisbecker > > Hi all, > > This patch seems to be causing this: Yeah, there is a patch to fix it. https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?id=0e4097c3354e2f5a5ad8affd9dc7f7f7d00bb6b9 Regards, Wanpeng Li > > BUG: using smp_processor_id() in preemptible [00000000] code: kworker/u9:0/6 > caller is debug_smp_processor_id+0x1c/0x20 lib/smp_processor_id.c:56 > CPU: 1 PID: 6 Comm: kworker/u9:0 Not tainted 4.12.0-next-20170714 #187 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1ubuntu1 04/01/2014 > Workqueue: events_unbound call_usermodehelper_exec_work > Call Trace: > __dump_stack lib/dump_stack.c:16 [inline] > dump_stack+0x11d/0x1ef lib/dump_stack.c:52 > check_preemption_disabled+0x1f4/0x200 lib/smp_processor_id.c:46 > debug_smp_processor_id+0x1c/0x20 lib/smp_processor_id.c:56 > vtime_delta.isra.6+0x11/0x60 kernel/sched/cputime.c:686 > task_cputime+0x3ca/0x790 kernel/sched/cputime.c:882 > thread_group_cputime+0x51a/0xaa0 kernel/sched/cputime.c:327 > thread_group_cputime_adjusted+0x73/0xf0 kernel/sched/cputime.c:676 > wait_task_zombie kernel/exit.c:1114 [inline] > wait_consider_task+0x1c82/0x37f0 kernel/exit.c:1389 > do_wait_thread kernel/exit.c:1452 [inline] > do_wait+0x457/0xb00 kernel/exit.c:1523 > kernel_wait4+0x1fd/0x380 kernel/exit.c:1665 > SYSC_wait4+0x145/0x160 kernel/exit.c:1677 > SyS_wait4+0x2c/0x40 kernel/exit.c:1673 > call_usermodehelper_exec_sync kernel/kmod.c:286 [inline] > call_usermodehelper_exec_work+0x1fc/0x2c0 kernel/kmod.c:323 > process_one_work+0xae7/0x1a00 kernel/workqueue.c:2097 > worker_thread+0x221/0x1860 kernel/workqueue.c:2231 > kthread+0x35f/0x430 kernel/kthread.c:231 > ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:425 > capability: warning: `syz-executor5' uses 32-bit capabilities (legacy support in use) > BUG: using smp_processor_id() in preemptible [00000000] code: syz-executor6/7013 > caller is debug_smp_processor_id+0x1c/0x20 lib/smp_processor_id.c:56 > CPU: 3 PID: 7013 Comm: syz-executor6 Not tainted 4.12.0-next-20170714 #187 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1ubuntu1 04/01/2014 > Call Trace: > __dump_stack lib/dump_stack.c:16 [inline] > dump_stack+0x11d/0x1ef lib/dump_stack.c:52 > check_preemption_disabled+0x1f4/0x200 lib/smp_processor_id.c:46 > debug_smp_processor_id+0x1c/0x20 lib/smp_processor_id.c:56 > vtime_delta.isra.6+0x11/0x60 kernel/sched/cputime.c:686 > task_cputime+0x3ca/0x790 kernel/sched/cputime.c:882 > thread_group_cputime+0x51a/0xaa0 kernel/sched/cputime.c:327 > thread_group_cputime_adjusted+0x73/0xf0 kernel/sched/cputime.c:676 > wait_task_zombie kernel/exit.c:1114 [inline] > wait_consider_task+0x1c82/0x37f0 kernel/exit.c:1389 > do_wait_thread kernel/exit.c:1452 [inline] > do_wait+0x457/0xb00 kernel/exit.c:1523 > kernel_wait4+0x1fd/0x380 kernel/exit.c:1665 > SYSC_wait4+0x145/0x160 kernel/exit.c:1677 > SyS_wait4+0x2c/0x40 kernel/exit.c:1673 > do_syscall_64+0x267/0x740 arch/x86/entry/common.c:284 > entry_SYSCALL64_slow_path+0x25/0x25 > RIP: 0033:0x40bd8a > RSP: 002b:00007ffdbdf67b08 EFLAGS: 00000246 ORIG_RAX: 000000000000003d > RAX: ffffffffffffffda RBX: 0000000000b22914 RCX: 000000000040bd8a > RDX: 0000000040000001 RSI: 00007ffdbdf67b4c RDI: ffffffffffffffff > RBP: 0000000000002243 R08: 0000000000001b65 R09: 0000000000b22940 > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 > R13: 00007ffdbdf67b4c R14: 0000000000016ee4 R15: 0000000000000016 > BUG: using smp_processor_id() in preemptible [00000000] code: init/1 > caller is debug_smp_processor_id+0x1c/0x20 lib/smp_processor_id.c:56 > CPU: 3 PID: 1 Comm: init Not tainted 4.12.0-next-20170714 #187 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1ubuntu1 04/01/2014 > Call Trace: > __dump_stack lib/dump_stack.c:16 [inline] > dump_stack+0x11d/0x1ef lib/dump_stack.c:52 > check_preemption_disabled+0x1f4/0x200 lib/smp_processor_id.c:46 > debug_smp_processor_id+0x1c/0x20 lib/smp_processor_id.c:56 > vtime_delta.isra.6+0x11/0x60 kernel/sched/cputime.c:686 > task_cputime+0x3ca/0x790 kernel/sched/cputime.c:882 > thread_group_cputime+0x51a/0xaa0 kernel/sched/cputime.c:327 > thread_group_cputime_adjusted+0x73/0xf0 kernel/sched/cputime.c:676 > wait_task_zombie kernel/exit.c:1114 [inline] > wait_consider_task+0x1c82/0x37f0 kernel/exit.c:1389 > do_wait_thread kernel/exit.c:1452 [inline] > do_wait+0x457/0xb00 kernel/exit.c:1523 > kernel_wait4+0x1fd/0x380 kernel/exit.c:1665 > SYSC_wait4+0x145/0x160 kernel/exit.c:1677 > SyS_wait4+0x2c/0x40 kernel/exit.c:1673 > do_syscall_64+0x267/0x740 arch/x86/entry/common.c:284 > entry_SYSCALL64_slow_path+0x25/0x25 > RIP: 0033:0x7f61952dca3e > RSP: 002b:00007fff93bafea0 EFLAGS: 00000246 ORIG_RAX: 000000000000003d > RAX: ffffffffffffffda RBX: 00007f6195c326a0 RCX: 00007f61952dca3e > RDX: 0000000000000001 RSI: 00007fff93bafedc RDI: ffffffffffffffff > RBP: 00007fff93bafedc R08: 00007fff93bb0870 R09: 0000000000000001 > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004 > R13: 00007fff93bb0bd0 R14: 0000000000000000 R15: 0000000000000000 > > -- > > Thanks, > Sasha