Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp5149811ybi; Tue, 28 May 2019 08:20:25 -0700 (PDT) X-Google-Smtp-Source: APXvYqwkHPsMzQCojIW6qdG9irODg89UtMK7UtRKZjxMMazE7/HXn+OEehM27aTolw2VLlJAjWGp X-Received: by 2002:a17:90a:ae02:: with SMTP id t2mr6666411pjq.41.1559056825040; Tue, 28 May 2019 08:20:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1559056825; cv=none; d=google.com; s=arc-20160816; b=EZo1UtH8oUdzQhWMC3QSQoo+r+qOfrQalXobi/GqVHfuwxtGkgKOhOhqAVcgQBQLWP bGinqoTguoMxBWGEMg/ihWNDZqkP2I7hxzuCpN7Ip0Tngtc92bwJIHZ7VAHyFi6O44r1 0KXsjbuY1iHjXmoA3hceEjeqlwI8RFw9ikEasQn2CE3VhEUyflWlz9RRUi4kMHjlMfjN lzqbPoxb7BR0GrgRdbzMHhDtbDvDj75uEtBMsqTagz2n98HW9YWz5OeOBc8b7vyhE8Ft IPBOSSSWLKw2o7b784jpF1Z/erq1XaAoOlf3bzF8YMUkNd8tBo3oENGnvWuKJHXld2ul TkJQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=7R4GKAwq7br+2slRDCyMthdXnSm8y5CByJwVQ9oF6KY=; b=xGQJaloOdR6DvSC47za5F3VZCWxfC4IjuWu1ZgRB+W6eZQf6el0gnGYLWsE6MM08Oj vCjKZKNsUiHjK9NRl1qotK7CzhPC3VqSSG/s1bLnki/CJgelz8ugRSug7hVavPV9MciA ZQ9hVm8IA/PtgyaAvmmzqxzPjrE+8JbosLJCE86IjWEq3NI3Rra1GWkOjvFcGcBJFg47 3eLpi6yGqIbViCOpoA2giI67LFwLxmk54xCvU617iYPk9RcQW32CQ6JX8ATafU/BtVOk +6zM552UtsEb0cW2KXCqdLpyCGuEsh2DqvdNAzL1zF1Ej24PiWy5lDfitm5m8ZD5d509 Y35w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 65si9957157ple.240.2019.05.28.08.20.09; Tue, 28 May 2019 08:20:25 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727781AbfE1PRJ (ORCPT + 99 others); Tue, 28 May 2019 11:17:09 -0400 Received: from mx1.redhat.com ([209.132.183.28]:29101 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726362AbfE1PRI (ORCPT ); Tue, 28 May 2019 11:17:08 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 6044A9FFCC; Tue, 28 May 2019 15:17:02 +0000 (UTC) Received: from inkernel.default (ovpn-116-60.phx2.redhat.com [10.3.116.60]) by smtp.corp.redhat.com (Postfix) with ESMTP id C70F97856B; Tue, 28 May 2019 15:16:53 +0000 (UTC) From: Daniel Bristot de Oliveira To: linux-kernel@vger.kernel.org Cc: williams@redhat.com, daniel@bristot.me, "Steven Rostedt (VMware)" , Ingo Molnar , Peter Zijlstra , Thomas Gleixner , "Paul E. McKenney" , Matthias Kaehlcke , "Joel Fernandes (Google)" , Frederic Weisbecker , Yangtao Li , Tommaso Cucinotta Subject: [RFC 3/3] preempt_tracer: Use a percpu variable to control traceble calls Date: Tue, 28 May 2019 17:16:24 +0200 Message-Id: <9b0698774be3bb406e2b8b2c12dc1fb91532bff0.1559051152.git.bristot@redhat.com> In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Tue, 28 May 2019 15:17:07 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The preempt_disable tracepoint only traces in the disable <-> enable case. Which is correct, but think about this case: --------------------------- %< ---------------------- THREAD IRQ | | preempt_disable_notrace() { __preempt_count_add(1) } /* preemption disabled/IRQs enabled */ -------> smp_apic_timer_interrupt() { preempt_disable() { do not trace (preempt count >= 1) } .... preempt_enable() { do not trace (preempt count >= 1) } <------- } preempt_enable_notrace() { __preempt_count_sub(1) } --------------------------- >% ---------------------- The non-traceble preempt_disable can hide a legit preempt_disable (which is worth tracing). It is possible to observe this problem using this kernel module: http://bristot.me/files/efficient_verification/wip.tar.gz and doing the following trace: # cd /sys/kernel/debug/tracing/ # echo 1 > snapshot # cat available_events | grep preempt_ > set_event # echo irq_vectors >> /sys/kernel/debug/tracing/set_event # insmod wip.ko /* wait for a snapshot creation */ # tail -100 snapshot sshd-1159 [000] d...1.. 2440.866116: preempt_enable: caller=migrate_enable+0x1bb/0x330 parent=migrate_enable+0x1bb/0x330 sshd-1159 [000] d..h1.. 2440.866122: local_timer_entry: vector=236 sshd-1159 [000] d..h1.. 2440.866127: local_timer_exit: vector=236 sshd-1159 [000] d.L.4.. 2440.866129: process_event: event sched_waking not expected in the state preemptive sshd-1159 [000] d.L.4.. 2440.866137: => process_event => __handle_event => ttwu_do_wakeup => try_to_wake_up => irq_exit => smp_apic_timer_interrupt => apic_timer_interrupt => kvm_clock_read => ktime_get_with_offset => posix_get_boottime => __x64_sys_clock_gettime => do_syscall_64 => entry_SYSCALL_64_after_hwframe and kvm_clock_read() disables preemption without tracing: --------------------------- %< ---------------------- static u64 kvm_clock_read(void) { u64 ret; preempt_disable_notrace(); ret = pvclock_clocksource_read(this_cpu_pvti()); preempt_enable_notrace(); return ret; } --------------------------- >% ---------------------- Use a percpu variable for the traced preempt_disable/enable, and use it to decide whether trace or not. Signed-off-by: Daniel Bristot de Oliveira Cc: "Steven Rostedt (VMware)" Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: "Paul E. McKenney" Cc: Matthias Kaehlcke Cc: "Joel Fernandes (Google)" Cc: Frederic Weisbecker Cc: Yangtao Li Cc: Tommaso Cucinotta Cc: linux-kernel@vger.kernel.org --- kernel/sched/core.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index be4117d7384f..2e07d4174778 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -3148,19 +3148,25 @@ static inline void sched_tick_stop(int cpu) { } #if defined(CONFIG_PREEMPT) && (defined(CONFIG_DEBUG_PREEMPT) || \ defined(CONFIG_TRACE_PREEMPT_TOGGLE)) + +DEFINE_PER_CPU(int, __traced_preempt_count) = 0; /* * If the value passed in is equal to the current preempt count * then we just disabled preemption. Start timing the latency. */ void preempt_latency_start(int val) { - if (preempt_count() == val) { + int curr = this_cpu_read(__traced_preempt_count); + + if (!curr) { unsigned long ip = get_lock_parent_ip(); #ifdef CONFIG_DEBUG_PREEMPT current->preempt_disable_ip = ip; #endif trace_preempt_off(CALLER_ADDR0, ip); } + + this_cpu_write(__traced_preempt_count, curr + val); } static inline void preempt_add_start_latency(int val) @@ -3200,8 +3206,12 @@ NOKPROBE_SYMBOL(preempt_count_add); */ void preempt_latency_stop(int val) { - if (preempt_count() == val) + int curr = this_cpu_read(__traced_preempt_count) - val; + + if (!curr) trace_preempt_on(CALLER_ADDR0, get_lock_parent_ip()); + + this_cpu_write(__traced_preempt_count, curr); } static inline void preempt_sub_stop_latency(int val) -- 2.20.1