Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp2404352ybl; Thu, 9 Jan 2020 12:09:11 -0800 (PST) X-Google-Smtp-Source: APXvYqwX9uk44wVF0Nl3Jnsx88nb4vvIAI2KLNAuamqYfgeEo8+Y0wgVTDAiSDugLGQN1UxQJdod X-Received: by 2002:aca:1913:: with SMTP id l19mr4324200oii.47.1578600551595; Thu, 09 Jan 2020 12:09:11 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1578600551; cv=none; d=google.com; s=arc-20160816; b=Ubwz13gAxrM7d9nzBLgp//YVNpx7t2iM78zMUmIQj0jXLjLdPYRwDDhMfRnF42z5b7 I7kb04r/fqb0qu4PXDMYVog2pxdIShNTqIvlRYr4sDP4o14eZc04xobC/ar+SZB5b85G jCoj8M8wjzv9Wv+Hy1Lr/FuD38L3iazQCiUB97px/Az10UOzP1NAjwDQX1VGcL7X7rI/ ARUQPCaNxJ8lGB7tksBn6LApcTbG5M/qhg+o9nR3LeHBoQ0dYZ7nV7NljXkGyb4Y/F8e tY/aSDGqeG3QZ7iyUbwifbgYUNcohC9nboGQM61VtmHoDPHvLb+6Xi/0u87R/1dBaYl7 6XxQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:from:subject:references :mime-version:message-id:in-reply-to:date:dkim-signature; bh=BwDt6lrR16c2nuKNqASjxv2dJeJjfu/H1rhKllUpnIM=; b=rLo88O2K879L23Ky9yEKiBylDN4dZUKPKuNAxPClErfnExx2bIu1EXlgRktD/u/yzC Y6dMJ529M2aOTHvkzlCofn4SXXwHHMvViJr1enmLTccJf1zuv/CPSAf1YzW5sbFdQjxc 9Up/33+eFmZtfbDBZa/VZGY7tDcO3JVrOz9a0ZgXEvRisJyvcCUxIhXkqwWVqbxEamsV tsDk1aLzNI+gEzbUDpo/+sUcwvpe4J43lNBsd5CaLv6Wf6WSWkVDWTCA0O69ZOUSJWQs x8pekGhOQaVNzyfeIu/8O4ynIc/gWa25bLCpiFl+jVM4YSZ88E5Z8Ts7WFihny4lMvu9 8TvQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=vffo9f1x; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d20si4185246oic.40.2020.01.09.12.08.57; Thu, 09 Jan 2020 12:09:11 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=vffo9f1x; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732390AbgAIPXo (ORCPT + 99 others); Thu, 9 Jan 2020 10:23:44 -0500 Received: from mail-wm1-f74.google.com ([209.85.128.74]:41403 "EHLO mail-wm1-f74.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732359AbgAIPXn (ORCPT ); Thu, 9 Jan 2020 10:23:43 -0500 Received: by mail-wm1-f74.google.com with SMTP id b9so1047392wmj.6 for ; Thu, 09 Jan 2020 07:23:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=BwDt6lrR16c2nuKNqASjxv2dJeJjfu/H1rhKllUpnIM=; b=vffo9f1xiXcbPtJ5+ehZBvBCloI8nBL+vFyIO2FekPUsKet9uk8GX88jNMy40dngYO 66rF7tE3/Qp8m0/1rq0ErbSIT2pTubNTn0qCzgPjyQKoE9XRVeX/BmeisoKhkKt3otcY 3NsHAH+w1lmjZWs74M/mkb6wv7Z/rAYmlmSRB0j170M5jhg4SrIqCP0oRNyup5ahkWfR +UmSpdJoTBeEn+wKAOqzpV9COFXGvWP3UwHEiktykduHbPI78rYxNW2gBEXxMAIY9zRL ouRoDnMSPcbvp21imnW5u1Rj5Mam68YnaLEcrU1MHiOisKNx/7dY8HdS/WAF+ZoV3RK5 VAEg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=BwDt6lrR16c2nuKNqASjxv2dJeJjfu/H1rhKllUpnIM=; b=NHbkeGBkNABzuKAy1IgSeoBcXHMShGgqXJ33ZeCBZDY+7zgFx99YeYFPQVAr4d4yc8 kcU0tTFecSyTxtqNDrLRQmESKqjuIZPuQMECGfBANbzn1Pk9UFpjy39m4hG5nPxMqTYg QrpvkoPCqJQVEsjIXLhq1r1M4z5MQkh5tUJid4tdkrRqOJyjGuv3Ao4//RWfCl/A6zcr 52MK5xE1N9+V/2jm94fiGxRTdlRtYK94qTiqdiDHjXv1De+4ad6JQIEa4IIJNhVX2CTr 3hxGMIubaMod5sURfiuMLjtjxAVmnikUam3OW69iRuUn/S52M0zpepC/jT5yV3E9XviT rOZg== X-Gm-Message-State: APjAAAX/PBJsdVCvrXACxvQqscPXlF4ZiO8LQ7cq9wmoqXBDpVBhmpGx CBTFPMXvIz7yWhBv1HU2a2EfPE7fSw== X-Received: by 2002:a5d:540f:: with SMTP id g15mr11173097wrv.86.1578583420349; Thu, 09 Jan 2020 07:23:40 -0800 (PST) Date: Thu, 9 Jan 2020 16:23:22 +0100 In-Reply-To: <20200109152322.104466-1-elver@google.com> Message-Id: <20200109152322.104466-3-elver@google.com> Mime-Version: 1.0 References: <20200109152322.104466-1-elver@google.com> X-Mailer: git-send-email 2.25.0.rc1.283.g88dfdc4193-goog Subject: [PATCH -rcu 2/2] kcsan: Rate-limit reporting per data races From: Marco Elver To: elver@google.com Cc: paulmck@kernel.org, andreyknvl@google.com, glider@google.com, dvyukov@google.com, kasan-dev@googlegroups.com, linux-kernel@vger.kernel.org, Qian Cai Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Adds support for rate limiting reports. This uses a time based rate limit, that limits any given data race report to no more than one in a fixed time window (default is 3 sec). This should prevent the console from being spammed with data race reports, that would render the system unusable. The implementation assumes that unique data races and the rate at which they occur is bounded, since we cannot store arbitrarily many past data race report information: we use a fixed-size array to store the required information. We cannot use kmalloc/krealloc and resize the list when needed, as reporting is triggered by the instrumentation calls; to permit using KCSAN on the allocators, we cannot (re-)allocate any memory during report generation (data races in the allocators lead to deadlock). Reported-by: Qian Cai Suggested-by: Paul E. McKenney Signed-off-by: Marco Elver --- kernel/kcsan/report.c | 112 ++++++++++++++++++++++++++++++++++++++---- lib/Kconfig.kcsan | 10 ++++ 2 files changed, 112 insertions(+), 10 deletions(-) diff --git a/kernel/kcsan/report.c b/kernel/kcsan/report.c index 9f503ca2ff7a..e324af7d14c9 100644 --- a/kernel/kcsan/report.c +++ b/kernel/kcsan/report.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 #include +#include #include #include #include @@ -31,12 +32,101 @@ static struct { int num_stack_entries; } other_info = { .ptr = NULL }; +/* + * Information about reported data races; used to rate limit reporting. + */ +struct report_time { + /* + * The last time the data race was reported. + */ + ktime_t time; + + /* + * The frames of the 2 threads; if only 1 thread is known, one frame + * will be 0. + */ + unsigned long frame1; + unsigned long frame2; +}; + +/* + * Since we also want to be able to debug allocators with KCSAN, to avoid + * deadlock, report_times cannot be dynamically resized with krealloc in + * rate_limit_report. + * + * Therefore, we use a fixed-size array, which at most will occupy a page. This + * still adequately rate limits reports, assuming that a) number of unique data + * races is not excessive, and b) occurrence of unique data races within the + * same time window is limited. + */ +#define REPORT_TIMES_MAX (PAGE_SIZE / sizeof(struct report_time)) +#define REPORT_TIMES_SIZE \ + (CONFIG_KCSAN_REPORT_ONCE_IN_MS > REPORT_TIMES_MAX ? \ + REPORT_TIMES_MAX : \ + CONFIG_KCSAN_REPORT_ONCE_IN_MS) +static struct report_time report_times[REPORT_TIMES_SIZE]; + /* * This spinlock protects reporting and other_info, since other_info is usually * required when reporting. */ static DEFINE_SPINLOCK(report_lock); +/* + * Checks if the data race identified by thread frames frame1 and frame2 has + * been reported since (now - KCSAN_REPORT_ONCE_IN_MS). + */ +static bool rate_limit_report(unsigned long frame1, unsigned long frame2) +{ + struct report_time *use_entry = &report_times[0]; + ktime_t now; + ktime_t invalid_before; + int i; + + BUILD_BUG_ON(CONFIG_KCSAN_REPORT_ONCE_IN_MS != 0 && REPORT_TIMES_SIZE == 0); + + if (CONFIG_KCSAN_REPORT_ONCE_IN_MS == 0) + return false; + + now = ktime_get(); + invalid_before = ktime_sub_ms(now, CONFIG_KCSAN_REPORT_ONCE_IN_MS); + + /* Check if a matching data race report exists. */ + for (i = 0; i < REPORT_TIMES_SIZE; ++i) { + struct report_time *rt = &report_times[i]; + + /* + * Must always select an entry for use to store info as we + * cannot resize report_times; at the end of the scan, use_entry + * will be the oldest entry, which ideally also happened before + * KCSAN_REPORT_ONCE_IN_MS ago. + */ + if (ktime_before(rt->time, use_entry->time)) + use_entry = rt; + + /* + * Initially, no need to check any further as this entry as well + * as following entries have never been used. + */ + if (rt->time == 0) + break; + + /* Check if entry expired. */ + if (ktime_before(rt->time, invalid_before)) + continue; /* before KCSAN_REPORT_ONCE_IN_MS ago */ + + /* Reported recently, check if data race matches. */ + if ((rt->frame1 == frame1 && rt->frame2 == frame2) || + (rt->frame1 == frame2 && rt->frame2 == frame1)) + return true; + } + + use_entry->time = now; + use_entry->frame1 = frame1; + use_entry->frame2 = frame2; + return false; +} + /* * Special rules to skip reporting. */ @@ -132,7 +222,9 @@ static bool print_report(const volatile void *ptr, size_t size, int access_type, unsigned long stack_entries[NUM_STACK_ENTRIES] = { 0 }; int num_stack_entries = stack_trace_save(stack_entries, NUM_STACK_ENTRIES, 1); int skipnr = get_stack_skipnr(stack_entries, num_stack_entries); - int other_skipnr; + unsigned long this_frame = stack_entries[skipnr]; + unsigned long other_frame = 0; + int other_skipnr = 0; /* silence uninit warnings */ /* * Must check report filter rules before starting to print. @@ -143,34 +235,34 @@ static bool print_report(const volatile void *ptr, size_t size, int access_type, if (type == KCSAN_REPORT_RACE_SIGNAL) { other_skipnr = get_stack_skipnr(other_info.stack_entries, other_info.num_stack_entries); + other_frame = other_info.stack_entries[other_skipnr]; /* @value_change is only known for the other thread */ - if (skip_report(other_info.access_type, value_change, - other_info.stack_entries[other_skipnr])) + if (skip_report(other_info.access_type, value_change, other_frame)) return false; } + if (rate_limit_report(this_frame, other_frame)) + return false; + /* Print report header. */ pr_err("==================================================================\n"); switch (type) { case KCSAN_REPORT_RACE_SIGNAL: { - void *this_fn = (void *)stack_entries[skipnr]; - void *other_fn = (void *)other_info.stack_entries[other_skipnr]; int cmp; /* * Order functions lexographically for consistent bug titles. * Do not print offset of functions to keep title short. */ - cmp = sym_strcmp(other_fn, this_fn); + cmp = sym_strcmp((void *)other_frame, (void *)this_frame); pr_err("BUG: KCSAN: data-race in %ps / %ps\n", - cmp < 0 ? other_fn : this_fn, - cmp < 0 ? this_fn : other_fn); + (void *)(cmp < 0 ? other_frame : this_frame), + (void *)(cmp < 0 ? this_frame : other_frame)); } break; case KCSAN_REPORT_RACE_UNKNOWN_ORIGIN: - pr_err("BUG: KCSAN: data-race in %pS\n", - (void *)stack_entries[skipnr]); + pr_err("BUG: KCSAN: data-race in %pS\n", (void *)this_frame); break; default: diff --git a/lib/Kconfig.kcsan b/lib/Kconfig.kcsan index 3f78b1434375..3552990abcfe 100644 --- a/lib/Kconfig.kcsan +++ b/lib/Kconfig.kcsan @@ -81,6 +81,16 @@ config KCSAN_SKIP_WATCH_RANDOMIZE KCSAN_WATCH_SKIP. If false, the chosen value is always KCSAN_WATCH_SKIP. +config KCSAN_REPORT_ONCE_IN_MS + int "Duration in milliseconds, in which any given data race is only reported once" + default 3000 + help + Any given data race is only reported once in the defined time window. + Different data races may still generate reports within a duration + that is smaller than the duration defined here. This allows rate + limiting reporting to avoid flooding the console with reports. + Setting this to 0 disables rate limiting. + # Note that, while some of the below options could be turned into boot # parameters, to optimize for the common use-case, we avoid this because: (a) # it would impact performance (and we want to avoid static branch for all -- 2.25.0.rc1.283.g88dfdc4193-goog