Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp1975711pxb; Fri, 5 Mar 2021 04:34:24 -0800 (PST) X-Google-Smtp-Source: ABdhPJzlTr3Dik7W7uXOZhjLfsYPiC3aFuS7DYNSwevw/LWV15R0tOpXDsBGO/ezn3zODnbeGqw5 X-Received: by 2002:a05:6402:304b:: with SMTP id bu11mr8520556edb.157.1614947664514; Fri, 05 Mar 2021 04:34:24 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1614947664; cv=none; d=google.com; s=arc-20160816; b=ccSWdgw5aeIa/b/12XJTUTIB873OjoCZN2tzbCgg6LtPdb3jz/YhyxqBcZ22MpYf/y mIL/MuubHtzo5PFqXUPpisXF1JeYMNgpKWJn6+lh3BIW6GStc8HpEn4mTnHIPkhs4KeP tmhuWTGVSTKbKD5edHSdFy/w4ibarGv4Z3qTMUZwYTYZURjJ7VA37APFfkfa/qEs8nBW EYcyIexSbSoipVE10f4gT3ox0ojSwrKLpGwWOeKxyMJ2dwAbqVyS+UgjN4ZfkHEucQlX NEcG84yMTsThY9yBxb7vTUe8TeiLXALj5JhNI2jaOZOy5kV4hOFXmvh3f1p8ZARQBrtk SGHg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=GFVGzZLp7iGVTf3jLrdYR/bqkwmyZpcO4CHLhC4y5g0=; b=SWI5crW0H9uL7ecCJ0Gqj2n30yw93k4kqQ5htZZHo3mmVAj7PNbr4L/ki7edU5i3DF mvHjFKL/OpnDJlaaxrJ2TWcffnnONT7IIyiaFs+WNlkTzXtzi4m4WJs9hhzgeMmZ5ivJ shH8gMsBRcDeGJKYXPv7f0RuompbtCTyQmK74GwpyYseToJVxyoYzYlg2oZBnWzkfaZ7 2/Q7rmr3Lr0CdkMhzXp0QeRy9kzJzCNbbL7SD93ab1CqmV8J7qmckPqJYc5ORQz7wax8 suVuN5M+7Kvf1vYEvKlrlR+73Yg2buCeEJOuCePIX/doQQpNigXWmpXHkujMYE3Vf0wK IbWQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b="j/EO4v0K"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id w2si1404520edr.159.2021.03.05.04.34.01; Fri, 05 Mar 2021 04:34:24 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b="j/EO4v0K"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230402AbhCEMdL (ORCPT + 99 others); Fri, 5 Mar 2021 07:33:11 -0500 Received: from mail.kernel.org ([198.145.29.99]:43050 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231706AbhCEMcc (ORCPT ); Fri, 5 Mar 2021 07:32:32 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 7D1DF65004; Fri, 5 Mar 2021 12:32:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1614947551; bh=sAAUMxzJvsIrTKwj0/c6KtxIqUDZu/44pvKskrTbGIo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=j/EO4v0Kn6D3mqdRW95yXNjGrKZcP5MaGPn/Pu5JchmxVGxfFyO/5gNtk0HeKMG9F OVX+xHavDhAegXO9oetznKquddl5Onlem4PtNbRXDGokcVqJr89SGq25B6Ue8vCnqE MJGIgdXDQ0cGE3+nnU92TsbCXB2FNW1vdFzGEuII= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Juri Lelli , "Luis Claudio R. Goncalves" , Daniel Bristot de Oliveira , "Peter Zijlstra (Intel)" , Ingo Molnar , Sasha Levin Subject: [PATCH 5.10 078/102] sched/features: Fix hrtick reprogramming Date: Fri, 5 Mar 2021 13:21:37 +0100 Message-Id: <20210305120907.115354115@linuxfoundation.org> X-Mailer: git-send-email 2.30.1 In-Reply-To: <20210305120903.276489876@linuxfoundation.org> References: <20210305120903.276489876@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Juri Lelli [ Upstream commit 156ec6f42b8d300dbbf382738ff35c8bad8f4c3a ] Hung tasks and RCU stall cases were reported on systems which were not 100% busy. Investigation of such unexpected cases (no sign of potential starvation caused by tasks hogging the system) pointed out that the periodic sched tick timer wasn't serviced anymore after a certain point and that caused all machinery that depends on it (timers, RCU, etc.) to stop working as well. This issues was however only reproducible if HRTICK was enabled. Looking at core dumps it was found that the rbtree of the hrtimer base used also for the hrtick was corrupted (i.e. next as seen from the base root and actual leftmost obtained by traversing the tree are different). Same base is also used for periodic tick hrtimer, which might get "lost" if the rbtree gets corrupted. Much alike what described in commit 1f71addd34f4c ("tick/sched: Do not mess with an enqueued hrtimer") there is a race window between hrtimer_set_expires() in hrtick_start and hrtimer_start_expires() in __hrtick_restart() in which the former might be operating on an already queued hrtick hrtimer, which might lead to corruption of the base. Use hrtick_start() (which removes the timer before enqueuing it back) to ensure hrtick hrtimer reprogramming is entirely guarded by the base lock, so that no race conditions can occur. Signed-off-by: Juri Lelli Signed-off-by: Luis Claudio R. Goncalves Signed-off-by: Daniel Bristot de Oliveira Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Link: https://lkml.kernel.org/r/20210208073554.14629-2-juri.lelli@redhat.com Signed-off-by: Sasha Levin --- kernel/sched/core.c | 8 +++----- kernel/sched/sched.h | 1 + 2 files changed, 4 insertions(+), 5 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 269165bf440a..3a150445e0cb 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -363,8 +363,9 @@ static enum hrtimer_restart hrtick(struct hrtimer *timer) static void __hrtick_restart(struct rq *rq) { struct hrtimer *timer = &rq->hrtick_timer; + ktime_t time = rq->hrtick_time; - hrtimer_start_expires(timer, HRTIMER_MODE_ABS_PINNED_HARD); + hrtimer_start(timer, time, HRTIMER_MODE_ABS_PINNED_HARD); } /* @@ -388,7 +389,6 @@ static void __hrtick_start(void *arg) void hrtick_start(struct rq *rq, u64 delay) { struct hrtimer *timer = &rq->hrtick_timer; - ktime_t time; s64 delta; /* @@ -396,9 +396,7 @@ void hrtick_start(struct rq *rq, u64 delay) * doesn't make sense and can cause timer DoS. */ delta = max_t(s64, delay, 10000LL); - time = ktime_add_ns(timer->base->get_time(), delta); - - hrtimer_set_expires(timer, time); + rq->hrtick_time = ktime_add_ns(timer->base->get_time(), delta); if (rq == this_rq()) __hrtick_restart(rq); diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index c122176c627e..fac1b121d113 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -1018,6 +1018,7 @@ struct rq { call_single_data_t hrtick_csd; #endif struct hrtimer hrtick_timer; + ktime_t hrtick_time; #endif #ifdef CONFIG_SCHEDSTATS -- 2.30.1