Received: by 10.192.165.148 with SMTP id m20csp698436imm; Wed, 25 Apr 2018 06:23:52 -0700 (PDT) X-Google-Smtp-Source: AIpwx482IUGprPvgXndpPzIo4WP1+oFYZGlQk2q9S+89ykZx4CEde8+8d8pWanSA1hwpfkHtjjZx X-Received: by 10.98.210.134 with SMTP id c128mr24123641pfg.240.1524662632078; Wed, 25 Apr 2018 06:23:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524662632; cv=none; d=google.com; s=arc-20160816; b=EmOtx9r2DzP+q7t0SiWbmmVuyeOWqhMWXU60kkeHY+YTwrOL0uXxYyZf8Z6LJFkAZK kgKgyL2BBAJhPfbm7gxbRULh2R8p21pI153yXXnrrceNfvHAX1txKq9u/D7MYHJ/Le7g F1ANv5Mq/Pt4sfcx2mRKYXkCZNFzwbp7gOujibkI56hs90u9CPqxXgx+ffpCZ5k4VT8p 0RSn9gaBYhvIEnUQlH2BlyCK97P5hBxgMGhoFP629dJXTZ/BBTtzKaFxiyeXhHQmIwdX mnVSI4huOC/Zti1y7zmcbYptBkaKy2eW1fpg8+AQxldwcXWU2lZ/Wo0jCprtkD2OpLKB H6VA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dmarc-filter:arc-authentication-results; bh=BDRTON0yeHRmbUcP6FpfypVoHoOAZoYAym9RNf55S0U=; b=ZK+APIvJVwTcuAFauOJKGUI3/2p3sHLm75ARGbkaa4/Z+k4oIkLFAqom3iz5asQFfO jFA86zL51QsQ+f48mPMZsh2TGBHGQiR16I53cNN5jH4t8+c3/LnNL+38aAKFjDp2G6Sd +pQxAxcMxeiGpatquFjMfC5hwZQ8vN0lc8hsPWBRnqjShZk7L3/Y3aM7cqw/sR45fnuz FTiHs1NZtgcr4i8NvB2CnS4AfFSB8YrYLe0RaCIGiCk0uT712NljLNCti8fQQA6RYmQS goDJGRrR49Ii1qaShmIjp2VcXnG9c5ysmhzhezFQB/9x3UXFyohq8BzV9jXB+BF95hrV QaQg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j16-v6si2163478pli.361.2018.04.25.06.23.37; Wed, 25 Apr 2018 06:23:52 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754047AbeDYNWL (ORCPT + 99 others); Wed, 25 Apr 2018 09:22:11 -0400 Received: from mail.kernel.org ([198.145.29.99]:54860 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753033AbeDYNWK (ORCPT ); Wed, 25 Apr 2018 09:22:10 -0400 Received: from localhost (LFbn-NCY-1-193-82.w83-194.abo.wanadoo.fr [83.194.41.82]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 2193D21745; Wed, 25 Apr 2018 13:22:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2193D21745 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=frederic@kernel.org Date: Wed, 25 Apr 2018 15:22:06 +0200 From: Frederic Weisbecker To: Thomas Gleixner Cc: "Wan, Kaike" , "Marciniszyn, Mike" , "Dalessandro, Dennis" , "Weiny, Ira" , "Fleck, John" , "linux-kernel@vger.kernel.org" , "linux-rdma@vger.kernel.org" , Peter Zijlstra , Anna-Maria Gleixner , Frederic Weisbecker , Ingo Molnar Subject: Re: [PATCH] tick/sched: Do not mess with an enqueued hrtimer Message-ID: <20180425132205.GA12534@lerouge> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Apr 24, 2018 at 09:22:18PM +0200, Thomas Gleixner wrote: > Kaike reported that in tests rdma hrtimers occasionaly stopped working. He > did great debugging, which provided enough context to decode the problem. > > CPU 3 CPU 2 > > idle > start sched_timer expires = 712171000000 > queue->next = sched_timer > start rdmavt timer. expires = 712172915662 > lock(baseof(CPU3)) > tick_nohz_stop_tick() > tick = 716767000000 timerqueue_add(tmr) > > hrtimer_set_expires(sched_timer, tick); > sched_timer->expires = 716767000000 <---- FAIL > if (tmr->expires < queue->next->expires) > hrtimer_start(sched_timer) queue->next = tmr; > lock(baseof(CPU3)) > unlock(baseof(CPU3)) > timerqueue_remove() > timerqueue_add() > > ts->sched_timer is queued and queue->next is pointing to it, but then > ts->sched_timer.expires is modified. > > This not only corrupts the ordering of the timerqueue RB tree, it also > makes CPU2 see the new expiry time of timerqueue->next->expires when > checking whether timerqueue->next needs to be updated. So CPU2 sees that > the rdma timer is earlier than timerqueue->next and sets the rdma timer as > new next. > > Depending on whether it had also seen the new time at RB tree enqueue, it > might have queued the rdma timer at the wrong place and then after removing > the sched_timer the RB tree is completely hosed. > > The problem was introduced with a commit which tried to solve inconsistency > between the hrtimer in the tick_sched data and the underlying hardware > clockevent. It split out hrtimer_set_expires() to store the new tick time > in both the NOHZ and the NOHZ + HIGHRES case, but missed the fact that in > the NOHZ + HIGHRES case the hrtimer might still be queued. > > Use hrtimer_start(timer, tick...) for the NOHZ + HIGHRES case which sets > timer->expires after canceling the timer and move the hrtimer_set_expires() > invocation into the NOHZ only code path which is not affected as it merily > uses the hrtimer as next event storage so code pathes can be shared with > the NOHZ + HIGHRES case. > > Fixes: d4af6d933ccf ("nohz: Fix spurious warning when hrtimer and clockevent get out of sync") > Reported-by: "Wan Kaike" > Signed-off-by: Thomas Gleixner Acked-by: Frederic Weisbecker Thanks!