Received: by 2002:a05:6a10:6744:0:0:0:0 with SMTP id w4csp1297713pxu; Fri, 16 Oct 2020 08:37:04 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwENHtVr8oaTFajbK6t3beTVJfZ8M57++nVQYaFDRQNpoBEbqAhNessf0fjS9EA0EHbTpEF X-Received: by 2002:aa7:ce0a:: with SMTP id d10mr4591400edv.254.1602862623954; Fri, 16 Oct 2020 08:37:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1602862623; cv=none; d=google.com; s=arc-20160816; b=LZ9qhL9HP804yt1C0KgmSgRYHZQMOq7BmfyjuT7qIaI+YT+Fc5P735HdGuyfVlrg+U 9YwN3w7ulllnegegrpW9jGjE1xKxWloh3EeeKmpfM/quIHmrS7j/E8SKlbN9exDLcZ0p xpSxmGSAFXKJr1+cOPoyF3NLxIuvlg8tjcFntDcaFCmkCLhKZ6iYDHxyN1sZQXTNLqqV uEbcD3UKI7OQ9uANt4HabazCn/kQpUrtr9asSdhrEnM0pAw7b+q1up91OkQ1qzGIfEGW PZ4ByNCHH968CWpiTx3Y8JzwbJQzigzG1aJYSIZ1dEqmG36WwWQRHIiw8JZnx1Wiv43K Nh8w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version; bh=3tLxs3Ec/9QewzF9/5OsF1Lu3L5UygZwfvUIskQ2kBw=; b=mz8SnCTvSf/9jcF/kW3PTo1k3B/fqa1XrjF0e6LCJO87F+fnBCOh8gZxBB9klI8JGz DNsZIT/Pb0FKMD31BxwP0Nj6Nxw8zuu3CmBgavI1ngxLqKUdk3fno3Regtc5syRVRFAF GeAnSDbRgSnvsjtb7d+r4NG22YvSxr42OBkmC/sCwUKQHNkwHD9S7ze/KRCv8Sd0kkkY MhsaXVrI+KCosJAtB4JAR3H7ag3PJCpUQA+vZVXX0i1vttupOrtmSJKKSDQqx8GTbBvr q92GJAbOy+tY+in5b/eS0r79KymEOTo0OegcPhGhzWq4ccgw7826aOYt/ZslnZhFtNRM 7Czg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id l2si1962736ejb.423.2020.10.16.08.36.41; Fri, 16 Oct 2020 08:37:03 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2408425AbgJPNwd (ORCPT + 99 others); Fri, 16 Oct 2020 09:52:33 -0400 Received: from mail-lf1-f67.google.com ([209.85.167.67]:38283 "EHLO mail-lf1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2408355AbgJPNwd (ORCPT ); Fri, 16 Oct 2020 09:52:33 -0400 Received: by mail-lf1-f67.google.com with SMTP id c141so3021950lfg.5 for ; Fri, 16 Oct 2020 06:52:31 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=3tLxs3Ec/9QewzF9/5OsF1Lu3L5UygZwfvUIskQ2kBw=; b=gEbG16nOq+6ioLNz/T0tfe7PorlOcfikWHOhxlvot/949k9fUJsBqe7BXX0G0IUunv VcsQwJ8bAmkPBABlfhvnmg/FxTD+AM+drm8Cdy4K43ZDiehFfOfK6dObyiWM8uGKCAWv BvnFv/8e+WLigFEewzRjhANvmv9iV9cLcm6LDV/q3pcvFa4G2+L4zS4PddDvOYk0PZDt i9ZubNjOP1R0+VAwmakgLjyK51p/a1rsqt0UwfG+vcHPdFoSkXT4ULKtbVu5JmzD5cjF FG5cdNeMf/gCXKd7ogWn6D1gIEJqU+zluoPjNYiSLNKPwxhFLtbAlsuVHpa1rZMPFbQv qk+g== X-Gm-Message-State: AOAM5324do9RJwphnl4mTGih35MhwLYueqy8ZNWIY4iSZRGyeAC3BMVy lq7MugIbpXik6pWu+JdJSsZMbAotMdWGcjUIBF0= X-Received: by 2002:a05:6512:3137:: with SMTP id p23mr1479500lfd.316.1602856350629; Fri, 16 Oct 2020 06:52:30 -0700 (PDT) MIME-Version: 1.0 References: <20200923123916.1115962-1-jbrunet@baylibre.com> <20201015134628.GA11989@arm.com> <1jlfg7k2ux.fsf@starbuckisacylon.baylibre.com> <20201016085217.GA12323@arm.com> <1jk0vqk0ju.fsf@starbuckisacylon.baylibre.com> In-Reply-To: <1jk0vqk0ju.fsf@starbuckisacylon.baylibre.com> From: Sudeep Holla Date: Fri, 16 Oct 2020 14:52:18 +0100 Message-ID: Subject: Re: [PATCH] mailbox: cancel timer before starting it To: Jerome Brunet , Sudeep Holla Cc: Ionela Voinescu , Jassi Brar , Kevin Hilman , "open list:ARM/Amlogic Meson..." , Da Xue , Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Oct 16, 2020 at 1:15 PM Jerome Brunet wrote: > > > On Fri 16 Oct 2020 at 10:52, Ionela Voinescu wrote: > > > On Thursday 15 Oct 2020 at 13:45:54 (-0500), Jassi Brar wrote: > > [..] > >> > >> --- a/drivers/mailbox/mailbox.c > >> > >> +++ b/drivers/mailbox/mailbox.c > >> > >> @@ -82,9 +82,13 @@ static void msg_submit(struct mbox_chan *chan) > >> > >> exit: > >> > >> spin_unlock_irqrestore(&chan->lock, flags); > >> > >> > >> > >> - if (!err && (chan->txdone_method & TXDONE_BY_POLL)) > >> > >> - /* kick start the timer immediately to avoid delays */ > >> > >> + if (!err && (chan->txdone_method & TXDONE_BY_POLL)) { > >> > >> + /* Disable the timer if already active ... */ > >> > >> + hrtimer_cancel(&chan->mbox->poll_hrt); > >> > >> + > >> > >> + /* ... and kick start it immediately to avoid delays */ > >> > >> hrtimer_start(&chan->mbox->poll_hrt, 0, HRTIMER_MODE_REL); > >> > >> + } > >> > >> } > >> > >> > >> > >> static void tx_tick(struct mbox_chan *chan, int r) > >> > > > >> > > I've tracked a regression back to this commit. Details to reproduce: > >> > > >> > Hi Ionela, > >> > > >> > I don't have access to your platform and I don't get what is going on > >> > from the log below. > >> > > >> > Could you please give us a bit more details about what is going on ? > >> > > >> > All this patch does is add hrtimer_cancel(). > >> > * It is needed if the timer had already been started, which is > >> > appropriate AFAIU > >> > * It is a NO-OP is the timer is not active. > >> > > >> Can you please try using hrtimer_try_to_cancel() instead ? > >> > > > > Yes, using hrtimer_try_to_cancel() instead works for me. But doesn't > > this limit how effective this change is? AFAIU, this will possibly only > > reduce the chances for the race condition, but not solve it. > > > > It is also my understanding, hrtimer_try_to_cancel() would remove a > timer which as not already started but would return withtout doing > anything if the callback is already running ... which is the original > problem > There seem to be some races. It always hangs in the hrtimer_cancel. Logging some extra messages makes it progress for a while and finally get stuck in the loop. I wonder if there is a race between cancel and handler execution. -- Regards, Sudeep