Received: by 10.223.185.116 with SMTP id b49csp5193222wrg; Wed, 7 Mar 2018 07:46:57 -0800 (PST) X-Google-Smtp-Source: AG47ELsipqkmsaBVRzSjpEiRIPr3T28B1omTy28QFLIur0Oqw8DT0jRtq8lII/wuNLIOhlyXfHyc X-Received: by 10.167.131.135 with SMTP id u7mr16495293pfm.50.1520437617889; Wed, 07 Mar 2018 07:46:57 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1520437617; cv=none; d=google.com; s=arc-20160816; b=uoR4KEFEuB79gBIZ+A/IU3d6JinXroopME/ES6jLVK5rxQe59Wg/vjg3h4fuBCIFal pnbfr/+GoFFiwKEWc+XwPppOv1wxX/XRPlf+1vDgfypoSP/Zs6mHyqiXahV3iNoMCXZ9 Mw9TuY6KL8BzDQIUwNUZbbWdsdBTzUG5bh04BFtGrWpJCTKVqtL4fJkmkcnxGaqT1Sfs Coc1j92VeQgrcrwk165481gjaD+eLGcFk7UTltxgz0xaR4WGxm8sDxneOM03uvKL4sR4 Xb2K0YDPZsMhcm3hV4hflwikWUTKWducZSJak3rHkJCht7Nugh5JHzQe1OX+zVhLRXDP dQGQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=LLQ7Tp3hdGogGrRDrfEzHry61Qw6nhNMlJdmwGlgzuI=; b=kTo0AWz9msMyJYDrFuKf9SCcGt0CJPQMi65ue7t/GvibVoB6xYC59K2KRvDWzPdVuO K7Nuo5OGE4NINuSUey7xo93WFjgPimF5f/moRHFyTLVlBdTC1tu0u3512+syIag7ZfS+ VdRVmukJUa83lcHJq/BlyjAKNJS4igFCXes8hVjp9EUBrDx11geDqEzi7B0vTUlLHiqC B8d6q8iXPWdmlpmBb76OvH/Xm0YZ2KonqoLVRr9N7MmuLdzuZITqexZUsVx1cqYB07wc R9NL1iOQXgB14+S18B70mpbzTFihEb4ZZN/C4XC67EdvvgIK7nwEGziMis/z/FXgs+1y OMfA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@mvista-com.20150623.gappssmtp.com header.s=20150623 header.b=1d8eFXjs; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m83si14173122pfa.367.2018.03.07.07.46.42; Wed, 07 Mar 2018 07:46:57 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@mvista-com.20150623.gappssmtp.com header.s=20150623 header.b=1d8eFXjs; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933968AbeCGPpd (ORCPT + 99 others); Wed, 7 Mar 2018 10:45:33 -0500 Received: from mail-ot0-f193.google.com ([74.125.82.193]:32791 "EHLO mail-ot0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933870AbeCGPpb (ORCPT ); Wed, 7 Mar 2018 10:45:31 -0500 Received: by mail-ot0-f193.google.com with SMTP id y11so2458879otg.0 for ; Wed, 07 Mar 2018 07:45:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mvista-com.20150623.gappssmtp.com; s=20150623; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding:content-language; bh=LLQ7Tp3hdGogGrRDrfEzHry61Qw6nhNMlJdmwGlgzuI=; b=1d8eFXjsLwWuUdbcEM/Esw9dKjxnuwOOCV8W7xYGr+Diot/1jfb2WoElnfTwZAUqXO i7hrVl0cJW/F2WTatb942Ftjll3jnwpTofAlvDUJaIM/m7nz7dNmsExYCBK3MH/h2C6C mIP1b15U2K5WbA5O+Tpmz5xkg8/PXde+marcZIJhj/aHv1SMrxaSTJxVHYugMxjBc/ox 3kPZkvC9tyQuE45inY/74aCRWqSk0RwBeIfyvBFCtPAFjp9ZyH8x7j8Fmu4zoHNx9/Tp /R2P1JK0oJqIsSSDunDTK4zfy6HatwvKQFk9RiC3Tp4WsW6NYOpxlYvlbnLTSKhcHl0v IloQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=LLQ7Tp3hdGogGrRDrfEzHry61Qw6nhNMlJdmwGlgzuI=; b=NvKpjNTETiCsliw+f0iTn+0TF8y8IljdJe4pU3sa3PA//Lp0Oj5Z3nvJo5b9U0c69v JW3ZqzdIG7fBrCwGQ3cYwjob9vX5jsDzs2UOFR1YEScuauJ6N5VJWK6sU1VVXdMK16wV R2F5Vfhx4IN3pmweSE6HOQMpoJyjWGuuhA0SF7QGgM/H73ypYlOplv4fPXPQ762Qiu4y Geg3/rNNaAF/kwnK5yTyH7P4JO1GLEGbIhZ+GScXgU3ih3mSLUvZuw+1e31jIlxY7wjc FcS91yaxLG5YiSVqjAbb2pTCK3nL704C1fR1uK0xIRid7jJnIs1WOmylvPXljwV/CawF 0jQA== X-Gm-Message-State: AElRT7GbWrOSgqDcHUqrcfKMy7UQ+tLClVFP56STe9Y8mp/+QJMIxfqF e2mfHrb02Egq0RYH2spWO//Vdw== X-Received: by 10.157.112.3 with SMTP id k3mr16149823otj.189.1520437530633; Wed, 07 Mar 2018 07:45:30 -0800 (PST) Received: from [192.168.27.3] ([47.184.168.85]) by smtp.gmail.com with ESMTPSA id s129sm338818oia.26.2018.03.07.07.45.29 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 07 Mar 2018 07:45:29 -0800 (PST) Subject: Re: Warning from swake_up_all in 4.14.15-rt13 non-RT To: Sebastian Andrzej Siewior Cc: linux-rt-users , linux-kernel , Tejun Heo , Peter Zijlstra , Thomas Gleixner References: <20180306174604.nta5rcvfvrfdfftz@linutronix.de> From: Corey Minyard Message-ID: <1704d817-8fb9-ce8f-1aa1-fe6e8b0c3919@mvista.com> Date: Wed, 7 Mar 2018 09:45:29 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: <20180306174604.nta5rcvfvrfdfftz@linutronix.de> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-GB Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03/06/2018 11:46 AM, Sebastian Andrzej Siewior wrote: > On 2018-03-05 09:08:11 [-0600], Corey Minyard wrote: >> Starting with the change >> >> 8a64547a07980f9d25e962a78c2e10ee82bdb742 fs/dcache: use swait_queue instead >> of >> waitqueue > … >> The following change is the obvious reason: >> >> --- a/kernel/sched/swait.c >> +++ b/kernel/sched/swait.c >> @@ -69,6 +69,7 @@ void swake_up_all(struct swait_queue_head *q) >>         struct swait_queue *curr; >>         LIST_HEAD(tmp); >> >> +       WARN_ON(irqs_disabled()); >>         raw_spin_lock_irq(&q->lock); >>         list_splice_init(&q->task_list, &tmp); >>         while (!list_empty(&tmp)) { >> >> I've done a little bit of analysis here, percpu_ref_kill_and_confirm() >> does spin_lock_irqsave() and then does a percpu_ref_put().  If the >> refcount reaches zero, the release function of the refcount is >> called.  In this case, the block code has set this to >> blk_queue_usage_counter_release(), which calls swake_up_all(). >> >> It seems like a bad idea to call percpu_ref_put() with interrupts >> disabled.  This problem actually doesn't appear to be RT-related, >> there's just no warning call if the RT tree isn't used. > yeah but vanilla uses wake_up() which does spin_lock_irqsafe() so it is > not an issue there. > > The odd part here is that percpu_ref_kill_and_confirm() does _irqsave() > which suggests that it might be called from any context and then it does > wait_event_lock_irq() which enables interrupts again while it waits. So > it can't be used from any context. > >> I'm not sure if it's best to just do the put outside the lock, or >> have modified put function that returns a bool to know if a release >> is required, then the release function can be called outside the >> lock.  I can do patches and test, but I'm hoping for a little >> guidance here. > swake_up_all() does raw_spin_lock_irq() because it should be called from > non-IRQ context. And it drops the lock (+IRQ enable) between wake-ups in > case we "need_resched()" because we woke a high-priority waiter. There > is the list_splice because we wanted to drop the locks (and have IRQs > open) during the entire wake up process but finish_swait() may happen > during the wake up and so we must hold the lock while the list-item is > removed for the queue head. > I have no idea what is the wisest thing to do here. The obvious fix > would be to use the irqsafe() variant here and not drop the lock between > wake ups. That is essentially what swake_up_all_locked() does which I > need for the completions (and based on some testing most users have one > waiter except during PM and some crypto code). > It is probably no comparison to wake_up_q() (which does multiple wake > ups without a context switch) but then we did this before like that. > > Preferably we would have a proper list_splice() and some magic in the > "early" dequeue part that works. > Maybe just modify the block code to run the swake_up_all() call in a workqueue or tasklet?  If you think that works, I'll create a patch, test it, and submit it if all goes well. Thanks, -corey