Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp143920pxk; Tue, 1 Sep 2020 19:09:22 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz3mLjvjou0L0gvM34ORrKyqA69jWPlYhK6xXhIOO/ae2Rn6eYPIFotNJPvZtKmJi6HZ/bH X-Received: by 2002:a17:906:1e11:: with SMTP id g17mr3909819ejj.298.1599012561992; Tue, 01 Sep 2020 19:09:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1599012561; cv=none; d=google.com; s=arc-20160816; b=wb+4QUngZzU8UcISn0KG8G0ZF9O2No23bnhf59szp6rAwGLRdrTrmLLpNnEB7Ytcau K0meHPyvHIMqRA2AH+5IOFY/o0CLJNIrY/YC++uASmomCzirU0+sBJf6+iJnS2l1//Pg DjNFjmP69HlRL2+rbOTb9wXuJ+3CH2LuioaadOjzEgG1lOD2b4hT9dQGBgGrD2WJ+oh8 5ddfPK0Eupze9nNgAHoUZhFdrJRg6NsZa4K7m3gO9DWIWIgyJNAhPkiJ5ZucnfA/5xCL Un+nemvagWpgsZQ4ii9hpIh5Pv+GISTC8ZBpTebkBhE0iLqumgGq/VtTBhqiI18iuS9e H9tA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=V0bn+ZBA7RXBpoL6yH9HsaL5jYhM1mWs7s0m/cuEmjU=; b=Vwh04ajNNhKtcSlhYhBMKJTYJ0kwNa7EkKSR1udY+Ov7d5XmCULPBcfJ/yIQaV/dSo uz32VZXiHmFaDwNdlnEY/zUw6ABa4H/nCDPDHj8dGWkfMg4ruVTb2seJz2oL3+UrHHHG SeK8rHqs+SWDA36fJ/oFSNctgG+Zc3c/Dd1uWalwQ/TANbQWNfCWE8zd17ys7zTOp9mo SUE5TyISQGQbrEWsD3CwanePVoAcZvkO/4sFYiMYN4J5JB71yr//E3obX2dG8wSO7tFc vpVFwXOJahjPPra9zSK0brZzYz30xivHkFfJewkuXeBFWW3oEmwfjGbIhKbTXWTX8kTm MsaQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z20si1708688ejm.732.2020.09.01.19.08.58; Tue, 01 Sep 2020 19:09:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726426AbgIBCHT (ORCPT + 99 others); Tue, 1 Sep 2020 22:07:19 -0400 Received: from mx2.suse.de ([195.135.220.15]:60268 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726131AbgIBCHQ (ORCPT ); Tue, 1 Sep 2020 22:07:16 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 3B72FAD36; Wed, 2 Sep 2020 02:07:16 +0000 (UTC) Date: Tue, 1 Sep 2020 18:51:28 -0700 From: Davidlohr Bueso To: "Paul E. McKenney" Cc: peterz@infradead.org, mingo@redhat.com, will@kernel.org, linux-kernel@vger.kernel.org Subject: Re: Question on task_blocks_on_rt_mutex() Message-ID: <20200902015128.wsulcxhbo7dutcjz@linux-p48b> References: <20200831224911.GA13114@paulmck-ThinkPad-P72> <20200831232130.GA28456@paulmck-ThinkPad-P72> <20200901174938.GA8158@paulmck-ThinkPad-P72> <20200901235821.GA8516@paulmck-ThinkPad-P72> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: <20200901235821.GA8516@paulmck-ThinkPad-P72> User-Agent: NeoMutt/20180716 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 01 Sep 2020, Paul E. McKenney wrote: >And it appears that a default-niced CPU-bound SCHED_OTHER process is >not preempted by a newly awakened MAX_NICE SCHED_OTHER process. OK, >OK, I never waited for more than 10 minutes, but on my 2.2GHz that is >close enough to a hang for most people. > >Which means that the patch below prevents the hangs. And maybe does >other things as well, firing rcutorture up on it to check. > >But is this indefinite delay expected behavior? > >This reproduces for me on current mainline as follows: > >tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --torture lock --duration 3 --configs LOCK05 > >This hangs within a minute of boot on my setup. Here "hangs" is defined >as stopping the per-15-second console output of: > Writes: Total: 569906696 Max/Min: 81495031/63736508 Fail: 0 Ok this doesn't seem to be related to lockless wake_qs then. fyi there have been missed wakeups in the past where wake_q_add() fails the cmpxchg because the task is already pending a wakeup leading to the actual wakeup ocurring before its corresponding wake_up_q(). This is why we have wake_q_add_safe(). But for rtmutexes, because there is no lock stealing only top-waiter is awoken as well as try_to_take_rt_mutex() is done under the lock->wait_lock I was not seeing an actual race here. Thanks, Davidlohr