Received: by 2002:a05:6358:a55:b0:ec:fcf4:3ecf with SMTP id 21csp5440051rwb; Tue, 17 Jan 2023 13:56:19 -0800 (PST) X-Google-Smtp-Source: AMrXdXuZnNnp02XujdLwEDsVDoEcA6w+vR+6B5aBzTcdculUnEnnXuDvPy9Fvvp0cIEhx88wJ6AP X-Received: by 2002:a62:1751:0:b0:56b:3758:a2d9 with SMTP id 78-20020a621751000000b0056b3758a2d9mr4964223pfx.21.1673992578890; Tue, 17 Jan 2023 13:56:18 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673992578; cv=none; d=google.com; s=arc-20160816; b=XP8VySmsxbEsfGdWIUBLA4h2PbBbCiaKT72eAVkFo8ZvMlKNrEJKn/Al6STl0Y7Hbh 5vXDUfgSZyl5VgtPG9He6VoeoZ8NscZ3as7tGqEwbMPCkizn/HqknhfkVhbUH+1Et3hU ZftFC1rPYLKjH6YR5AIky7YF3oQkSeo/R1yYiR0/ZEkoYxKHXmvwqWUHOsL2RYPjdzez 1ChB6xHymXxUhPTUaPkIP1hR6fGDY/3gD5EbIH2oft3PMkP2FV1+Q+S6aCbgNhCzLdkQ wPi0K/h4fXx+PFJqRJG0lFDbdEL9wfsm8v+fVX20orHJET70UPBSxRNJ/7em+vUc9OpW /vrg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:cc:from :references:to:content-language:subject:user-agent:mime-version:date :message-id:dkim-signature; bh=2LUjVvGCn8bvgtZlOmPhRZtXkQbB6wWVKWPH/jzNYf4=; b=z6GyEeQ0ct6ig0hPuMtEwKtFPvX9RhBkWw/DAUG53FrSAACLMICSetRubWGAPL/0FA IAGq/+C8f7A/IdudNbU8QqMW2+QEcLATIk9Avfo2agbuv0klNfRQGm0/Iy6rfRDWBl2w ONBdXYvnJPFUITIdLONyy/f/n93tu1Dfxtr3/bRUCSXQwtS2RealQnRzuHUwPmm+Rfcf 3aWmJ9U25W14Yf9ofkHLCt/fzN6SEbsPt1PWdWE0H5A97nra/WBwEYpIG5n/6YVz82JT Bd/Byd9e6v2VElxQTpHwhiisk4h9ShA3h7ehQjy3Ryv+sTuT0/oivTWpeIb50EtZXcnZ 8w9w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=dHS4N+As; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s4-20020a056a0008c400b0056c0a9e09f0si35591585pfu.292.2023.01.17.13.56.13; Tue, 17 Jan 2023 13:56:18 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=dHS4N+As; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229647AbjAQVJ1 (ORCPT + 47 others); Tue, 17 Jan 2023 16:09:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37362 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229555AbjAQVHQ (ORCPT ); Tue, 17 Jan 2023 16:07:16 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9F90F9086B for ; Tue, 17 Jan 2023 11:32:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673983952; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2LUjVvGCn8bvgtZlOmPhRZtXkQbB6wWVKWPH/jzNYf4=; b=dHS4N+AshLaBW/XXZeFmLkTXKZerBm5dmYxcpbb6t42oV0BYSzuqnKBlajMRgBUw5L9taA anVISOU9wjDj3g9Go8tUqJ7xVYw//D/45Kd/SPkNRhLJOfv1a2FPZlz2tO0LIYiAG6DrvV e5ELQwLoPPu3CbALhxL1R6Dc3UUyToM= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-538-CSCLg9DTMLW7T_M2RdgXxg-1; Tue, 17 Jan 2023 14:32:29 -0500 X-MC-Unique: CSCLg9DTMLW7T_M2RdgXxg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 20CA8858F09; Tue, 17 Jan 2023 19:32:29 +0000 (UTC) Received: from [10.18.17.153] (dhcp-17-153.bos.redhat.com [10.18.17.153]) by smtp.corp.redhat.com (Postfix) with ESMTP id D349F2026D4B; Tue, 17 Jan 2023 19:32:28 +0000 (UTC) Message-ID: <23a15414-927c-ba0d-eb6a-58f6191ce17b@redhat.com> Date: Tue, 17 Jan 2023 14:32:28 -0500 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.0 Subject: Re: [PATCH] rtmutex: ensure we wake up the top waiter Content-Language: en-US To: Wander Lairson Costa , Peter Zijlstra , Ingo Molnar , Will Deacon , Boqun Feng , "open list:LOCKING PRIMITIVES" References: <20230117172649.52465-1-wander@redhat.com> From: Waiman Long Cc: Thomas Gleixner , Sebastian Andrzej Siewior In-Reply-To: <20230117172649.52465-1-wander@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Spam-Status: No, score=-2.2 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 1/17/23 12:26, Wander Lairson Costa wrote: > In task_blocked_on_lock() we save the owner, release the wait_lock and > call rt_mutex_adjust_prio_chain(). Before we acquire the wait_lock > again, the owner may release the lock and deboost. Are you referring to task_blocks_on_rt_mutex(), not task_blocked_on_lock()? > > rt_mutex_adjust_prio_chain() acquires the wait_lock. In the requeue > phase, waiter may be initially in the top of the queue, but after > dequeued and requeued it may no longer be true. > > This scenario ends up waking the wrong task, which will verify it is no > the top waiter and comes back to sleep. Now we have a situation in which > no task is holding the lock but no one acquires it. > > We can reproduce the bug in PREEMPT_RT with stress-ng: > > while true; do > stress-ng --sched deadline --sched-period 1000000000 \ > --sched-runtime 800000000 --sched-deadline \ > 1000000000 --mmapfork 23 -t 20 > done > > Signed-off-by: Wander Lairson Costa > --- > kernel/locking/rtmutex.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c > index 010cf4e6d0b8..728f434de2bb 100644 > --- a/kernel/locking/rtmutex.c > +++ b/kernel/locking/rtmutex.c > @@ -901,8 +901,9 @@ static int __sched rt_mutex_adjust_prio_chain(struct task_struct *task, > * then we need to wake the new top waiter up to try > * to get the lock. > */ > - if (prerequeue_top_waiter != rt_mutex_top_waiter(lock)) > - wake_up_state(waiter->task, waiter->wake_state); > + top_waiter = rt_mutex_top_waiter(lock); > + if (prerequeue_top_waiter != top_waiter) > + wake_up_state(top_waiter->task, top_waiter->wake_state); > raw_spin_unlock_irq(&lock->wait_lock); > return 0; > } I would say that if a rt_mutex has no owner but have waiters, we should always wake up the top waiter whether or not it is the current waiter. So what is the result of the stress-ng run above? Is it a hang? It is not clear from your patch description. I am not that familiar with the rt_mutex code, I am cc'ing Thomas and Sebastian for their input. Cheers, Longman