Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp1437153pxb; Wed, 4 Nov 2020 08:54:02 -0800 (PST) X-Google-Smtp-Source: ABdhPJwOq+82+CGuVye9d/NdEC0L5rHSTsfS/Zx8A1AFzF56oGCn/SqMh5QtTi02La1pKkOG1FMu X-Received: by 2002:a17:906:c1d4:: with SMTP id bw20mr25308679ejb.91.1604508842507; Wed, 04 Nov 2020 08:54:02 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1604508842; cv=none; d=google.com; s=arc-20160816; b=Ik2hCAErpWolrRBXaKcGkMWQ4z0EbpOqu4z/LKdOMiqAxg3U2iuTMTiIouYytbwtEl 6k3Ia2ggtXlHq5txU7dcgS/HKxqT0gtb1yHL7pczyl+K68Nuo8hbQJEqxq7a1CYSfuA/ 9vLl9lp3xd6v0vcuTkHrOBW1MFwjHrx435vxUOyXGl+vg0pVpCXSuKl0zyrZZEtzitG8 MLOBtjuhXQlhO4O66fAfIF765H2Hyl8H5gll50/LJ69z4NYmtFdckWRAfB7RvwGGYuqq 8YJcYR/NeJnKtldxiTcIiM6Rf3my8gokkUxtlGQ59JZTgJPIyx+Rf8R7/o/HCHKBrsBE lg5g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:date:cc:to:from:subject :message-id:dkim-signature; bh=3M3IxUrKrdtseiXupBvvp/S7d3qp+5mQFk5GR58ADZo=; b=bssfOR7Xs7PDyn34cXamP+ZRWL4Gs0YJ3pBcYsJLRBTS54LAyGkM0B4C72i/MAKu+R OBo+seQMOveWjedxtwWNsUHCzOOSvXQ0weDqmG+VaWmihRlnkZBPKsFuUyfF7vig5eQQ d2sFU153z9P03KcK5E2zz7afsDxzUHWJ2cyxPbOPRPVnnqwn0SQlCGGQXttKfCzttHDs bbBDgtUrnADGpMIW12idjVO0qSE/1fyQPIk8kxr/IdEFvQk5ckk3hRU8ps8BxqSsxDTG Xa5pkfK/NKv0uu/lI7RmMPKg0TabZjW+rx1yJocl9uoa8C0Rgx90SURgT2XmvmnKy9SC NHgw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmx.net header.s=badeba3b8450 header.b=JWmfABcV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id x9si2011170ejc.745.2020.11.04.08.53.39; Wed, 04 Nov 2020 08:54:02 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmx.net header.s=badeba3b8450 header.b=JWmfABcV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731805AbgKDQur (ORCPT + 99 others); Wed, 4 Nov 2020 11:50:47 -0500 Received: from mout.gmx.net ([212.227.15.19]:40677 "EHLO mout.gmx.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731764AbgKDQuh (ORCPT ); Wed, 4 Nov 2020 11:50:37 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1604508621; bh=a8xcy7k7OAoCLMxIFLMY6hOiExRZJifNzzJndA+GmOg=; h=X-UI-Sender-Class:Subject:From:To:Cc:Date:In-Reply-To:References; b=JWmfABcV59BaobLbCSH/Ii7sY46VaI7tVGY2o/FtObofBnaRvkzoFKpENp48SAeXG tcmuGuo3Ch/sHxxjozD5zOxe3Wt07KkY86watpLWCrbpdwhfkf51uJBhms5z4fvINH hv3+4M7yhmc8S32LIG6M6mmxjkfXQRejX3xmUWXk= X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c Received: from homer.fritz.box ([185.221.148.80]) by mail.gmx.com (mrgmx005 [212.227.17.190]) with ESMTPSA (Nemesis) id 1Mqb1W-1jwdSu3FhY-00mfL7; Wed, 04 Nov 2020 17:50:20 +0100 Message-ID: <3fa8a9a3e447b7216610b01a31310ef1f9f7cd69.camel@gmx.de> Subject: Re: [PATCH] futex: Handle transient "ownerless" rtmutex state correctly From: Mike Galbraith To: Thomas Gleixner , Gratian Crisan , Peter Zijlstra Cc: linux-kernel@vger.kernel.org, linux-rt-users@vger.kernel.org, Brandon Streiff , Ingo Molnar , Darren Hart , James Minor , Sebastian Andrzej Siewior Date: Wed, 04 Nov 2020 17:50:20 +0100 In-Reply-To: <87sg9pkvf7.fsf@nanos.tec.linutronix.de> References: <87a6w6x7bb.fsf@ni.com> <878sbixbk4.fsf@ni.com> <2376f4e71c638aee215a4911e5efed14c5adf56e.camel@gmx.de> <5f536491708682fc3b86cb5b7bc1e05ffa3521e7.camel@gmx.de> <874km5mnbf.fsf@nanos.tec.linutronix.de> <871rh9mkvr.fsf@nanos.tec.linutronix.de> <87v9ell0cn.fsf@nanos.tec.linutronix.de> <87sg9pkvf7.fsf@nanos.tec.linutronix.de> Content-Type: text/plain; charset="ISO-8859-15" User-Agent: Evolution 3.34.4 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Provags-ID: V03:K1:3Be43IoHgozZduYyg8KHi8muOQYyFkuTJRKr0i/RHPxKSKfSoBf m3bTBvVKMjmNhYhf3/3d7S1DDc0rlrHwghYLNkKJDbF8851wFMB1gZPJvbGjycsYhwVJg28 wtRp+ydPeJFbCVp8u0kiEfAQ/srw4AKlBDbRKO6aIdYC6N7c51590qV0EE5L918oYaWEQW1 jnC+a4rG8LiGCz2+JUkSg== X-Spam-Flag: NO X-UI-Out-Filterresults: notjunk:1;V03:K0:1H9IYpapMlk=:q6FZXjlHWnMei6TxINLLPF bpAu6HzneXJvdcb7wW1XXvpwFazNSZOHk0aUFpPJ/IrPyAzLn9vCmxBDtO6KbyMq++xmAoMr8 c3k4ldRo38tTppJJ5685OgVes3wH7IeLLo7NCHQDknDTffJl2gryIqlpvz7ArN8EHVO/fAf7K RQ5nwav0x5/3AeurGEHusPenvdUFDq5jctyTZnKZyzuQ48nXLgZriaeIfvti9vqLvKA0q1JU/ R8OuXWUPvVxsxuLN6278n7ODqDrupuRG7hQ0rKSOLHI3gxsk7zcXChx1aY+4Jm2AhMxVIngq5 7zFsUPhvjd6Zi5J30VcS7ok+3Bmr+vZeSgeqQFx9SiCNKqdFxTmbYFX/FqHk7KmLBR0c2wCDw gyRpJ5FN6pLXPQn02wPW3izgqLl71X7HYDDMvpAkFXbb+QojbMTtHv8eagmx1oc4jlkwvHHbn qczMlMOOVM0F3OYkqxyQKXc7ynzzOB27dAdpNMd11Kh+jGUMt2Sp1oRAq14tnR0sZjK/2MiWm arI647YNbElWmWxtXFSew4uCtoXsdyz8BkD/eBelbQzy36GNJ+BAv4TerQbN2gNVoWxo8L9Gm 5qlAqww4OGKal0Zq8YPwCqXDcMnOO4CRVLjbufJOciyHFtDrxuKv0Bvhurwp/zNlu5d+depjH YdoZDiPAqah4/Dql8L5Zz5VcdSn9ghB8fMSEpx4Zsmr89YlgVIMh4etDavXA8rh2YLQ2iG37B aequYpmWzCNYV5A/AU8a7VraCz2U2aFezD9q2DqSHLBNKlDlgOXsn5cmdt/naH7osQcHOR4Yw jXtSFhp5iYZKBlmrA0OiHODX7hT7HfHK86qAPh2Ialt9zodurU93B0N2I1DRdo1JeDt4RAB6U xfYQXiPLkzOYD+aKL4og== Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2020-11-04 at 16:12 +0100, Thomas Gleixner wrote: > From: Mike Galbraith Hrmph. How about a suggested-by, or just take it. That dinky diag hit the mark (not _entirely_ by accident, but..;) is irrelevant, you did all of the work to make it a patch. -Mike > Gratian managed to trigger the BUG_ON(!newowner) in fixup_pi_state_owner= (). > This is one possible chain of events leading to this: > > Task Prio Operation > T1 120 lock(F) > T2 120 lock(F) -> blocks (top waiter) > T3 50 (RT) lock(F) -> boosts T3 and blocks (new top waiter) > XX timeout/ -> wakes T2 > signal > T1 50 unlock(F) -> wakes T3 (rtmutex->owner =3D=3D NULL, waiter bit i= s set) > T2 120 cleanup -> try_to_take_mutex() fails because T3 is the top wa= iter > and the lower priority T2 cannot steal the lock. > -> fixup_pi_state_owner() sees newowner =3D=3D NULL -> BUG_ON(= ) > > The comment states that this is invalid and rt_mutex_real_owner() must > return a non NULL owner when the trylock failed, but in case of a queued > and woken up waiter rt_mutex_real_owner() =3D=3D NULL is a valid transie= nt > state. The higher priority waiter has simply not yet managed to take ove= r > the rtmutex. > > The BUG_ON() is therefore wrong and this is just another retry condition= in > fixup_pi_state_owner(). > > Drop the locks, so that T3 can make progress, and then try the fixup aga= in. > > Gratian provided a great analysis, traces and a reproducer. The analysis= is > to the point, but it confused the hell out of that tglx dude who had to > page in all the futex horrors again. Condensed version is above. > > [ tglx: Wrote comment and changelog ] > > Fixes: c1e2f0eaf015 ("futex: Avoid violating the 10th rule of futex") > Reported-by: Gratian Crisan > Signed-off-by: Mike Galbraith > Signed-off-by: Thomas Gleixner > Cc: stable@vger.kernel.org > Link: https://lore.kernel.org/r/87a6w6x7bb.fsf@ni.com > --- > kernel/futex.c | 16 ++++++++++++++-- > 1 file changed, 14 insertions(+), 2 deletions(-) > > --- a/kernel/futex.c > +++ b/kernel/futex.c > @@ -2380,10 +2380,22 @@ static int fixup_pi_state_owner(u32 __us > } > > /* > - * Since we just failed the trylock; there must be an owner. > + * The trylock just failed, so either there is an owner or > + * there is a higher priority waiter than this one. > */ > newowner =3D rt_mutex_owner(&pi_state->pi_mutex); > - BUG_ON(!newowner); > + /* > + * If the higher priority waiter has not yet taken over the > + * rtmutex then newowner is NULL. We can't return here with > + * that state because it's inconsistent vs. the user space > + * state. So drop the locks and try again. It's a valid > + * situation and not any different from the other retry > + * conditions. > + */ > + if (unlikely(!newowner)) { > + ret =3D -EAGAIN; > + goto handle_err; > + } > } else { > WARN_ON_ONCE(argowner !=3D current); > if (oldowner =3D=3D current) {