Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp6443116imu; Wed, 30 Jan 2019 15:01:48 -0800 (PST) X-Google-Smtp-Source: ALg8bN7o0y87DIG0EaV1mGDOPFVlGge4d40cVtZv2mWAGMAKYy5UItpD24I9jnS63Nt1mLpg0U8v X-Received: by 2002:a17:902:14b:: with SMTP id 69mr32815597plb.52.1548889308115; Wed, 30 Jan 2019 15:01:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548889308; cv=none; d=google.com; s=arc-20160816; b=LWXb8sGhy8MYylmMyBuqBRWO//maLaw/Zu4nb3B7mXf9U4Kh+GYvfxXlm2y20SCn2c x7euEEIlMY/dPpuWYvQO9PV2ycsPHn4fzvPIfvqifDNK2tJmh793ei0ym5Y2el+nCaTf /9Ai57G94uYmjmUXvDIwZXE9gNW+02Hd29OoRsxw7TGwsonRu79RQCeBQ3BeGuRUgJ/I gpP0bQzjg/garCQhXACzGs0lzO4OLuQdTjU7O34m/x2LR8RL99bj68KWHsXX8WzUbMfY LFOYIXryuNjEnqkA/g583IpMGCK58iMavtxOub49zJZD7EZodO3kehQiU+UvntmoU+6a iHRg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=+jxp5XCfSiD1yoXkVhH7D1sY2b+y+9b0lqkpJtEq8Jw=; b=y0USAV+OwNprJYtyfTz8QBsuKy0mT/UTNhrWzPZa3vKApzA73+0pSrne9fcj5GxRtq Gjj8RuF8cohcM+QlpgoEqJkavz0udMSChaVQ4Y/UEv9PdJ/rqN5aj/3DAjapPmeIJSkQ ofll9YxJkREYq1jvQOWXgsSqzfah2BefM8esOrBmwAqn98QJKQFItnsAlR+A/h2V33eL LhDMqOtuwmmpg3cR2c5xq6Y95b001vb69nDPh5p2STG1x0wji4NM8sX6r/kvtl+dUgH5 Od9bjL7YdN47TypWRfcy5nWuG9mv4fxZWe0QWflO4NrzosiO/+SE7uIaBPg9VcPKLsO+ /Y6g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r39si2696381pld.434.2019.01.30.15.01.30; Wed, 30 Jan 2019 15:01:48 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732463AbfA3VHl (ORCPT + 99 others); Wed, 30 Jan 2019 16:07:41 -0500 Received: from Galois.linutronix.de ([146.0.238.70]:48266 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727097AbfA3VHl (ORCPT ); Wed, 30 Jan 2019 16:07:41 -0500 Received: from bigeasy by Galois.linutronix.de with local (Exim 4.80) (envelope-from ) id 1gox53-0005ka-Vm; Wed, 30 Jan 2019 22:07:34 +0100 Date: Wed, 30 Jan 2019 22:07:33 +0100 From: Sebastian Sewior To: Thomas Gleixner Cc: Heiko Carstens , Peter Zijlstra , Ingo Molnar , Martin Schwidefsky , LKML , linux-s390@vger.kernel.org, Stefan Liebler Subject: Re: WARN_ON_ONCE(!new_owner) within wake_futex_pi() triggered Message-ID: <20190130210733.mg6aascw2gzl3oqz@linutronix.de> References: <20190129132303.GE26906@osiris> <20190129151058.GG26906@osiris> <20190129171653.ycl64psq2liy5o5c@linutronix.de> <20190130094913.GC5299@osiris> <20190130125955.GD5299@osiris> <20190130132420.spwrq2d4oxeydk5s@linutronix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20180716 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2019-01-30 18:56:54 [+0100], Thomas Gleixner wrote: > TBH, no clue. Below are some more traceprintks which hopefully shed some > light on that mystery. See kernel/futex.c line 30 ... The robust list it somehow buggy. In the last trace we had the handle_futex_death() of uaddr 3ff9e880140 as the last action. That means it was an entry in 56496's ->list_op_pending entry. This makes sense because it tried to acquire the lock, failed, got killed. According to uaddr pid 56956 is the owner. So 56956 invoked one of pthread_mutex_lock() / pthread_mutex_timedlock() / pthread_mutex_trylock() and should have obtained the lock in userland. Depending on where it got killed, that mutex should be either recorded in ->list_op_pending or the robust_list (or both if it didn't clear ->list_op_pending yet). But it is not. Similar for pthread_mutex_unlock(). We don't have a trace_point if we abort processing the list. On the other hand, it didn't trigger on x86 for hours. Could the atomic ops be the culprit? > Thanks, > > tglx Sebastian