Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp5890435imu; Wed, 30 Jan 2019 05:25:55 -0800 (PST) X-Google-Smtp-Source: ALg8bN6nA+fTqmWT8b0xINuJqL1L9uEZx+84/KogRLoUizf2u81alvd+rDocR0UYPbArJ+iSdNgN X-Received: by 2002:a63:24c2:: with SMTP id k185mr26684118pgk.406.1548854755496; Wed, 30 Jan 2019 05:25:55 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548854755; cv=none; d=google.com; s=arc-20160816; b=eB1f567MZ5zgDPQzK1H1chi+/WlEfrfSwPGdqrOhWGxhJXg07cGsY/lo5TDal9Prvs b/W/jDEL47+s4X2ejFJDZIp6N9/OuAPoChfaerWyfD2b9WE/D/wezpZE1xH7+vOw2/TT sa8dzbuu+O6n2MVTpStoDqP03uvUuJjrYGlyg8nx2DCHwxUX2H86q/YW5ArfVcYLy21k fe3EQJP/FvUzA+CtNZg1Uww3tOCmWGophTfCE09bomL0F6/ykPF5mjtzQsCfTo3UCJVA ADu/cl4Iisd4WXk4+D0DL5Uiowv2irMJMCXRLZgHXGxBT8QmVCLVAC+NqfJVxaVuMavp zKww== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date; bh=esKDlRq9lPz8yjiYc4rMGYzKOvdLyCRSlkMqIS6AvqU=; b=PcbPaRN50HvWYkvhRhziGHKA3X+StyA4eTxP6pwhQtnsWqJi0KoE1h+Dy6Tuw7QiSz 2gffBCOQdN6EGPTESwqOBuasVUKC+jLant/yo4FnPRb4s9gYq4qic991cHf0Pb/zwPz8 +1B5yGP2gGpiVvXskYBL56jo8iMYJ8OGopcLI5jJkl3t0sn0ezy5UF3B2sHKYXUdBSK2 HESLVyp2Bi3x+DMMHuXK3I3iBH78wiPBRQXwfUWmbF/8TL4xMz+nQRnERFYU6ah7C/4o iLdno7LyMWac2pJOmKeeC1psh6M/kiMG5/XbHMRQaSaEx4eViex378lT+LGPo0NiFDkB 8riQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 37si1420238pgs.447.2019.01.30.05.25.39; Wed, 30 Jan 2019 05:25:55 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730979AbfA3NZd (ORCPT + 99 others); Wed, 30 Jan 2019 08:25:33 -0500 Received: from Galois.linutronix.de ([146.0.238.70]:47172 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725851AbfA3NZc (ORCPT ); Wed, 30 Jan 2019 08:25:32 -0500 Received: from [2a01:598:b890:92b7:fc90:b8ff:fed0:1fb6] (helo=nanos) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1goprr-0000wj-5r; Wed, 30 Jan 2019 14:25:27 +0100 Date: Wed, 30 Jan 2019 14:25:20 +0100 (CET) From: Thomas Gleixner To: Heiko Carstens cc: Sebastian Sewior , Peter Zijlstra , Ingo Molnar , Martin Schwidefsky , LKML , linux-s390@vger.kernel.org, Stefan Liebler Subject: Re: WARN_ON_ONCE(!new_owner) within wake_futex_pi() triggered In-Reply-To: <20190130125955.GD5299@osiris> Message-ID: References: <20190129090108.GA26906@osiris> <20190129102409.GB26906@osiris> <20190129103557.GF28485@hirez.programming.kicks-ass.net> <20190129132303.GE26906@osiris> <20190129151058.GG26906@osiris> <20190129171653.ycl64psq2liy5o5c@linutronix.de> <20190130094913.GC5299@osiris> <20190130125955.GD5299@osiris> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 30 Jan 2019, Heiko Carstens wrote: > On Wed, Jan 30, 2019 at 01:15:18PM +0100, Thomas Gleixner wrote: > > On Wed, 30 Jan 2019, Heiko Carstens wrote: > > > On Tue, Jan 29, 2019 at 06:16:53PM +0100, Sebastian Sewior wrote: > > > > if (unlikely(p->flags & PF_KTHREAD)) { > > > > put_task_struct(p); > > > > > > Last lines of the trace with your additional patch (full log attached): > > > > > > <...>-50539 [003] .... 2376.398223: sys_futex -> 0x0 > > > <...>-50539 [003] .... 2376.398223: sys_futex(uaddr: 3ffb7700208, op: 6, val: 1, utime: 0, uaddr2: 3, val3: 0) > > > <...>-50539 [003] .... 2376.398225: attach_to_pi_owner: Missing pid 50734 > > > <...>-50539 [003] .... 2376.398226: handle_exit_race: uval2 vs uval 8000c62e vs 8000c62e (-1) > > > > So the user space value is: 8000c62e. FUTEX_WAITER bit is set and the owner > > of the futex is PID 50734, which exited long time ago: > > > > <...>-50734 [000] .... 2376.394936: sched_process_exit: comm=ld64.so.1 pid=50734 prio=120 > > > > But at least from the kernel view 50734 has released it last: > > > > <...>-50734 [000] .... 2376.394930: sys_futex(uaddr: 3ffb7700208, op: 7, val: 3ff00000007, utime: 3ffb3ef8910, uaddr2: 3ffb3ef8910, val3: 3ffc0afe987) > > <...>-50539 [003] .... 2376.398223: sys_futex(uaddr: 3ffb7700208, op: 6, val: 1, utime: 0, uaddr2: 3, val3: 0) > > > > Now, if it would have acquired it in userspace again before exiting, then > > the robust list exit code should have set the OWNER_DIED bit as well, but > > that's not set.... > > > > debug patch for the robust list exit handling below. > > Last lines of trace below (full log attached): SNIP... It's the same picture as last time and the only occurence of the futex in question in the context of the dead task is: <...>-56956 [007] .... 658.804018: sys_futex(uaddr: 3ff9e880050, op: 7, val: 3ff00000007, utime: 3ff9b078910, uaddr2: 3ff9b078910, val3: 3ffea67e3f7) The robust list exit of that task does not contain the user space address 3ff9e880050. Confused and of course the problem does not reproduce on x86. Sigh. I'll think about it some more. Thanks, tglx