Received: by 10.223.164.202 with SMTP id h10csp4673078wrb; Wed, 29 Nov 2017 09:58:21 -0800 (PST) X-Google-Smtp-Source: AGs4zMYUl00uAV60V6RI18WjQsv29NlR0+ww6pyVX3ZIrQ+QirmNlF2vDRv0BFY17FAJs7TCPIPU X-Received: by 10.84.232.74 with SMTP id f10mr3546210pln.90.1511978301453; Wed, 29 Nov 2017 09:58:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1511978301; cv=none; d=google.com; s=arc-20160816; b=oa2dnBlS88VI4/kNRNEnOSxLG82Y/iMk07H5CuhQ+1XTCLcCMkKSJzkyrOSiShDaE5 9B/4PGWC+1K/PLB2qgAAWYpB3ajtO9l8JUncxDTRIL29BlVyhCbE1YILLnm4LLeUHobe wVnPKvIcgnP5DPw8t0niKnC0GwRl/8lA/MpQOgQOzmfXaiW4IXuxNGoNfrct5c/lWM1j Lv4Rl8rcB6A+Sl5Ia2Aaldjw+dlc1svFK6gNcqcOnUbDbyMHyZYtntQ0K542W7sh3yDv DJKPyid8IfAk/LN2/11yWvQbWojx4ixCiXp5Pm7bQW3FtFqNOHAemHZExByNyvjaarJ4 y6tQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:content-disposition :mime-version:message-id:subject:cc:to:from:date :arc-authentication-results; bh=LFZP/32upmnY2yJ36DJS/YdJW7/+0U4eaPQNqPY4qtg=; b=DTeo16XAIMq6mCeXGodt8LN0jUkgQnaVCR7qeVUxPC65zHTgN2/FsREoX/Sl3BO/ow +Qp6naPOp2iaAEQ1JJQ6cmRFgcU6V2PIY6KU3hpY46B4/eB9XgA/D/ZBrBQ/pc8mEMto Ft1IRJl/P2iHUhRZktoOLKYcUF+aoSBa2JsImEE1s8fLqD85QAnXC01H9KkOCf1Ry4Iw HvB59NEw+kx+0UH8DBhPNjDoGD/RkoCyvv0RgubENsMUxjxH+on5SzB0htB83P7bO4w7 d275+kVn03arQyBgPnqrQrCfTTLX5j8FX61NVBxSdv6sO2WHHErj90b+vnQVgng0G3at 6IZA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z13si1540479pgo.335.2017.11.29.09.58.11; Wed, 29 Nov 2017 09:58:21 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935123AbdK2R5B (ORCPT + 70 others); Wed, 29 Nov 2017 12:57:01 -0500 Received: from mx0a-00010702.pphosted.com ([148.163.156.75]:35526 "EHLO mx0b-00010702.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S933881AbdK2R4p (ORCPT ); Wed, 29 Nov 2017 12:56:45 -0500 Received: from pps.filterd (m0098780.ppops.net [127.0.0.1]) by mx0a-00010702.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id vATHu4o8024582; Wed, 29 Nov 2017 11:56:06 -0600 Received: from ni.com (skprod3.natinst.com [130.164.80.24]) by mx0a-00010702.pphosted.com with ESMTP id 2egpkyqpgf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 29 Nov 2017 11:56:06 -0600 Received: from us-aus-exch1.ni.corp.natinst.com (us-aus-exch1.ni.corp.natinst.com [130.164.68.11]) by us-aus-skprod3.natinst.com (8.16.0.21/8.16.0.21) with ESMTPS id vATHu5MC013050 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Wed, 29 Nov 2017 11:56:05 -0600 Received: from us-aus-exhub1.ni.corp.natinst.com (130.164.68.41) by us-aus-exch1.ni.corp.natinst.com (130.164.68.11) with Microsoft SMTP Server (TLS) id 15.0.1156.6; Wed, 29 Nov 2017 11:56:05 -0600 Received: from jcartwri.amer.corp.natinst.com (130.164.49.7) by us-aus-exhub1.ni.corp.natinst.com (130.164.68.41) with Microsoft SMTP Server id 15.0.1156.6 via Frontend Transport; Wed, 29 Nov 2017 11:56:05 -0600 Received: by jcartwri.amer.corp.natinst.com (Postfix, from userid 1000) id 9BCD630199B; Wed, 29 Nov 2017 11:56:05 -0600 (CST) Date: Wed, 29 Nov 2017 11:56:05 -0600 From: Julia Cartwright To: Thomas Gleixner , Peter Zijlstra CC: Gratian Crisan , , Darren Hart , Ingo Molnar Subject: PI futexes + lock stealing woes Message-ID: <20171129175605.GA863@jcartwri.amer.corp.natinst.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline User-Agent: Mutt/1.9.1 (2017-09-22) X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-11-29_06:,, signatures=0 X-Proofpoint-Spam-Details: rule=inbound_policy_notspam policy=inbound_policy score=30 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1011 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=30 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1711290231 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hey Thomas, Peter- Gratian and I have been debugging into a nasty and difficult race w/ futexes seemingly the culprit. The original symptom we were seeing was a seemingly spurious -EDEADLK from a futex(LOCK_PI) operation. On further analysis, however, it appears the thread which gets the spurious -EDEADLK has observed a weird futex state: a prior futex(WAIT_REQUEUE_PI) operation has returned -ETIMEDOUT, but the uaddr2 futex word owner field indicates that it's the owner. Here's an attempt to boil down this situation into a pseudo trace; I'm happy to forward along the full traces as well, if that would be helpful: waiter waker stealer (prio > waiter) futex(WAIT_REQUEUE_PI, uaddr, uaddr2, timeout=[N ms]) futex_wait_requeue_pi() futex_wait_queue_me() freezable_schedule() futex(LOCK_PI, uaddr2) futex(CMP_REQUEUE_PI, uaddr, uaddr2, 1, 0) /* requeues waiter to uaddr2 */ futex(UNLOCK_PI, uaddr2) wake_futex_pi() cmp_futex_value_locked(uaddr, waiter) wake_up_q() task> futex(LOCK_PI, uaddr2) __rt_mutex_start_proxy_lock() try_to_take_rt_mutex() /* steals lock */ rt_mutex_set_owner(lock, stealer) rt_mutex_wait_proxy_lock() __rt_mutex_slowlock() try_to_take_rt_mutex() /* fails, lock held by stealer */ if (timeout && !timeout->task) return -ETIMEDOUT; fixup_owner() /* lock wasn't acquired, so, fixup_pi_state_owner skipped */ return -ETIMEDOUT; /* At this point, we've returned -ETIMEDOUT to userspace, but the * futex word shows waiter to be the owner, and the pi_mutex has * stealer as the owner */ futex_lock(LOCK_PI, uaddr2) -> bails with EDEADLK, futex word says we're owner. At some later point in execution, the stealer gets scheduled back in and will do fixup_owner() which fixes up the futex word, but at that point it's too late: the waiter has already observed the wonky state. fixup_owner() used to have additional seemingly relevant checks in place that were removed 73d786bd043eb ("futex: Rework inconsistent rt_mutex/futex_q state"). The actual kernel we've been testing is 4.9.33-rt23, w/ 153fbd1226fb3 ("futex: Fix more put_pi_state() vs. exit_pi_state_list() races") cherry-picked w/ PREEMPT_RT_FULL. However, it appears that this issue may affect v4.15-rc1? Thoughts on how to move forward? Nasty. Julia From 1586191315230498421@xxx Fri Dec 08 05:11:58 +0000 2017 X-GM-THRID: 1586191315230498421 X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread