Received: by 2002:a25:31c3:0:0:0:0:0 with SMTP id x186csp1143863ybx; Thu, 7 Nov 2019 07:55:39 -0800 (PST) X-Google-Smtp-Source: APXvYqyQVVT8zKwJhPfegOBaAUmhKvKHKDnLG6fZ9oOCM39dqozMkoQNNwP9isSNUDJbEcutbfKH X-Received: by 2002:a05:6402:2042:: with SMTP id bc2mr4278732edb.167.1573142139000; Thu, 07 Nov 2019 07:55:39 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1573142138; cv=none; d=google.com; s=arc-20160816; b=SYMFngmJdyX62Kd4spLZaeTLuQt/CryksQYcwKcsU59FMWMqJRmFoEpyoWobWAVRa3 sAZ4XFAfe5Ud9p1VjrkP04b9w58rQsE2Sig1VhEylM4iX1ARzVEBCUYXgVxKS1Gbs001 PeywOkd4IhdzCTb+wPUjc/+rOOpIBOaPvwXfXyrKeM8JCggtZxpSYZfMzVT0HQmINaXF tTIuBILoAUBifuqKnjfAGUuv+e5wOJeF/LnFPZ8P5K+mOGy7Q/5b8mp1CQV67jJhmK19 k3vApx8qGuooqDsODA/U9TrrzGLWVVD+gosZpyc0cwXL7QbRE8fXoLs7e06SOT5Sd5Yo ab4A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-disposition :content-transfer-encoding:user-agent:in-reply-to:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=mwYqD5W078qyciDk7k9AIq+YAsZM74It8HoGfRNTJNo=; b=Sszelm4/4YURHiOrRU/80nvsqJkotq86vm4YaweFkSj3D86kMmfO6rb98pBpi1o5S5 AH4o3dj4manpbGhlPWM5zPpVbxTgEYZ2cQYxPsNKCsfwla/c0P6d2WNQaTZSD+5jTmKF iMdNVuwBw5vWfLIS0ICI7fafj9/EhsJZ3mRpzYUQLdL/RbwRkySxDiIXYOKD+/+iQ/SV aMNbxXfvhIN+WQVpO0RP49aV9MDg+w97qeiw8qPda47Ju0xSknPdEE637Eevnz0tm9c8 S29JTuuFCFPHb4yzKS6BFHPFxxY3dh685xK4W5/T+p5GG35uji5nb1TrGo2Nsxh2lgcj ZLEg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=TYWX6ZGE; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c3si1875912edx.315.2019.11.07.07.55.15; Thu, 07 Nov 2019 07:55:38 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=TYWX6ZGE; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387939AbfKGPvp (ORCPT + 99 others); Thu, 7 Nov 2019 10:51:45 -0500 Received: from us-smtp-2.mimecast.com ([205.139.110.61]:42049 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727142AbfKGPvo (ORCPT ); Thu, 7 Nov 2019 10:51:44 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1573141903; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mwYqD5W078qyciDk7k9AIq+YAsZM74It8HoGfRNTJNo=; b=TYWX6ZGE4IJEAt1aJ/ZImSHHmMd4MjzLwhYtEESGP6zsIymlB9KgxO/dtfdl+ZcY7EqNVf o/ZqbKbWvsiBnQn+D+jItihdcfRETBb6raXIosU/sixM7jbUy0MsucfBk9JZ1DTIbTsrvd pGRae5VXBYe5Cu1wxV0QdkXe3XGIQ2w= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-220-2iala_i5O12Cmv23G3z22Q-1; Thu, 07 Nov 2019 10:51:37 -0500 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id C3A1C477; Thu, 7 Nov 2019 15:51:35 +0000 (UTC) Received: from dhcp-27-174.brq.redhat.com (unknown [10.43.17.44]) by smtp.corp.redhat.com (Postfix) with SMTP id CE1C3100032F; Thu, 7 Nov 2019 15:51:31 +0000 (UTC) Received: by dhcp-27-174.brq.redhat.com (nbSMTP-1.00) for uid 1000 oleg@redhat.com; Thu, 7 Nov 2019 16:51:35 +0100 (CET) Date: Thu, 7 Nov 2019 16:51:30 +0100 From: Oleg Nesterov To: Thomas Gleixner Cc: Florian Weimer , Shawn Landden , libc-alpha@sourceware.org, linux-api@vger.kernel.org, LKML , Arnd Bergmann , Deepa Dinamani , Andrew Morton , Catalin Marinas , Keith Packard , Peter Zijlstra Subject: Re: handle_exit_race && PF_EXITING Message-ID: <20191107155130.GB24042@redhat.com> References: <20191106085529.GA12575@redhat.com> <20191106103509.GB12575@redhat.com> <20191106121111.GC12575@redhat.com> MIME-Version: 1.0 In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-MC-Unique: 2iala_i5O12Cmv23G3z22Q-1 X-Mimecast-Spam-Score: 0 Content-Type: text/plain; charset=WINDOWS-1252 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 11/06, Thomas Gleixner wrote: > > On Wed, 6 Nov 2019, Oleg Nesterov wrote: > > > > I think that (with or without this fix) handle_exit_race() logic needs > > cleanups, there is no reason for get_futex_value_locked(), we can drop > > ->pi_lock right after we see PF_EXITPIDONE. Lets discuss this later. > > Which still is in atomic because the hash bucket lock is held, ergo > get_futex_value_locked() needs to stay for now. Indeed, you are right. > Same explanation as before just not prosa this time: > > exit()=09=09=09=09=09lock_pi(futex2) > exit_pi_state_list() > lock(tsk->pi_lock) > tsk->flags |=3D PF_EXITPIDONE;=09=09 attach_to_pi_owner() > =09=09=09=09=09 ... > // Loop unrolled for clarity > while(!list_empty())=09=09=09 lock(tsk->pi_lock); > cleanup(futex1) > unlock(tsk->pi_lock) ^^^^^^^^^^^^^^^^^^^^ Ah! Thanks. Hmm. In particular, exit_pi_state() drops pi_lock if refcount_inc_not_zero(= ) fails. Isn't this another potential source of livelock ? Suppose that a realtime lock owner X sleeps somewhere, another task T calls put_pi_state(), refcount_dec_and_test() succeeds. What if, say, X is killed right after that and preempts T on the same CPU? Oleg.