Received: by 2002:a25:6193:0:0:0:0:0 with SMTP id v141csp2335630ybb; Mon, 30 Mar 2020 04:21:05 -0700 (PDT) X-Google-Smtp-Source: ADFU+vvqLAhU0gU5ZcCdgyg9tMl/R6cTywrVCMsfbTN0Qb/90xmoMLIPx3WqHt4upx6l3eGlUQY0 X-Received: by 2002:a05:6830:1e55:: with SMTP id e21mr9113768otj.233.1585567264948; Mon, 30 Mar 2020 04:21:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585567264; cv=none; d=google.com; s=arc-20160816; b=0epiFUlPSdfZC/Rn2uprRjkvo5vJUuaUe2cMQQ1xNgLGLza4ob82CLubfefukpffD4 n12NkPuetot5V/2M33E06Zgmry3odhJRfbLnz+X5bn/h40uMx+XdKZTEYvmjFuH72Gr/ xx63LngLAJJDOiw5IvJPjegfOf5quqk9xGGATc2eugPg9xCd6vfKsb+b7paaJ1AGhgeF +42l33x06hKBjruvTyEyznj52R5jxmkGI9RCKuzNR2XWSKLOJzsxklO89x69EBzDTPPO e3QZHET708dcPsbfBmZtWK92xRzQoim6uGgmetmY8Py2RlR1WtrurRkePix6PMJeFt9j VBjA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=fWVXGX/OziKB4qDcBSld4wYjJ05171GId/cI3GmWg3k=; b=Z9ZhfuWoqscKwEXpSHOWjiDF/Wq29Ob7cCOY7SeolpmgOkfF9lzHWMV9Hgg+yzSWGH 95XRsoTFMCkOCZQvc62HJqWKBW4n3k7gUYiY8A9Zat0jaIis2mlOl/47vyX3oK0s034e O1eII01cxlBakKdTNaAHbVW491oyCCDYMnysjh47FSnZO9lgTssEde79iHq/zVn3lagg uKMbZLcRLZOYRY++0h3D9uo2bBWQJaKyRF7D+K+C2FNsAI/uhVz7QvY9QfF2AizhwB2N osq+2MBJcInLuw/EVCNbYe9z6G4EKx/nzXdn/sFgbcTHdFPlnx6jDzQK69ac3FEUvhhB Xk9g== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=merlin.20170209 header.b=E7RfOGIc; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e11si425975oif.131.2020.03.30.04.20.50; Mon, 30 Mar 2020 04:21:04 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=merlin.20170209 header.b=E7RfOGIc; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729650AbgC3LTA (ORCPT + 99 others); Mon, 30 Mar 2020 07:19:00 -0400 Received: from merlin.infradead.org ([205.233.59.134]:51708 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729267AbgC3LTA (ORCPT ); Mon, 30 Mar 2020 07:19:00 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=merlin.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=fWVXGX/OziKB4qDcBSld4wYjJ05171GId/cI3GmWg3k=; b=E7RfOGIc4QLl2iK/NnCRH9ILsx d0GmPXGxtHuhonmHMZ42yvO89G2aLtbjLDvwMdUCyoSmgTC35VUU6A1U2/suReXfelJkEV0v8Po0P 0jWa5NLWLAYS7fIFPfTU12w/Ezmv9tLYhNzYEgT40jgj0i2KKkqXv8/z85dillHvEFcu0UUjimjAZ A8fgatBjgYeVSO15Kh4UrpRS3Yu8qIVFTqA/Ob4RahTh7BpSjMIqauE7qHSkwfavwVC9hE0nrnTuD Xgbts7vUU67ci3hYLYIo1pmbWy903QwHJIx5aSXlw03+HEF0ikyEjPk2s0z5rW28PQIPeaP/noQ3Y 5kO2IK7w==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=noisy.programming.kicks-ass.net) by merlin.infradead.org with esmtpsa (Exim 4.92.3 #3 (Red Hat Linux)) id 1jIsRT-00024i-HO; Mon, 30 Mar 2020 11:18:55 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 12E7C303C41; Mon, 30 Mar 2020 13:18:53 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id F04FD29D04D69; Mon, 30 Mar 2020 13:18:52 +0200 (CEST) Date: Mon, 30 Mar 2020 13:18:52 +0200 From: Peter Zijlstra To: Qian Cai Cc: mingo@redhat.com, will@kernel.org, dbueso@suse.de, juri.lelli@redhat.com, longman@redhat.com, linux-kernel@vger.kernel.org Subject: Re: [PATCH -next] locking/percpu-rwsem: fix a task_struct refcount Message-ID: <20200330111852.GH20696@hirez.programming.kicks-ass.net> References: <20200327093754.GS20713@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 27, 2020 at 06:19:37AM -0400, Qian Cai wrote: > > > > On Mar 27, 2020, at 5:37 AM, Peter Zijlstra wrote: > > > > If the trylock fails, someone else got the lock and we remain on the > > waitqueue. It seems like a very bad idea to put the task while it > > remains on the waitqueue, no? > > Interesting, I thought this was more straightforward to see, It is indeed as straight forward as you explain; but when doing 10 things at once, and having just dug through some low-level arch assembly code for the previous email, even obvious things might sometimes need a little explaining :/ So please, always try and err on the side of a little verbose when writing Changelogs, esp. when concerning locking / concurrency, you really can't be clear enough. > but I may > be wrong as always. At the beginning of percpu_rwsem_wake_function() > it calls get_task_struct(), but if the trylock failed, it will remain > in the waitqueue. However, it will run percpu_rwsem_wake_function() > again with get_task_struct() to increase the refcount. Can you > enlighten me where it will call put_task_struct() in waitqueue or > elsewhere to balance the refcount in this case? See, had that explaination been part of the Changelog, my brain would've probably been able to kick itself in gear and actually spot the problem. Yes, you're right. That said, I wonder if we can just move the get_task_struct() call like below; after all the race we're guarding against is percpu_rwsem_wait() observing !private, terminating the wait and doing a quick exit() while percpu_rwsem_wake_function() then does wake_up_process(p) as a use-after-free. Hmm? diff --git a/kernel/locking/percpu-rwsem.c b/kernel/locking/percpu-rwsem.c index a008a1ba21a7..8bbafe3e5203 100644 --- a/kernel/locking/percpu-rwsem.c +++ b/kernel/locking/percpu-rwsem.c @@ -118,14 +118,15 @@ static int percpu_rwsem_wake_function(struct wait_queue_entry *wq_entry, unsigned int mode, int wake_flags, void *key) { - struct task_struct *p = get_task_struct(wq_entry->private); bool reader = wq_entry->flags & WQ_FLAG_CUSTOM; struct percpu_rw_semaphore *sem = key; + struct task_struct *p; /* concurrent against percpu_down_write(), can get stolen */ if (!__percpu_rwsem_trylock(sem, reader)) return 1; + p = get_task_struct(wq_entry->private); list_del_init(&wq_entry->entry); smp_store_release(&wq_entry->private, NULL);