Received: by 2002:a05:7412:b995:b0:f9:9502:5bb8 with SMTP id it21csp3281537rdb; Wed, 27 Dec 2023 02:15:23 -0800 (PST) X-Google-Smtp-Source: AGHT+IFXDmEKd0U+ocmJWhdAAFEIaDRDrA24TnmUi61YTx+zyHOBSOLVc9/WaI0aekEEXomtV9H7 X-Received: by 2002:ad4:4691:0:b0:680:4fe2:821e with SMTP id pl17-20020ad44691000000b006804fe2821emr1577399qvb.56.1703672122736; Wed, 27 Dec 2023 02:15:22 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1703672122; cv=none; d=google.com; s=arc-20160816; b=uVG2ETyw/X/CjoWm0c8q/MK0N9g5G1hGRmJTsRwxOVHmUxa/HJgIhE5aKXKuCGAwis ictuWlnC8h8rmSxA9OqxVleh8Ucd5vmPGPK37qLupC+FrOOJoh/MezwsGPKIoYj7IwA9 KIt+RqisIk1XKFA/fC8uWif/QEg/tFU5R+hSCr93DDOt7zrI6XSCD55jxzOe/dnwJedm /GdZSJyUdbxQtEtqx/BPAUlG4VoUYG6HYahhq8HdL95Mdc54N1Fs0aFESWGEPtjRV5cP 8ozVnbsjCPi92RcyaCm55bDrCvDnvXxD5Gr8LItP4p0D8HikxkODtRUbHRj/mpC1/U7C qIoA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:message-id:subject:cc :to:from:date:dkim-signature; bh=bxBVfm4PKGrl5oprqwZSPbNZSHbFaZWeyr+oOUXZV+Y=; fh=zhK+bxEnj8Y8Vfc7I6HvD3sNkT/K7BOWu3ck0a0SKFM=; b=hHROnn1Zcw86QuL9cM+ZAq4kV14/8v+ynURAYY8c1lV+8RjUHYA8/j3xbOndSkbBjA 6pkzkloE1Pki3DSFkioIVzEAgaRDc/PJeNvTcQT59txcK7RhLN473sCztxle7R5Ld2og RqiK3yzXMd4q9fEAsu6t4n8XgN0nKFtV76+4rvDSYYtxmQoTJEWE6VVjEY/GdlS6v2zw G0+YTJrY90ZRUtazbYG7ptBtHYXBumYDBsKh7svVgu5rEnnc79Ip7Me1z9mKWmT5jEJO NZPJ9OhDhdxQ+mDA479AhChyVOiTJ9/XV2pzsuMWYcuLgfpP892Z42kXAmfnI+jsQQEe e60A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=nzeyeygm; spf=pass (google.com: domain of linux-kernel+bounces-11952-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-11952-linux.lists.archive=gmail.com@vger.kernel.org" Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id c13-20020a0ca9cd000000b0067f94720d1bsi1403201qvb.235.2023.12.27.02.15.22 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Dec 2023 02:15:22 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-11952-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=nzeyeygm; spf=pass (google.com: domain of linux-kernel+bounces-11952-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-11952-linux.lists.archive=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 733CC1C21812 for ; Wed, 27 Dec 2023 10:15:22 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 2CE292C87A; Wed, 27 Dec 2023 10:15:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="nzeyeygm" X-Original-To: linux-kernel@vger.kernel.org Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 90851EEBB for ; Wed, 27 Dec 2023 10:15:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=bxBVfm4PKGrl5oprqwZSPbNZSHbFaZWeyr+oOUXZV+Y=; b=nzeyeygmDYculFyYc5P/XMnpCj OYXhj+NRhbTekxR3Gv4i6tqECIcC4fj4D+UahIirOTsQBso0JUN3wC5xm6CsJAa/ITsliQLxMkSZx yUJBkC4WQ5FobcRAqb/NAkxoR8NW38KPRT5Ya9AL/hS3ohHleYmU+o8j6BwYO1mu1jmXvlQ+mnlEO gjhSksuhN0c0g1n3RU9iYJ78yDBO0F+BNZ5kz8ZCIcrv2oYk4xAmW3ZdRs9qh9yXu7f4SQOxMm2cS nrddd+Ycs2kydqE5kweaMrRR9kRyj5c4eycrXDTYs0KizOPU1KUxARmvTicQDZhjiSVyDFHQ+iqWT z2U+V1/w==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1rIQwE-002cOg-LM; Wed, 27 Dec 2023 10:14:59 +0000 Date: Wed, 27 Dec 2023 10:14:58 +0000 From: Matthew Wilcox To: "Aiqun Yu (Maria)" Cc: Hillf Danton , "Eric W. Biederman" , linux-kernel@vger.kernel.org Subject: Re: [PATCH] kernel: Introduce a write lock/unlock wrapper for tasklist_lock Message-ID: References: <20231213101745.4526-1-quic_aiquny@quicinc.com> <20231226104652.1491-1-hdanton@sina.com> <6e762e8e-b031-4e37-97c1-56390c9b8076@quicinc.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <6e762e8e-b031-4e37-97c1-56390c9b8076@quicinc.com> On Wed, Dec 27, 2023 at 09:41:29AM +0800, Aiqun Yu (Maria) wrote: > On 12/26/2023 6:46 PM, Hillf Danton wrote: > > On Wed, 13 Dec 2023 12:27:05 -0600 Eric W. Biederman > > > Matthew Wilcox writes: > > > > On Wed, Dec 13, 2023 at 06:17:45PM +0800, Maria Yu wrote: > > > > > +static inline void write_lock_tasklist_lock(void) > > > > > +{ > > > > > + while (1) { > > > > > + local_irq_disable(); > > > > > + if (write_trylock(&tasklist_lock)) > > > > > + break; > > > > > + local_irq_enable(); > > > > > + cpu_relax(); > > > > > > > > This is a bad implementation though. You don't set the _QW_WAITING flag > > > > so readers don't know that there's a pending writer. Also, I've seen > > > > cpu_relax() pessimise CPU behaviour; putting it into a low-power mode > > > > that takes a while to wake up from. > > > > > > > > I think the right way to fix this is to pass a boolean flag to > > > > queued_write_lock_slowpath() to let it know whether it can re-enable > > > > interrupts while checking whether _QW_WAITING is set. > > > > lock(&lock->wait_lock) > > enable irq > > int > > lock(&lock->wait_lock) > > > > You are adding chance for recursive locking. > > Thx for the comments for discuss of the deadlock possibility. While I think > deadlock can be differentiate with below 2 scenarios: > 1. queued_write_lock_slowpath being triggered in interrupt context. > tasklist_lock don't have write_lock_irq(save) in interrupt context. > while for common rw lock, maybe write_lock_irq(save) usage in interrupt > context is a possible. > so may introduce a state when lock->wait_lock is released and left the > _QW_WAITING flag. > Welcome others to suggest on designs and comments. Hm? I am confused. You're talking about the scenario where: - CPU B holds the lock for read - CPU A attempts to get the lock for write in user context, fails, sets the _QW_WAITING flag - CPU A re-enables interrupts - CPU A executes an interrupt handler which calls queued_write_lock() - If CPU B has dropped the read lock in the meantime, atomic_try_cmpxchg_acquire(&lock->cnts, &cnts, _QW_LOCKED) succeeds - CPU A calls queued_write_unlock() which stores 0 to the lock and we _lose_ the _QW_WAITING flag for the userspace waiter. How do we end up with CPU A leaving the _QW_WAITING flag set?