Received: by 2002:a25:6193:0:0:0:0:0 with SMTP id v141csp2173314ybb; Sat, 21 Mar 2020 14:50:23 -0700 (PDT) X-Google-Smtp-Source: ADFU+vtWtE5oUKQpNdw50f2EHq4DpKIbzgPwTqce6xKfUV/ZvUuH8CjMeRMYrzvkYqsE/mChbHwe X-Received: by 2002:a9d:1708:: with SMTP id i8mr12385271ota.250.1584827423737; Sat, 21 Mar 2020 14:50:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1584827423; cv=none; d=google.com; s=arc-20160816; b=ci8FhkPaN0wlQX794cLMqBM5LAbkPC9nOepVZVMP9MFxuhmrDVcjZaZH6xXSQZiK+W ooW+4B9syEzS9glyVR+I284Z9bwS4SqLi6dXd0m5t8exnTIq8HYRyM0ZRssGuaNmgXTd Dk5c/Lk+4aL8BkHiYSTqXbJdA8uk4YiE1w+KBaOr0rcMmgrTw/4bDXZEVgYg/MsnWJZX JxethmfWFg4t/p72HhQWLNaZIlpo//mDrOPc2tvZzbQVIJXxJLTmv6MpJbUVJaTdQRbx wWgtF9ZuaaZPKHVSFgwoxADuMfFkLM+hnM/bmNbLOJy09B8aCEYA0zLVt0EdQ5MIZ4et Vf/g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from; bh=AC+rsxNt0SY/7WDx20OMVYN5f83/ToT2RcjkpPlh2aA=; b=URmubfK+CElKzo9W1IngC7tO3U9gy0e+mq99rn5MJknGWlMRnDiJkCwvKC+OGUqwm5 mY7A4XHsYTUIHj/X/DCDeIbh8Wj3asTpwsPYBDUw1Gs/GhKB8brfeNolf4c5Gr0Qwg29 cy3y+UlYH5RhrjdeVhYEYHpQpc0tOjUYV7yARjbVJ8xXf6VkUch1dMxAI7dtyLGaFfnZ HMD83lry6GezC7hPfoclTsQ0p91dbi/0yTlStVMX+xbJ1LrliQ6rP5Z3vW83+7vODbY4 tIIP+aJE/XRC+f5Xmw9lAcvRvTrSqNwpjKAbBXA0kuxG+GB1JjekvDUocD1vgqmUeH1I pG8Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j1si1019532oiw.205.2020.03.21.14.50.08; Sat, 21 Mar 2020 14:50:23 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727883AbgCUVtq (ORCPT + 99 others); Sat, 21 Mar 2020 17:49:46 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:39343 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726539AbgCUVtq (ORCPT ); Sat, 21 Mar 2020 17:49:46 -0400 Received: from p5de0bf0b.dip0.t-ipconnect.de ([93.224.191.11] helo=nanos.tec.linutronix.de) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1jFlzN-0001VX-Ok; Sat, 21 Mar 2020 22:49:06 +0100 Received: by nanos.tec.linutronix.de (Postfix, from userid 1000) id C8A961040D5; Sat, 21 Mar 2020 22:49:04 +0100 (CET) From: Thomas Gleixner To: Joel Fernandes Cc: LKML , Peter Zijlstra , Linus Torvalds , Ingo Molnar , Will Deacon , "Paul E . McKenney" , Steven Rostedt , Randy Dunlap , Sebastian Andrzej Siewior , Logan Gunthorpe , Kurt Schwemmer , Bjorn Helgaas , linux-pci@vger.kernel.org, Felipe Balbi , Greg Kroah-Hartman , linux-usb@vger.kernel.org, Kalle Valo , "David S. Miller" , linux-wireless@vger.kernel.org, netdev@vger.kernel.org, Oleg Nesterov , Davidlohr Bueso , Michael Ellerman , Arnd Bergmann , linuxppc-dev@lists.ozlabs.org Subject: Re: [patch V2 08/15] Documentation: Add lock ordering and nesting documentation In-Reply-To: <20200321212144.GA6475@google.com> References: <20200318204302.693307984@linutronix.de> <20200318204408.211530902@linutronix.de> <20200321212144.GA6475@google.com> Date: Sat, 21 Mar 2020 22:49:04 +0100 Message-ID: <874kuhqsz3.fsf@nanos.tec.linutronix.de> MIME-Version: 1.0 Content-Type: text/plain X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Joel Fernandes writes: >> +rwlock_t >> +======== >> + >> +rwlock_t is a multiple readers and single writer lock mechanism. >> + >> +On a non PREEMPT_RT enabled kernel rwlock_t is implemented as a spinning >> +lock and the suffix rules of spinlock_t apply accordingly. The >> +implementation is fair and prevents writer starvation. >> > > You mentioned writer starvation, but I think it would be good to also mention > that rwlock_t on a non-PREEMPT_RT kernel also does not have _reader_ > starvation problem, since it uses queued implementation. This fact is worth > mentioning here, since further below you explain that an rwlock in PREEMPT_RT > does have reader starvation problem. It's worth mentioning. But RT really has only write starvation not reader starvation. >> +rwlock_t and PREEMPT_RT >> +----------------------- >> + >> +On a PREEMPT_RT enabled kernel rwlock_t is mapped to a separate >> +implementation based on rt_mutex which changes the semantics: >> + >> + - Same changes as for spinlock_t >> + >> + - The implementation is not fair and can cause writer starvation under >> + certain circumstances. The reason for this is that a writer cannot grant >> + its priority to multiple readers. Readers which are blocked on a writer >> + fully support the priority inheritance protocol. > > Is it hard to give priority to multiple readers because the number of readers > to give priority to could be unbounded? Yes, and it's horribly complex and racy. We had an implemetation years ago which taught us not to try it again :) >> +PREEMPT_RT also offers a local_lock mechanism to substitute the >> +local_irq_disable/save() constructs in cases where a separation of the >> +interrupt disabling and the locking is really unavoidable. This should be >> +restricted to very rare cases. > > It would also be nice to mention where else local_lock() can be used, such as > protecting per-cpu variables without disabling preemption. Could we add a > section on protecting per-cpu data? (Happy to do that and send a patch if you > prefer). The local lock section will come soon when we post the local lock patches again. >> +rwsems have grown interfaces which allow non owner release for special >> +purposes. This usage is problematic on PREEMPT_RT because PREEMPT_RT >> +substitutes all locking primitives except semaphores with RT-mutex based >> +implementations to provide priority inheritance for all lock types except >> +the truly spinning ones. Priority inheritance on ownerless locks is >> +obviously impossible. >> + >> +For now the rwsem non-owner release excludes code which utilizes it from >> +being used on PREEMPT_RT enabled kernels. > > I could not parse the last sentence here, but I think you meant "For now, > PREEMPT_RT enabled kernels disable code that perform a non-owner release of > an rwsem". Correct me if I'm wrong. Right, that's what I wanted to say :) Care to send a delta patch? Thanks! tglx