Received: by 2002:a6b:500f:0:0:0:0:0 with SMTP id e15csp4834055iob; Mon, 9 May 2022 02:45:05 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwLGY0Xg9mZLkIKdf1UIo2C0WZjsJGMYaKZ1IZgijhFo9tllgE1E68sGncMGc9jW8Ldv1Tl X-Received: by 2002:a62:1788:0:b0:50d:dc1f:70b0 with SMTP id 130-20020a621788000000b0050ddc1f70b0mr14925636pfx.48.1652089505405; Mon, 09 May 2022 02:45:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1652089505; cv=none; d=google.com; s=arc-20160816; b=cc8uBURoF+hf/MOLMX1w3WcAaolA1AZvQ07t8ZSNbEidB0HdRMjQPT/u+ZVSp3aAx0 5QKbvSJEOhm5wMVpxx8CHj+f7jD5TCGH6NY7zz60Mmk+ujF7fPy8zgrXQDpCv+qh2ZXm 2RJ0h8n+mnJAMW/J3e0on4h5YI7eS+WnAfsXe36r69mL/RFBDtWZmpoc2AQeIUAId7yI 399wwS8jWwpbEjqvn3sGsEtd5xV5stbueuVBy82lbQutGP58UXKib5rOGYbsfSjemUEN 5qLde6KEWfw8b9alL4PSmf2wfvMhIVdteam4KA+913UF1JKCC7NegemKOIpE9nOuoZqB kFzA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=0a3n6g08uy4NgSGyUNbaprX3YM0H5Vez2eDs3lJUsJk=; b=NL6qjqdaC+sbkVzRplJxKz0Z+iEMXvPHx59+qHFcSZGrjn1NcbeQcClUes1W+SGaBz qD2EhUQvUfWntxDb+wz2UXy3YOau7rNBCZ53ePVQ+RQm7lXVNeO/2AGfeiHmLYMXGwqW BeDSFSFl8ZiFLLx24vANrUKmGHN9u9+ZENbpaNlgoWm2ZIrF4xtZxM1SdIb62FYE44QX sEGrVJ7G7qLzFU3CfRMWtqu9xD4fpryJqQTXOrKk+7OObESZpux7UZPphDRNB12FxU6P /yQEwa8jNNQRQauQusKJUTW2O17e5TrHMvLZpxcVtirII5TJClTzMTqKndm7mLj1wpve EONg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=gcueIgMy; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id u9-20020a170902e80900b001590504ef9fsi13939157plg.60.2022.05.09.02.45.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 May 2022 02:45:05 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=gcueIgMy; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 9A67E222435; Mon, 9 May 2022 02:28:55 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232619AbiEEAmH (ORCPT + 99 others); Wed, 4 May 2022 20:42:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60840 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231466AbiEEAmG (ORCPT ); Wed, 4 May 2022 20:42:06 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4C2AD13F6F for ; Wed, 4 May 2022 17:38:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=0a3n6g08uy4NgSGyUNbaprX3YM0H5Vez2eDs3lJUsJk=; b=gcueIgMyoPiA5Wg4SzO+ZlJxCE 5LnXK3zcy4lIkYzLGXoCV2bekzWV+XJJ7KdDze1t9Zr7yFd/R+8HhxtXdhVJibYMUGKAMes2rVFI1 JpGjwyWOQsoRGxdlq2nTrl7tZF1a57IQbOXBVsn4oB5mzaCRSqPGIGCK5HrlHdlODjfWYLp63skyQ 0nl86JveqK1dipA1LI5puoz7X3TMAq5PEjQhXa2hKwYSNZnzbAXwoqttXNFWnWSdLK8h4Hg16iVIc x8RQO4e4WhBy27GV+TgtwqhZBu1y5PQD18uF0gV04v9MjaCTzPO1GYoddhPIfnbXeqkXCrhYwKmSU 5Ox8Jjbg==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1nmPVW-00H7um-TB; Thu, 05 May 2022 00:38:15 +0000 Date: Thu, 5 May 2022 01:38:14 +0100 From: Matthew Wilcox To: Thomas Gleixner Cc: Peter Zijlstra , Ingo Molnar , Will Deacon , Waiman Long , "Paul E. McKenney" , "Liam R. Howlett" , linux-kernel@vger.kernel.org Subject: Re: Wait for mutex to become unlocked Message-ID: References: <87pmksj0ah.ffs@tglx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87pmksj0ah.ffs@tglx> X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, May 05, 2022 at 02:22:30AM +0200, Thomas Gleixner wrote: > > So this is Good. For the vast majority of cases, we avoid taking the > > mmap read lock and the problem will appear much less often. But we can > > do Better with a new API. You see, for this case, we don't actually > > want to acquire the mmap_sem; we're happy to spin a bit, but there's no > > point in spinning waiting for the writer to finish when we can sleep. > > I'd like to write this code: > > > > again: > > rcu_read_lock(); > > vma = vma_lookup(); > > if (down_read_trylock(&vma->sem)) { > > rcu_read_unlock(); > > } else { > > rcu_read_unlock(); > > rwsem_wait_read(&mm->mmap_sem); > > goto again; > > } > > > > That is, rwsem_wait_read() puts the thread on the rwsem's wait queue, > > and wakes it up without giving it the lock. Now this thread will never > > be able to block any thread that tries to acquire mmap_sem for write. > > Never? > > if (down_read_trylock(&vma->sem)) { > > ---> preemption by writer Ah! This is a different semaphore. Yes, it can be preempted while holding the VMA rwsem and block a thread which is trying to modify the VMA which will then block all threads from faulting _on that VMA_, but it won't affect page faults on any other VMA. It's only Better, not Best (the Best approach was proposed on Monday afternoon, and the other MM developers asked us to only go as far as Better and see if that was good enough). > The information gathered from /proc/pid/smaps is unreliable at the point > where the lock is dropped already today. So it does not make a > difference whether the VMAs have a 'read me if you really think it's > useful' sideband information which gets updated when the VMA changes and > allows to do: Mmm. I'm not sure that we want to maintain the smaps information on the off chance that somebody wants to query it. > But looking at the stuff which gets recomputed and reevaluated in that > proc/smaps code this makes a lot of sense, because most if not all of > this information is already known at the point where the VMA is modified > while holding mmap_sem for useful reasons, no? I suspect the only way to know is to try to implement it, and then benchmark it.