Received: by 2002:a05:6358:a55:b0:ec:fcf4:3ecf with SMTP id 21csp4966414rwb; Tue, 17 Jan 2023 07:36:35 -0800 (PST) X-Google-Smtp-Source: AMrXdXsY+kO2jki2IhiOhkSyu/VR+el7aItJi6adRZ2N3ETnt3gtYGoO0n5voBpKS6XFayJveS0r X-Received: by 2002:a17:902:a614:b0:192:8ca0:b86e with SMTP id u20-20020a170902a61400b001928ca0b86emr2837687plq.35.1673969795528; Tue, 17 Jan 2023 07:36:35 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673969795; cv=none; d=google.com; s=arc-20160816; b=cQYKQglv6YxUEKEdZu2cXnOg0D6GaGVfqIPVwxBqRysPsxvbxgplqVSRY1IYukGjDi 4X00U6nl38P7RUOC2bR2Dk/CiswxUbdKYuhgUa/5wPrSAO5eVupwPv0R0HgWQ1BjHxYf Kc7z3qcOtbX9lA13o0EJ2Om2ynfQkK9aedC7Q6dG2A8/pwaDITh16kca3IUChtqYGJCh oFck1kSp/52RbhWzhapNpABcJYqNdgkLW+OwlXHJRfA57VUST++nBlDB4zKvcAxuzrh6 RPR3wkSh6vCkXYUfe2YEpXxV1/Ba3hPFhC6C5f/ztdufV1/tSTEy5hWUHmRl0x7qzOkV B5Tg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=5NTI5PSLTfCDe6gQVpoy+iwpEywbgMbai5Jcy4OtFUQ=; b=i8/lzBSJTG73DEP4n+sC/47qUlnTSlqQk9Opzw3jBPTuw2/hLj9IKP8E9otUl+Ji91 V0wGDy1r64w83P3mt2QlYNfFfGwEFm5zbFv6V1Flsqz+QgGsUQTHbCpTzvlc4zBdNjjs 8ZLIlMOOXKkcwnbqSsoI6yBZOWflYiR5iZ9SAKzfRq2MffrrmytTWuTGCLRO4Z40laT0 jcunC2zByR7UNHvGfpU4azNzeCRAio9Lip0afPAObr+ioJsxNjShNPqkyT0ShRVOHRs0 aIz7+RZdHJ0t6TDLRDmOlAcEY1GV5UnsM0485ojr54BhmtYPki1uphAqVZfvdy+pqkBJ YbTQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=vKJK0oUh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j12-20020a63fc0c000000b0046f63218914si32023243pgi.311.2023.01.17.07.36.29; Tue, 17 Jan 2023 07:36:35 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=vKJK0oUh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233157AbjAQPMt (ORCPT + 48 others); Tue, 17 Jan 2023 10:12:49 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50696 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232844AbjAQPMp (ORCPT ); Tue, 17 Jan 2023 10:12:45 -0500 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 425CF10F6 for ; Tue, 17 Jan 2023 07:12:44 -0800 (PST) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id EA6466891D; Tue, 17 Jan 2023 15:12:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1673968362; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=5NTI5PSLTfCDe6gQVpoy+iwpEywbgMbai5Jcy4OtFUQ=; b=vKJK0oUhU26xmRrTy9EJrTe/vSwWuUEHbdOXYccjfeTnTiR5NwQ2XMngZr63M/b1WVe2AX juTB2Dy3SsizgKPcu0nnUM13rX04ilIO9rd02mk5W48x9S7RhqDjCVVmrzVsQbp6PCDA5l o7gyTFLzV7Fd2V2UwjRPwAw3AmeEGos= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id BE7761390C; Tue, 17 Jan 2023 15:12:42 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id HSTwLeq6xmOPQwAAMHmgww (envelope-from ); Tue, 17 Jan 2023 15:12:42 +0000 Date: Tue, 17 Jan 2023 16:12:42 +0100 From: Michal Hocko To: Suren Baghdasaryan Cc: akpm@linux-foundation.org, michel@lespinasse.org, jglisse@google.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com Subject: Re: [PATCH 12/41] mm: add per-VMA lock and helper functions to control it Message-ID: References: <20230109205336.3665937-1-surenb@google.com> <20230109205336.3665937-13-surenb@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 17-01-23 16:04:26, Michal Hocko wrote: > On Mon 09-01-23 12:53:07, Suren Baghdasaryan wrote: > > Introduce a per-VMA rw_semaphore to be used during page fault handling > > instead of mmap_lock. Because there are cases when multiple VMAs need > > to be exclusively locked during VMA tree modifications, instead of the > > usual lock/unlock patter we mark a VMA as locked by taking per-VMA lock > > exclusively and setting vma->lock_seq to the current mm->lock_seq. When > > mmap_write_lock holder is done with all modifications and drops mmap_lock, > > it will increment mm->lock_seq, effectively unlocking all VMAs marked as > > locked. > > I have to say I was struggling a bit with the above and only understood > what you mean by reading the patch several times. I would phrase it like > this (feel free to use if you consider this to be an improvement). > > Introduce a per-VMA rw_semaphore. The lock implementation relies on a > per-vma and per-mm sequence counters to note exclusive locking: > - read lock - (implemented by vma_read_trylock) requires the the > vma (vm_lock_seq) and mm (mm_lock_seq) sequence counters to > differ. If they match then there must be a vma exclusive lock > held somewhere. > - read unlock - (implemented by vma_read_unlock) is a trivial > vma->lock unlock. > - write lock - (vma_write_lock) requires the mmap_lock to be > held exclusively and the current mm counter is noted to the vma > side. This will allow multiple vmas to be locked under a single > mmap_lock write lock (e.g. during vma merging). The vma counter > is modified under exclusive vma lock. Didn't realize one more thing. Unlike standard write lock this implementation allows to be called multiple times under a single mmap_lock. In a sense it is more of mark_vma_potentially_modified than a lock. > - write unlock - (vma_write_unlock_mm) is a batch release of all > vma locks held. It doesn't pair with a specific > vma_write_lock! It is done before exclusive mmap_lock is > released by incrementing mm sequence counter (mm_lock_seq). > - write downgrade - if the mmap_lock is downgraded to the read > lock all vma write locks are released as well (effectivelly > same as write unlock). -- Michal Hocko SUSE Labs