Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp131309pxb; Mon, 2 Nov 2020 16:20:59 -0800 (PST) X-Google-Smtp-Source: ABdhPJxDM+dbwo7+1+4rSGs/WGHZqS4QqX3vdurt+ViQ75rVAZHv4xyapNE799tiU0OAz9apoXuw X-Received: by 2002:a05:6402:22d9:: with SMTP id dm25mr18891042edb.182.1604362859471; Mon, 02 Nov 2020 16:20:59 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1604362859; cv=none; d=google.com; s=arc-20160816; b=VJ5ohD1n360zYyYQIpHYmoZO4nJ3LVTqRFjf1Px9wC+KM7/KER9Hs+r0KzpiMfgUox 9VcLXsF+DPfuZ9+e5e4+81fruSCkVhjoLQsUix5Q65ZrdOqSMBb0xmLp+E/mjeKABVw6 5bkUTSD59+i1zrt3yZWs4C66DhOmE6x6jphJf0BZY25fhElgFwDkyvn6P0RAP1m444Mn TYsg9qR77lUhlXf7qLEaPNzKbfD/w0JG9xhPaKtdP6FL6pictMHx7oQsC79bRgmpgWqa LQs6Dyp46onkUQw+uxnZcRIiDK3ih80YtZaBg6rjmZ/kTpaY4QoVaP9pV7BnRJnOfT3d YudA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:dkim-signature :dkim-signature:date; bh=MByvycUX+rn60Vq2M21WP4xK2GmTKyADM01aIsR7iQU=; b=pE76ECbEdINhYenjyDa8UVc1h6GnL9ylm3lqw//KmY89bM+GcMTHn2kB5yrkcLG/dZ yLG9gxhCzdnQDQ4gKBkYhnmsFgN8VahbZ5t1omLd07XTiNW+HA/aL10wN0Aq6bXIhY2t XOeBc6h3qnqy/0BsIblRwskT8UMNmXKR2TBHHapCi0GRf5W5xEFH9dTo22rKlVoep7mR BVEMzbWlJt7lbSsOEYd3Hd9pHJW7hOTlYyy2kk1LHpm4NjJJ7jq9e9NvQW8UD+HOczFQ gwl2oFZ0eet26ADtz5eonwKGVx0AVBdhEp8wcMq29CXvzrC7SN8eEMt8jh6QVH1jsxR4 lsmw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=GP7dznI9; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=47Sa4dnZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id j23si7806950ejc.46.2020.11.02.16.20.35; Mon, 02 Nov 2020 16:20:59 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=GP7dznI9; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=47Sa4dnZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727063AbgKCARR (ORCPT + 99 others); Mon, 2 Nov 2020 19:17:17 -0500 Received: from Galois.linutronix.de ([193.142.43.55]:34070 "EHLO galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725940AbgKCARQ (ORCPT ); Mon, 2 Nov 2020 19:17:16 -0500 Date: Tue, 3 Nov 2020 01:17:12 +0100 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1604362634; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=MByvycUX+rn60Vq2M21WP4xK2GmTKyADM01aIsR7iQU=; b=GP7dznI9z6LRZT6wgbHKVocpu2MqqLHkQBq5MI6Ao32VwjpYLBWttOdvkYqXKu6H0sfUY6 A+izxSs9gBfEX3xDUXVRefkNu9BSc0A4UyKg6KIBhR4nr/RHT7brjb65o3wuo8nVJ+CqRX quVSL8JMQEl2EieF/ThIPPgpiNOYvredmOZ01Zx69bHtL37SjTGQNdmMrn0wEBReI14BOx qfrM3eWDJkuoP9VU+Yhm9YlPIGWzYQIzzydaBOsLAej4WBJCZPZPsNpfC2K5SzkROhRFiB HYJ9NLGVpHr8yAUY05ou3uDRotclr17GvLU0DB1bkBdj76UcwAUMo+BbFfaxeA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1604362634; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=MByvycUX+rn60Vq2M21WP4xK2GmTKyADM01aIsR7iQU=; b=47Sa4dnZdc4cdcAGn1otUqMq1Eeu10KAf8Kx3rrLhXH/EqPxkJMQhQ9hf6hFrvNxUbaSOf aY/zSGYgVQdj4LBA== From: "Ahmed S. Darwish" To: Jason Gunthorpe Cc: Peter Xu , linux-kernel@vger.kernel.org, Linus Torvalds , Andrea Arcangeli , Andrew Morton , "Aneesh Kumar K.V" , Christoph Hellwig , Hugh Dickins , Jan Kara , Jann Horn , John Hubbard , Kirill Shutemov , Kirill Tkhai , Leon Romanovsky , Linux-MM , Michal Hocko , Oleg Nesterov , Peter Zijlstra , Ingo Molnar , Will Deacon , Thomas Gleixner , Sebastian Siewior Subject: Re: [PATCH v2 2/2] mm: prevent gup_fast from racing with COW during fork Message-ID: <20201103001712.GB52235@lx-t490> References: <0-v2-dfe9ecdb6c74+2066-gup_fork_jgg@nvidia.com> <2-v2-dfe9ecdb6c74+2066-gup_fork_jgg@nvidia.com> <20201030225250.GB6357@xz-x1> <20201030235121.GQ2620339@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20201030235121.GQ2620339@nvidia.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Oct 30, 2020 at 08:51:21PM -0300, Jason Gunthorpe wrote: > On Fri, Oct 30, 2020 at 06:52:50PM -0400, Peter Xu wrote: ... > > > > diff --git a/mm/memory.c b/mm/memory.c > > > index c48f8df6e50268..294c2c3c4fe00d 100644 > > > +++ b/mm/memory.c > > > @@ -1171,6 +1171,12 @@ copy_page_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma) > > > mmu_notifier_range_init(&range, MMU_NOTIFY_PROTECTION_PAGE, > > > 0, src_vma, src_mm, addr, end); > > > mmu_notifier_invalidate_range_start(&range); > > > + /* > > > + * The read side doesn't spin, it goes to the mmap_lock, so the > > > + * raw version is used to avoid disabling preemption here > > > + */ > > > + mmap_assert_write_locked(src_mm); > > > + raw_write_seqcount_t_begin(&src_mm->write_protect_seq); > > > > Would raw_write_seqcount_begin() be better here? > > Hum.. > > I felt no because it had the preempt stuff added into it, however it > would work - __seqcount_lock_preemptible() == false for the seqcount_t > case (see below) > > Looking more closely, maybe the right API to pick is > write_seqcount_t_begin() and write_seqcount_t_end() ?? > No, that's not the right API: it is also internal to seqlock.h. Please stick with the official exported API: raw_write_seqcount_begin(). It should satisfy your needs, and the raw_*() variant is created exactly for contexts wishing to avoid the lockdep checks (e.g. NMI handlers cannot invoke lockdep, etc.) > However, no idea what the intention of the '*_seqcount_t_*' family is > - it only seems to be used to implement the seqlock.. > Exactly. '*_seqcount_t_*' is a seqlock.h implementation detail, and it has _zero_ relevance to what is discussed in this thread actually. ... > Ahmed explained in commit 8117ab508f the reason the seqcount_t write > side has preemption disabled is because it can livelock RT kernels if > the read side is spinning after preempting the write side. eg look at > how __read_seqcount_begin() is implemented: > > while ((seq = __seqcount_sequence(s)) & 1) \ > cpu_relax(); \ > > However, in this patch, we don't spin on the read side. > > If the read side collides with a writer it immediately goes to the > mmap_lock, which is sleeping, and so it will sort itself out properly, > even if it was preempted. > Correct. Thanks, -- Ahmed Darwish Linutronix GmbH