Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp1566594rwd; Thu, 25 May 2023 14:44:08 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6CnXnYDpB29GLgDrinF2pU5BBaeloaXg+Z5bvzwyQBKh5M5aUKMIiuyWCesgrM0tgaoK2O X-Received: by 2002:a05:6a00:b4e:b0:647:776c:d19c with SMTP id p14-20020a056a000b4e00b00647776cd19cmr223539pfo.13.1685051048228; Thu, 25 May 2023 14:44:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685051048; cv=none; d=google.com; s=arc-20160816; b=dvMmhcBlmD1hUo72U/m8aa8cwt7heh5MrscOTIhZaXHkZXdZUvdI8wRCNj8K6DFQWI hIBSxDSbpznjKJMkvY4fFLwEUxNkjnz6kFReDOKjwozVFWVwNfnqRheuw8KQKKeNqXHw HGS2vvkE85llF7fOp2KhWZYcjwXyn3wIMHk1VT0TED+KHDJvkfjtp5PsnzkkNmP9mZey x0YqsNPM9oFXiWAy3pf8RUA/VywVbhLzHEEhijSpwC8FvxVDVl9FOCbXEbpwEOcKb91a G154ahndpNAfv9YWsSmHRX7pTYz0pBH3pWd+hStmOqivAPwlACOO2vYWxP7LRU9gVuZZ lBvw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:dkim-signature:date; bh=NpYQakP/W6RPFRm8zMcoftE0dEwMWiRIJfKT61WqU5A=; b=BlWLcitM3Xpv7po4pNxG0VcP+modES8H5uwcv1vhi37uM+jLd6PZhBTu4EcHuAitnV 3UXiPRGnNw6eNO/AtJ+S0Mynd8kGCcLtWysoWI6j0qIQVQoLa7r3esHciyNpw0mX85As J47853DrXg1P0qRcReCfD600A8EMRlCM0a6k09cMawR3jAx8RgxUrtH7ZyEPK9jlPVTy iTg+3VRMSCQhHqhITQzYcunLivH3wTLUOwLkmyrewPY80hSyEEuuY038xODRje9hYCVa x9cLSVMj91qeDRTED/qN0sdi0OMB5N93EpNJRxkPZTOh5j40sHoywpW7hsL3FaoouDZH qaIw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=roxaxnnC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id d185-20020a621dc2000000b0064d28479818si2249814pfd.96.2023.05.25.14.43.53; Thu, 25 May 2023 14:44:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=roxaxnnC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241819AbjEYVhA (ORCPT + 99 others); Thu, 25 May 2023 17:37:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43076 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240486AbjEYVg7 (ORCPT ); Thu, 25 May 2023 17:36:59 -0400 Received: from out-51.mta1.migadu.com (out-51.mta1.migadu.com [IPv6:2001:41d0:203:375::33]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 37074E6 for ; Thu, 25 May 2023 14:36:56 -0700 (PDT) Date: Thu, 25 May 2023 17:36:50 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1685050615; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=NpYQakP/W6RPFRm8zMcoftE0dEwMWiRIJfKT61WqU5A=; b=roxaxnnCj6QAgIhBd9kxUeFAPPtmRVShCsgRk0igZ5IGFV8A7jpK/mwf08EfF1eOFHZGO7 HhKkv9PeMGbSwxEK7v6BZzFwNEaqiGws2iz7UaOgLwxxM4E0oEiFCv9UbOTyherZqJI/Bb pVen4KprM+zmgXIpbY5iu33iRIkEZ7Y= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Kent Overstreet To: Jan Kara Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-bcachefs@vger.kernel.org, Kent Overstreet , "Darrick J . Wong" , dhowells@redhat.com, Andreas Gruenbacher , cluster-devel@redhat.com, Bob Peterson Subject: Re: [PATCH 06/32] sched: Add task_struct->faults_disabled_mapping Message-ID: References: <20230509165657.1735798-1-kent.overstreet@linux.dev> <20230509165657.1735798-7-kent.overstreet@linux.dev> <20230510010737.heniyuxazlprrbd6@quack3> <20230523133431.wwrkjtptu6vqqh5e@quack3> <20230525084731.losrlnarpbqtqzil@quack3> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230525084731.losrlnarpbqtqzil@quack3> X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, May 25, 2023 at 10:47:31AM +0200, Jan Kara wrote: > If we submit direct IO that uses mapped file F at offset O as a buffer for > direct IO from file F, offset O, it will currently livelock in an > indefinite retry loop. It should rather return error or fall back to > buffered IO. But that should be fixable. Andreas? > > But if the buffer and direct IO range does not overlap, it will just > happily work - iomap_dio_rw() invalidates only the range direct IO is done > to. *nod* readahead triggered from the page fault path is another consideration. No idea how that interacts with the gf2s method; IIRC there's a hack in the page fault path that says somewhere "we may be getting called via gup(), don't invoke readahead". We could potentially kill that hack if we lifted this to the VFS layer. > > > What happens if we race with the pages we faulted in being evicted? > > We fault them in again and retry. > > > > Also good that you've written a fstest for this, that is definitely a useful > > > addition, although I suspect GFS2 guys added a test for this not so long > > > ago when testing their stuff. Maybe they have a pointer handy? > > > > More tests more good. > > > > So if we want to lift this scheme to the VFS layer, we'd start by > > replacing the lock you added (grepping for it, the name escapes me) with > > a different type of lock - two_state_shared_lock in my code, it's like a > > rw lock except writers don't exclude other writers. That way the DIO > > path can use it without singlethreading writes to a single file. > > Yes, I've noticed that you are introducing in bcachefs a lock with very > similar semantics to mapping->invalidate_lock, just with this special lock > type. What I'm kind of worried about with two_state_shared_lock as > implemented in bcachefs is the fairness. AFAICS so far if someone is e.g. > heavily faulting pages on a file, direct IO to that file can be starved > indefinitely. That is IMHO not a good thing and I would not like to use > this type of lock in VFS until this problem is resolved. But it should be > fixable e.g. by introducing some kind of deadline for a waiter after which > it will block acquisitions of the other lock state. Yeah, my two_state_shared lock is definitely at the quick and dirty prototype level, the implementation would need work. Lockdep support would be another hard requirement. The deadline might be a good idea, OTOH it'd want tuning. Maybe something like what rwsem does where we block new read acquirerers if there's a writer waiting would work.