Received: by 2002:a05:7412:251c:b0:e2:908c:2ebd with SMTP id w28csp2226066rda; Tue, 24 Oct 2023 17:06:45 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEzko8ytCUcojbGhjxt3VSfS8znxB8DSly8boENZHR7tCGTiZmnPDyGqsDDkuPzrABs7GfY X-Received: by 2002:a05:6a20:2443:b0:14c:c393:692 with SMTP id t3-20020a056a20244300b0014cc3930692mr4368503pzc.7.1698192404696; Tue, 24 Oct 2023 17:06:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698192404; cv=none; d=google.com; s=arc-20160816; b=Ps6O+o1ccwwiBWPQIeTf7Ws7voXedGG4AIXFgwgnpb0P6QP30n+bST2+F/9hnNucoK NLglfqlPO3gAMH1OkXl3W3+2IJHq18ajUtgwvfy6x1TLYN/1p0Sk2O+EusIziql+04P1 3L6pziWhdB1JEyZoblUMWNmYtjO29lkwfrdZRsucoNKRXmVyrMPgL65qhfppwM6rQYjz 0qpO/Jv1dpf16u/rPXBoPFPugjskbHob+sQWo9K6H5vSf+XEhCYtNcD2RjGirUYP3hf8 ntK9C9Aa7rw8yrDY/TM+8CnVoSrukgY46iAIrXH2Wq1xBKrk/Q5TDM+Zs2xBEsACUFjs MmxA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=9cuh9pnEhjPE9K0Glopq/FvywTEeksxy32LVZfdqmQU=; fh=XeucOVInh0SXcsV8ZIiKiWGjh+g7loItamJj8n976T4=; b=uSAY9MjE7iwOBQkI2RZ99V3j45+bpklWhz4J4kBMUBfPpAm7wimK2FDD57qzEV7MtA HT3vAmaq1mCk8RDKMjZMi4cGqeKqMwySZOz1pc30Zbkcsk1EYu3mjtI3GLYs7+8WCq0u T7v+FECdhy7bp/PJFK8YsBG7XSCWEjrS0CdGd2bY/XrwI+GBDXIPNA2Ql0+LRaNVT1Er /HlSmwBYEGn0+oRrhcWWgscCXr0qjDk2VvKWeH/+bD9o27WkB2iobCQyx/GRmVMfSQsa KX3TM8KCq3/BOE5KMMyLrExDK17x8R0BClTPobtuBe18Sulm7yDihgi9n/0rXNwC6LC0 VJog== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b=It68cFMD; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=fromorbit.com Return-Path: Received: from pete.vger.email (pete.vger.email. [2620:137:e000::3:6]) by mx.google.com with ESMTPS id ca27-20020a056a02069b00b0058555ea0a21si10031685pgb.571.2023.10.24.17.06.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Oct 2023 17:06:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) client-ip=2620:137:e000::3:6; Authentication-Results: mx.google.com; dkim=pass header.i=@fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b=It68cFMD; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=fromorbit.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id 09B69802F8EB; Tue, 24 Oct 2023 17:06:42 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344703AbjJYAGb (ORCPT + 99 others); Tue, 24 Oct 2023 20:06:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57992 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344663AbjJYAGb (ORCPT ); Tue, 24 Oct 2023 20:06:31 -0400 Received: from mail-pj1-x1033.google.com (mail-pj1-x1033.google.com [IPv6:2607:f8b0:4864:20::1033]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ACC4210CE for ; Tue, 24 Oct 2023 17:06:29 -0700 (PDT) Received: by mail-pj1-x1033.google.com with SMTP id 98e67ed59e1d1-27d45f5658fso4020942a91.3 for ; Tue, 24 Oct 2023 17:06:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20230601.gappssmtp.com; s=20230601; t=1698192389; x=1698797189; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=9cuh9pnEhjPE9K0Glopq/FvywTEeksxy32LVZfdqmQU=; b=It68cFMDDobg3AcgynMInXvOe8BYNj7Txn1HGrjQIbtVjlJvfO1IbT9dkjQ65tCLbz 1lK2GYe0U6NxcLPacJmcRONNZ1u+wxgXgPLM0qY2Y3uaB4rLkhUbm3v8YsA2rsTPQjpX neN7TYxy5dtQLrNBusW+3991fM1HBtqWVKyCShPC5kfl51n+9kTsWXCsnO9Qvqvs83+l bHpvuGOBqMHr1mS9MI7vJK4Nc4WCkz/gO4Z0fSc1Hr7oACjbDdRKy3rfhxOiNvN/PBSc XSehuM1hZIAqPQxNWAz4AYaIawVWJkjVyBWgNXVmK66icDxpPYZsRwmxD+FJp5Jsf7X4 vkiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698192389; x=1698797189; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=9cuh9pnEhjPE9K0Glopq/FvywTEeksxy32LVZfdqmQU=; b=gZBsfOmM8nD6ZDSvZhYC8bzMN74oIywMkAA3P734yh2i3K94q9QhTykaTzd5nylezJ 0VTJ/V+dt1af50ktKfsScsUe/6x9YZBnMFl21IJfmRG8ua77ROEVM9Q2ZxCymjb/TP5V ZU1bavgGPMR3aa9b4H6kv5dMI2OOOPw611JfHaflmnwqHcj6GcDRIi0XGmEAHeabYkjG copphPOEKnZKAQiZPYp2ds14MGnLZ4hexCDnYFpN+yfiVumGgtJGF6+RAcmZk172EnS6 5QSrn74IKp7NPdow+y1Wp1ssmawY31jWP5QVhUsvNttg2PYb8TjofabQSW5uB3pd5sTJ mPqA== X-Gm-Message-State: AOJu0YwrssBb7PuFQG5Yx458Uc93hMsFK9D1OqiPLlYBvIoMLIRUhc2I 8Blk3KuAXx0nPE2Cs1rC5tUAXg== X-Received: by 2002:a17:90b:3d8a:b0:27d:b87b:a9d4 with SMTP id pq10-20020a17090b3d8a00b0027db87ba9d4mr13144116pjb.7.1698192389083; Tue, 24 Oct 2023 17:06:29 -0700 (PDT) Received: from dread.disaster.area (pa49-180-20-59.pa.nsw.optusnet.com.au. [49.180.20.59]) by smtp.gmail.com with ESMTPSA id l21-20020a17090a599500b00277560ecd5dsm9144553pji.46.2023.10.24.17.06.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Oct 2023 17:06:28 -0700 (PDT) Received: from dave by dread.disaster.area with local (Exim 4.96) (envelope-from ) id 1qvRPl-003WMY-2O; Wed, 25 Oct 2023 11:06:25 +1100 Date: Wed, 25 Oct 2023 11:06:25 +1100 From: Dave Chinner To: Jens Axboe Cc: Andres Freund , Theodore Ts'o , Thorsten Leemhuis , Shreeya Patel , linux-ext4@vger.kernel.org, Ricardo =?iso-8859-1?Q?Ca=F1uelo?= , gustavo.padovan@collabora.com, zsm@google.com, garrick@google.com, Linux regressions mailing list , io-uring@vger.kernel.org Subject: Re: task hung in ext4_fallocate #2 Message-ID: References: <20231017033725.r6pfo5a4ayqisct7@awork3.anarazel.de> <20231018004335.GA593012@mit.edu> <20231018025009.ulkykpefwdgpfvzf@awork3.anarazel.de> <74921cba-6237-4303-bb4c-baa22aaf497b@kernel.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Tue, 24 Oct 2023 17:06:42 -0700 (PDT) On Tue, Oct 24, 2023 at 12:35:26PM -0600, Jens Axboe wrote: > On 10/24/23 8:30 AM, Jens Axboe wrote: > > I don't think this is related to the io-wq workers doing non-blocking > > IO. The io-wq worker that has deadlocked _must_ be doing blocking IO. If it was doing non-blocking IO (i.e. IOCB_NOWAIT) then it would have done a trylock and returned -EAGAIN to the worker for it to try again later. I'm not sure that would avoid the issue, however - it seems to me like it might just turn it into a livelock rather than a deadlock.... > > The callback is eventually executed by the task that originally > > submitted the IO, which is the owner and not the async workers. But... > > If that original task is blocked in eg fallocate, then I can see how > > that would potentially be an issue. > > > > I'll take a closer look. > > I think the best way to fix this is likely to have inode_dio_wait() be > interruptible, and return -ERESTARTSYS if it should be restarted. Now > the below is obviously not a full patch, but I suspect it'll make ext4 > and xfs tick, because they should both be affected. How does that solve the problem? Nothing will issue a signal to the process that is waiting in inode_dio_wait() except userspace, so I can't see how this does anything to solve the problem at hand... I'm also very leary of adding new error handling complexity to paths like truncate, extent cloning, fallocate, etc which expect to block on locks until they can perform the operation safely. On further thinking, this could be a self deadlock with just async direct IO submission - submit an async DIO with IOCB_CALLER_COMP, then run an unaligned async DIO that attempts to drain in-flight DIO before continuing. Then the thread waits in inode_dio_wait() because it can't run the completion that will drop the i_dio_count to zero. Hence it appears to me that we've missed some critical constraints around nesting IO submission and completion when using IOCB_CALLER_COMP. Further, it really isn't clear to me how deep the scope of this problem is yet, let alone what the solution might be. With all this in mind, and how late this is in the 6.6 cycle, can we just revert the IOCB_CALLER_COMP changes for now? -Dave. -- Dave Chinner david@fromorbit.com