Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp11819476rwd; Thu, 22 Jun 2023 19:58:15 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7D/rEyEjoXxTqh0HHEuaZuU7wNeSuyVe8b1sfKMisFP7dXCum9JkO3N71S7vymaZ82HzNw X-Received: by 2002:a05:6808:16aa:b0:3a0:50c0:4a5d with SMTP id bb42-20020a05680816aa00b003a050c04a5dmr5926653oib.56.1687489095516; Thu, 22 Jun 2023 19:58:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687489095; cv=none; d=google.com; s=arc-20160816; b=Fw1pQ04f61QTPi3Tb332l9pIdxJj9Vk0ljNScFkURHfjHIWSUhUYioJMB1W9j0KvTW OgQP1XuYQDdIE+8wj7eihw7GwTEB0er7O0AhL1v/cv8Ra22xfpwrMa395OgXWtAxWgiE oTOBH3bDjmRjTilnLrGkUYJczwT9bgQfFr10eaQ2rXqkru2j1Ifk+cc3CQZ6f445xqSz 7RbAYWPd64s8QGFXdFyGl2xI4uF+l4807d3JzTOYqGTSwvAzPX+R0HI/10qDEKwcxa+v RCF32hvbVXkGMjSflbUB4Vqsh7EK2HGw7AvbOPQvvrcH39yRigzII7fRzLEMYhDSJf4Q eksA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=I+VMhLMXxWq6sRcX/DfOebgIsflCUWXieno3lSi5mbA=; b=PbQpjuCHrg5QVKcvx+forJQln77BtAiNvyEOAk6CVQwy6tTWBFk1xd5HiFjQZlT2nw ElVE+SW4GWd2vuTCqTlnsvdT+hzxFm6PBgJPcghFQVyPYQnc/y6bdd5UmIKzHO0RKT4I Jram0rv8781MCv+T8mteNzLOi1lTvOOW3gCDQog3XEK1IKe+XfRzzmrlYGclU9vhixJL jOVydANHT8ME3syDXwfM9Yj/Era2HuXP2UWLOyQx3mfbZNMMNn9WZzaJmZnQ7Ow+n7xv VYTmzQuPU/hBuSD1fXBSzuqJJqtPZ23cBzMG+gF9mxEHO6Zf0f7srO2LvGOw6Ht1JVwc z1NQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@mit.edu header.s=outgoing header.b=hlkYQZ63; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=mit.edu Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t70-20020a637849000000b0053f0cdab820si3789470pgc.259.2023.06.22.19.57.57; Thu, 22 Jun 2023 19:58:15 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=fail header.i=@mit.edu header.s=outgoing header.b=hlkYQZ63; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=mit.edu Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231358AbjFWCcx (ORCPT + 99 others); Thu, 22 Jun 2023 22:32:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49474 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229747AbjFWCcw (ORCPT ); Thu, 22 Jun 2023 22:32:52 -0400 Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C6DCF2137 for ; Thu, 22 Jun 2023 19:32:50 -0700 (PDT) Received: from cwcc.thunk.org (pool-173-48-111-196.bstnma.fios.verizon.net [173.48.111.196]) (authenticated bits=0) (User authenticated as tytso@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 35N2WXDD008707 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 22 Jun 2023 22:32:34 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mit.edu; s=outgoing; t=1687487555; bh=I+VMhLMXxWq6sRcX/DfOebgIsflCUWXieno3lSi5mbA=; h=Date:From:To:Cc:Subject:References:In-Reply-To; b=hlkYQZ63iYW+xAKCWi4QfD6c1RID7JUVQDmFIrIUBH8pbreSMvaYcCD9XSfZSAZAJ wlV8qBJjxkEFyOinq3d987Ix8tbRKPFNmoqYOBIBSD1ejkb5dkYy59LVbl615k2Hw5 R3dYPdMA1S7z0yVbTobpUMMlloI2ckJwjFCjR37pAS6/Y9Bq3LmuCjfosaXEglKngx 8FVNNg5mWLbyYP9g44Jxxb9G8rzuPebXCD4b8GwX+kltNSscLZ+V1QnarltcWfbkrC ae+QRfpNj3JVWCKJRLXI2rIRyzGm98sALsCuX9KGmOzGSGZbwqe5nmUpL3Ad+vda/F t9esX0ssgGa3A== Received: by cwcc.thunk.org (Postfix, from userid 15806) id 5DBF715C027E; Thu, 22 Jun 2023 22:32:33 -0400 (EDT) Date: Thu, 22 Jun 2023 22:32:33 -0400 From: "Theodore Ts'o" To: Dave Chinner Cc: Jeremy Bongio , "Darrick J . Wong" , Allison Henderson , linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org Subject: Re: [PATCH 0/1] iomap regression for aio dio 4k writes Message-ID: <20230623023233.GC34229@mit.edu> References: <20230621174114.1320834-1-bongiojp@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-4.0 required=5.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Thu, Jun 22, 2023 at 09:59:29AM +1000, Dave Chinner wrote: > Ah, you are testing pure overwrites, which means for ext4 the only > thing it needs to care about is cached mappings. What happens when > you add O_DSYNC here? I think you mean O_SYNC, right? In a pure overwrite case, where all of the extents are initialized and where the Oracle or DB2 server is doing writes to preallocated, pre-initialized space in the tablespace file followed by fdatasync(), there *are* no post-I/O data integrity operations which are required. If the file is opened O_SYNC or if the blocks were not preallocated using fallocate(2) and not initialized ahead of time, then sure, we can't use this optimization. However, the cases where databases workloads *are* doing overwrites and using fdatasync(2) most certainly do exist, and the benefit of this optimization can be a 20% throughput. Which is nothing to sneeze at. What we might to do is to let the file system tell the iomap layer via a flag whether or not there are no post-I/O metadata operations required, and then *if* that flag is set, and *if* the inode has no pages in the page cache (so there are no invalidate operations necessary), it should be safe to skip using queue_work(). That way, the file system has to affirmatively state that it is safe to skip the workqueue, so it shouldn't do any harm to other file systems using the iomap DIO layer. What am I missing? Cheers, - Ted