Received: by 2002:a25:e74b:0:0:0:0:0 with SMTP id e72csp244713ybh; Sun, 12 Jul 2020 05:03:20 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwjC8UphZAZ7BhMgmHPZNt3g15D6AaSiUizP6CUSt/nV841FkhJJ62DQ/57GCYfcoNoMtfX X-Received: by 2002:aa7:da4c:: with SMTP id w12mr73105764eds.122.1594555400325; Sun, 12 Jul 2020 05:03:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1594555400; cv=none; d=google.com; s=arc-20160816; b=RKAsqmAC1E10wbR+kudMPOCBtUMy6HpjjDplgHP/ef1nEjc8JNc7VlVZ16/F8+Wqji OPm8Ec2XiOaYKDqrrftoUlBSMJ84pksQ2nbxZP53Pdu5ysJV/QNMyooFaX5SEeNEMpAn IAsBvmMuZeNtH90SJ2zW0XPj3DRtwvXsJ5e4TigZyxfygYNEb4p6qsJbv/aIdGuhob3d LGemBXlvhLS4FIn8u2/d9aINsMrstlO2hZeXu76gX/ujfFclQEx4rCzEMG/iqnfH1ORf 2EEBioLkMdiyehdZiqNfLIoWiJyHVQGk9JiOg/cX6GrydFbqpzcTvnmg3qkTYcJ6msJF GhlA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:organization:from:references:cc:to:subject :dkim-signature; bh=5A9cDnFjTG3NBzPmqHsPL3YmbPK4apFGGEKOIxq7+1E=; b=jvbFP/reGedISHbMTLzOpR3jXomZZah8wfhW0MOtbJU3MoAo+hHcYVNoVOM4LJUXoB 5nVF5V3IK4ktsqnbuTFmVSvwt7oNIlhieZOXiCRA5GiU85TPtXX/l+YaMJCpcJAMBJ0H AJm8QcCXADNDbJ+BYZ1PoNVUwvftrc95YPNzwxpE0z0un13wZ8OK0FN5FMjr8OfMJ5P6 3I41+Jyj7tnAVpHezqel9lnn1o2GkIMwsww1L3PRTi3lbe1n5xgksenUUvt1nVLKdoAC JxafLykaqyNbZGrLXCTa28/zjMoJ+6A0rqQEZolq/Du6KuhxRAJ6l5SG9ndMYeQj+yeZ GJbw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@scylladb-com.20150623.gappssmtp.com header.s=20150623 header.b=ZiuD87Ds; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id cx13si6924540edb.280.2020.07.12.05.02.34; Sun, 12 Jul 2020 05:03:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@scylladb-com.20150623.gappssmtp.com header.s=20150623 header.b=ZiuD87Ds; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728818AbgGLLgg (ORCPT + 99 others); Sun, 12 Jul 2020 07:36:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34940 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728799AbgGLLgf (ORCPT ); Sun, 12 Jul 2020 07:36:35 -0400 Received: from mail-wr1-x443.google.com (mail-wr1-x443.google.com [IPv6:2a00:1450:4864:20::443]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E127BC08C5DD for ; Sun, 12 Jul 2020 04:36:34 -0700 (PDT) Received: by mail-wr1-x443.google.com with SMTP id r12so10063025wrj.13 for ; Sun, 12 Jul 2020 04:36:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=scylladb-com.20150623.gappssmtp.com; s=20150623; h=subject:to:cc:references:from:organization:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=5A9cDnFjTG3NBzPmqHsPL3YmbPK4apFGGEKOIxq7+1E=; b=ZiuD87Dsu+6b7jF8K06Oi76ztxdd7tcCfM24FD0meQMd0en5C46y3i/vjGou1ps+mp IINa+5udjSzWkifAEdUzA21DvZcisQqApTQfvyUaleXNt2uCllIdltFhYN4TsL5IU3Wz f+67WzizA9s7b3wjdRsNSf6wAguRT8/W0xLrxIM+zD0y19fcSXwwmG//8Ba/KKYoH/ck XZSqpzoX6xVbBJ0gnk3TJFGiHv4bUtwLjQfHq1aCeVkUR8byjONl687PUWDkIYlUScSe ptjblxeDeZ6iP9JetGcVIDSG8L9WbbwkPQFwcQ+DIzroN6zwuC1FzvfZxRMeeZ9Ml1LC 0N1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:organization :message-id:date:user-agent:mime-version:in-reply-to :content-transfer-encoding:content-language; bh=5A9cDnFjTG3NBzPmqHsPL3YmbPK4apFGGEKOIxq7+1E=; b=I1QFIdbmNLK10DUhGWC7TSO1vtsyIxOBQl7jlgufGIlnFtxaC+qiuFpmAywy1AKo/n bKoEAkskkkHtpe5kuEJ0enpocUaUCodhNpoVf/8RNe9NR/TryqaLCri7MzY8cBisQIgt E5Z89/K+dVJ1SYt4y6+pt2lkw1XWa2Gq8gb0ogLAwe1oNf14e+C6XnxDb0jh+XTDNSlf ZrOBhi1Wg5iPD45SucsaDgaF1IlS/U1lQvI2LTsIG51WnQkjbxoq7oLBBGm4qio87dN9 5cpd0bhosfuJGSblin3Ht0WRY5gxcnaU/mQmac7NwascDG7NWtdzxfw+adxdenWjkkKJ QMIQ== X-Gm-Message-State: AOAM532gvSJ1RRxjK4JGlt55K3EifewWPbSlH3xQXmMBaFWEMCJ2rZ7+ vfBoFlUMOtTKmke1EwR4LmxDNQ== X-Received: by 2002:adf:f889:: with SMTP id u9mr82733150wrp.149.1594553793471; Sun, 12 Jul 2020 04:36:33 -0700 (PDT) Received: from [10.0.0.1] (system.cloudius-systems.com. [199.203.229.89]) by smtp.gmail.com with ESMTPSA id m2sm7020970wmg.0.2020.07.12.04.36.29 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 12 Jul 2020 04:36:32 -0700 (PDT) Subject: Re: always fall back to buffered I/O after invalidation failures, was: Re: [PATCH 2/6] iomap: IOMAP_DIO_RWF_NO_STALE_PAGECACHE return if page invalidation fails To: Dave Chinner , Matthew Wilcox Cc: Christoph Hellwig , Goldwyn Rodrigues , linux-fsdevel@vger.kernel.org, linux-btrfs@vger.kernel.org, fdmanana@gmail.com, dsterba@suse.cz, darrick.wong@oracle.com, cluster-devel@redhat.com, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org References: <20200629192353.20841-1-rgoldwyn@suse.de> <20200629192353.20841-3-rgoldwyn@suse.de> <20200701075310.GB29884@lst.de> <20200707124346.xnr5gtcysuzehejq@fiona> <20200707125705.GK25523@casper.infradead.org> <20200707130030.GA13870@lst.de> <20200708065127.GM2005@dread.disaster.area> <20200708135437.GP25523@casper.infradead.org> <20200709022527.GQ2005@dread.disaster.area> From: Avi Kivity Organization: ScyllaDB Message-ID: Date: Sun, 12 Jul 2020 14:36:28 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.9.0 MIME-Version: 1.0 In-Reply-To: <20200709022527.GQ2005@dread.disaster.area> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On 09/07/2020 05.25, Dave Chinner wrote: > >> Nobody's proposing changing Direct I/O to exclusively work through the >> pagecache. The proposal is to behave less weirdly when there's already >> data in the pagecache. > No, the proposal it to make direct IO behave *less > deterministically* if there is data in the page cache. > > e.g. Instead of having a predicatable submission CPU overhead and > read latency of 100us for your data, this proposal makes the claim > that it is always better to burn 10x the IO submission CPU for a > single IO to copy the data and give that specific IO 10x lower > latency than it is to submit 10 async IOs to keep the IO pipeline > full. > > What it fails to take into account is that in spending that CPU time > to copy the data, we haven't submitted 10 other IOs and so the > actual in-flight IO for the application has decreased. If > performance comes from keeping the IO pipeline as close to 100% full > as possible, then copying the data out of the page cache will cause > performance regressions. > > i.e. Hit 5 page cache pages in 5 IOs in a row, and the IO queue > depth craters because we've only fulfilled 5 complete IOs instead of > submitting 50 entire IOs. This is the hidden cost of synchronous IO > via CPU data copying vs async IO via hardware offload, and if we > take that into account we must look at future hardware performance > trends to determine if this cost is going to increase or decrease in > future. > > That is: CPUs are not getting faster anytime soon. IO subsystems are > still deep in the "performance doubles every 2 years" part of the > technology curve (pcie 3.0->4.0 just happened, 4->5 is a year away, > 5->6 is 3-4 years away, etc). Hence our reality is that we are deep > within a performance trend curve that tells us synchronous CPU > operations are not getting faster, but IO bandwidth and IOPS are > going to increase massively over the next 5-10 years. Hence putting > (already expensive) synchronous CPU operations in the asynchronous > zero-data-touch IO fast path is -exactly the wrong direction to be > moving-. > > This is simple math. The gap between IO latency and bandwidth and > CPU addressable memory latency and bandwidth is closing all the > time, and the closer that gap gets the less sense it makes to use > CPU addressable memory for buffering syscall based read and write > IO. We are not quite yet at the cross-over point, but we really > aren't that far from it. > > My use-case supports this. The application uses AIO+DIO, but backup may bring pages into page cache. For me, it is best to ignore page cache (as long as it's clean, which it is for backup) and serve from disk as usual.