Received: by 2002:a05:6a10:1287:0:0:0:0 with SMTP id d7csp4460667pxv; Tue, 20 Jul 2021 04:25:50 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxGXhzd4fJtY9+QIq3SPB0zXlLHhnVIv+nG2QKiRjud3AtbHsu9JuFmW4wru0wXu8ruyLsG X-Received: by 2002:a05:6402:1601:: with SMTP id f1mr39840409edv.388.1626780349859; Tue, 20 Jul 2021 04:25:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1626780349; cv=none; d=google.com; s=arc-20160816; b=TdKYVN2sMmj8YephitCAuJ2vzl4WcbKXcu+lq64Vmk6JvjCf7xjm+Cr/jDFaD7PEt7 9nUgOO1Vq/RpbFVFa6XXkMI7Wr/rYidT1DXd/Ivh8muAgxuqiiXL/rphKUDWqeq6QZRI tBpZcxt1qRCPU9Q1ckTj3VM3pJ61vQM+C6ipe2xDnh4a3+1BWgYChGwkmyl9dqdlnEhm m6V9nkMbeF1ChCjTXKUFtfdhBVW/MW9x99hiDnsH0Nf9F3fwnDUFc0wXQx9VSQ6tl1DT /m4ud1ObpBnmb8gwTZMw0lZhwE9Fuio117vfqRXC0qWJtk8+mRwLfBZJe9/YBiK/4uvf CPBg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=dVnkGAPBzNUrDjo/vYV+V2SBEEyGsdjuUedWnsOzsis=; b=F6cZwF0+rqnYjri0+yEJBR5nU56BkWA3ZaU/JdmTMuxnFT91sKTPWHXM8az++QQrtT nLpn+FaBlioj3SKRlPfq2osP+QrCIh2rlUO3+ThlveUMpNDGoRhPsznzE6elRu44Iv5s EE+rCK2/xvD1Q5emUER/S80v7txG/knx9W/Rc518stkvkboaa1g/1WllBgS/BPTWkR6x oN2RcYDdPKtOLjxjdJBkYw2jq3ZRyFoyyd1t74XL2t/EbLlGJuwzCU05nGGTh/3puFRG aF1WnjdwEP4M75zUyinKgN/CwGNOt/UEXTFnTEkuyeScy3DfC52ecMaM1Y7u4cw+TcA5 wAAA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=HhpzAjl0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id cm16si22656253edb.411.2021.07.20.04.25.26; Tue, 20 Jul 2021 04:25:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=HhpzAjl0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236701AbhGTKnR (ORCPT + 99 others); Tue, 20 Jul 2021 06:43:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56988 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231442AbhGTKnP (ORCPT ); Tue, 20 Jul 2021 06:43:15 -0400 Received: from mail-il1-x12e.google.com (mail-il1-x12e.google.com [IPv6:2607:f8b0:4864:20::12e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C3F98C061574; Tue, 20 Jul 2021 04:23:53 -0700 (PDT) Received: by mail-il1-x12e.google.com with SMTP id a11so18840387ilf.2; Tue, 20 Jul 2021 04:23:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=dVnkGAPBzNUrDjo/vYV+V2SBEEyGsdjuUedWnsOzsis=; b=HhpzAjl09BeOsEUIHpVHWdklJxETeRFiANMBco0ll1FiBP/oqIp82Rp0IOuN4qVKf/ qQnawPRP13Mt3ys+cZH7IYKZctRhluI109OvPpL4CRWr6teTA70+/oNb8BsdxyU63WHT hrG0rGLPRP2Jw46212Qk9eKPUTmtXFuZZ4SAmmSOfhJNiT38qdLD2BYccDeS8kZLk/I7 XGrNwkZYwR4sP84muKsPDF6xLTxQ7OdUNrbC0O3LDULWEYeC2bq54/6iTi8y86jmCK6g kaII1t+FF7+XEDeZ8dFoloGAxGDqZxcTXvquESb2qtIbnU94CVRLN5/j5ulZPZU7FwXm DP4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=dVnkGAPBzNUrDjo/vYV+V2SBEEyGsdjuUedWnsOzsis=; b=FpgEEIclCgEtvWrjfbEBemMHzehQl7lb9ynuQFs0AG83n6SRT9JhF4gI3OGul4Kor4 iZtNLb5hssmSNYE9aDed8HzpQl4bwXm1gpLmjpIwxf+3uCx3WQcKpmH9eCcc7vSHT0oz 44S4dQkPW/ii9MsejrTl8921W/rSRmK5Tgjhm/HibDivcX+HG78hB/xgML4TCc41nUUX s9HajtUsAjobCUhieHUtEx7WZd9Xd/DGeJnb0IDjl1EiPZ8+ZYuUybz4a9yj70+zAD0m 5OsIMtrzJW1mSV0YzUT9sRqsP4L1mfFYiWc+cAr+oucd/oj4xUvQsBax1W0D780huiqa 6hrA== X-Gm-Message-State: AOAM533l35Ba5GX0iLyuucIua8Z6NrXTYCSolB38UFX4D7ioCxFQRAv1 NanMwuNDwq6APzvXFlLAZzzWfR3IPtczewMBb1A= X-Received: by 2002:a05:6e02:c73:: with SMTP id f19mr19729417ilj.291.1626780232988; Tue, 20 Jul 2021 04:23:52 -0700 (PDT) MIME-Version: 1.0 References: <20210719144747.189634-1-hsiangkao@linux.alibaba.com> In-Reply-To: <20210719144747.189634-1-hsiangkao@linux.alibaba.com> From: =?UTF-8?Q?Andreas_Gr=C3=BCnbacher?= Date: Tue, 20 Jul 2021 13:23:41 +0200 Message-ID: Subject: Re: [PATCH v3] iomap: support tail packing inline read To: Gao Xiang Cc: linux-erofs@lists.ozlabs.org, Linux FS-devel Mailing List , LKML , Christoph Hellwig , "Darrick J . Wong" , Matthew Wilcox Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Am Mo., 19. Juli 2021 um 16:48 Uhr schrieb Gao Xiang : > This tries to add tail packing inline read to iomap, which can support > several inline tail blocks. Similar to the previous approach, it cleans > post-EOF in one iteration. > > The write path remains untouched since EROFS cannot be used for testing. > It'd be better to be implemented if upcoming real users care rather than > leave untested dead code around. > > Cc: Christoph Hellwig > Cc: Darrick J. Wong > Cc: Matthew Wilcox > Cc: Andreas Gruenbacher > Signed-off-by: Gao Xiang > --- > v2: https://lore.kernel.org/r/YPLdSja%2F4FBsjss%2F@B-P7TQMD6M-0146.local/ > changes since v2: > - update suggestion from Christoph: > https://lore.kernel.org/r/YPVe41YqpfGLNsBS@infradead.org/ > > Hi Andreas, > would you mind test on the gfs2 side? Thanks in advance! > > Thanks, > Gao Xiang > > fs/iomap/buffered-io.c | 50 ++++++++++++++++++++++++++---------------- > fs/iomap/direct-io.c | 11 ++++++---- > 2 files changed, 38 insertions(+), 23 deletions(-) > > diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c > index 87ccb3438bec..cac8a88660d8 100644 > --- a/fs/iomap/buffered-io.c > +++ b/fs/iomap/buffered-io.c > @@ -207,23 +207,22 @@ struct iomap_readpage_ctx { > > static void > iomap_read_inline_data(struct inode *inode, struct page *page, > - struct iomap *iomap) > + struct iomap *iomap, loff_t pos) > { > - size_t size = i_size_read(inode); > + unsigned int size, poff = offset_in_page(pos); > void *addr; > > - if (PageUptodate(page)) > - return; > - > - BUG_ON(page_has_private(page)); > - BUG_ON(page->index); > - BUG_ON(size > PAGE_SIZE - offset_in_page(iomap->inline_data)); > + /* inline source data must be inside a single page */ > + BUG_ON(iomap->length > PAGE_SIZE - offset_in_page(iomap->inline_data)); > + /* handle tail-packing blocks cross the current page into the next */ > + size = min_t(unsigned int, iomap->length + pos - iomap->offset, > + PAGE_SIZE - poff); Hmm, so EROFS really does multi-page tail packing? This contradicts the comment and code in iomap_dio_inline_actor. > addr = kmap_atomic(page); > - memcpy(addr, iomap->inline_data, size); > - memset(addr + size, 0, PAGE_SIZE - size); > + memcpy(addr + poff, iomap->inline_data - iomap->offset + pos, size); > + memset(addr + poff + size, 0, PAGE_SIZE - poff - size); > kunmap_atomic(addr); > - SetPageUptodate(page); > + iomap_set_range_uptodate(page, poff, PAGE_SIZE - poff); > } It's been said before, but iomap_read_inline_data should return PAGE_SIZE - poff, and no (void) casting when the return value is ignored. > static inline bool iomap_block_needs_zeroing(struct inode *inode, > @@ -246,18 +245,19 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data, > unsigned poff, plen; > sector_t sector; > > - if (iomap->type == IOMAP_INLINE) { > - WARN_ON_ONCE(pos); > - iomap_read_inline_data(inode, page, iomap); > - return PAGE_SIZE; > - } > - > - /* zero post-eof blocks as the page may be mapped */ > iop = iomap_page_create(inode, page); > + /* needs to skip some leading uptodated blocks */ "needs to skip some leading uptodate blocks" > iomap_adjust_read_range(inode, iop, &pos, length, &poff, &plen); > if (plen == 0) > goto done; > > + if (iomap->type == IOMAP_INLINE) { > + iomap_read_inline_data(inode, page, iomap, pos); > + plen = PAGE_SIZE - poff; > + goto done; > + } > + > + /* zero post-eof blocks as the page may be mapped */ > if (iomap_block_needs_zeroing(inode, iomap, pos)) { > zero_user(page, poff, plen); > iomap_set_range_uptodate(page, poff, plen); > @@ -589,6 +589,18 @@ __iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, int flags, > return 0; > } > > +static int iomap_write_begin_inline(struct inode *inode, loff_t pos, > + struct page *page, struct iomap *srcmap) > +{ > + /* needs more work for the tailpacking case, disable for now */ > + if (WARN_ON_ONCE(pos != 0)) This should be a WARN_ON_ONCE(srcmap->offset != 0). Otherwise, something like: xfs_io -ft -c 'pwrite 1 2' will fail because pos will be 1. > + return -EIO; > + if (PageUptodate(page)) > + return 0; > + iomap_read_inline_data(inode, page, srcmap, pos); The above means that passing pos to iomap_read_inline_data here won't do the right thing, either. > + return 0; > +} > + > static int > iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags, > struct page **pagep, struct iomap *iomap, struct iomap *srcmap) > @@ -618,7 +630,7 @@ iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags, > } > > if (srcmap->type == IOMAP_INLINE) > - iomap_read_inline_data(inode, page, srcmap); > + status = iomap_write_begin_inline(inode, pos, page, srcmap); > else if (iomap->flags & IOMAP_F_BUFFER_HEAD) > status = __block_write_begin_int(page, pos, len, NULL, srcmap); > else > diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c > index 9398b8c31323..ee6309967b77 100644 > --- a/fs/iomap/direct-io.c > +++ b/fs/iomap/direct-io.c > @@ -379,22 +379,25 @@ iomap_dio_inline_actor(struct inode *inode, loff_t pos, loff_t length, > { > struct iov_iter *iter = dio->submit.iter; > size_t copied; > + void *dst = iomap->inline_data + pos - iomap->offset; > > - BUG_ON(pos + length > PAGE_SIZE - offset_in_page(iomap->inline_data)); > + /* inline data must be inside a single page */ > + BUG_ON(length > PAGE_SIZE - offset_in_page(iomap->inline_data)); > > if (dio->flags & IOMAP_DIO_WRITE) { > loff_t size = inode->i_size; > > if (pos > size) > - memset(iomap->inline_data + size, 0, pos - size); > - copied = copy_from_iter(iomap->inline_data + pos, length, iter); > + memset(iomap->inline_data + size - iomap->offset, > + 0, pos - size); > + copied = copy_from_iter(dst, length, iter); > if (copied) { > if (pos + copied > size) > i_size_write(inode, pos + copied); > mark_inode_dirty(inode); > } > } else { > - copied = copy_to_iter(iomap->inline_data + pos, length, iter); > + copied = copy_to_iter(dst, length, iter); > } > dio->size += copied; > return copied; > -- > 2.24.4 > Thanks, Andreas