Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp372446pxf; Thu, 8 Apr 2021 05:06:16 -0700 (PDT) X-Google-Smtp-Source: ABdhPJweTOYhBTttZ9/z59SWz1PSdssAmzCi4V8zomfgwt4ZpWYaOrYZ8xQkbe9jo/y2tz1M2iEd X-Received: by 2002:a17:906:d90:: with SMTP id m16mr9905840eji.101.1617883576360; Thu, 08 Apr 2021 05:06:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1617883576; cv=none; d=google.com; s=arc-20160816; b=ktjG/AdIpcvEYSD0psgL+u4HjuKELk8k8ZRczyeiPynpkPVc5zuhrZ3ive7AC1Fd/k 78fO21ymU8oSUC5Xqrij/cW4yFUYzf91V+eKx0xrXufEN3cbt7pJ+DBv/TsFnSWmZxhp 1Tcj4EtiEW7WE9l2C0UERt70yyZ1jc/cFSue3WfH/J8RUWtm3+thV2tgTebvsXHCdg3j p9oCKmz8wq7nqeOT53rcQ0uBK5jWVX/bLPvfBQ92uHhdADmCA0IV9PIPX2Dw/1596eQ3 jsdi6D0bt2hTwavfZygFHgd7lPXfRNaxibooBnllwunjf8gPt1oim+49kFvmLjrJkDUw 1PVg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :ironport-hdrordr; bh=MKEDyY7ptMn1W91atE//t8B4yv20ds4eKs3ydn+yx6s=; b=ik0U35Sdg+aNRJNkAMskM5f4sBk7hSL9+++KBrbw2VZI32WcZqLJx6/VojqkW46+Tv AQsdH5FeV0XanVYdjW8pjBbeje9AyoNl4kOwOs+NXxxqZEk+Goimpy3c5sa8MGUNIpsG TDoYteBPBFZO4Ksw9LAwXVSOObdjIZm2hYzzLDBUEjwMv5ukPPwFxRNPT3PJg2BVo5aw nBzK4mlFYzOK3CDK++Qedb80BAuo3Jac3DH/LT/O/T3l07kV0lxUqsqn2+t6rQoJkv51 LDcTfKG+yEoZREvx6WVHdzT/WQt0XAEcCiEFikpU6eYHC6x7rz3DNF3IKcRR69o98n2i XXDQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id h25si21510813ejd.683.2021.04.08.05.05.51; Thu, 08 Apr 2021 05:06:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231400AbhDHMEz (ORCPT + 99 others); Thu, 8 Apr 2021 08:04:55 -0400 Received: from mail.cn.fujitsu.com ([183.91.158.132]:24811 "EHLO heian.cn.fujitsu.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S231195AbhDHMEy (ORCPT ); Thu, 8 Apr 2021 08:04:54 -0400 IronPort-HdrOrdr: =?us-ascii?q?A9a23=3A7G2dj6v6h/XP8m9i1FC+MU3j7skDltV00zAX?= =?us-ascii?q?/kB9WHVpW+afkN2jm+le6A/shF8qKRUdsP2jGI3Fe3PT8pZp/ZIcVI3OYCDKsH?= =?us-ascii?q?alRbsN0aLMzzHsECX19Kp8+M5bGZRWJ8b3CTFB7PrSxCmdP5IezMKc8Kau7N2u?= =?us-ascii?q?qktFaQ1xcalv40NYJ2+gYy5LbTJLD5Y4C5aQj/Avz1WdUE4KZce2DGRtZZmgm/?= =?us-ascii?q?T3kvvdASIuNloO7QmiqXeS4qfmLh7w5HwjegIK7bA80WWtqWDE2pk=3D?= X-IronPort-AV: E=Sophos;i="5.82,206,1613404800"; d="scan'208";a="106797244" Received: from unknown (HELO cn.fujitsu.com) ([10.167.33.5]) by heian.cn.fujitsu.com with ESMTP; 08 Apr 2021 20:04:39 +0800 Received: from G08CNEXMBPEKD06.g08.fujitsu.local (unknown [10.167.33.206]) by cn.fujitsu.com (Postfix) with ESMTP id E5F494CF2676; Thu, 8 Apr 2021 20:04:36 +0800 (CST) Received: from G08CNEXJMPEKD02.g08.fujitsu.local (10.167.33.202) by G08CNEXMBPEKD06.g08.fujitsu.local (10.167.33.206) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Thu, 8 Apr 2021 20:04:36 +0800 Received: from G08CNEXCHPEKD04.g08.fujitsu.local (10.167.33.200) by G08CNEXJMPEKD02.g08.fujitsu.local (10.167.33.202) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Thu, 8 Apr 2021 20:04:36 +0800 Received: from irides.mr.mr.mr (10.167.225.141) by G08CNEXCHPEKD04.g08.fujitsu.local (10.167.33.209) with Microsoft SMTP Server id 15.0.1497.2 via Frontend Transport; Thu, 8 Apr 2021 20:04:36 +0800 From: Shiyang Ruan To: , , , CC: , , , , , , , , Subject: [PATCH v4 1/7] fsdax: Introduce dax_iomap_cow_copy() Date: Thu, 8 Apr 2021 20:04:26 +0800 Message-ID: <20210408120432.1063608-2-ruansy.fnst@fujitsu.com> X-Mailer: git-send-email 2.31.0 In-Reply-To: <20210408120432.1063608-1-ruansy.fnst@fujitsu.com> References: <20210408120432.1063608-1-ruansy.fnst@fujitsu.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII X-yoursite-MailScanner-ID: E5F494CF2676.A3511 X-yoursite-MailScanner: Found to be clean X-yoursite-MailScanner-From: ruansy.fnst@fujitsu.com X-Spam-Status: No Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org In the case where the iomap is a write operation and iomap is not equal to srcmap after iomap_begin, we consider it is a CoW operation. The destance extent which iomap indicated is new allocated extent. So, it is needed to copy the data from srcmap to new allocated extent. In theory, it is better to copy the head and tail ranges which is outside of the non-aligned area instead of copying the whole aligned range. But in dax page fault, it will always be an aligned range. So, we have to copy the whole range in this case. Signed-off-by: Shiyang Ruan Reviewed-by: Christoph Hellwig --- fs/dax.c | 82 ++++++++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 77 insertions(+), 5 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 8d7e4e2cc0fb..b4fd3813457a 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -1038,6 +1038,61 @@ static int dax_iomap_direct_access(struct iomap *iomap, loff_t pos, size_t size, return rc; } +/** + * dax_iomap_cow_copy(): Copy the data from source to destination before write. + * @pos: address to do copy from. + * @length: size of copy operation. + * @align_size: aligned w.r.t align_size (either PMD_SIZE or PAGE_SIZE) + * @srcmap: iomap srcmap + * @daddr: destination address to copy to. + * + * This can be called from two places. Either during DAX write fault, to copy + * the length size data to daddr. Or, while doing normal DAX write operation, + * dax_iomap_actor() might call this to do the copy of either start or end + * unaligned address. In this case the rest of the copy of aligned ranges is + * taken care by dax_iomap_actor() itself. + * Also, note DAX fault will always result in aligned pos and pos + length. + */ +static int dax_iomap_cow_copy(loff_t pos, loff_t length, size_t align_size, + struct iomap *srcmap, void *daddr) +{ + loff_t head_off = pos & (align_size - 1); + size_t size = ALIGN(head_off + length, align_size); + loff_t end = pos + length; + loff_t pg_end = round_up(end, align_size); + bool copy_all = head_off == 0 && end == pg_end; + void *saddr = 0; + int ret = 0; + + ret = dax_iomap_direct_access(srcmap, pos, size, &saddr, NULL); + if (ret) + return ret; + + if (copy_all) { + ret = copy_mc_to_kernel(daddr, saddr, length); + return ret ? -EIO : 0; + } + + /* Copy the head part of the range. Note: we pass offset as length. */ + if (head_off) { + ret = copy_mc_to_kernel(daddr, saddr, head_off); + if (ret) + return -EIO; + } + + /* Copy the tail part of the range */ + if (end < pg_end) { + loff_t tail_off = head_off + length; + loff_t tail_len = pg_end - end; + + ret = copy_mc_to_kernel(daddr + tail_off, saddr + tail_off, + tail_len); + if (ret) + return -EIO; + } + return 0; +} + /* * The user has performed a load from a hole in the file. Allocating a new * page in the file would cause excessive storage usage for workloads with @@ -1167,11 +1222,12 @@ dax_iomap_actor(struct inode *inode, loff_t pos, loff_t length, void *data, struct dax_device *dax_dev = iomap->dax_dev; struct iov_iter *iter = data; loff_t end = pos + length, done = 0; + bool write = iov_iter_rw(iter) == WRITE; ssize_t ret = 0; size_t xfer; int id; - if (iov_iter_rw(iter) == READ) { + if (!write) { end = min(end, i_size_read(inode)); if (pos >= end) return 0; @@ -1180,7 +1236,8 @@ dax_iomap_actor(struct inode *inode, loff_t pos, loff_t length, void *data, return iov_iter_zero(min(length, end - pos), iter); } - if (WARN_ON_ONCE(iomap->type != IOMAP_MAPPED)) + if (WARN_ON_ONCE(iomap->type != IOMAP_MAPPED && + !(iomap->flags & IOMAP_F_SHARED))) return -EIO; /* @@ -1219,6 +1276,13 @@ dax_iomap_actor(struct inode *inode, loff_t pos, loff_t length, void *data, break; } + if (write && srcmap->addr != iomap->addr) { + ret = dax_iomap_cow_copy(pos, length, PAGE_SIZE, srcmap, + kaddr); + if (ret) + break; + } + map_len = PFN_PHYS(map_len); kaddr += offset; map_len -= offset; @@ -1230,7 +1294,7 @@ dax_iomap_actor(struct inode *inode, loff_t pos, loff_t length, void *data, * validated via access_ok() in either vfs_read() or * vfs_write(), depending on which operation we are doing. */ - if (iov_iter_rw(iter) == WRITE) + if (write) xfer = dax_copy_from_iter(dax_dev, pgoff, kaddr, map_len, iter); else @@ -1382,6 +1446,7 @@ static vm_fault_t dax_fault_actor(struct vm_fault *vmf, pfn_t *pfnp, unsigned long entry_flags = pmd ? DAX_PMD : 0; int err = 0; pfn_t pfn; + void *kaddr; /* if we are reading UNWRITTEN and HOLE, return a hole. */ if (!write && @@ -1392,18 +1457,25 @@ static vm_fault_t dax_fault_actor(struct vm_fault *vmf, pfn_t *pfnp, return dax_pmd_load_hole(xas, vmf, iomap, entry); } - if (iomap->type != IOMAP_MAPPED) { + if (iomap->type != IOMAP_MAPPED && !(iomap->flags & IOMAP_F_SHARED)) { WARN_ON_ONCE(1); return pmd ? VM_FAULT_FALLBACK : VM_FAULT_SIGBUS; } - err = dax_iomap_direct_access(iomap, pos, size, NULL, &pfn); + err = dax_iomap_direct_access(iomap, pos, size, &kaddr, &pfn); if (err) return pmd ? VM_FAULT_FALLBACK : dax_fault_return(err); *entry = dax_insert_entry(xas, mapping, vmf, *entry, pfn, entry_flags, write && !sync); + if (write && + srcmap->addr != IOMAP_HOLE && srcmap->addr != iomap->addr) { + err = dax_iomap_cow_copy(pos, size, size, srcmap, kaddr); + if (err) + return dax_fault_return(err); + } + if (sync) return dax_fault_synchronous_pfnp(pfnp, pfn); -- 2.31.0