Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp3889543imu; Mon, 10 Dec 2018 09:22:26 -0800 (PST) X-Google-Smtp-Source: AFSGD/UiIC3mgRc4HqFB+KPr33aRc01tz4dqb8XA9UrtziKkmeRoLxxF9JYP0jMyhNt5WiL8rH12 X-Received: by 2002:a63:ca0a:: with SMTP id n10mr11670660pgi.258.1544462546701; Mon, 10 Dec 2018 09:22:26 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1544462546; cv=none; d=google.com; s=arc-20160816; b=FN2IYviEIUnhVodMUtnIMT/1yBa6SR5vRK8gydHYahfu3Q+Hy7znr9trO1t0jDwXAi RjgtCUzLuB1AusCk1XFRaT2pSIRabne8dsl094/CLQ1gTL7YLKl6v8B7zHh1zbuPCVG/ UqFb5LqProClTr7tNCVZswBkp0JmW3idmlfyxMNWXJWmUjlDqLBOZvI6UzHjTsbPZIqK sMjsknno+xIo+rdcFqTufFgqZDEUzE6+l1GRa1evLMe1USZ6HMXNiIC0ucihfI2cP4nj KG6Rs/oAHNtWwGexO6qBnsAFk93sGZvg+BmmloqVs47QvkDVaKcV+8WFNZtivQyeQJtj 4I0A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from; bh=EZtGjxIzLaDiinp79zhvR/a9Y24LAfd0V53mZS+4N7A=; b=KRmrvZvMur0tKIYVhdPewafU8e4grqmAoUwUUP0UwF7xOufE7sQ2TMBRF3Dy9FOvYT zYcG89HTyTfjHZ7x790ITZG3XIqPemcNAf/3jPEeRwoy2WPVYSOzdHRjBoDLkOlVFi/H MlZR6jIKXeX9l9q4+O0cM76amNO1Jvw4G2qcxosI0Iz6RJL7tM8vgajiNKkghEQLfaiQ iYqSDi+tTW7WqVVu2EtxeSV1JpAWrItlJ38QhV1cOvPsMu/LoBhQCyhZwoOR+hRbX4yL UUsnhe6eHYoaWfv9a3hPgW2HwYM1e65pN8DrekKzXiruRI678AFOXd5PPCxXCPIhUVw6 QVtw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a6si9756127pgc.137.2018.12.10.09.22.11; Mon, 10 Dec 2018 09:22:26 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727799AbeLJRTM (ORCPT + 99 others); Mon, 10 Dec 2018 12:19:12 -0500 Received: from mx1.redhat.com ([209.132.183.28]:37000 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728379AbeLJRNf (ORCPT ); Mon, 10 Dec 2018 12:13:35 -0500 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 2FB8E3154866; Mon, 10 Dec 2018 17:13:35 +0000 (UTC) Received: from horse.redhat.com (unknown [10.18.25.234]) by smtp.corp.redhat.com (Postfix) with ESMTP id C72D2605CF; Mon, 10 Dec 2018 17:13:34 +0000 (UTC) Received: by horse.redhat.com (Postfix, from userid 10451) id 76162224267; Mon, 10 Dec 2018 12:13:30 -0500 (EST) From: Vivek Goyal To: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: vgoyal@redhat.com, miklos@szeredi.hu, stefanha@redhat.com, dgilbert@redhat.com, sweil@redhat.com, swhiteho@redhat.com Subject: [PATCH 28/52] Do fallocate() to grow file before mapping for file growing writes Date: Mon, 10 Dec 2018 12:12:54 -0500 Message-Id: <20181210171318.16998-29-vgoyal@redhat.com> In-Reply-To: <20181210171318.16998-1-vgoyal@redhat.com> References: <20181210171318.16998-1-vgoyal@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.41]); Mon, 10 Dec 2018 17:13:35 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org How to handle file growing writes. For now, this patch does fallocate() to grow file and then map it using dax. We need to figure out what's the best way to handle it. This patch does fallocate() and setup mapping operations in fuse_dax_write_iter(), instead of iomap_begin(). I don't have access to file pointer needed to send a message to fuse daemon in iomap_begin(). Dave Chinner has expressed concers with this approach as this is not atomic. If guest crashes after falloc() but before data was written, user will think that filesystem lost its data. So this is still an outstanding issue. Signed-off-by: Vivek Goyal --- fs/fuse/file.c | 71 +++++++++++++++++++++++++++++++++++++++++++++------------- 1 file changed, 55 insertions(+), 16 deletions(-) diff --git a/fs/fuse/file.c b/fs/fuse/file.c index 94ad76382a6f..41d773ba2c72 100644 --- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -28,6 +28,9 @@ INTERVAL_TREE_DEFINE(struct fuse_dax_mapping, rb, __u64, __subtree_last, START, LAST, static inline, fuse_dax_interval_tree); +static long __fuse_file_fallocate(struct file *file, int mode, + loff_t offset, loff_t length); + static int fuse_send_open(struct fuse_conn *fc, u64 nodeid, struct file *file, int opcode, struct fuse_open_out *outargp) { @@ -1819,6 +1822,22 @@ static ssize_t fuse_dax_write_iter(struct kiocb *iocb, struct iov_iter *from) /* TODO file_update_time() but we don't want metadata I/O */ /* TODO handle growing the file */ + /* Grow file here if need be. iomap_begin() does not have access + * to file pointer + */ + if (iov_iter_rw(from) == WRITE && + ((iocb->ki_pos + iov_iter_count(from)) > i_size_read(inode))) { + ret = __fuse_file_fallocate(iocb->ki_filp, 0, iocb->ki_pos, + iov_iter_count(from)); + if (ret < 0) { + printk("fallocate(offset=0x%llx length=0x%lx)" + " failed. err=%ld\n", iocb->ki_pos, + iov_iter_count(from), ret); + goto out; + } + pr_debug("fallocate(offset=0x%llx length=0x%lx)" + " succeed. ret=%ld\n", iocb->ki_pos, iov_iter_count(from), ret); + } ret = dax_iomap_rw(iocb, from, &fuse_iomap_ops); @@ -3331,8 +3350,12 @@ fuse_direct_IO(struct kiocb *iocb, struct iov_iter *iter) return ret; } -static long fuse_file_fallocate(struct file *file, int mode, loff_t offset, - loff_t length) +/* + * This variant does not take any inode lock and if locking is required, + * caller is supposed to hold lock + */ +static long __fuse_file_fallocate(struct file *file, int mode, + loff_t offset, loff_t length) { struct fuse_file *ff = file->private_data; struct inode *inode = file_inode(file); @@ -3346,8 +3369,6 @@ static long fuse_file_fallocate(struct file *file, int mode, loff_t offset, .mode = mode }; int err; - bool lock_inode = !(mode & FALLOC_FL_KEEP_SIZE) || - (mode & FALLOC_FL_PUNCH_HOLE); if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE)) return -EOPNOTSUPP; @@ -3355,17 +3376,13 @@ static long fuse_file_fallocate(struct file *file, int mode, loff_t offset, if (fc->no_fallocate) return -EOPNOTSUPP; - if (lock_inode) { - inode_lock(inode); - if (mode & FALLOC_FL_PUNCH_HOLE) { - loff_t endbyte = offset + length - 1; - err = filemap_write_and_wait_range(inode->i_mapping, - offset, endbyte); - if (err) - goto out; - - fuse_sync_writes(inode); - } + if (mode & FALLOC_FL_PUNCH_HOLE) { + loff_t endbyte = offset + length - 1; + err = filemap_write_and_wait_range(inode->i_mapping, offset, + endbyte); + if (err) + goto out; + fuse_sync_writes(inode); } if (!(mode & FALLOC_FL_KEEP_SIZE)) @@ -3401,9 +3418,31 @@ static long fuse_file_fallocate(struct file *file, int mode, loff_t offset, if (!(mode & FALLOC_FL_KEEP_SIZE)) clear_bit(FUSE_I_SIZE_UNSTABLE, &fi->state); + return err; +} + +static long fuse_file_fallocate(struct file *file, int mode, loff_t offset, + loff_t length) +{ + struct fuse_file *ff = file->private_data; + struct inode *inode = file_inode(file); + struct fuse_conn *fc = ff->fc; + int err; + bool lock_inode = !(mode & FALLOC_FL_KEEP_SIZE) || + (mode & FALLOC_FL_PUNCH_HOLE); + + if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE)) + return -EOPNOTSUPP; + + if (fc->no_fallocate) + return -EOPNOTSUPP; + if (lock_inode) - inode_unlock(inode); + inode_lock(inode); + err = __fuse_file_fallocate(file, mode, offset, length); + if (lock_inode) + inode_unlock(inode); return err; } -- 2.13.6