Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp3921731imu; Mon, 10 Dec 2018 09:55:19 -0800 (PST) X-Google-Smtp-Source: AFSGD/VWKgniSKK/4ES9ykNWVoKnPlQTSF8EizHmo2zSWnGBOAJXXS+APhsCNEzoAi7oHCigb9Gu X-Received: by 2002:a62:509b:: with SMTP id g27mr13352167pfj.48.1544464519786; Mon, 10 Dec 2018 09:55:19 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1544464519; cv=none; d=google.com; s=arc-20160816; b=dWEhGdEZH66JCGOuL9dzRRnGS8r/PfvXxG5I+VoeyuS3Gvbgh7hWOZLmXnf/lTCn3Y g29VhlpVr0SiAWvUdFVtUeC5l4c6oaRLl0NfPvZ7cauvkAfn+TqTerxPBBGHWv1zeapg Kx5qzs5alNl5hBlzvcovCJKjie/TaWPtDRd7QWIjNecrBfEWqjnrC/TLsg43TeDtXj+e Za/1U7A9qOXQwQkqBCBAPB9x1GZrt8Aw+DuXoFBiCJhwDX/Q/Qw0c2/9/8S1oDRASTOr 84SMkSduRYbdSTxlk59J/jQuxKJveTlxPB0cYyaNk4SufYS45TcEpvaRi/jOF58xqnJO GJqw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from; bh=5+a5o8rsySzrOgBkbMR1t3nlDS1wPRDBySKYf6yh7Q4=; b=LqmUhy04B0zLvfoLV6da3jQew58e5iVMdpmJZ91GqLy20Xp/dQlF3FWFunPBRKueKx DJuR+o9puylO5gItHfP4KV/f0vgkOtQz5wOUP8Vq4dLqicFuE/TrPk08yT9K6Akfqc0h IoLoU149xUeBE4H/X0HMSg/npqavslEuzVewtx1FoahprcctsxSxcSzYauG4SsZrKhN9 uJ1FxVhVvsBsn5UT0pVSZIswK2QEDSeyxzykqwlwKNIGioP4DIWGN2W/GBULkzimRpJu 3ceb9cKaxsfHbM/bcPTQK36ffzKNTA0BJJbyCa1x/NfaDKHNt1a3Hkul0esgUOWc3gki haKw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w15si10629773plk.357.2018.12.10.09.55.04; Mon, 10 Dec 2018 09:55:19 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728961AbeLJRRd (ORCPT + 99 others); Mon, 10 Dec 2018 12:17:33 -0500 Received: from mx1.redhat.com ([209.132.183.28]:58430 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728383AbeLJRNf (ORCPT ); Mon, 10 Dec 2018 12:13:35 -0500 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 4EC0730044D9; Mon, 10 Dec 2018 17:13:35 +0000 (UTC) Received: from horse.redhat.com (unknown [10.18.25.234]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0C3F26015E; Mon, 10 Dec 2018 17:13:35 +0000 (UTC) Received: by horse.redhat.com (Postfix, from userid 10451) id 8FFCE22426F; Mon, 10 Dec 2018 12:13:30 -0500 (EST) From: Vivek Goyal To: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: vgoyal@redhat.com, miklos@szeredi.hu, stefanha@redhat.com, dgilbert@redhat.com, sweil@redhat.com, swhiteho@redhat.com Subject: [PATCH 35/52] fuse: Add logic to do direct reclaim of memory Date: Mon, 10 Dec 2018 12:13:01 -0500 Message-Id: <20181210171318.16998-36-vgoyal@redhat.com> In-Reply-To: <20181210171318.16998-1-vgoyal@redhat.com> References: <20181210171318.16998-1-vgoyal@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.45]); Mon, 10 Dec 2018 17:13:35 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This can be done only from same inode. Also it can be done only for read/write case and not for fault case. Reason, as of now reclaim requires holding inode_lock, fuse_inode->i_mmap_sem and fuse_inode->dmap_tree locks in that order and only read/write path will allow that (and not fault path). Signed-off-by: Vivek Goyal --- fs/fuse/file.c | 121 +++++++++++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 105 insertions(+), 16 deletions(-) diff --git a/fs/fuse/file.c b/fs/fuse/file.c index 17becdff3014..13db83d105ff 100644 --- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -30,6 +30,8 @@ INTERVAL_TREE_DEFINE(struct fuse_dax_mapping, static long __fuse_file_fallocate(struct file *file, int mode, loff_t offset, loff_t length); +static struct fuse_dax_mapping *alloc_dax_mapping_reclaim(struct fuse_conn *fc, + struct inode *inode); static int fuse_send_open(struct fuse_conn *fc, u64 nodeid, struct file *file, int opcode, struct fuse_open_out *outargp) @@ -1727,7 +1729,12 @@ static int fuse_iomap_begin(struct inode *inode, loff_t pos, loff_t length, if (pos >= i_size_read(inode)) goto iomap_hole; - alloc_dmap = alloc_dax_mapping(fc); + /* Can't do reclaim in fault path yet due to lock ordering */ + if (flags & IOMAP_FAULT) + alloc_dmap = alloc_dax_mapping(fc); + else + alloc_dmap = alloc_dax_mapping_reclaim(fc, inode); + if (!alloc_dmap) return -EBUSY; @@ -3705,24 +3712,14 @@ void fuse_init_file_inode(struct inode *inode) } } -int fuse_dax_free_one_mapping_locked(struct fuse_conn *fc, struct inode *inode, - u64 dmap_start) +int fuse_dax_reclaim_dmap_locked(struct fuse_conn *fc, struct inode *inode, + struct fuse_dax_mapping *dmap) { int ret; struct fuse_inode *fi = get_fuse_inode(inode); - struct fuse_dax_mapping *dmap; - - WARN_ON(!inode_is_locked(inode)); - - /* Find fuse dax mapping at file offset inode. */ - dmap = fuse_dax_interval_tree_iter_first(&fi->dmap_tree, dmap_start, - dmap_start); - - /* Range already got cleaned up by somebody else */ - if (!dmap) - return 0; - ret = filemap_fdatawrite_range(inode->i_mapping, dmap->start, dmap->end); + ret = filemap_fdatawrite_range(inode->i_mapping, dmap->start, + dmap->end); if (ret) { printk("filemap_fdatawrite_range() failed. err=%d start=0x%llx," " end=0x%llx\n", ret, dmap->start, dmap->end); @@ -3743,6 +3740,99 @@ int fuse_dax_free_one_mapping_locked(struct fuse_conn *fc, struct inode *inode, /* Remove dax mapping from inode interval tree now */ fuse_dax_interval_tree_remove(dmap, &fi->dmap_tree); fi->nr_dmaps--; + return 0; +} + +/* First first mapping in the tree and free it. */ +struct fuse_dax_mapping *fuse_dax_reclaim_first_mapping_locked( + struct fuse_conn *fc, struct inode *inode) +{ + struct fuse_inode *fi = get_fuse_inode(inode); + struct fuse_dax_mapping *dmap; + int ret; + + /* Find fuse dax mapping at file offset inode. */ + dmap = fuse_dax_interval_tree_iter_first(&fi->dmap_tree, 0, -1); + if (!dmap) + return NULL; + + ret = fuse_dax_reclaim_dmap_locked(fc, inode, dmap); + if (ret < 0) + return ERR_PTR(ret); + + /* Clean up dmap. Do not add back to free list */ + spin_lock(&fc->lock); + list_del_init(&dmap->busy_list); + WARN_ON(fc->nr_busy_ranges == 0); + fc->nr_busy_ranges--; + dmap->inode = NULL; + dmap->start = dmap->end = 0; + spin_unlock(&fc->lock); + + pr_debug("fuse: reclaimed memory range window_offset=0x%llx," + " length=0x%llx\n", dmap->window_offset, + dmap->length); + return dmap; +} + +/* + * First first mapping in the tree and free it and return it. Do not add + * it back to free pool. + * + * This is called with inode lock held. + */ +struct fuse_dax_mapping *fuse_dax_reclaim_first_mapping(struct fuse_conn *fc, + struct inode *inode) +{ + struct fuse_inode *fi = get_fuse_inode(inode); + struct fuse_dax_mapping *dmap; + + down_write(&fi->i_mmap_sem); + down_write(&fi->i_dmap_sem); + dmap = fuse_dax_reclaim_first_mapping_locked(fc, inode); + up_write(&fi->i_dmap_sem); + up_write(&fi->i_mmap_sem); + return dmap; +} + +static struct fuse_dax_mapping *alloc_dax_mapping_reclaim(struct fuse_conn *fc, + struct inode *inode) +{ + struct fuse_dax_mapping *dmap; + struct fuse_inode *fi = get_fuse_inode(inode); + + dmap = alloc_dax_mapping(fc); + if (dmap) + return dmap; + + /* There are no mappings which can be reclaimed */ + if (!fi->nr_dmaps) + return NULL; + + /* Try reclaim a fuse dax memory range */ + return fuse_dax_reclaim_first_mapping(fc, inode); +} + +int fuse_dax_free_one_mapping_locked(struct fuse_conn *fc, struct inode *inode, + u64 dmap_start) +{ + int ret; + struct fuse_inode *fi = get_fuse_inode(inode); + struct fuse_dax_mapping *dmap; + + WARN_ON(!inode_is_locked(inode)); + + /* Find fuse dax mapping at file offset inode. */ + dmap = fuse_dax_interval_tree_iter_first(&fi->dmap_tree, dmap_start, + dmap_start); + + /* Range already got cleaned up by somebody else */ + if (!dmap) + return 0; + + ret = fuse_dax_reclaim_dmap_locked(fc, inode, dmap); + if (ret < 0) + return ret; /* Cleanup dmap entry and add back to free list */ spin_lock(&fc->lock); @@ -3757,7 +3847,6 @@ int fuse_dax_free_one_mapping_locked(struct fuse_conn *fc, struct inode *inode, pr_debug("fuse: freed memory range window_offset=0x%llx," " length=0x%llx\n", dmap->window_offset, dmap->length); - return ret; } -- 2.13.6