Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp5181282imm; Tue, 9 Oct 2018 10:54:09 -0700 (PDT) X-Google-Smtp-Source: ACcGV62Si6TTR5ABl+2SjEqgbg5qA9z4gCtynpg/aSwpf0Kk21+7iN5c5Oa0yajS9BWN30hY+bWy X-Received: by 2002:a62:b09:: with SMTP id t9-v6mr31294554pfi.36.1539107649498; Tue, 09 Oct 2018 10:54:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539107649; cv=none; d=google.com; s=arc-20160816; b=EDHkQKCC6OMF952uvkp6mt3HOHWS+ySwEBkMSJSJ9kYXq/0iBLd+9fis0asUGho3mI NksBZ09Qml6PAyV+aqsSRPzO/JfZIe76YG6mC2VoUPOigdulvHRI+7XeOejnHcNK78pJ WlUHvtyKUsF8g7h42WVjLoBLIZn8lQ94EhyiB8/1lLZjOPiamHFyXYXsSp3EVYWgVr9E JYAMwUR4r7cZmi/O8plHEzrTNx+a3rXHn2l2igLZ3E/b7NWNm2y853WktFr8vvxrOIDT qAXYqtCL2kcmC5jjpUWymz7fG9PH8xBC7zHQZndQ1krgz6vCTyQy8/lzgrvDhx0EACsR Y+FQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=UtIk+6F5SFICe0wUrLq6kpFB3vdWWkDggYPYmHrUvhg=; b=RBH0uMp1gr6eb6HC+tzfffjNAf+F3FB8OE4bW9h2+CSqADyHsFAKLuVL8Tck2j89fC TCbpl8aRVC0E+K+i/eTAk7r5713bMGCmX6q7Tl00PEV9abjy/q1Ss0wqYEIUJPl+i4tl ZnpvmUME8DxWeEFnN0ZUK772gxHMFHXuWqo3BqrJ9/8k4+MEoeXVEK+RElvzCQxNK+fi kitnjJK34sKrR/KiuZ8JK1RHBR5Wl89f4NKgdJ5Z3neEqBagTwmfDpZNszt3V6LapR3N oA0IfJMEuSiJAXol5sdPPcPYkRtyNzhXtuAsLObDLoi7fqYk9l88xXCckAYxHvyFs9fG tpUw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y25-v6si23181958pff.249.2018.10.09.10.53.53; Tue, 09 Oct 2018 10:54:09 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726723AbeJJBLj (ORCPT + 99 others); Tue, 9 Oct 2018 21:11:39 -0400 Received: from mx2.suse.de ([195.135.220.15]:39678 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726393AbeJJBLj (ORCPT ); Tue, 9 Oct 2018 21:11:39 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 200C7AFF1; Tue, 9 Oct 2018 17:53:29 +0000 (UTC) From: Luis Henriques To: "Yan, Zheng" , Sage Weil , Ilya Dryomov Cc: ceph-devel@vger.kernel.org, linux-kernel@vger.kernel.org, Luis Henriques Subject: [PATCH] ceph: only allow punch hole mode in fallocate Date: Tue, 9 Oct 2018 18:54:28 +0100 Message-Id: <20181009175428.18543-1-lhenriques@suse.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Current implementation of cephfs fallocate isn't correct as it doesn't really reserve the space in the cluster, which means that a subsequent call to a write may actually fail due to lack of space. In fact, it is currently possible to fallocate an amount space that is larger than the free space in the cluster. Since there's no easy solution to fix this at the moment, this patch simply removes support for all fallocate operations but FALLOC_FL_PUNCH_HOLE (which implies FALLOC_FL_KEEP_SIZE). Link: https://tracker.ceph.com/issues/36317 Cc: stable@vger.kernel.org Fixes: ad7a60de882a ("ceph: punch hole support") Signed-off-by: Luis Henriques --- fs/ceph/file.c | 45 +++++++++------------------------------------ 1 file changed, 9 insertions(+), 36 deletions(-) diff --git a/fs/ceph/file.c b/fs/ceph/file.c index 92ab20433682..91a7ad259bcf 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -1735,7 +1735,6 @@ static long ceph_fallocate(struct file *file, int mode, struct ceph_file_info *fi = file->private_data; struct inode *inode = file_inode(file); struct ceph_inode_info *ci = ceph_inode(inode); - struct ceph_fs_client *fsc = ceph_inode_to_client(inode); struct ceph_cap_flush *prealloc_cf; int want, got = 0; int dirty; @@ -1743,10 +1742,7 @@ static long ceph_fallocate(struct file *file, int mode, loff_t endoff = 0; loff_t size; - if ((offset + length) > max(i_size_read(inode), fsc->max_file_size)) - return -EFBIG; - - if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE)) + if (mode != (FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE)) return -EOPNOTSUPP; if (!S_ISREG(inode->i_mode)) @@ -1763,18 +1759,6 @@ static long ceph_fallocate(struct file *file, int mode, goto unlock; } - if (!(mode & (FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE)) && - ceph_quota_is_max_bytes_exceeded(inode, offset + length)) { - ret = -EDQUOT; - goto unlock; - } - - if (ceph_osdmap_flag(&fsc->client->osdc, CEPH_OSDMAP_FULL) && - !(mode & FALLOC_FL_PUNCH_HOLE)) { - ret = -ENOSPC; - goto unlock; - } - if (ci->i_inline_version != CEPH_INLINE_NONE) { ret = ceph_uninline_data(file, NULL); if (ret < 0) @@ -1782,12 +1766,12 @@ static long ceph_fallocate(struct file *file, int mode, } size = i_size_read(inode); - if (!(mode & FALLOC_FL_KEEP_SIZE)) { - endoff = offset + length; - ret = inode_newsize_ok(inode, endoff); - if (ret) - goto unlock; - } + + /* Are we punching a hole beyond EOF? */ + if (offset >= size) + goto unlock; + if ((offset + length) > size) + length = size - offset; if (fi->fmode & CEPH_FILE_MODE_LAZY) want = CEPH_CAP_FILE_BUFFER | CEPH_CAP_FILE_LAZYIO; @@ -1798,16 +1782,8 @@ static long ceph_fallocate(struct file *file, int mode, if (ret < 0) goto unlock; - if (mode & FALLOC_FL_PUNCH_HOLE) { - if (offset < size) - ceph_zero_pagecache_range(inode, offset, length); - ret = ceph_zero_objects(inode, offset, length); - } else if (endoff > size) { - truncate_pagecache_range(inode, size, -1); - if (ceph_inode_set_size(inode, endoff)) - ceph_check_caps(ceph_inode(inode), - CHECK_CAPS_AUTHONLY, NULL); - } + ceph_zero_pagecache_range(inode, offset, length); + ret = ceph_zero_objects(inode, offset, length); if (!ret) { spin_lock(&ci->i_ceph_lock); @@ -1817,9 +1793,6 @@ static long ceph_fallocate(struct file *file, int mode, spin_unlock(&ci->i_ceph_lock); if (dirty) __mark_inode_dirty(inode, dirty); - if ((endoff > size) && - ceph_quota_is_max_bytes_approaching(inode, endoff)) - ceph_check_caps(ci, CHECK_CAPS_NODELAY, NULL); } ceph_put_cap_refs(ci, got);