Received: by 2002:a05:6a10:6d10:0:0:0:0 with SMTP id gq16csp400843pxb; Tue, 12 Apr 2022 04:42:14 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy+QuUFdDJcnryI51v5Xjh9OJAexwi6eM+RTbmVxdbp3SDldsjvzyy4bM6dknAk5kC/A4bk X-Received: by 2002:a17:90b:4f8d:b0:1c6:408b:6b0d with SMTP id qe13-20020a17090b4f8d00b001c6408b6b0dmr4506358pjb.90.1649763733936; Tue, 12 Apr 2022 04:42:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649763733; cv=none; d=google.com; s=arc-20160816; b=aCtzJZDYHYXSt1A2qTh4w30RPMm0o5Zxpeh5jpdpAdBhFHnjdYPlcpKjTMJ61zlx2N gCrCyFl5JLyj7YUoiNXzhppQvjNFz+bPdjPIQQ6Mq54VY+uzfwJag3bnpm8aiCqct6xL q93TvqAgdVwv1R2ChB8Zu1LZ4zk2jk/vSmWPNpadt7brubytpTpsDyLenSCLztwLURqq HMf33PZARFs317zHeG39Lr6F51RtamV4DNH29G51EuNQCFzICbVEoI8ZGRJ9sx6ykr3n +HV9EI3tgzhSZA7rUzhpZaXIE4h7AK2+HHegkffCMmTiin7V+E+8X8uwUR9qQ3rfFct/ oVZw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=aHTlKmebmCucE9cDNkxQ9RXFfYusic3sh7BXIbl6ZsI=; b=IaM0D2lrsaAbU7F1GQqaj//82BLg7KaqSSHVb6bP7BLNeAVzqf0sxmUlDeOutmlT4F cR0alQ0S8vvOdstQucMMDTqKNKIfNn4xrUWqEBaQf0ILJGCHz+nEUiIHuQ2iJliNv4gP Ia7u6CAu+1rmwPpH355w+RtgKUORnWZzr1cR7QHWUOgolScaMhqa+ddbgbmFT6svTkJ7 Fi7yVfhkRV+wJ1Wn0oNzO0Fyyxy8tmCSqbXlmtoBPBLoVqpnt8B7O1dN2B7nWV3RPmvn VRIqafUNbmAoBpS++5WNslnowF6MHsfjnp2hC30eVya7WnWTnULeP0wwsGY87Q0KKcKe 24MA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l8-20020a170903120800b00156d899e1c8si14505755plh.615.2022.04.12.04.41.47; Tue, 12 Apr 2022 04:42:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S245211AbiDLKOj (ORCPT + 99 others); Tue, 12 Apr 2022 06:14:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43744 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1352844AbiDLJRk (ORCPT ); Tue, 12 Apr 2022 05:17:40 -0400 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8EEBB220CE for ; Tue, 12 Apr 2022 01:25:12 -0700 (PDT) Received: from canpemm500005.china.huawei.com (unknown [172.30.72.57]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4KczHV6s5MzFpk9; Tue, 12 Apr 2022 16:22:46 +0800 (CST) Received: from huawei.com (10.175.127.227) by canpemm500005.china.huawei.com (7.192.104.229) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Tue, 12 Apr 2022 16:25:10 +0800 From: Zhang Yi To: CC: , , , , , Subject: [RFC PATCH v2] ext4: convert symlink external data block mapping to bdev Date: Tue, 12 Apr 2022 16:39:41 +0800 Message-ID: <20220412083941.2242143-1-yi.zhang@huawei.com> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To canpemm500005.china.huawei.com (7.192.104.229) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org Symlink's external data block is one kind of metadata block, and now that almost all ext4 metadata block's page cache (e.g. directory blocks, quota blocks...) belongs to bdev backing inode except the symlink. It is essentially worked in data=journal mode like other regular file's data block because probably in order to make it simple for generic VFS code handling symlinks or some other historical reasons, but the logic of creating external data block in ext4_symlink() is complicated. and it also make things confused if user do not want to let the filesystem worked in data=journal mode. This patch convert the final exceptional case and make things clean, move the mapping of the symlink's external data block to bdev like any other metadata block does. Signed-off-by: Zhang Yi --- v2->v1: - Add comment to explain the credits of creating symlink. [v1]: https://lore.kernel.org/linux-ext4/20220406084503.1961686-1-yi.zhang@huawei.com/ fs/ext4/inode.c | 9 +--- fs/ext4/namei.c | 129 ++++++++++++++++++++++------------------------ fs/ext4/symlink.c | 44 +++++++++++++--- 3 files changed, 99 insertions(+), 83 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 13740f2d0e61..a6339cefbb1e 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -199,8 +199,7 @@ void ext4_evict_inode(struct inode *inode) */ if (inode->i_ino != EXT4_JOURNAL_INO && ext4_should_journal_data(inode) && - (S_ISLNK(inode->i_mode) || S_ISREG(inode->i_mode)) && - inode->i_data.nrpages) { + S_ISREG(inode->i_mode) && inode->i_data.nrpages) { journal_t *journal = EXT4_SB(inode->i_sb)->s_journal; tid_t commit_tid = EXT4_I(inode)->i_datasync_tid; @@ -2944,8 +2943,7 @@ static int ext4_da_write_begin(struct file *file, struct address_space *mapping, index = pos >> PAGE_SHIFT; - if (ext4_nonda_switch(inode->i_sb) || S_ISLNK(inode->i_mode) || - ext4_verity_in_progress(inode)) { + if (ext4_nonda_switch(inode->i_sb) || ext4_verity_in_progress(inode)) { *fsdata = (void *)FALL_BACK_TO_NONDELALLOC; return ext4_write_begin(file, mapping, pos, len, flags, pagep, fsdata); @@ -4977,7 +4975,6 @@ struct inode *__ext4_iget(struct super_block *sb, unsigned long ino, } if (IS_ENCRYPTED(inode)) { inode->i_op = &ext4_encrypted_symlink_inode_operations; - ext4_set_aops(inode); } else if (ext4_inode_is_fast_symlink(inode)) { inode->i_link = (char *)ei->i_data; inode->i_op = &ext4_fast_symlink_inode_operations; @@ -4985,9 +4982,7 @@ struct inode *__ext4_iget(struct super_block *sb, unsigned long ino, sizeof(ei->i_data) - 1); } else { inode->i_op = &ext4_symlink_inode_operations; - ext4_set_aops(inode); } - inode_nohighmem(inode); } else if (S_ISCHR(inode->i_mode) || S_ISBLK(inode->i_mode) || S_ISFIFO(inode->i_mode) || S_ISSOCK(inode->i_mode)) { inode->i_op = &ext4_special_inode_operations; diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c index e37da8d5cd0c..fa8002aae829 100644 --- a/fs/ext4/namei.c +++ b/fs/ext4/namei.c @@ -3249,6 +3249,32 @@ static int ext4_unlink(struct inode *dir, struct dentry *dentry) return retval; } +static int ext4_init_symlink_block(handle_t *handle, struct inode *inode, + struct fscrypt_str *disk_link) +{ + struct buffer_head *bh; + char *kaddr; + int err = 0; + + bh = ext4_bread(handle, inode, 0, EXT4_GET_BLOCKS_CREATE); + if (IS_ERR(bh)) + return PTR_ERR(bh); + + BUFFER_TRACE(bh, "get_write_access"); + err = ext4_journal_get_write_access(handle, inode->i_sb, bh, EXT4_JTR_NONE); + if (err) + goto out; + + kaddr = (char *)bh->b_data; + memcpy(kaddr, disk_link->name, disk_link->len); + inode->i_size = disk_link->len - 1; + EXT4_I(inode)->i_disksize = inode->i_size; + err = ext4_handle_dirty_metadata(handle, inode, bh); +out: + brelse(bh); + return err; +} + static int ext4_symlink(struct user_namespace *mnt_userns, struct inode *dir, struct dentry *dentry, const char *symname) { @@ -3270,26 +3296,14 @@ static int ext4_symlink(struct user_namespace *mnt_userns, struct inode *dir, if (err) return err; - if ((disk_link.len > EXT4_N_BLOCKS * 4)) { - /* - * For non-fast symlinks, we just allocate inode and put it on - * orphan list in the first transaction => we need bitmap, - * group descriptor, sb, inode block, quota blocks, and - * possibly selinux xattr blocks. - */ - credits = 4 + EXT4_MAXQUOTAS_INIT_BLOCKS(dir->i_sb) + - EXT4_XATTR_TRANS_BLOCKS; - } else { - /* - * Fast symlink. We have to add entry to directory - * (EXT4_DATA_TRANS_BLOCKS + EXT4_INDEX_EXTRA_TRANS_BLOCKS), - * allocate new inode (bitmap, group descriptor, inode block, - * quota blocks, sb is already counted in previous macros). - */ - credits = EXT4_DATA_TRANS_BLOCKS(dir->i_sb) + - EXT4_INDEX_EXTRA_TRANS_BLOCKS + 3; - } - + /* + * EXT4_INDEX_EXTRA_TRANS_BLOCKS for addition of entry into the + * directory. +3 for inode, inode bitmap, group descriptor allocation. + * EXT4_DATA_TRANS_BLOCKS for the data block allocation and + * modification. + */ + credits = EXT4_DATA_TRANS_BLOCKS(dir->i_sb) + + EXT4_INDEX_EXTRA_TRANS_BLOCKS + 3; inode = ext4_new_inode_start_handle(mnt_userns, dir, S_IFLNK|S_IRWXUGO, &dentry->d_name, 0, NULL, EXT4_HT_DIR, credits); @@ -3305,73 +3319,52 @@ static int ext4_symlink(struct user_namespace *mnt_userns, struct inode *dir, if (err) goto err_drop_inode; inode->i_op = &ext4_encrypted_symlink_inode_operations; + } else { + if ((disk_link.len > EXT4_N_BLOCKS * 4)) { + inode->i_op = &ext4_symlink_inode_operations; + } else { + inode->i_op = &ext4_fast_symlink_inode_operations; + inode->i_link = (char *)&EXT4_I(inode)->i_data; + } } if ((disk_link.len > EXT4_N_BLOCKS * 4)) { - if (!IS_ENCRYPTED(inode)) - inode->i_op = &ext4_symlink_inode_operations; - inode_nohighmem(inode); - ext4_set_aops(inode); - /* - * We cannot call page_symlink() with transaction started - * because it calls into ext4_write_begin() which can wait - * for transaction commit if we are running out of space - * and thus we deadlock. So we have to stop transaction now - * and restart it when symlink contents is written. - * - * To keep fs consistent in case of crash, we have to put inode - * to orphan list in the mean time. - */ - drop_nlink(inode); - err = ext4_orphan_add(handle, inode); - if (handle) - ext4_journal_stop(handle); - handle = NULL; + /* alloc symlink block and fill it */ + err = ext4_init_symlink_block(handle, inode, &disk_link); if (err) goto err_drop_inode; - err = __page_symlink(inode, disk_link.name, disk_link.len, 1); - if (err) - goto err_drop_inode; - /* - * Now inode is being linked into dir (EXT4_DATA_TRANS_BLOCKS - * + EXT4_INDEX_EXTRA_TRANS_BLOCKS), inode is also modified - */ - handle = ext4_journal_start(dir, EXT4_HT_DIR, - EXT4_DATA_TRANS_BLOCKS(dir->i_sb) + - EXT4_INDEX_EXTRA_TRANS_BLOCKS + 1); - if (IS_ERR(handle)) { - err = PTR_ERR(handle); - handle = NULL; - goto err_drop_inode; - } - set_nlink(inode, 1); - err = ext4_orphan_del(handle, inode); + err = ext4_mark_inode_dirty(handle, inode); + if (!err) + err = ext4_add_entry(handle, dentry, inode); if (err) goto err_drop_inode; + + d_instantiate_new(dentry, inode); + if (IS_DIRSYNC(dir)) + ext4_handle_sync(handle); + if (handle) + ext4_journal_stop(handle); } else { /* clear the extent format for fast symlink */ ext4_clear_inode_flag(inode, EXT4_INODE_EXTENTS); - if (!IS_ENCRYPTED(inode)) { - inode->i_op = &ext4_fast_symlink_inode_operations; - inode->i_link = (char *)&EXT4_I(inode)->i_data; - } memcpy((char *)&EXT4_I(inode)->i_data, disk_link.name, disk_link.len); inode->i_size = disk_link.len - 1; + EXT4_I(inode)->i_disksize = inode->i_size; + err = ext4_add_nondir(handle, dentry, &inode); + if (handle) + ext4_journal_stop(handle); + if (inode) + iput(inode); } - EXT4_I(inode)->i_disksize = inode->i_size; - err = ext4_add_nondir(handle, dentry, &inode); - if (handle) - ext4_journal_stop(handle); - if (inode) - iput(inode); goto out_free_encrypted_link; err_drop_inode: - if (handle) - ext4_journal_stop(handle); clear_nlink(inode); + ext4_orphan_add(handle, inode); unlock_new_inode(inode); + if (handle) + ext4_journal_stop(handle); iput(inode); out_free_encrypted_link: if (disk_link.name != (unsigned char *)symname) diff --git a/fs/ext4/symlink.c b/fs/ext4/symlink.c index 69109746e6e2..f030f8705986 100644 --- a/fs/ext4/symlink.c +++ b/fs/ext4/symlink.c @@ -27,7 +27,7 @@ static const char *ext4_encrypted_get_link(struct dentry *dentry, struct inode *inode, struct delayed_call *done) { - struct page *cpage = NULL; + struct buffer_head *bh = NULL; const void *caddr; unsigned int max_size; const char *paddr; @@ -39,16 +39,19 @@ static const char *ext4_encrypted_get_link(struct dentry *dentry, caddr = EXT4_I(inode)->i_data; max_size = sizeof(EXT4_I(inode)->i_data); } else { - cpage = read_mapping_page(inode->i_mapping, 0, NULL); - if (IS_ERR(cpage)) - return ERR_CAST(cpage); - caddr = page_address(cpage); + bh = ext4_bread(NULL, inode, 0, 0); + if (IS_ERR(bh)) + return ERR_CAST(bh); + if (!bh) { + EXT4_ERROR_INODE(inode, "bad symlink."); + return ERR_PTR(-EFSCORRUPTED); + } + caddr = bh->b_data; max_size = inode->i_sb->s_blocksize; } paddr = fscrypt_get_symlink(inode, caddr, max_size, done); - if (cpage) - put_page(cpage); + brelse(bh); return paddr; } @@ -62,6 +65,31 @@ static int ext4_encrypted_symlink_getattr(struct user_namespace *mnt_userns, return fscrypt_symlink_getattr(path, stat); } +static void ext4_free_link(void *bh) +{ + brelse(bh); +} + +static const char *ext4_get_link(struct dentry *dentry, struct inode *inode, + struct delayed_call *callback) +{ + struct buffer_head *bh; + + if (!dentry) + return ERR_PTR(-ECHILD); + + bh = ext4_bread(NULL, inode, 0, 0); + if (IS_ERR(bh)) + return ERR_CAST(bh); + if (!bh) { + EXT4_ERROR_INODE(inode, "bad symlink."); + return ERR_PTR(-EFSCORRUPTED); + } + + set_delayed_call(callback, ext4_free_link, bh); + return bh->b_data; +} + const struct inode_operations ext4_encrypted_symlink_inode_operations = { .get_link = ext4_encrypted_get_link, .setattr = ext4_setattr, @@ -70,7 +98,7 @@ const struct inode_operations ext4_encrypted_symlink_inode_operations = { }; const struct inode_operations ext4_symlink_inode_operations = { - .get_link = page_get_link, + .get_link = ext4_get_link, .setattr = ext4_setattr, .getattr = ext4_getattr, .listxattr = ext4_listxattr, -- 2.31.1