Received: by 10.223.185.111 with SMTP id b44csp1642231wrg; Sat, 10 Mar 2018 10:24:37 -0800 (PST) X-Google-Smtp-Source: AG47ELu0GaPGbHt7D+raRHzwxWt25U2n8rIK+susCbfGZLfNz65G53ENj3uCWlrqpM5yDVAh5j+4 X-Received: by 2002:a17:902:7c03:: with SMTP id x3-v6mr2769509pll.94.1520706277535; Sat, 10 Mar 2018 10:24:37 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1520706277; cv=none; d=google.com; s=arc-20160816; b=zn+ceZ5ThS+hcoaA8fC5SjgN/erzjMs8sO9KsYV1UoX4ibYLmU+aN+l8uJ03oFQx5q umhpAktX1gvMHmXaLImNB2oi+nXnYuzLR883TNTlyq9Vb4hpYEVCmWfJLmJofZAW5g4g mS3xoA6tQU/Dw+wtK5pS8bh+wMRrfXSV9RkqSMZ3kaDLZF1NhB191NATLNnker6+LGFG ZLsbq4eaw8BePymdMsPnq7zRGVEFDOKI4JD4De4Tv31ZzAhk6AuMWXKZwHnFzjLtPz9v bMtXi9fHK4MmnwLJ0GiXwHmF2HGUY2ziN8ScPqj8Insboe6zuEJssC6y95YWbTtYfA4s a9WA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=iekq4MtEqd26G80E6ALUq6zl1vrsu5AMKuvIE6OiCak=; b=zHNknCvvynTQjmbvAj6I7x3sJyXakBdxHYxvJvKZekkILzLpXvWcYTciXzkB1Z7wrY DGccaBDScdFC6Xq38flYrXDs1+j4SCQVToVGYr6U7d0A5JDGn0cN1OKajCKxggB7DUgj jMKE/TAn0LTSTQdEHqDOdhBwjQ2NCjxkvCGDw1Z5DRGbdG2lFW002HE4KP7diodjVVKo R8LtSdgZi0bS9K0bsSD4X/IIQBrVNtuUcXWnEbwaxbNPXEyMkZ9X1+Pn0txk06FDExKU 0H8r2TP04ABWUGdZx2sYFX54s/er5JCfECzjEEPjMxIH609PaMfP9fF6dcQDE0MsyjTF 3FlQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@eng.ucsd.edu header.s=google header.b=ca96CL+L; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i2si704016pgc.818.2018.03.10.10.24.23; Sat, 10 Mar 2018 10:24:37 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@eng.ucsd.edu header.s=google header.b=ca96CL+L; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932963AbeCJSV4 (ORCPT + 99 others); Sat, 10 Mar 2018 13:21:56 -0500 Received: from mail-pf0-f195.google.com ([209.85.192.195]:36846 "EHLO mail-pf0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932893AbeCJSVx (ORCPT ); Sat, 10 Mar 2018 13:21:53 -0500 Received: by mail-pf0-f195.google.com with SMTP id 68so2620286pfx.3 for ; Sat, 10 Mar 2018 10:21:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=eng.ucsd.edu; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=iekq4MtEqd26G80E6ALUq6zl1vrsu5AMKuvIE6OiCak=; b=ca96CL+LoByYe5iikU55HdlfkOjTtQsncYcoajJheiAzv07xuaa66Qk4AtynuKcA95 jachopKVlMYxgeAgDZFHm6QSzg0tNNVXmx2NKH7wCzf4Fx9t7S1eiUIj/y4f0UBxjYO0 ePeANYGkUlM8at9hoOj4Slk7x7rx2jwbC8NQc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=iekq4MtEqd26G80E6ALUq6zl1vrsu5AMKuvIE6OiCak=; b=ocnwMhjCNzYpT+VZ+q+Mt58kFjREOtyblhuo6Yd3lX/R8vb47ItKc43SZsgRjdOY7D Oy1G99ODOp9Ru1iPQnLfw3f290GSc9gdTZWYsOj+lSnHxUNb7Xjvy98KyLqP5kFeqlj8 SDxr1g5Vsyx4aqrOFzMk8jHymrQJWJHeTQrS1D9sviV1Hk0BfKfGbN8myopA9MXn6dZV 3RYkOkJiQB4VooyzgiZsySLx5TpwSpSQFZcKCc5AAx7keyaNeooNRE3XEbvG++z+20nv LiqKD2YB8oog1Ub4cd8nFzoAz91BT9dnf7wYBtewVjvRjGuSiHa5UqHgduNJK7jgca9+ TFlA== X-Gm-Message-State: AElRT7EZ/ZMLhYYonv/Nk4loOxaVm/6A9y6HURy+EZHbuLRvkyi0XAtB uR398HAHgF2XjiYtiXb8ndxRHg== X-Received: by 10.98.152.205 with SMTP id d74mr2675107pfk.115.1520706111832; Sat, 10 Mar 2018 10:21:51 -0800 (PST) Received: from brienza-desktop.8.8.4.4 (andxu.ucsd.edu. [132.239.17.134]) by smtp.gmail.com with ESMTPSA id h80sm9210167pfj.181.2018.03.10.10.21.50 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sat, 10 Mar 2018 10:21:51 -0800 (PST) From: Andiry Xu To: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org Cc: dan.j.williams@intel.com, andy.rudoff@intel.com, coughlan@redhat.com, swanson@cs.ucsd.edu, david@fromorbit.com, jack@suse.com, swhiteho@redhat.com, miklos@szeredi.hu, andiry.xu@gmail.com, Andiry Xu Subject: [RFC v2 79/83] Normal recovery. Date: Sat, 10 Mar 2018 10:19:00 -0800 Message-Id: <1520705944-6723-80-git-send-email-jix024@eng.ucsd.edu> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1520705944-6723-1-git-send-email-jix024@eng.ucsd.edu> References: <1520705944-6723-1-git-send-email-jix024@eng.ucsd.edu> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Andiry Xu Upon umount, NOVA stores the allocator information and the inuse inode list in reserved inodes. During remount, NOVA reads these information and rebuild the allocator and inuse inode list DRAM data structures. Signed-off-by: Andiry Xu --- fs/nova/bbuild.c | 266 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ fs/nova/bbuild.h | 1 + fs/nova/super.c | 3 + 3 files changed, 270 insertions(+) diff --git a/fs/nova/bbuild.c b/fs/nova/bbuild.c index af1b352..ca51dca 100644 --- a/fs/nova/bbuild.c +++ b/fs/nova/bbuild.c @@ -52,6 +52,206 @@ void nova_init_header(struct super_block *sb, init_rwsem(&sih->i_sem); } +static inline int get_cpuid(struct nova_sb_info *sbi, unsigned long blocknr) +{ + return blocknr / sbi->per_list_blocks; +} + +static void nova_destroy_range_node_tree(struct super_block *sb, + struct rb_root *tree) +{ + struct nova_range_node *curr; + struct rb_node *temp; + + temp = rb_first(tree); + while (temp) { + curr = container_of(temp, struct nova_range_node, node); + temp = rb_next(temp); + rb_erase(&curr->node, tree); + nova_free_range_node(curr); + } +} + +static void nova_destroy_blocknode_tree(struct super_block *sb, int cpu) +{ + struct free_list *free_list; + + free_list = nova_get_free_list(sb, cpu); + nova_destroy_range_node_tree(sb, &free_list->block_free_tree); +} + +static void nova_destroy_blocknode_trees(struct super_block *sb) +{ + struct nova_sb_info *sbi = NOVA_SB(sb); + int i; + + for (i = 0; i < sbi->cpus; i++) + nova_destroy_blocknode_tree(sb, i); + +} + +static int nova_init_blockmap_from_inode(struct super_block *sb) +{ + struct nova_sb_info *sbi = NOVA_SB(sb); + struct nova_inode *pi = nova_get_inode_by_ino(sb, NOVA_BLOCKNODE_INO); + struct nova_inode_info_header sih; + struct free_list *free_list; + struct nova_range_node_lowhigh *entry; + struct nova_range_node *blknode; + size_t size = sizeof(struct nova_range_node_lowhigh); + u64 curr_p; + u64 cpuid; + int ret = 0; + + /* FIXME: Backup inode for BLOCKNODE */ + ret = nova_get_head_tail(sb, pi, &sih); + if (ret) + goto out; + + curr_p = sih.log_head; + if (curr_p == 0) { + nova_dbg("%s: pi head is 0!\n", __func__); + return -EINVAL; + } + + while (curr_p != sih.log_tail) { + if (is_last_entry(curr_p, size)) + curr_p = next_log_page(sb, curr_p); + + if (curr_p == 0) { + nova_dbg("%s: curr_p is NULL!\n", __func__); + NOVA_ASSERT(0); + ret = -EINVAL; + break; + } + + entry = (struct nova_range_node_lowhigh *)nova_get_block(sb, + curr_p); + blknode = nova_alloc_blocknode(sb); + if (blknode == NULL) + NOVA_ASSERT(0); + blknode->range_low = le64_to_cpu(entry->range_low); + blknode->range_high = le64_to_cpu(entry->range_high); + cpuid = get_cpuid(sbi, blknode->range_low); + + /* FIXME: Assume NR_CPUS not change */ + free_list = nova_get_free_list(sb, cpuid); + ret = nova_insert_blocktree(sbi, + &free_list->block_free_tree, blknode); + if (ret) { + nova_err(sb, "%s failed\n", __func__); + nova_free_blocknode(sb, blknode); + NOVA_ASSERT(0); + nova_destroy_blocknode_trees(sb); + goto out; + } + free_list->num_blocknode++; + if (free_list->num_blocknode == 1) + free_list->first_node = blknode; + free_list->last_node = blknode; + free_list->num_free_blocks += + blknode->range_high - blknode->range_low + 1; + curr_p += sizeof(struct nova_range_node_lowhigh); + } +out: + nova_free_inode_log(sb, pi, &sih); + return ret; +} + +static void nova_destroy_inode_trees(struct super_block *sb) +{ + struct nova_sb_info *sbi = NOVA_SB(sb); + struct inode_map *inode_map; + int i; + + for (i = 0; i < sbi->cpus; i++) { + inode_map = &sbi->inode_maps[i]; + nova_destroy_range_node_tree(sb, + &inode_map->inode_inuse_tree); + } +} + +#define CPUID_MASK 0xff00000000000000 + +static int nova_init_inode_list_from_inode(struct super_block *sb) +{ + struct nova_sb_info *sbi = NOVA_SB(sb); + struct nova_inode *pi = nova_get_inode_by_ino(sb, NOVA_INODELIST_INO); + struct nova_inode_info_header sih; + struct nova_range_node_lowhigh *entry; + struct nova_range_node *range_node; + struct inode_map *inode_map; + size_t size = sizeof(struct nova_range_node_lowhigh); + unsigned long num_inode_node = 0; + u64 curr_p; + unsigned long cpuid; + int ret; + + /* FIXME: Backup inode for INODELIST */ + ret = nova_get_head_tail(sb, pi, &sih); + if (ret) + goto out; + + sbi->s_inodes_used_count = 0; + curr_p = sih.log_head; + if (curr_p == 0) { + nova_dbg("%s: pi head is 0!\n", __func__); + return -EINVAL; + } + + while (curr_p != sih.log_tail) { + if (is_last_entry(curr_p, size)) + curr_p = next_log_page(sb, curr_p); + + if (curr_p == 0) { + nova_dbg("%s: curr_p is NULL!\n", __func__); + NOVA_ASSERT(0); + } + + entry = (struct nova_range_node_lowhigh *)nova_get_block(sb, + curr_p); + range_node = nova_alloc_inode_node(sb); + if (range_node == NULL) + NOVA_ASSERT(0); + + cpuid = (entry->range_low & CPUID_MASK) >> 56; + if (cpuid >= sbi->cpus) { + nova_err(sb, "Invalid cpuid %lu\n", cpuid); + nova_free_inode_node(sb, range_node); + NOVA_ASSERT(0); + nova_destroy_inode_trees(sb); + goto out; + } + + range_node->range_low = entry->range_low & ~CPUID_MASK; + range_node->range_high = entry->range_high; + ret = nova_insert_inodetree(sbi, range_node, cpuid); + if (ret) { + nova_err(sb, "%s failed, %d\n", __func__, cpuid); + nova_free_inode_node(sb, range_node); + NOVA_ASSERT(0); + nova_destroy_inode_trees(sb); + goto out; + } + + sbi->s_inodes_used_count += + range_node->range_high - range_node->range_low + 1; + num_inode_node++; + + inode_map = &sbi->inode_maps[cpuid]; + inode_map->num_range_node_inode++; + if (!inode_map->first_inode_range) + inode_map->first_inode_range = range_node; + + curr_p += sizeof(struct nova_range_node_lowhigh); + } + + nova_dbg("%s: %lu inode nodes\n", __func__, num_inode_node); +out: + nova_free_inode_log(sb, pi, &sih); + return ret; +} + static u64 nova_append_range_node_entry(struct super_block *sb, struct nova_range_node *curr, u64 tail, unsigned long cpuid) { @@ -214,3 +414,69 @@ void nova_save_blocknode_mappings_to_log(struct super_block *sb) pi->log_head, pi->log_tail); } +/*********************** Recovery entrance *************************/ + +/* Return TRUE if we can do a normal unmount recovery */ +static bool nova_try_normal_recovery(struct super_block *sb) +{ + struct nova_inode *pi = nova_get_inode_by_ino(sb, NOVA_BLOCKNODE_INO); + int ret; + + if (pi->log_head == 0 || pi->log_tail == 0) + return false; + + ret = nova_init_blockmap_from_inode(sb); + if (ret) { + nova_err(sb, "init blockmap failed, fall back to failure recovery\n"); + return false; + } + + ret = nova_init_inode_list_from_inode(sb); + if (ret) { + nova_err(sb, "init inode list failed, fall back to failure recovery\n"); + nova_destroy_blocknode_trees(sb); + return false; + } + + return true; +} + +/* + * Recovery routine has two tasks: + * 1. Restore inuse inode list; + * 2. Restore the NVMM allocator. + */ +int nova_recovery(struct super_block *sb) +{ + struct nova_sb_info *sbi = NOVA_SB(sb); + struct nova_super_block *super = sbi->nova_sb; + unsigned long initsize = le64_to_cpu(super->s_size); + bool value = false; + int ret = 0; + timing_t start, end; + + nova_dbgv("%s\n", __func__); + + /* Always check recovery time */ + if (measure_timing == 0) + getrawmonotonic(&start); + + NOVA_START_TIMING(recovery_t, start); + sbi->num_blocks = ((unsigned long)(initsize) >> PAGE_SHIFT); + + /* initialize free list info */ + nova_init_blockmap(sb, 1); + + value = nova_try_normal_recovery(sb); + + NOVA_END_TIMING(recovery_t, start); + if (measure_timing == 0) { + getrawmonotonic(&end); + Timingstats[recovery_t] += + (end.tv_sec - start.tv_sec) * 1000000000 + + (end.tv_nsec - start.tv_nsec); + } + + sbi->s_epoch_id = le64_to_cpu(super->s_epoch_id); + return ret; +} diff --git a/fs/nova/bbuild.h b/fs/nova/bbuild.h index 5d2b5f0..2c3deb0 100644 --- a/fs/nova/bbuild.h +++ b/fs/nova/bbuild.h @@ -5,5 +5,6 @@ void nova_init_header(struct super_block *sb, struct nova_inode_info_header *sih, u16 i_mode); void nova_save_inode_list_to_log(struct super_block *sb); void nova_save_blocknode_mappings_to_log(struct super_block *sb); +int nova_recovery(struct super_block *sb); #endif diff --git a/fs/nova/super.c b/fs/nova/super.c index 980b1d7..14b4af6 100644 --- a/fs/nova/super.c +++ b/fs/nova/super.c @@ -642,6 +642,9 @@ static int nova_fill_super(struct super_block *sb, void *data, int silent) sb->s_xattr = NULL; sb->s_flags |= MS_NOSEC; + if ((sbi->s_mount_opt & NOVA_MOUNT_FORMAT) == 0) + nova_recovery(sb); + root_i = nova_iget(sb, NOVA_ROOT_INO); if (IS_ERR(root_i)) { retval = PTR_ERR(root_i); -- 2.7.4