Received: by 10.223.185.111 with SMTP id b44csp1642101wrg; Sat, 10 Mar 2018 10:24:25 -0800 (PST) X-Google-Smtp-Source: AG47ELvFkJd9YChmUhhGq9yaGFr2bthNKQShhGpOBCU+roKskJHHUg3pL46HNU+dysseThNY/cc+ X-Received: by 10.99.96.130 with SMTP id u124mr1950533pgb.252.1520706265418; Sat, 10 Mar 2018 10:24:25 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1520706265; cv=none; d=google.com; s=arc-20160816; b=0OxrSE5b27zR9zP+WggpS8vmGn9H7RVLWsqBfJ9JIyrswyhMrmU7p7tTLYn+FZkqaW myjtQRJqsMTkr6iccr6PSEcwVl+QtvVinOcs3sdQZHGucvCC7qLKkxDfQ/x/u2ZOC6ly YqQkl2270uia9nYo45BUCDkyFDxW9X9WMnrPHWzAPOE/m5GjMV1xj5JmEr2SnLYph39g Z+YoAPxjZO+KO5dIJpyDgQCy9uiWihCKui1e3ItH2pSRbpFcV3zqfrlGO84q20TkSJ2h ys6IH9ewJOtOpnwcod7bYmbOKHdmSf7cicHDa3Ax61sDNOqeT2E42fNbRFjujxlESqIQ knKw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=Irl8D3qCKsgZ0YsxM+Ed8lvL8Hr5YoNikg99Gt8qToc=; b=jmws/Kt5XyHoG54aYCXK5lpDtc3FoOZ2DI1NmNAYVUiJ8vechSYPSozQQGWG8p28Ju HefMM5tNFtl7zohaonqLIPoI29Tl1ky/21lpjaQndrpKuSew8OtaGUs7g2mhQ8xeRtc+ 1d2/FmHgPYrE6hSgTYnvU2s5Yyy9ZNlBQLJcpZ+9N9hUqx3OUAyeX5DH+Xztg7MJlFIl QdtnaKwpmMaH/GP7l94U6AE9iuCZU+3weGniH68kROtNySx6lIkVXGEZveNBQBS7QQBW J89lGBOvQh2OfTC8OtygFmUm6TBjwB3bcp8YWH/ikDWxj0YcV6Sdd8tagpVjRj+k2iTV dMSg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@eng.ucsd.edu header.s=google header.b=KqHVjDnp; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l18-v6si1635107pli.630.2018.03.10.10.24.11; Sat, 10 Mar 2018 10:24:25 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@eng.ucsd.edu header.s=google header.b=KqHVjDnp; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932977AbeCJSV6 (ORCPT + 99 others); Sat, 10 Mar 2018 13:21:58 -0500 Received: from mail-pf0-f196.google.com ([209.85.192.196]:41656 "EHLO mail-pf0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932940AbeCJSVv (ORCPT ); Sat, 10 Mar 2018 13:21:51 -0500 Received: by mail-pf0-f196.google.com with SMTP id f80so2618085pfa.8 for ; Sat, 10 Mar 2018 10:21:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=eng.ucsd.edu; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Irl8D3qCKsgZ0YsxM+Ed8lvL8Hr5YoNikg99Gt8qToc=; b=KqHVjDnpN7uSLOOSGFiRJ4U/Pnbb3iRDgXBA/p9vd7t5ERyoj1ZDix62Sq8JU54sYB mQJpKOcRvLrlJIarf65TDcY1bnVk7KKiqJp6tPiBhPRJ0fW97WBnsEnCoCnbU+IHRa7V u/P3Kuk+1qzdgtCwtMllDXpWIqRXZ1KuWtMQc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Irl8D3qCKsgZ0YsxM+Ed8lvL8Hr5YoNikg99Gt8qToc=; b=ZO1RgYmcnDSGZBWmFqC6wvRGwdMOFLi9wt6VVhTEueW/aAkM6UNOFJ+PbLtejiTVVN aVwzAmvEjyRTd7wzzR2acU1v/6IXQjuN1bBwiV9kZq4KvUW/5D74htZk8M4LLC1jz8m0 by7VcbmpBbwm6/mCBiNvaO1NJI6GosbiQbmTCVz5uR7/iGZroilbd2+G9/Ln2OSdzXkq J1IsaFOmMvH/AzsrM9pz8oVSGOmdtrHzuB9lZ/qnYqFzH/KFdxfUIRTNXd0t6OcCKsHM ASfl0/D4l59EDn0UaDMNBG9DqhOpNbau0n4wZhpJN0cGQwhzto/dQ74t5LEEJNkwtht9 GyKw== X-Gm-Message-State: AElRT7E3BEqnT7jXjtzqVQ74M/yjRpvf30yNB1lDcf2LCI+JJdCSHZ+o RJcQ/cs7644J/TGgbtrHiZePRg== X-Received: by 10.98.242.6 with SMTP id m6mr2656086pfh.230.1520706110618; Sat, 10 Mar 2018 10:21:50 -0800 (PST) Received: from brienza-desktop.8.8.4.4 (andxu.ucsd.edu. [132.239.17.134]) by smtp.gmail.com with ESMTPSA id h80sm9210167pfj.181.2018.03.10.10.21.49 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sat, 10 Mar 2018 10:21:50 -0800 (PST) From: Andiry Xu To: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org Cc: dan.j.williams@intel.com, andy.rudoff@intel.com, coughlan@redhat.com, swanson@cs.ucsd.edu, david@fromorbit.com, jack@suse.com, swhiteho@redhat.com, miklos@szeredi.hu, andiry.xu@gmail.com, Andiry Xu Subject: [RFC v2 78/83] GC: Thorough garbage collection. Date: Sat, 10 Mar 2018 10:18:59 -0800 Message-Id: <1520705944-6723-79-git-send-email-jix024@eng.ucsd.edu> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1520705944-6723-1-git-send-email-jix024@eng.ucsd.edu> References: <1520705944-6723-1-git-send-email-jix024@eng.ucsd.edu> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Andiry Xu After fast gc, if the valid log entries still account for less than the half of the log size, NOVA starts thorough garbage collection, allocates a new log, copies the live log entries to it, and switches to the new log atomically. The radix tree needs to be updated to point to the new log. Example: I = Invalid, V = Valid VIIV -> IIII -> VVII || || fast gc \/ VIIV -> VVII || || thorough gc \/ VVVV Signed-off-by: Andiry Xu --- fs/nova/gc.c | 273 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 273 insertions(+) diff --git a/fs/nova/gc.c b/fs/nova/gc.c index 1634c04..d74286e 100644 --- a/fs/nova/gc.c +++ b/fs/nova/gc.c @@ -18,6 +18,62 @@ #include "nova.h" #include "inode.h" +static bool curr_log_entry_invalid(struct super_block *sb, + struct nova_inode *pi, struct nova_inode_info_header *sih, + u64 curr_p, size_t *length) +{ + struct nova_file_write_entry *entry; + struct nova_dentry *dentry; + struct nova_setattr_logentry *setattr_entry; + struct nova_link_change_entry *linkc_entry; + void *entryc; + u8 type; + bool ret = true; + + entryc = (void *)nova_get_block(sb, curr_p); + type = nova_get_entry_type(entryc); + + switch (type) { + case SET_ATTR: + setattr_entry = (struct nova_setattr_logentry *) entryc; + if (setattr_entry->invalid == 0) + ret = false; + *length = sizeof(struct nova_setattr_logentry); + break; + case LINK_CHANGE: + linkc_entry = (struct nova_link_change_entry *) entryc; + if (linkc_entry->invalid == 0) + ret = false; + *length = sizeof(struct nova_link_change_entry); + break; + case FILE_WRITE: + entry = (struct nova_file_write_entry *) entryc; + if (entry->num_pages != entry->invalid_pages) + ret = false; + *length = sizeof(struct nova_file_write_entry); + break; + case DIR_LOG: + dentry = (struct nova_dentry *) entryc; + if (dentry->invalid == 0) + ret = false; + if (sih->last_dentry == curr_p) + ret = false; + *length = le16_to_cpu(dentry->de_len); + break; + case NEXT_PAGE: + /* No more entries in this page */ + *length = PAGE_SIZE - ENTRY_LOC(curr_p); + break; + default: + nova_dbg("%s: unknown type %d, 0x%llx\n", + __func__, type, curr_p); + NOVA_ASSERT(0); + *length = PAGE_SIZE - ENTRY_LOC(curr_p); + break; + } + + return ret; +} static bool curr_page_invalid(struct super_block *sb, struct nova_inode *pi, struct nova_inode_info_header *sih, @@ -68,6 +124,210 @@ static void free_curr_page(struct super_block *sb, nova_get_blocknr(sb, curr_head, btype), 1); } +static int nova_gc_assign_file_entry(struct super_block *sb, + struct nova_inode_info_header *sih, + struct nova_file_write_entry *old_entry, + struct nova_file_write_entry *new_entry) +{ + struct nova_file_write_entry *temp; + void **pentry; + unsigned long start_pgoff = old_entry->pgoff; + unsigned int num = old_entry->num_pages; + unsigned long curr_pgoff; + int i; + int ret = 0; + + for (i = 0; i < num; i++) { + curr_pgoff = start_pgoff + i; + + pentry = radix_tree_lookup_slot(&sih->tree, curr_pgoff); + if (pentry) { + temp = radix_tree_deref_slot(pentry); + if (temp == old_entry) + radix_tree_replace_slot(&sih->tree, pentry, + new_entry); + } + } + + return ret; +} + +static int nova_gc_assign_dentry(struct super_block *sb, + struct nova_inode_info_header *sih, struct nova_dentry *old_dentry, + struct nova_dentry *new_dentry) +{ + struct nova_dentry *temp; + void **pentry; + unsigned long hash; + int ret = 0; + + hash = BKDRHash(old_dentry->name, old_dentry->name_len); + nova_dbgv("%s: assign %s hash %lu\n", __func__, + old_dentry->name, hash); + + /* FIXME: hash collision ignored here */ + pentry = radix_tree_lookup_slot(&sih->tree, hash); + if (pentry) { + temp = radix_tree_deref_slot(pentry); + if (temp == old_dentry) + radix_tree_replace_slot(&sih->tree, pentry, new_dentry); + } + + return ret; +} + +static int nova_gc_assign_new_entry(struct super_block *sb, + struct nova_inode *pi, struct nova_inode_info_header *sih, + u64 curr_p, u64 new_curr) +{ + struct nova_file_write_entry *old_entry, *new_entry; + struct nova_dentry *old_dentry, *new_dentry; + void *addr, *new_addr; + u8 type; + int ret = 0; + + addr = (void *)nova_get_block(sb, curr_p); + type = nova_get_entry_type(addr); + switch (type) { + case SET_ATTR: + sih->last_setattr = new_curr; + break; + case LINK_CHANGE: + sih->last_link_change = new_curr; + break; + case FILE_WRITE: + new_addr = (void *)nova_get_block(sb, new_curr); + old_entry = (struct nova_file_write_entry *)addr; + new_entry = (struct nova_file_write_entry *)new_addr; + ret = nova_gc_assign_file_entry(sb, sih, old_entry, new_entry); + break; + case DIR_LOG: + new_addr = (void *)nova_get_block(sb, new_curr); + old_dentry = (struct nova_dentry *)addr; + new_dentry = (struct nova_dentry *)new_addr; + if (sih->last_dentry == curr_p) + sih->last_dentry = new_curr; + ret = nova_gc_assign_dentry(sb, sih, old_dentry, new_dentry); + break; + default: + nova_dbg("%s: unknown type %d, 0x%llx\n", + __func__, type, curr_p); + NOVA_ASSERT(0); + break; + } + + return ret; +} + +/* Copy live log entries to the new log and atomically replace the old log */ +static unsigned long nova_inode_log_thorough_gc(struct super_block *sb, + struct nova_inode *pi, struct nova_inode_info_header *sih, + unsigned long blocks, unsigned long checked_pages) +{ + struct nova_inode_log_page *curr_page = NULL; + size_t length; + u64 curr_p, new_curr; + u64 old_curr_p; + u64 tail_block; + u64 old_head; + u64 new_head = 0; + u64 next; + int allocated; + int extended = 0; + int ret; + timing_t gc_time; + + NOVA_START_TIMING(thorough_gc_t, gc_time); + + curr_p = sih->log_head; + old_curr_p = curr_p; + old_head = sih->log_head; + nova_dbg_verbose("Log head 0x%llx, tail 0x%llx\n", + curr_p, sih->log_tail); + if (curr_p == 0 && sih->log_tail == 0) + goto out; + + if (curr_p >> PAGE_SHIFT == sih->log_tail >> PAGE_SHIFT) + goto out; + + allocated = nova_allocate_inode_log_pages(sb, sih, blocks, + &new_head, ANY_CPU, 0); + if (allocated != blocks) { + nova_err(sb, "%s: ERROR: no inode log page available\n", + __func__); + goto out; + } + + new_curr = new_head; + while (curr_p != sih->log_tail) { + old_curr_p = curr_p; + if (goto_next_page(sb, curr_p)) + curr_p = next_log_page(sb, curr_p); + + if (curr_p >> PAGE_SHIFT == sih->log_tail >> PAGE_SHIFT) { + /* Don't recycle tail page */ + break; + } + + length = 0; + ret = curr_log_entry_invalid(sb, pi, sih, curr_p, &length); + if (!ret) { + extended = 0; + new_curr = nova_get_append_head(sb, pi, sih, + new_curr, length, MAIN_LOG, + 1, &extended); + if (extended) + blocks++; + /* Copy entry to the new log */ + memcpy_to_pmem_nocache(nova_get_block(sb, new_curr), + nova_get_block(sb, curr_p), length); + nova_inc_page_num_entries(sb, new_curr); + nova_gc_assign_new_entry(sb, pi, sih, curr_p, new_curr); + new_curr += length; + } + + curr_p += length; + } + + /* Step 1: Link new log to the tail block */ + tail_block = BLOCK_OFF(sih->log_tail); + curr_page = (struct nova_inode_log_page *)nova_get_block(sb, + BLOCK_OFF(new_curr)); + next = next_log_page(sb, new_curr); + if (next > 0) + nova_free_contiguous_log_blocks(sb, sih, next); + + nova_set_next_page_flag(sb, new_curr); + nova_set_next_page_address(sb, curr_page, tail_block, 0); + + /* Step 2: Atomically switch to the new log */ + pi->log_head = new_head; + nova_persist_inode(pi); + nova_flush_buffer(pi, sizeof(struct nova_inode), 1); + sih->log_head = new_head; + + /* Step 3: Unlink the old log */ + curr_page = (struct nova_inode_log_page *)nova_get_block(sb, + BLOCK_OFF(old_curr_p)); + next = next_log_page(sb, old_curr_p); + if (next != tail_block) + nova_err(sb, "Old log error: old curr_p 0x%lx, next 0x%lx ", + "curr_p 0x%lx, tail block 0x%lx\n", old_curr_p, + next, curr_p, tail_block); + + nova_set_next_page_address(sb, curr_page, 0, 1); + + /* Step 4: Free the old log */ + nova_free_contiguous_log_blocks(sb, sih, old_head); + + sih->log_pages = sih->log_pages + blocks - checked_pages; + NOVA_STATS_ADD(thorough_gc_pages, checked_pages - blocks); + NOVA_STATS_ADD(thorough_checked_pages, checked_pages); +out: + NOVA_END_TIMING(thorough_gc_t, gc_time); + return blocks; +} + /* * Scan pages in the log and remove those with no valid log entries. @@ -178,9 +438,22 @@ int nova_inode_log_fast_gc(struct super_block *sb, if (sih->num_entries == 0) return 0; + /* Estimate how many pages worth of valid entries the log contains. + * + * If it is less than half the number pages that remain in the log, + * compress them with thorough gc. + */ blocks = (sih->valid_entries * checked_pages) / sih->num_entries; if ((sih->valid_entries * checked_pages) % sih->num_entries) blocks++; + if (force_thorough || (blocks && blocks * 2 < checked_pages)) { + nova_dbgv("Thorough GC for inode %lu: checked pages %lu, valid pages %lu\n", + sih->ino, + checked_pages, blocks); + blocks = nova_inode_log_thorough_gc(sb, pi, sih, + blocks, checked_pages); + } + return 0; } -- 2.7.4