Received: by 10.223.185.111 with SMTP id b44csp1656963wrg; Sat, 10 Mar 2018 10:46:58 -0800 (PST) X-Google-Smtp-Source: AG47ELuLf/l+kJ9iH7A11x9Prf/5XKaCDBfD5SZAdv5aafQsXPnKMyNjXOrdM4ou0Tp7bPUgZ313 X-Received: by 2002:a17:902:5852:: with SMTP id f18-v6mr2721668plj.289.1520707618101; Sat, 10 Mar 2018 10:46:58 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1520707618; cv=none; d=google.com; s=arc-20160816; b=XrnpnyyKeal9ilUmIuo+RXlb40frvrnMAoz+gLZKUXvakPA6bvQjcoPVvS3B/F8aiC AnuQKJEDONiaDxFUbyxmm7RjqTMctpMYcWPVZFWBUl8z9D8gvD7pAEL7rSZWCywdYvwc s+H4I558sjkIW9IFxu6NvxiDCkjo7IFEsmEelKYgQ1uhIJFFG+ReRH1ozdcyLnQFEhrF j4+R3gkHN1WPSDZYiXkMfn/cArBcPHRbAAiIgR5N+71TiPumOf4NHL2agnRgVwAE8E6w 88bAtjvuijt72LZoFUzhYQMzUOveRxdwYCmaOE5FUI2j9JMTbJrac7ki67zezzyWqrEM /tng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=lhCHNMckKK4tjr6L481lp3dDRbIiYXThZPaSAxhq0dQ=; b=p9jYzDkW6UOPxyoIKT8Ood6x7WG73xyxQPFvwSIcf1BLLSH6wgBQhqzV65bbJEZTxG cXe0WTZaRa/Fu26+1U84KRFw8Xn7TiZ2LpUzVEFZ9WhP2Xo6IW6v4M/1mk6soG/WUcCJ iT2n//58dDPE6D4SXL0su7dbQQvXJo8zwwLFBia+maG9kNDYb61OjuHdgCWK7LhMRe6d MljyGiEFdE8EJSIqYWjhbkWDbiOFRJP5WKoyvNEBXa/h6TjmvRMHMILDKeJFXoHut4PN lSgnaI2HD0IVjErobBdPmxrp/bQM93GwV5U4aBC6We5LZvRuzrDlnH37jVDPbeQAsNKj 1YaQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@eng.ucsd.edu header.s=google header.b=MTgW94bs; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a92-v6si240292pla.107.2018.03.10.10.46.43; Sat, 10 Mar 2018 10:46:58 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@eng.ucsd.edu header.s=google header.b=MTgW94bs; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932322AbeCJSpg (ORCPT + 99 others); Sat, 10 Mar 2018 13:45:36 -0500 Received: from mail-pf0-f195.google.com ([209.85.192.195]:37734 "EHLO mail-pf0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932180AbeCJSUV (ORCPT ); Sat, 10 Mar 2018 13:20:21 -0500 Received: by mail-pf0-f195.google.com with SMTP id h11so2617208pfn.4 for ; Sat, 10 Mar 2018 10:20:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=eng.ucsd.edu; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=lhCHNMckKK4tjr6L481lp3dDRbIiYXThZPaSAxhq0dQ=; b=MTgW94bsmIydpufZsI+X+bgiOhxJqtj4wdXiTWEuaMpMs/BpVFenaAN3lYf4K7K4LY RzLJQ5mTXGQa0YtNmwlc6+htfj7uyLG1iZ7XNbXzDR1LVjOgz5JKnOrF7AS2hZTFrKlA HBoFh7HisGvZW5o3GBJTfG2gx4DWiTCh4TcyU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=lhCHNMckKK4tjr6L481lp3dDRbIiYXThZPaSAxhq0dQ=; b=NDoMcuzGIEvz6m6oivA9r5pgSld4Zq1DcDMAOTIHvrGY4sR4LvPVPbGpxMja4YS1NK HoyEyvOFNceMvFUsp3W8U1MBbn4NlQ5sDdR/0hM04LH876GqF3Qp18NiSb638Renp4EM cbUxqW/NbWP3EAdgt2yOZnr+Fv7sH+U5NL5VL3oC+v0IdoIjuAIVEGoN3B0wB6UVYaRF RGVILtkzX5FyHQxyK5DAQcAy1NeGJHPas5wW3ZTacnKw5WK5WbQFNCTeEi/0/LLtkARw tLchZ/uAsMjL2W1iNXCW1s2bG0zT4fqwafamILH0LHolfn8T5u367c+ecNfqu/EM7Stc wDmA== X-Gm-Message-State: AElRT7HhJj5X/unFHinRQ/qaQ/o0YGLK7s12m3BVp4Tve+HK65hpsejM x90q/9fGmSnaKniL39c4YVHUXQ== X-Received: by 10.98.150.82 with SMTP id c79mr2675452pfe.88.1520706021258; Sat, 10 Mar 2018 10:20:21 -0800 (PST) Received: from brienza-desktop.8.8.4.4 (andxu.ucsd.edu. [132.239.17.134]) by smtp.gmail.com with ESMTPSA id h80sm9210167pfj.181.2018.03.10.10.20.20 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sat, 10 Mar 2018 10:20:20 -0800 (PST) From: Andiry Xu To: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org Cc: dan.j.williams@intel.com, andy.rudoff@intel.com, coughlan@redhat.com, swanson@cs.ucsd.edu, david@fromorbit.com, jack@suse.com, swhiteho@redhat.com, miklos@szeredi.hu, andiry.xu@gmail.com, Andiry Xu Subject: [RFC v2 05/83] Add NOVA filesystem definitions and useful helper routines. Date: Sat, 10 Mar 2018 10:17:46 -0800 Message-Id: <1520705944-6723-6-git-send-email-jix024@eng.ucsd.edu> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1520705944-6723-1-git-send-email-jix024@eng.ucsd.edu> References: <1520705944-6723-1-git-send-email-jix024@eng.ucsd.edu> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Andiry Xu NOVA stores offset rather than absolute addresses in pmem. nova_get_block() and nova_get_addr_off() provide transitions between these two kinds of addresses. Signed-off-by: Andiry Xu --- fs/nova/nova.h | 299 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 299 insertions(+) create mode 100644 fs/nova/nova.h diff --git a/fs/nova/nova.h b/fs/nova/nova.h new file mode 100644 index 0000000..5eb696c --- /dev/null +++ b/fs/nova/nova.h @@ -0,0 +1,299 @@ +/* + * BRIEF DESCRIPTION + * + * Definitions for the NOVA filesystem. + * + * Copyright 2015-2016 Regents of the University of California, + * UCSD Non-Volatile Systems Lab, Andiry Xu + * Copyright 2012-2013 Intel Corporation + * Copyright 2009-2011 Marco Stornelli + * Copyright 2003 Sony Corporation + * Copyright 2003 Matsushita Electric Industrial Co., Ltd. + * 2003-2004 (c) MontaVista Software, Inc. , Steve Longerbeam + * This file is licensed under the terms of the GNU General Public + * License version 2. This program is licensed "as is" without any + * warranty of any kind, whether express or implied. + */ +#ifndef __NOVA_H +#define __NOVA_H + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "nova_def.h" + +#define PAGE_SHIFT_2M 21 +#define PAGE_SHIFT_1G 30 + + +/* + * Debug code + */ +#ifdef pr_fmt +#undef pr_fmt +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt +#endif + +/* #define nova_dbg(s, args...) pr_debug(s, ## args) */ +#define nova_dbg(s, args ...) pr_info(s, ## args) +#define nova_err(sb, s, args ...) nova_error_mng(sb, s, ## args) +#define nova_warn(s, args ...) pr_warn(s, ## args) +#define nova_info(s, args ...) pr_info(s, ## args) + +extern unsigned int nova_dbgmask; +#define NOVA_DBGMASK_MMAPHUGE (0x00000001) +#define NOVA_DBGMASK_MMAP4K (0x00000002) +#define NOVA_DBGMASK_MMAPVERBOSE (0x00000004) +#define NOVA_DBGMASK_MMAPVVERBOSE (0x00000008) +#define NOVA_DBGMASK_VERBOSE (0x00000010) +#define NOVA_DBGMASK_TRANSACTION (0x00000020) + +#define nova_dbg_mmap4k(s, args ...) \ + ((nova_dbgmask & NOVA_DBGMASK_MMAP4K) ? nova_dbg(s, args) : 0) +#define nova_dbg_mmapv(s, args ...) \ + ((nova_dbgmask & NOVA_DBGMASK_MMAPVERBOSE) ? nova_dbg(s, args) : 0) +#define nova_dbg_mmapvv(s, args ...) \ + ((nova_dbgmask & NOVA_DBGMASK_MMAPVVERBOSE) ? nova_dbg(s, args) : 0) + +#define nova_dbg_verbose(s, args ...) \ + ((nova_dbgmask & NOVA_DBGMASK_VERBOSE) ? nova_dbg(s, ##args) : 0) +#define nova_dbgv(s, args ...) nova_dbg_verbose(s, ##args) +#define nova_dbg_trans(s, args ...) \ + ((nova_dbgmask & NOVA_DBGMASK_TRANSACTION) ? nova_dbg(s, ##args) : 0) + +#define NOVA_ASSERT(x) do {\ + if (!(x))\ + nova_warn("assertion failed %s:%d: %s\n", \ + __FILE__, __LINE__, #x);\ + } while (0) + +#define nova_set_bit __test_and_set_bit_le +#define nova_clear_bit __test_and_clear_bit_le +#define nova_find_next_zero_bit find_next_zero_bit_le + +#define clear_opt(o, opt) (o &= ~NOVA_MOUNT_ ## opt) +#define set_opt(o, opt) (o |= NOVA_MOUNT_ ## opt) +#define test_opt(sb, opt) (NOVA_SB(sb)->s_mount_opt & NOVA_MOUNT_ ## opt) + +#define NOVA_LARGE_INODE_TABLE_SIZE (0x200000) +/* NOVA size threshold for using 2M blocks for inode table */ +#define NOVA_LARGE_INODE_TABLE_THREASHOLD (0x20000000) +/* + * nova inode flags + * + * NOVA_EOFBLOCKS_FL There are blocks allocated beyond eof + */ +#define NOVA_EOFBLOCKS_FL 0x20000000 +/* Flags that should be inherited by new inodes from their parent. */ +#define NOVA_FL_INHERITED (FS_SECRM_FL | FS_UNRM_FL | FS_COMPR_FL | \ + FS_SYNC_FL | FS_NODUMP_FL | FS_NOATIME_FL | \ + FS_COMPRBLK_FL | FS_NOCOMP_FL | \ + FS_JOURNAL_DATA_FL | FS_NOTAIL_FL | FS_DIRSYNC_FL) +/* Flags that are appropriate for regular files (all but dir-specific ones). */ +#define NOVA_REG_FLMASK (~(FS_DIRSYNC_FL | FS_TOPDIR_FL)) +/* Flags that are appropriate for non-directories/regular files. */ +#define NOVA_OTHER_FLMASK (FS_NODUMP_FL | FS_NOATIME_FL) +#define NOVA_FL_USER_VISIBLE (FS_FL_USER_VISIBLE | NOVA_EOFBLOCKS_FL) + +/* IOCTLs */ +#define NOVA_PRINT_TIMING 0xBCD00010 +#define NOVA_CLEAR_STATS 0xBCD00011 +#define NOVA_PRINT_LOG 0xBCD00013 +#define NOVA_PRINT_LOG_BLOCKNODE 0xBCD00014 +#define NOVA_PRINT_LOG_PAGES 0xBCD00015 +#define NOVA_PRINT_FREE_LISTS 0xBCD00018 + + +#define READDIR_END (ULONG_MAX) +#define ANY_CPU (65536) +#define FREE_BATCH (16) + +extern unsigned int blk_type_to_shift[NOVA_BLOCK_TYPE_MAX]; +extern unsigned int blk_type_to_size[NOVA_BLOCK_TYPE_MAX]; + + +/* Mask out flags that are inappropriate for the given type of inode. */ +static inline __le32 nova_mask_flags(umode_t mode, __le32 flags) +{ + flags &= cpu_to_le32(NOVA_FL_INHERITED); + if (S_ISDIR(mode)) + return flags; + else if (S_ISREG(mode)) + return flags & cpu_to_le32(NOVA_REG_FLMASK); + else + return flags & cpu_to_le32(NOVA_OTHER_FLMASK); +} + +static inline u32 nova_crc32c(u32 crc, const u8 *data, size_t len) +{ + u8 *ptr = (u8 *) data; + u64 acc = crc; /* accumulator, crc32c value in lower 32b */ + u32 csum; + + /* x86 instruction crc32 is part of SSE-4.2 */ + if (static_cpu_has(X86_FEATURE_XMM4_2)) { + /* This inline assembly implementation should be equivalent + * to the kernel's crc32c_intel_le_hw() function used by + * crc32c(), but this performs better on test machines. + */ + while (len > 8) { + asm volatile(/* 64b quad words */ + "crc32q (%1), %0" + : "=r" (acc) + : "r" (ptr), "0" (acc) + ); + ptr += 8; + len -= 8; + } + + while (len > 0) { + asm volatile(/* trailing bytes */ + "crc32b (%1), %0" + : "=r" (acc) + : "r" (ptr), "0" (acc) + ); + ptr++; + len--; + } + + csum = (u32) acc; + } else { + /* The kernel's crc32c() function should also detect and use the + * crc32 instruction of SSE-4.2. But calling in to this function + * is about 3x to 5x slower than the inline assembly version on + * some test machines. + */ + csum = crc32c(crc, data, len); + } + + return csum; +} + +static inline int memcpy_to_pmem_nocache(void *dst, const void *src, + unsigned int size) +{ + int ret; + + ret = __copy_from_user_inatomic_nocache(dst, src, size); + + return ret; +} + + +/* assumes the length to be 4-byte aligned */ +static inline void memset_nt(void *dest, uint32_t dword, size_t length) +{ + uint64_t dummy1, dummy2; + uint64_t qword = ((uint64_t)dword << 32) | dword; + + asm volatile ("movl %%edx,%%ecx\n" + "andl $63,%%edx\n" + "shrl $6,%%ecx\n" + "jz 9f\n" + "1: movnti %%rax,(%%rdi)\n" + "2: movnti %%rax,1*8(%%rdi)\n" + "3: movnti %%rax,2*8(%%rdi)\n" + "4: movnti %%rax,3*8(%%rdi)\n" + "5: movnti %%rax,4*8(%%rdi)\n" + "8: movnti %%rax,5*8(%%rdi)\n" + "7: movnti %%rax,6*8(%%rdi)\n" + "8: movnti %%rax,7*8(%%rdi)\n" + "leaq 64(%%rdi),%%rdi\n" + "decl %%ecx\n" + "jnz 1b\n" + "9: movl %%edx,%%ecx\n" + "andl $7,%%edx\n" + "shrl $3,%%ecx\n" + "jz 11f\n" + "10: movnti %%rax,(%%rdi)\n" + "leaq 8(%%rdi),%%rdi\n" + "decl %%ecx\n" + "jnz 10b\n" + "11: movl %%edx,%%ecx\n" + "shrl $2,%%ecx\n" + "jz 12f\n" + "movnti %%eax,(%%rdi)\n" + "12:\n" + : "=D"(dummy1), "=d" (dummy2) + : "D" (dest), "a" (qword), "d" (length) + : "memory", "rcx"); +} + + +#include "super.h" // Remove when we factor out these and other functions. + +/* Translate an offset the beginning of the Nova instance to a PMEM address. + * + * If this is part of a read-modify-write of the block, + * nova_memunlock_block() before calling! + */ +static inline void *nova_get_block(struct super_block *sb, u64 block) +{ + struct nova_super_block *ps = nova_get_super(sb); + + return block ? ((void *)ps + block) : NULL; +} + +static inline int nova_get_reference(struct super_block *sb, u64 block, + void *dram, void **nvmm, size_t size) +{ + int rc; + + *nvmm = nova_get_block(sb, block); + rc = memcpy_mcsafe(dram, *nvmm, size); + return rc; +} + + +static inline u64 +nova_get_addr_off(struct nova_sb_info *sbi, void *addr) +{ + NOVA_ASSERT((addr >= sbi->virt_addr) && + (addr < (sbi->virt_addr + sbi->initsize))); + return (u64)(addr - sbi->virt_addr); +} + +static inline u64 +nova_get_block_off(struct super_block *sb, unsigned long blocknr, + unsigned short btype) +{ + return (u64)blocknr << PAGE_SHIFT; +} + + +static inline u64 nova_get_epoch_id(struct super_block *sb) +{ + struct nova_sb_info *sbi = NOVA_SB(sb); + + return sbi->s_epoch_id; +} + +#include "inode.h" +#endif /* __NOVA_H */ -- 2.7.4