Received: by 10.213.65.68 with SMTP id h4csp1427120imn; Wed, 14 Mar 2018 22:01:43 -0700 (PDT) X-Google-Smtp-Source: AG47ELu3bTly0x8NUKZibVpruhTQ8SUlBrJvw0u9mGV+/YMJSWUZxNwOZl4t7gewQS34hiv0jcYY X-Received: by 10.101.72.136 with SMTP id n8mr5719999pgs.201.1521090102924; Wed, 14 Mar 2018 22:01:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521090102; cv=none; d=google.com; s=arc-20160816; b=wN4ed0QYOD+2Y5EaH1wBHtWRAgGV/YS8eeJSQygzSV5t78fYcRvg6mF494TvvovQSv mNqR7sRhOo5yjBC1pJVC8/OyIfIrh4UzEUhs4OjtjvqjBly+391EhDJWYV63nil0+z6g pRtSpw4U893sUanWSGTxCL9Aq6/YetA49QHhu97cJlOejLgTWXIIAFmMx9j/rkVmEURx nOtmCuAaItoB7/vCt/uSS5iNucVg1nH9fmAUr+Qo9rWBU+zawvOVNJBIg22J4ksT9a5h aXTYx/0x64O+ciGe6KCu/l8MCvAtUboE0Ly+5dAEzWCN6hzEETlnnWGKJMFiU7+xwBSG +0Ow== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=++FyFvQGG6iCiHqsSQMJ+E23nLddnwtJ3TfrW3E6aZY=; b=ruAEr5jzeu3K3K5cslXYB3Zz2MnodoYine6kCNjsrYkRoqcqRBOl3h9f8HIZETSD9N W+xdrD7nReZI2x8QnsYJ35WZZOANidt7DNmCyEJ8yFP2ckDZGgjoPHZQA4s7R4XE5Zab kMLYa0ALqXk5FpIqj/zfzUgd8pe9G5gqCJNL3Hw14YlWJwStssPPQ4GpErhDXWL4zlp1 howKphYAr/bvXWv8l+QsWB1r8JdGEBxoWzQqywQ5hR0G2R9GJ9KhT/AEVhZ25T1OeT7w 56SjLPuqqFzYfUdEf7Kpy64vTStoq8uqo4RdslB5nhSyIpAdN+sOJJNTzRo2+qdZ9eIF Ca4A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=SEW+wXsq; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x19si3251585pfa.130.2018.03.14.22.01.28; Wed, 14 Mar 2018 22:01:42 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=SEW+wXsq; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751547AbeCOFAX (ORCPT + 99 others); Thu, 15 Mar 2018 01:00:23 -0400 Received: from aserp2130.oracle.com ([141.146.126.79]:55402 "EHLO aserp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750722AbeCOFAW (ORCPT ); Thu, 15 Mar 2018 01:00:22 -0400 Received: from pps.filterd (aserp2130.oracle.com [127.0.0.1]) by aserp2130.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w2F4pmm0129116; Thu, 15 Mar 2018 04:54:07 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=date : from : to : cc : subject : message-id : references : mime-version : content-type : in-reply-to; s=corp-2017-10-26; bh=++FyFvQGG6iCiHqsSQMJ+E23nLddnwtJ3TfrW3E6aZY=; b=SEW+wXsqB/w5TagPG4doJs6qLY/e0FO06lQx8yqa8MJSQYE05hmhTLckwGLHmIa7ZQv4 zIpPKXcekuVwftMNAxdOSV1X1paddklo9UuZHUhTZHjbeF/YLRzyOZiUmMJmnTZ8QW3j gIg5I44CAfS9OAb0twqYj1i9SACWSrTqQmy3BccJnvEkcrMOmbiaLpBsKenDAL+eS9tW zV3ayk+ETgDFyxREhcqMKGLlXlIjukTsvcGi0WGQxkOLC3SX6cwh+PB7RyjM/q44z3z4 T8gC3xgDaAnMC2TdtujDN+nRF/MAyYkWNH5fpveL5eI9tqTqk0mCF+iv5sxPokRqHSTb Fw== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by aserp2130.oracle.com with ESMTP id 2gqj36g0uj-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 15 Mar 2018 04:54:07 +0000 Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id w2F4s5AR004958 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 15 Mar 2018 04:54:05 GMT Received: from abhmp0012.oracle.com (abhmp0012.oracle.com [141.146.116.18]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id w2F4s3ki027119; Thu, 15 Mar 2018 04:54:03 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 14 Mar 2018 21:54:03 -0700 Date: Wed, 14 Mar 2018 21:54:01 -0700 From: "Darrick J. Wong" To: Andiry Xu Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org, dan.j.williams@intel.com, andy.rudoff@intel.com, coughlan@redhat.com, swanson@cs.ucsd.edu, david@fromorbit.com, jack@suse.com, swhiteho@redhat.com, miklos@szeredi.hu, andiry.xu@gmail.com, Andiry Xu Subject: Re: [RFC v2 03/83] Add super.h. Message-ID: <20180315045401.GB4860@magnolia> References: <1520705944-6723-1-git-send-email-jix024@eng.ucsd.edu> <1520705944-6723-4-git-send-email-jix024@eng.ucsd.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1520705944-6723-4-git-send-email-jix024@eng.ucsd.edu> User-Agent: Mutt/1.5.24 (2015-08-30) X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8832 signatures=668690 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=2 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1711220000 definitions=main-1803150056 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Mar 10, 2018 at 10:17:44AM -0800, Andiry Xu wrote: > From: Andiry Xu > > This header file defines NOVA persistent and volatile superblock > data structures. > > It also defines NOVA block layout: > > Page 0: Superblock > Page 1: Reserved inodes > Page 2 - 15: Reserved > Page 16 - 31: Inode table pointers > Page 32 - 47: Journal address pointers > Page 48 - 63: Reserved > Pages n-2: Replicate reserved inodes > Pages n-1: Replicate superblock > > Other pages are for normal inodes, logs and data. > > Signed-off-by: Andiry Xu > --- > fs/nova/super.h | 149 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 149 insertions(+) > create mode 100644 fs/nova/super.h > > diff --git a/fs/nova/super.h b/fs/nova/super.h > new file mode 100644 > index 0000000..cb53908 > --- /dev/null > +++ b/fs/nova/super.h > @@ -0,0 +1,149 @@ > +#ifndef __SUPER_H > +#define __SUPER_H > +/* > + * Structure of the NOVA super block in PMEM > + * > + * The fields are partitioned into static and dynamic fields. The static fields > + * never change after file system creation. This was primarily done because > + * nova_get_block() returns NULL if the block offset is 0 (helps in catching > + * bugs). So if we modify any field using journaling (for consistency), we > + * will have to modify s_sum which is at offset 0. So journaling code fails. > + * This (static+dynamic fields) is a temporary solution and can be avoided > + * once the file system becomes stable and nova_get_block() returns correct > + * pointers even for offset 0. > + */ > +struct nova_super_block { > + /* static fields. they never change after file system creation. > + * checksum only validates up to s_start_dynamic field below > + */ > + __le32 s_sum; /* checksum of this sb */ > + __le32 s_magic; /* magic signature */ > + __le32 s_padding32; > + __le32 s_blocksize; /* blocksize in bytes */ > + __le64 s_size; /* total size of fs in bytes */ > + char s_volume_name[16]; /* volume name */ > + > + /* all the dynamic fields should go here */ > + __le64 s_epoch_id; /* Epoch ID */ > + > + /* s_mtime and s_wtime should be together and their order should not be > + * changed. we use an 8 byte write to update both of them atomically > + */ > + __le32 s_mtime; /* mount time */ > + __le32 s_wtime; /* write time */ Hmmm, 32-bit timestamps? 2038 isn't that far away... > +} __attribute((__packed__)); > + > +#define NOVA_SB_SIZE 512 /* must be power of two */ > + > +/* ======================= Reserved blocks ========================= */ > + > +/* > + * Page 0 contains super blocks; > + * Page 1 contains reserved inodes; > + * Page 2 - 15 are reserved. > + * Page 16 - 31 contain pointers to inode tables. > + * Page 32 - 47 contain pointers to journal pages. > + */ > +#define HEAD_RESERVED_BLOCKS 64 > +#define NUM_JOURNAL_PAGES 16 > + > +#define SUPER_BLOCK_START 0 // Superblock > +#define RESERVE_INODE_START 1 // Reserved inodes > +#define INODE_TABLE_START 16 // inode table pointers > +#define JOURNAL_START 32 // journal pointer table > + > +/* For replica super block and replica reserved inodes */ > +#define TAIL_RESERVED_BLOCKS 2 > + > +/* ======================= Reserved inodes ========================= */ > + > +/* We have space for 31 reserved inodes */ > +#define NOVA_ROOT_INO (1) > +#define NOVA_INODETABLE_INO (2) /* Fake inode associated with inode > + * stroage. We need this because our > + * allocator requires inode to be > + * associated with each allocation. > + * The data actually lives in linked > + * lists in INODE_TABLE_START. */ > +#define NOVA_BLOCKNODE_INO (3) /* Storage for allocator state */ > +#define NOVA_LITEJOURNAL_INO (4) /* Storage for lightweight journals */ > +#define NOVA_INODELIST_INO (5) /* Storage for Inode free list */ > + > + > +/* Normal inode starts at 32 */ > +#define NOVA_NORMAL_INODE_START (32) I've been wondering this whole time, why not make the inode number the byte offset into the pmem? Then you don't have to lose the last 8 bytes of each inode block to point to the next one. --D > + > + > + > +/* > + * NOVA super-block data in DRAM > + */ > +struct nova_sb_info { > + struct super_block *sb; /* VFS super block */ > + struct nova_super_block *nova_sb; /* DRAM copy of SB */ > + struct block_device *s_bdev; > + struct dax_device *s_dax_dev; > + > + /* > + * base physical and virtual address of NOVA (which is also > + * the pointer to the super block) > + */ > + phys_addr_t phys_addr; > + void *virt_addr; > + void *replica_reserved_inodes_addr; > + void *replica_sb_addr; > + > + unsigned long num_blocks; > + > + /* Mount options */ > + unsigned long bpi; > + unsigned long blocksize; > + unsigned long initsize; > + unsigned long s_mount_opt; > + kuid_t uid; /* Mount uid for root directory */ > + kgid_t gid; /* Mount gid for root directory */ > + umode_t mode; /* Mount mode for root directory */ > + atomic_t next_generation; > + /* inode tracking */ > + unsigned long s_inodes_used_count; > + unsigned long head_reserved_blocks; > + unsigned long tail_reserved_blocks; > + > + struct mutex s_lock; /* protects the SB's buffer-head */ > + > + int cpus; > + > + /* Current epoch. volatile guarantees visibility */ > + volatile u64 s_epoch_id; > + > + /* ZEROED page for cache page initialized */ > + void *zeroed_page; > +}; > + > +static inline struct nova_sb_info *NOVA_SB(struct super_block *sb) > +{ > + return sb->s_fs_info; > +} > + > +static inline struct nova_super_block > +*nova_get_redund_super(struct super_block *sb) > +{ > + struct nova_sb_info *sbi = NOVA_SB(sb); > + > + return (struct nova_super_block *)(sbi->replica_sb_addr); > +} > + > + > +/* If this is part of a read-modify-write of the super block, > + * nova_memunlock_super() before calling! > + */ > +static inline struct nova_super_block *nova_get_super(struct super_block *sb) > +{ > + struct nova_sb_info *sbi = NOVA_SB(sb); > + > + return (struct nova_super_block *)sbi->virt_addr; > +} > + > +extern void nova_error_mng(struct super_block *sb, const char *fmt, ...); > + > +#endif > -- > 2.7.4 >