Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp2907389pxk; Tue, 15 Sep 2020 05:40:02 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzXPhq+sPNa3afo64nbM7dqxYVKLDxhio2iSaoiD83Y3Z5zQNU5lOQq7qkZssYZi7ZgBJ7U X-Received: by 2002:aa7:cad3:: with SMTP id l19mr21227617edt.352.1600173602057; Tue, 15 Sep 2020 05:40:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1600173602; cv=none; d=google.com; s=arc-20160816; b=HqoX4m8ZV8wbiT29PsVsVswb2nYo0ZTZ/WC/gPsEnYjgD34MJaVOBXYmlxzAe9hJdp iYxRC3QbvzI3ZJZHmHSOfARQH0mYunUd3cbwQAKDSKws3t98nzPHXstAt5W/aKHkejks Q37EQDqvIaEuCi64rH3cjSqvFoA6YP0m2bNszbrO/XQZ0fPjKTwSUh9hW+mfKQZzcAX9 X+jxLwyVnhOeD9N6FNNGpj47kx+2fKrIpn4VjSS3wUAkIKnG04WX8qaYquxnvo2BE5IB zPpV1N6J7+5hvW7NKg60Ax/nNjdKqk5HoYPxomTOgGn/Ac/KICxpZMX6VYZEm/3IOCiY giWg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:message-id :subject:cc:to:from:date:dkim-signature; bh=4519MBa/X1sl/QK2gxX8FNaV3scXLpDGnOIe66gW2JE=; b=HzP0/JB+eQ8gVifIhr8+uSTz+n/lH2LJ1H3QHSMSZhUnxGD/kLeawpFBbP9LGw9xV1 TdOxdq453dtlhrugltbetO0Y637GTKDdkhORTgw4GV8ZLhdHwOvvFFRPeAUBUJUOyPK4 5qUhp7c9b3nsBnGYFGwbWUuLqGOJWsZG5gE77xHtgfJ+QaK8W7U8T51RYJvx2YH76itF I3X+fNYwaBbJRCw2w4aPsvtav7K3CT/efQIPMuxvKN+sBHnZVdEOCo2lTi36z1w/2UJL AHNJ4UPcsrriLy7zc3upHMQf+aGAaBvQyaTg8cyWpuRvXDrZZno1hUCzkSg/ir2jLYe9 c3Lw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=IALlyh3Z; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id d20si10151663edn.317.2020.09.15.05.39.39; Tue, 15 Sep 2020 05:40:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=IALlyh3Z; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726201AbgIOMgF (ORCPT + 99 others); Tue, 15 Sep 2020 08:36:05 -0400 Received: from us-smtp-delivery-124.mimecast.com ([63.128.21.124]:52563 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726212AbgIOMfK (ORCPT ); Tue, 15 Sep 2020 08:35:10 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1600173289; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type; bh=4519MBa/X1sl/QK2gxX8FNaV3scXLpDGnOIe66gW2JE=; b=IALlyh3ZAv2KWYffoLkh7vck0u5M2j4hjxHfd88LKEBUiJ1yDNS9Pf57SwQEMceKwXYGmV RJMu3O5jqnN1+bl3zBWCCi4jr4whgyfT4MK8W1uyzlsjRfap0HbpCZ42wuM1DOqzKAvpkv XkOfXikIjciO0lvdYhiLrYTwQ2Axwdw= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-308-4pyk1GQ8Nuy-9_ZgAQpwbw-1; Tue, 15 Sep 2020 08:34:46 -0400 X-MC-Unique: 4pyk1GQ8Nuy-9_ZgAQpwbw-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 03FC881F007; Tue, 15 Sep 2020 12:34:44 +0000 (UTC) Received: from file01.intranet.prod.int.rdu2.redhat.com (file01.intranet.prod.int.rdu2.redhat.com [10.11.5.7]) by smtp.corp.redhat.com (Postfix) with ESMTPS id BBAAF19C4F; Tue, 15 Sep 2020 12:34:43 +0000 (UTC) Received: from file01.intranet.prod.int.rdu2.redhat.com (localhost [127.0.0.1]) by file01.intranet.prod.int.rdu2.redhat.com (8.14.4/8.14.4) with ESMTP id 08FCYhSA004603; Tue, 15 Sep 2020 08:34:43 -0400 Received: from localhost (mpatocka@localhost) by file01.intranet.prod.int.rdu2.redhat.com (8.14.4/8.14.4/Submit) with ESMTP id 08FCYfFY004600; Tue, 15 Sep 2020 08:34:41 -0400 X-Authentication-Warning: file01.intranet.prod.int.rdu2.redhat.com: mpatocka owned process doing -bs Date: Tue, 15 Sep 2020 08:34:41 -0400 (EDT) From: Mikulas Patocka X-X-Sender: mpatocka@file01.intranet.prod.int.rdu2.redhat.com To: Linus Torvalds , Alexander Viro , Andrew Morton , Dan Williams , Vishal Verma , Dave Jiang , Ira Weiny , Matthew Wilcox , Jan Kara , Eric Sandeen , Dave Chinner , "Kani, Toshi" , "Norton, Scott J" , "Tadakamadla, Rajesh (DCIG/CDI/HPS Perf)" cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-nvdimm@lists.01.org Subject: [RFC] nvfs: a filesystem for persistent memory Message-ID: User-Agent: Alpine 2.02 (LRH 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi I am developing a new filesystem suitable for persistent memory - nvfs. The goal is to have a small and fast filesystem that can be used on DAX-based devices. Nvfs maps the whole device into linear address space and it completely bypasses the overhead of the block layer and buffer cache. In the past, there was nova filesystem for pmem, but it was abandoned a year ago (the last version is for the kernel 5.1 - https://github.com/NVSL/linux-nova ). Nvfs is smaller and performs better. The design of nvfs is similar to ext2/ext4, so that it fits into the VFS layer naturally, without too much glue code. I'd like to ask you to review it. tarballs: http://people.redhat.com/~mpatocka/nvfs/ git: git://leontynka.twibright.com/nvfs.git the description of filesystem internals: http://people.redhat.com/~mpatocka/nvfs/INTERNALS benchmarks: http://people.redhat.com/~mpatocka/nvfs/BENCHMARKS TODO: - programs run approximately 4% slower when running from Optane-based persistent memory. Therefore, programs and libraries should use page cache and not DAX mapping. - when the fsck.nvfs tool mmaps the device /dev/pmem0, the kernel uses buffer cache for the mapping. The buffer cache slows does fsck by a factor of 5 to 10. Could it be possible to change the kernel so that it maps DAX based block devices directly? - __copy_from_user_inatomic_nocache doesn't flush cache for leading and trailing bytes. Mikulas