Received: by 2002:a05:6a10:2785:0:0:0:0 with SMTP id ia5csp1724052pxb; Sun, 10 Jan 2021 08:23:12 -0800 (PST) X-Google-Smtp-Source: ABdhPJxc0YubmV4t3nKOhOcfvaTEmYyqwTmIAE/WOKqzW78o2WJ7JnpPLLIV7AWg1idBlANglVVh X-Received: by 2002:a50:d5c1:: with SMTP id g1mr12113081edj.299.1610295791957; Sun, 10 Jan 2021 08:23:11 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1610295791; cv=none; d=google.com; s=arc-20160816; b=Mm4AZP9ZyCNvzma9G/QmHXBO3NkAwAQQoPpRljyDL8ReyM9yhekFYJT3tjNLUnjFoR CN1Iah6K+ZnlmzE0+KU/uqZR44/KI4A22qkD4tAfKuep+pkZE2pZViYypn9gTTHUY9QT YIOKgQ+lNnyOq2XoV9OCL1ipNa6HEFPUtTkjGwM/Vyd+FOcTlJ3N5z5mjXL23tedvN3O 8SkZXYaJGe1PPmnCnDn0QEfojbyIZIZ6MJPIJ1uYj7AjdZgvQQ46mTJRdx+Nbv6LK3hm VneAxCTyaRTN/WeGax7EO/Nb2nTzIDHFLREDIW8I1Xu1xHskkP0MiKUDQlAuzDR8rn3D EuQw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=JObYBzrh/4LQlJcw+wCrUK9AQIdgUC2wSWMOQKjYuv8=; b=Fh3C2Agu6AGlfzzCWiFEEn8BVK01LcEPp8mg2kJ+jdY70FEJatSBzlqdnsPcRYe9CP W2fHnobdfBK7mFuQRnY9jE3KBcpohnbWbT8QzwzvwaYK0otP+hqz0mAXiQFrq6x9QM6w zuDc49T6teBqZGetIaYqv74iDqIFciG2nbFEzmalr2Ghk+MdWlrLiqWDCUJN3yJLTCy6 GkwA6g7Ufre06JxK6TttEmrHgjjKYDw4Pz9Tjre1GA8sapPMNdzWYFCtbcIv2vtOcrnb UNzmw6Osm9XbiIYh0Au/lqX2ufk0GCX2s9MmG3w2hR7PVcFzqmrqBTFjgxWbfT/vD15g oCGQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u14si5867782edo.600.2021.01.10.08.22.47; Sun, 10 Jan 2021 08:23:11 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726618AbhAJQVj (ORCPT + 99 others); Sun, 10 Jan 2021 11:21:39 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36774 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726080AbhAJQVj (ORCPT ); Sun, 10 Jan 2021 11:21:39 -0500 Received: from ZenIV.linux.org.uk (zeniv.linux.org.uk [IPv6:2002:c35c:fd02::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8166EC061786; Sun, 10 Jan 2021 08:20:58 -0800 (PST) Received: from viro by ZenIV.linux.org.uk with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1kydRo-008yaK-RN; Sun, 10 Jan 2021 16:20:08 +0000 Date: Sun, 10 Jan 2021 16:20:08 +0000 From: Al Viro To: Mikulas Patocka Cc: Andrew Morton , Dan Williams , Vishal Verma , Dave Jiang , Ira Weiny , Matthew Wilcox , Jan Kara , Steven Whitehouse , Eric Sandeen , Dave Chinner , Theodore Ts'o , Wang Jianchao , "Kani, Toshi" , "Norton, Scott J" , "Tadakamadla, Rajesh" , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-nvdimm@lists.01.org Subject: Re: [RFC v2] nvfs: a filesystem for persistent memory Message-ID: <20210110162008.GV3579531@ZenIV.linux.org.uk> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: Al Viro Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jan 07, 2021 at 08:15:41AM -0500, Mikulas Patocka wrote: > Hi > > I announce a new version of NVFS - a filesystem for persistent memory. > http://people.redhat.com/~mpatocka/nvfs/ Utilities, AFAICS > git://leontynka.twibright.com/nvfs.git Seems to hang on git pull at the moment... Do you have it anywhere else? > I found out that on NVFS, reading a file with the read method has 10% > better performance than the read_iter method. The benchmark just reads the > same 4k page over and over again - and the cost of creating and parsing > the kiocb and iov_iter structures is just that high. Apples and oranges... What happens if you take ssize_t read_iter_locked(struct file *file, struct iov_iter *to, loff_t *ppos) { struct inode *inode = file_inode(file); struct nvfs_memory_inode *nmi = i_to_nmi(inode); struct nvfs_superblock *nvs = inode->i_sb->s_fs_info; ssize_t total = 0; loff_t pos = *ppos; int r; int shift = nvs->log2_page_size; size_t i_size; i_size = inode->i_size; if (pos >= i_size) return 0; iov_iter_truncate(to, i_size - pos); while (iov_iter_count(to)) { void *blk, *ptr; size_t page_mask = (1UL << shift) - 1; unsigned page_offset = pos & page_mask; unsigned prealloc = (iov_iter_count(to) + page_mask) >> shift; unsigned size; blk = nvfs_bmap(nmi, pos >> shift, &prealloc, NULL, NULL, NULL); if (unlikely(IS_ERR(blk))) { r = PTR_ERR(blk); goto ret_r; } size = ((size_t)prealloc << shift) - page_offset; ptr = blk + page_offset; if (unlikely(!blk)) { size = min(size, (unsigned)PAGE_SIZE); ptr = empty_zero_page; } size = copy_to_iter(to, ptr, size); if (unlikely(!size)) { r = -EFAULT; goto ret_r; } pos += size; total += size; } while (iov_iter_count(to)); r = 0; ret_r: *ppos = pos; if (file) file_accessed(file); return total ? total : r; } and use that instead of your nvfs_rw_iter_locked() in your ->read_iter() for DAX read case? Then the same with s/copy_to_iter/_copy_to_iter/, to see how much of that is "hardening" overhead. Incidentally, what's the point of sharing nvfs_rw_iter() for read and write cases? They have practically no overlap - count the lines common for wr and !wr cases. And if you do the same in nvfs_rw_iter_locked(), you'll see that the shared parts _there_ are bloody pointless on the read side. Not that it had been more useful on the write side, really, but that's another story (nvfs_write_pages() handling of copyin is... interesting). Let's figure out what's going on with the read overhead first... lib/iov_iter.c primitives certainly could use massage for better code generation, but let's find out how much of the PITA is due to those and how much comes from you fighing the damn thing instead of using it sanely...