Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp4251302ybt; Sun, 5 Jul 2020 23:09:34 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxhzsUa+mCEvhwmUackfXYBLNh/LyOR+TY2zul9oK1cFKZfDc3TWVN6CFHCRatNT930+7IL X-Received: by 2002:a17:906:1414:: with SMTP id p20mr41046694ejc.247.1594015774380; Sun, 05 Jul 2020 23:09:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1594015774; cv=none; d=google.com; s=arc-20160816; b=Icu7qGUX2jCsWSVGpCqw4Y5aRQ0rAJD4/yo1/oNW8ZZx40QSSYfh2obQJAfY5r+TMG haGEMIcPenWilNQnDy2HhluayOJWL81GIJ2hycn83PuYnNiA1YtT8FKdbKsquvSBjdDG Umu+TGvEXXCse0UxD33bwLb8nazSwllYpCqSuSLLPgzMihIBPnZUFYf34HrFtubqkXDd g16nLdCWMYTM6J/4LW4RNIhF3i/vrCpH1QXRFquMlunAgYHB1hbXorFOo5SFLIirLR44 3h96VCak0Krfy63rjTHshQrFVdqvrm0dwZ/nPd+Ly5+1aFecLKHeY3+u2/JgiXBphp2r V5IQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=X1BjitBb0woJ/AqYDROx+THUx8rqaPKmjtehNi6j/8Q=; b=ePXd09h9L16+QNxNpd5RJyEe2PYLn57fizQluvjanHDobQ7IQc3j1C9wJwsAUVnrhy beALnGBPWYAd2zbN0FoMQlctBYcq8mmLgk/s0AG+zecKYckGBj5VM+e130spEaaaLgY4 VJvqiIDK2EWDJB4xoPbBwkjOy+wWKM5OK2uYKcx358JC2UsFMFtpmw6KE7EBJrc9rvGY svA7ykihvgFzx+Z4YvVE1SoGNiLQQIJYucu+M9nHHxKbmexR3I+TCqUX32RivAHeWfw2 VUa08PED39RcP1QUbzp7lQK4kfBhxuXL2ZSVd1Zyn7AbUlePcwlwr872CG3BD2KJyfRX ysEA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=ga5k5Fm9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id mb24si12200841ejb.233.2020.07.05.23.09.11; Sun, 05 Jul 2020 23:09:34 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=ga5k5Fm9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728906AbgGFGIY (ORCPT + 99 others); Mon, 6 Jul 2020 02:08:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35006 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728804AbgGFGIX (ORCPT ); Mon, 6 Jul 2020 02:08:23 -0400 Received: from mail-oo1-xc43.google.com (mail-oo1-xc43.google.com [IPv6:2607:f8b0:4864:20::c43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3BD7EC061794; Sun, 5 Jul 2020 23:08:23 -0700 (PDT) Received: by mail-oo1-xc43.google.com with SMTP id a9so91494oof.12; Sun, 05 Jul 2020 23:08:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=X1BjitBb0woJ/AqYDROx+THUx8rqaPKmjtehNi6j/8Q=; b=ga5k5Fm94tsxJPmvN5Qz7vnzhTrXfjjHXZPRCJk4df64cFFiydOPVyTPH0ugyCSmBv xWRchB3G2ib8qiIIc/u+kxh12gDUiatbOIKDJc3FPtvSMJadpa7v/WSUZyEauK6u2tOu FpaK1eFRuAC1ZlM5KhuqPgkHurDYWgY/Avrf5zUvwjHzIb7vgOc7AK6SNdpXhly8Kh3x ATbUHXPs2V1SeaUyTp1NUrhUNmtK/3BIP/MZU5OdftvpglEYIEw6JH5T+5GFM1A4E5/b jS5JfUrnuYKcLihWEv4AZR4g/l7sSbrZ3PAIO6P4giOKs+iSNEId1+CevqpFav0wkviq OJuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=X1BjitBb0woJ/AqYDROx+THUx8rqaPKmjtehNi6j/8Q=; b=t11LlXrz0ini0Z3HNeMh/YaNder+20JR2vYZy1DCHZDE4tbWrIyp359i6tijlTVwtr lGALpRQ5+QMtJgLmMFwxnnxsSYOj3hpzXAmhFx/oYpGpSgC+O25/+shOWcAaDd7XS5nV S5Y9qqvGlLosKCrI8tj2eAasEBTK+27Vi2qsaw0scC/rP7gKj3+Lqf6km6a2snvrisVf Rcrb88R45cGzHkj14ojuN8ma9w359R2ZJnQKV2YnssDAyWziFlUpA1UZqCGmWZS5IFy5 g/xzTC+D7GJP+SF+uf6bc2ospZHs6ivdDh1c/oh/wv/GxiaSBE+yPn07dL844v3YUtSh qK0Q== X-Gm-Message-State: AOAM530zpA7hxvH14/TF0qxRSesPzxj//pvdzVugN/by1N0KTpsL8hU0 bn0Q1nrGWEVTerJWQLuyPQiXjPhcdeDm3+HmAmxOitn6 X-Received: by 2002:a4a:dfb5:: with SMTP id k21mr32478997ook.27.1594015702379; Sun, 05 Jul 2020 23:08:22 -0700 (PDT) MIME-Version: 1.0 References: <20200705021631.GR25523@casper.infradead.org> <20200705031208.GS25523@casper.infradead.org> <20200705032732.GT25523@casper.infradead.org> <20200705115851.GB1227929@kroah.com> In-Reply-To: <20200705115851.GB1227929@kroah.com> From: Jan Ziak <0xe2.0x9a.0x9b@gmail.com> Date: Mon, 6 Jul 2020 08:07:46 +0200 Message-ID: Subject: Re: [PATCH 0/3] readfile(2): a new syscall to make open/read/close faster To: Greg KH Cc: Matthew Wilcox , linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-man@vger.kernel.org, mtk.manpages@gmail.com, shuah@kernel.org, viro@zeniv.linux.org.uk Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Jul 5, 2020 at 1:58 PM Greg KH wrote: > > On Sun, Jul 05, 2020 at 06:09:03AM +0200, Jan Ziak wrote: > > On Sun, Jul 5, 2020 at 5:27 AM Matthew Wilcox wrote: > > > > > > On Sun, Jul 05, 2020 at 05:18:58AM +0200, Jan Ziak wrote: > > > > On Sun, Jul 5, 2020 at 5:12 AM Matthew Wilcox wrote: > > > > > > > > > > You should probably take a look at io_uring. That has the level of > > > > > complexity of this proposal and supports open/read/close along with many > > > > > other opcodes. > > > > > > > > Then glibc can implement readfile using io_uring and there is no need > > > > for a new single-file readfile syscall. > > > > > > It could, sure. But there's also a value in having a simple interface > > > to accomplish a simple task. Your proposed API added a very complex > > > interface to satisfy needs that clearly aren't part of the problem space > > > that Greg is looking to address. > > > > I believe that we should look at the single-file readfile syscall from > > a performance viewpoint. If an application is expecting to read a > > couple of small/medium-size files per second, then neither readfile > > nor readfiles makes sense in terms of improving performance. The > > benefits start to show up only in case an application is expecting to > > read at least a hundred of files per second. The "per second" part is > > important, it cannot be left out. Because readfile only improves > > performance for many-file reads, the syscall that applications > > performing many-file reads actually want is the multi-file version, > > not the single-file version. > > It also is a measurable increase over reading just a single file. > Here's my really really fast AMD system doing just one call to readfile > vs. one call sequence to open/read/close: > > $ ./readfile_speed -l 1 > Running readfile test on file /sys/devices/system/cpu/vulnerabilities/meltdown for 1 loops... > Took 3410 ns > Running open/read/close test on file /sys/devices/system/cpu/vulnerabilities/meltdown for 1 loops... > Took 3780 ns > > 370ns isn't all that much, yes, but it is 370ns that could have been > used for something else :) I am curious as to how you amortized or accounted for the fact that readfile() first needs to open the dirfd and then close it later. From performance viewpoint, only codes where readfile() is called multiple times from within a loop make sense: dirfd = open(); for(...) { readfile(dirfd, ...); } close(dirfd); > Look at the overhead these days of a syscall using something like perf > to see just how bad things have gotten on Intel-based systems (above was > AMD which doesn't suffer all the syscall slowdowns, only some). > > I'm going to have to now dig up my old rpi to get the stats on that > thing, as well as some Intel boxes to show the problem I'm trying to > help out with here. I'll post that for the next round of this patch > series. > > > I am not sure I understand why you think that a pointer to an array of > > readfile_t structures is very complex. If it was very complex then it > > would be a deep tree or a large graph. > > Of course you can make it more complex if you want, but look at the > existing tools that currently do many open/read/close sequences. The > apis there don't lend themselves very well to knowing the larger list of > files ahead of time. But I could be looking at the wrong thing, what > userspace programs are you thinking of that could be easily converted > into using something like this? Perhaps, passing multiple filenames to tools via the command-line is a valid and quite general use case where it is known ahead of time that multiple files are going to be read, such as "gcc *.o" which is commonly used to link shared libraries and executables. Although, in case of "gcc *.o" some of the object files are likely to be cached in memory and thus unlikely to be required to be fetched from HDD/SSD, so the valid use case where we could see a speedup (if gcc was to use the multi-file readfiles() syscall) is when the programmer/Makefile invokes "gcc *.o" after rebuilding a small subset of the object files and the objects files which did not have to be rebuilt are stored on HDD/SSD, so basically this means 1st-time use of a project's Makefile in a particular day.