Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752698AbdI1KKf (ORCPT ); Thu, 28 Sep 2017 06:10:35 -0400 Received: from mail-qk0-f193.google.com ([209.85.220.193]:38249 "EHLO mail-qk0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750914AbdI1KKd (ORCPT ); Thu, 28 Sep 2017 06:10:33 -0400 X-Google-Smtp-Source: AOwi7QBRlrhfjf9OI4yCmKTxqp3zTS2f0VulZafU4cXBCiNCJcwJOrabs3vEd8JKBUMPSAyNLzZOMBlWcc3gpkHwme4= MIME-Version: 1.0 In-Reply-To: References: <20170924200620.GA24368@avx2> <9bc11ace-d111-cdef-5280-8cdda027ae9a@gmail.com> <20170926190018.GA30898@avx2> From: Alexey Dobriyan Date: Thu, 28 Sep 2017 12:10:31 +0200 Message-ID: Subject: Re: [PATCH 1/2 v2] fdmap(2) To: Andy Lutomirski Cc: "Michael Kerrisk (man-pages)" , Andrew Morton , "linux-kernel@vger.kernel.org" , Linux API , Randy Dunlap , Thomas Gleixner , Djalal Harouni , Alexey Gladkov , Aliaksandr Patseyenak , Tatsiana Brouka Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by nfs id v8SAAdD4009507 Content-Length: 2893 Lines: 84 On 9/27/17, Andy Lutomirski wrote: > On Tue, Sep 26, 2017 at 12:00 PM, Alexey Dobriyan > wrote: >> On Mon, Sep 25, 2017 at 09:42:58AM +0200, Michael Kerrisk (man-pages) >> wrote: >>> [Not sure why original author is not in CC; added] >>> >>> Hello Alexey, >>> >>> On 09/24/2017 10:06 PM, Alexey Dobriyan wrote: >>> > From: Aliaksandr Patseyenak >>> > >>> > Implement system call for bulk retrieveing of opened descriptors >>> > in binary form. >>> > >>> > Some daemons could use it to reliably close file descriptors >>> > before starting. Currently they close everything upto some number >>> > which formally is not reliable. Other natural users are lsof(1) and >>> > CRIU >>> > (although lsof does so much in /proc that the effect is thoroughly >>> > buried). >>> > >>> > /proc, the only way to learn anything about file descriptors may not >>> > be >>> > available. There is unavoidable overhead associated with instantiating >>> > 3 dentries and 3 inodes and converting integers to strings and back. >>> > >>> > Benchmark: >>> > >>> > N=1<<22 times >>> > 4 opened descriptors (0, 1, 2, 3) >>> > opendir+readdir+closedir /proc/self/fd vs fdmap >>> > >>> > /proc 8.31 ą 0.37% >>> > fdmap 0.32 ą 0.72% >>> >>> From the text above, I'm still trying to understand: whose problem >>> does this solve? I mean, we've lived with the daemon-close-all-files >>> technique forever (and I'm not sure that performance is really an >>> important issue for the daemon case) . >> >>> And you say that the effect for lsof(1) will be buried. >> >> If only fdmap(2) is added, then effect will be negligible for lsof >> because it has to go through /proc anyway. >> >> The idea is to start process. In ideal world, only bynary system calls >> would exist and shells could emulate /proc/* same way bash implement >> /dev/tcp > > Then start the process by doing it for real and making it obviously > useful. We should not add a pair of vaguely useful, rather weak > syscalls just to start a process of modernizing /proc. > >> >>> So, who does this new system call >>> really help? (Note: I'm not saying don't add the syscall, but from >>> explanation given here, it's not clear why we should.) >> >> For fdmap(2) natural users are lsof(), CRIU. > > lsof does: > > int > main(argc, argv) > int argc; > char *argv[]; > { > ... > if ((MaxFd = (int) GET_MAX_FD()) < 53) > MaxFd = 53; > for (i = 3; i < MaxFd; i++) > (void) close(i); > > The solution isn't to wrangle fdmap(2) into this code. The solution > is to remove the code entirely. What do you think about this code from OpenSSH? /* * Discard other fds that are hanging around. These can cause problem * with backgrounded ssh processes started by ControlPersist. */ closefrom(STDERR_FILENO + 1);