Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1031647AbdIZSqy (ORCPT ); Tue, 26 Sep 2017 14:46:54 -0400 Received: from mail-wm0-f66.google.com ([74.125.82.66]:34231 "EHLO mail-wm0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1031612AbdIZSqr (ORCPT ); Tue, 26 Sep 2017 14:46:47 -0400 X-Google-Smtp-Source: AOwi7QCXDAtICJj8OgKbm3CeMOTJ+2bZ/5HSB7zYIk3+s4XNRIJm13Y1D2B6lesKpM3MkbpS/jVBmQ== Date: Tue, 26 Sep 2017 21:46:43 +0300 From: Alexey Dobriyan To: Andy Lutomirski Cc: Andrew Morton , "linux-kernel@vger.kernel.org" , Linux API , Randy Dunlap , Thomas Gleixner , Djalal Harouni , Alexey Gladkov , Tatsiana Brouka , Aliaksandr Patseyenak Subject: Re: [PATCH v2 2/2] pidmap(2) Message-ID: <20170926184643.GC14724@avx2> References: <20170924200620.GA24368@avx2> <20170924200822.GB24368@avx2> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.7.2 (2016-11-26) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2043 Lines: 49 On Sun, Sep 24, 2017 at 02:27:00PM -0700, Andy Lutomirski wrote: > On Sun, Sep 24, 2017 at 1:08 PM, Alexey Dobriyan wrote: > > From: Tatsiana Brouka > > > > Implement system call for bulk retrieveing of pids in binary form. > > > > Using /proc is slower than necessary: 3 syscalls + another 3 for each thread + > > converting with atoi() + instantiating dentries and inodes. > > > > /proc may be not mounted especially in containers. Natural extension of > > hidepid=2 efforts is to not mount /proc at all. > > > > It could be used by programs like ps, top or CRIU. Speed increase will > > become more drastic once combined with bulk retrieval of process statistics. > > > > Benchmark: > > > > N=1<<16 times > > ~130 processes (~250 task_structs) on a regular desktop system > > opendir + readdir + closedir /proc + the same for every /proc/$PID/task > > (roughly what htop(1) does) vs pidmap > > > > /proc 16.80 ? 0.73% > > pidmap 0.06 ? 0.31% > > > > PIDMAP_* flags are modelled after /proc/task_diag patchset. > > > > > > PIDMAP(2) Linux Programmer's Manual PIDMAP(2) > > > > NAME > > pidmap - get allocated PIDs > > > > SYNOPSIS > > long pidmap(pid_t pid, int *pids, unsigned int count , unsigned int start, int flags); > > I think we will seriously regret a syscall that does this. Djalal is > working on fixing the turd that is hidepid, and this syscall is > basically incompatible with ever fixing hidepids. I think that, to > make it less regrettable, it needs to take an fd to a proc mount as a > parameter. This makes me wonder why it's a syscall at all -- why not > just create a new file like /proc/pids? See reply to fdmap(2). pidmap(2) is indeed more complex case exactly because of pid/tgid/tid/everything else + pidnamespaces + ->hide_pid. However the problem remains: query task tree without all the bullshit. C/R people succumbed with /proc/*/children, it was a mistake IMO.