Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755201AbdIGJrP (ORCPT ); Thu, 7 Sep 2017 05:47:15 -0400 Received: from mail-qk0-f194.google.com ([209.85.220.194]:38594 "EHLO mail-qk0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755003AbdIGJrN (ORCPT ); Thu, 7 Sep 2017 05:47:13 -0400 X-Google-Smtp-Source: AOwi7QA3kZ3W0UlrL6m9BmKATRC/NwYsO1YJfLOvFRRU/FhepzMkBgB1SA/eUcb2T7bKG3CaNP0++C4G4u5+MYisCNA= MIME-Version: 1.0 In-Reply-To: References: <20170905190500.GA13746@avx2> <20170905155320.a683a4853b21a3be32d8b529@linux-foundation.org> From: Alexey Dobriyan Date: Thu, 7 Sep 2017 12:47:11 +0300 Message-ID: Subject: Re: [PATCH 1/2] pidmap(2) To: Djalal Harouni Cc: Andy Lutomirski , Randy Dunlap , Andrew Morton , Tatsiana Brouka , "linux-kernel@vger.kernel.org" , Linux API , Aliaksandr Patseyenak , Alexey Gladkov Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1927 Lines: 51 On 9/7/17, Djalal Harouni wrote: > Hi Alexey, > > On Thu, Sep 7, 2017 at 4:04 AM, Andy Lutomirski > wrote: >> On Wed, Sep 6, 2017 at 2:04 AM, Alexey Dobriyan >> wrote: >>> On 9/6/17, Randy Dunlap wrote: >>>> On 09/05/17 15:53, Andrew Morton wrote: > [...] >>>> >>>> also, I expect that the tiny kernel people will want kconfig options >>>> for >>>> these syscalls. >>> >>> We'll add it but the question if it is a good idea. Ideally these system >>> calls >>> should be mandatory and /proc optional. >>> >>> $ size kernel/pidmap.o fs/fdmap.o >>> text data bss dec hex filename >>> 560 0 0 560 230 kernel/pidmap.o >>> 617 0 0 617 269 fs/fdmap.o >> >> After much discussion at LPC/KS last year, I thought the idea was to >> try to speed up /proc rather than replacing it outright. The two >> specific ideas I recall were: >> >> 1. Add a syscall like readfileat() that you can use to, in a single >> operation, open, read, and close a /proc file (or other file). This >> should vastly reduce locking and RCU overhead. >> >> 2. Add a /proc file that has a nice binary format for task info. >> (nl_attr?) >> >> I don't see why pidmap() deserves to be significantly faster than >> getdents(). >> >> Also, a pidmap() syscall like this inherently bypasses any security >> restrictions implied by the way that /proc is mounted. It can respect >> hidepid, but hidepid (as a per-namespace concept) is an enormous turd >> that badly needs to be deprecated, and Djalal is working on exactly >> that. > > Yes as noted by Andy, me and Alexey Gladkov are working on modernizing > procfs [1] and to reduce/remove ties within pid namespaces which has lot > of problems now. > ... Kudos for digging into this mess. But the question will remain: how get pids of existing processes quickly.