Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp5279528pxj; Wed, 9 Jun 2021 13:37:06 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwVaoQCfagfyVLfCooPTh7JfJ2GhHFcLmF/CyOuDicfyseamnqDZoKDXS526mVNVS/H6pe1 X-Received: by 2002:aa7:d602:: with SMTP id c2mr1176409edr.317.1623271026729; Wed, 09 Jun 2021 13:37:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1623271026; cv=none; d=google.com; s=arc-20160816; b=w+11h1UK1cL2jNhTbBbvbaREgpBxKuE2cu9BYzKRodhc0/wM7WJ1bjEvyPnvYObwJr vcQXTE+0h4dnpydhwVigyzyZXik86Aee5936leRTxqZeIfc3lCRWYz6Do81B98VoJjRD qiLvSRG6VbWgOY2rkKzrbeYJCGqz8DkuYAMIH2O4zkdZDBszPUMh5BdgybrgaWn3zbyL lwXTje3zU1eDEyX/UyB0bN8ZenIhROibO2XBNopC8JnkdzbjXU6dquHRVDc9p1jXiBee 4XDwFzUVCIUyLostnKlvNPw6qPIgBKh6gpsMirQmnrURbkjIFmbd5IcPeMwLL7nut42p TG/Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=4tou8Ij0MR64Lz7wvtzPtmf/akDmg+C8gLyT+EbWhtY=; b=y2V2v4z+rY84dfbhmeZskDYX1cIEcogRq1LoSOO7Qm5fAeFc1xpO4aFRCYHOpXArdU T+kmnJrfieiz8cUYnpVysn3aruGn137adBN2o9QVfv6y2KeG95C5261kDH4YOQwKX1t3 jsn66EXYeNi+WdU0ppwIc/Fswoc+I8li3mgTAoSKuqgzaRGf1X+bHTgz23TXKZSi3tNV xOaItlG32iz61dejwzoaIJF4FmH3imWZ25LVPzvvqZxcbn3e+W/lQuT1XjboQXgLIspQ NHeA4CEtHL6CJRaPxQm2LhgREEVqhRrbjvUn8MO/fBx/Jze1KCp3I3r7x9VfWz/Kx5ll HSaA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@cmpxchg-org.20150623.gappssmtp.com header.s=20150623 header.b=l+WAIxTS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z26si743091ejc.46.2021.06.09.13.36.43; Wed, 09 Jun 2021 13:37:06 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@cmpxchg-org.20150623.gappssmtp.com header.s=20150623 header.b=l+WAIxTS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229865AbhFIUeq (ORCPT + 99 others); Wed, 9 Jun 2021 16:34:46 -0400 Received: from mail-qk1-f173.google.com ([209.85.222.173]:33561 "EHLO mail-qk1-f173.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229823AbhFIUep (ORCPT ); Wed, 9 Jun 2021 16:34:45 -0400 Received: by mail-qk1-f173.google.com with SMTP id k4so25132010qkd.0 for ; Wed, 09 Jun 2021 13:32:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=4tou8Ij0MR64Lz7wvtzPtmf/akDmg+C8gLyT+EbWhtY=; b=l+WAIxTSX0+7AYWb4Zsg5/UW3epyBa2JngfMmH0BO49mEDNpyjh33mYiwtMkRHBRiw YdvYAcIGIyCuJ22jFnhjm87zBEui4oDOaT/MnaQaXqfxDSvz655+8bUJm2UzugvUUmJs R9fTjM9fUlHY/QJFZaEnM2gAqGukpqbHAHUgo7qMZabYr9tqOQg0bUBlRKxYlf9uIOcF ZXhdnylcHfz5zUNB38jfPl5MZ9sYaP8rr0bg/CWu3ufkqHG9r7R8Gh90vZ347oxwICng 7ckPHHOTIvlkZB5uCqYQ2UMMmD0lYzSyfppjJ7ZtIr4QO+GtyOZTyc9iG7r7zL87W+fV Xelw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=4tou8Ij0MR64Lz7wvtzPtmf/akDmg+C8gLyT+EbWhtY=; b=cUR7tJvMx5ggnTYkzxMrexI7fkcfhJaAERK4PUxX8Tula2uieo3gFlypKtZAGmXBCE phFgrziI5HofaXgiYKgOaXmlVpyG/baFr46NhEdXpyq9cBif6zW8LofjFXmGvx8X3JpM HD8kfpKHOHSIUZiBTgUO6DGEuAKFWSc4IGMdSYoLjx6QCFLNbXG0uKNmktFzXdpFG2LU JLgULr5Lh1IYkmQVM/CQkjUYr28ZwrF++5IaKDPg1aA9kxZUKt33+w301jg3Jaz1In1A T8SOdReJD45w5pJ1OjJKNTTf8lHyHF1T5Uq1MsY92yHoEk7VWruzypzbewyw+41TDTOb dzKw== X-Gm-Message-State: AOAM532Ca1MByxyrx4UbaaIFnxx6SJ3iSsASBIgi+/IbSBRNHoW3BS4v RpRy5AkGLlI2pCN7LJivtwMLVA== X-Received: by 2002:a37:848:: with SMTP id 69mr1444328qki.411.1623270699692; Wed, 09 Jun 2021 13:31:39 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::1:2165]) by smtp.gmail.com with ESMTPSA id e3sm789600qts.34.2021.06.09.13.31.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Jun 2021 13:31:38 -0700 (PDT) Date: Wed, 9 Jun 2021 16:31:37 -0400 From: Johannes Weiner To: "Eric W. Biederman" Cc: "Enrico Weigelt, metux IT consult" , Chris Down , legion@kernel.org, LKML , Linux Containers , Linux Containers , Linux FS Devel , linux-mm@kvack.org, Andrew Morton , Christian Brauner , Michal Hocko Subject: Re: [PATCH v1] proc: Implement /proc/self/meminfo Message-ID: References: <87k0n2am0n.fsf@disp2133> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87k0n2am0n.fsf@disp2133> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jun 09, 2021 at 02:14:16PM -0500, Eric W. Biederman wrote: > "Enrico Weigelt, metux IT consult" writes: > > > On 03.06.21 13:33, Chris Down wrote: > > > > Hi folks, > > > > > >> Putting stuff in /proc to get around the problem of "some other metric I need > >> might not be exported to a container" is not a very compelling argument. If > >> they want it, then export it to the container... > >> > >> Ultimately, if they're going to have to add support for a new > >> /proc/self/meminfo file anyway, these use cases should just do it properly > >> through the already supported APIs. > > > > It's even a bit more complex ... > > > > /proc/meminfo always tells what the *machine* has available, not what a > > process can eat up. That has been this way even long before cgroups. > > (eg. ulimits). > > > > Even if you want a container look more like a VM - /proc/meminfo showing > > what the container (instead of the machine) has available - just looking > > at the calling task's cgroup is also wrong. Because there're cgroups > > outside containers (that really shouldn't be affected) and there're even > > other cgroups inside the container (that further restrict below the > > container's limits). > > > > BTW: applications trying to autotune themselves by looking at > > /proc/meminfo are broken-by-design anyways. This never has been a valid > > metric on how much memory invididual processes can or should eat. > > Which brings us to the problem. > > Using /proc/meminfo is not valid unless your application can know it has > the machine to itself. Something that is becoming increasing less > common. > > Unless something has changed in the last couple of years, reading values > out of the cgroup filesystem is both difficult (v1 and v2 have some > gratuitous differences) and is actively discouraged. > > So what should applications do? > > Alex has found applications that are trying to do something with > meminfo, and the fields that those applications care about. I don't see > anyone making the case that specifically what the applications are > trying to do is buggy. > > Alex's suggest is to have a /proc/self/meminfo that has the information > that applications want, which would be something that would be easy > to switch applications to. The patch to userspace at that point is > as simple as 3 lines of code. I can imagine people take that patch into > their userspace programs. But is it actually what applications want? Not all the information at the system level translates well to the container level. Things like available memory require a hierarchical assessment rather than just a look at the local level, since there could be limits higher up the tree. Not all items in meminfo have a container equivalent, either. The familiar format is likely a liability rather than an asset. > The simple fact that people are using /proc/meminfo when it doesn't make > sense for anything except system monitoring tools is a pretty solid bug > report on the existing linux apis. I agree that we likely need a better interface for applications to query the memory state of their container. But I don't think we should try to emulate a format that is a poor fit for this. We should also not speculate what users intended to do with the meminfo data right now. There is a surprising amount of misconception around what these values actually mean. I'd rather have users show up on the mailing list directly and outline the broader usecase.