Received: by 10.223.176.46 with SMTP id f43csp794757wra; Fri, 19 Jan 2018 01:59:56 -0800 (PST) X-Google-Smtp-Source: ACJfBosae5cQ5rVygHpH4kqgRJigMeKJY/oOMfeVCZzH8YVEg2yzLqBNRW91dzqOkSfOP527SZsO X-Received: by 10.101.78.207 with SMTP id w15mr17519015pgq.349.1516355996270; Fri, 19 Jan 2018 01:59:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1516355996; cv=none; d=google.com; s=arc-20160816; b=hzxLpCW+1izAY5bsiONv1YmAytNuwBqsBtxRnthzpkMkjxEBukqTA2t8EL5VhZw0Yy mKgteG0ovway0pTKc8Uzr+gfEKdh0oJZRLcqeH4B0h0TIkMdZ2dJmJKJqAzoi04P98rb NzjaV5wxG2+yE9PU7ieYCXg1XkiEemRklM3zIvqWoyCi1Vzx3Z3eB0rxm/AeR3hgCkTE Yv1Q4amLZB0Io4RPs7fu1sto2SymMWll+HBhqkWG5t+U5rN6FGtdzKpaGhF3UawJzEIk 0VOZVuIivLyIpV5WXG/4k6cFdX3OOyJOm08bgKqt4KaAM6eOxwEEK2czqVFSe7/jyKDZ wBXw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:reply-to:dkim-signature :arc-authentication-results; bh=2u63lCmZPsNJOeGXX+CXUF5TkwLDTXG+Rn6ytRkTNek=; b=beQWKfUXGzWhKgCcSJF0doHaJEAhy30un5CTm/kdJrTXoclyw45ZTXrA+bzZc/iyGE BU25vnTEN0DQFSUmWVEX5vDDlvBw0dNyn4o2t17//+hu6LSe0qZqLXhYKiJ0ej7rwIbM 3eIC7BchHND9RyaJqb+FvormZ6t9l2/u9SwZ45sbRGZiUllSMV6GnuVppKtBONdSypiG VwoaSYSSwy5GPKqLiFsroyT20zNFSI8Q3a4UBTAWuKj1f2Kdxo7yUvPEylG7mUbOX52H Av27RhnUf6wgoyCDp7NOFPxCDaXzdv/LXJvDMScTGsJ93rdAWHhX/UpA4HaxY1vzxPV/ 73yQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=o6+UQJzP; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 78si9051925pfi.25.2018.01.19.01.59.41; Fri, 19 Jan 2018 01:59:56 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=o6+UQJzP; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755415AbeASJ6d (ORCPT + 99 others); Fri, 19 Jan 2018 04:58:33 -0500 Received: from mail-wr0-f173.google.com ([209.85.128.173]:37618 "EHLO mail-wr0-f173.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751424AbeASJ61 (ORCPT ); Fri, 19 Jan 2018 04:58:27 -0500 Received: by mail-wr0-f173.google.com with SMTP id f11so1040250wre.4 for ; Fri, 19 Jan 2018 01:58:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=reply-to:subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding:content-language; bh=2u63lCmZPsNJOeGXX+CXUF5TkwLDTXG+Rn6ytRkTNek=; b=o6+UQJzPVHIXawzgnDYlpnwJCVBvyiP+58CljV/1uq8GyixWxZnT6Bw/wM2G6xvP/+ eFqKSzkeLSxnzVuTfTqeOkFsPldTLqpFAzc7FmbLvJYRXiTvsPqRBNgJ8nBPo/wef05c 0o2IwxA8G6LGEf7dzm4tS4yRxZeHwdsdpE1UbihUXbt2tfkaF0H0UNXH5jjUwCKz+j65 w5GvD0N40ca+qdqoXNmLVhgmSPe4KyqpWK/9awtEjAcQYP17360kmUngspFR1yXu6AHM z+jnuFV+w3KfD9U3bGj0NuDfA9WcuqG6LNlSs8FhN4tHL96KQ3F7fV/rAfGWUeWpHKTM DeXA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:reply-to:subject:to:cc:references:from :message-id:date:user-agent:mime-version:in-reply-to :content-transfer-encoding:content-language; bh=2u63lCmZPsNJOeGXX+CXUF5TkwLDTXG+Rn6ytRkTNek=; b=jgeVZwHbigkPXjJx3vK3Uizf8AXJSgz67ICNiMxG+Pd5K6+oYczx5ktz+r7dzyHk6j kwiU9FAmlO3zZy+9rwqWe6wwh2PlzePDuUGPpJ0VEqA4GtN+FxRTgDCkFZvFAT5CB/4Q 05h0TV4V3f2baQBWRfB0E8LYk/wPFY4XbAGuyukBgVKsXGbaF9fXys3F7hFmGfc7dT+r EOgDvvUTAU6HQrECBd4nY0KW7f8/JGB0zRCamA1sw2dbWKgC5VJjLpJ7O4fKrSlB454f XtarZK2pr4cIX+KP3xA5zBoM73iBXLHaLruW8JsHzjw2qMNErg2zSMqNX4Digmdm0Lxk OLMg== X-Gm-Message-State: AKwxytcXu9Z2VcLbWj6wj7//sGRxv2ybGwXyJkeGqUwZXTYkXg/m5fzd 7MiXahNtb6icOkbnY3dzd42095tE X-Received: by 10.223.130.119 with SMTP id 110mr9345438wrb.139.1516355905726; Fri, 19 Jan 2018 01:58:25 -0800 (PST) Received: from ?IPv6:2a02:908:1251:8fc0:4c6d:7233:b7e1:3b88? ([2a02:908:1251:8fc0:4c6d:7233:b7e1:3b88]) by smtp.gmail.com with ESMTPSA id j125sm1862860wmd.19.2018.01.19.01.58.24 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 19 Jan 2018 01:58:25 -0800 (PST) Reply-To: christian.koenig@amd.com Subject: Re: [RFC] Per file OOM badness To: =?UTF-8?Q?Michel_D=c3=a4nzer?= , =?UTF-8?Q?Christian_K=c3=b6nig?= , Michal Hocko , Eric Anholt Cc: Andrey Grodzovsky , dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, amd-gfx@lists.freedesktop.org, linux-mm@kvack.org References: <1516294072-17841-1-git-send-email-andrey.grodzovsky@amd.com> <20180118170006.GG6584@dhcp22.suse.cz> <20180118171355.GH6584@dhcp22.suse.cz> <87k1wfgcmb.fsf@anholt.net> <20180119082046.GL6584@dhcp22.suse.cz> <0cfaf256-928c-4cb8-8220-b8992592071b@amd.com> From: =?UTF-8?Q?Christian_K=c3=b6nig?= Message-ID: <11153f4f-8b9a-5780-6087-bc1e85459584@gmail.com> Date: Fri, 19 Jan 2018 10:58:24 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Am 19.01.2018 um 10:32 schrieb Michel Dänzer: > On 2018-01-19 09:39 AM, Christian König wrote: >> Am 19.01.2018 um 09:20 schrieb Michal Hocko: >>> On Thu 18-01-18 12:01:32, Eric Anholt wrote: >>>> Michal Hocko writes: >>>> >>>>> On Thu 18-01-18 18:00:06, Michal Hocko wrote: >>>>>> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote: >>>>>>> Hi, this series is a revised version of an RFC sent by Christian >>>>>>> König >>>>>>> a few years ago. The original RFC can be found at >>>>>>> https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html >>>>>>> >>>>>>> >>>>>>> This is the same idea and I've just adressed his concern from the >>>>>>> original RFC >>>>>>> and switched to a callback into file_ops instead of a new member >>>>>>> in struct file. >>>>>> Please add the full description to the cover letter and do not make >>>>>> people hunt links. >>>>>> >>>>>> Here is the origin cover letter text >>>>>> : I'm currently working on the issue that when device drivers >>>>>> allocate memory on >>>>>> : behalf of an application the OOM killer usually doesn't knew >>>>>> about that unless >>>>>> : the application also get this memory mapped into their address >>>>>> space. >>>>>> : >>>>>> : This is especially annoying for graphics drivers where a lot of >>>>>> the VRAM >>>>>> : usually isn't CPU accessible and so doesn't make sense to map >>>>>> into the >>>>>> : address space of the process using it. >>>>>> : >>>>>> : The problem now is that when an application starts to use a lot >>>>>> of VRAM those >>>>>> : buffers objects sooner or later get swapped out to system memory, >>>>>> but when we >>>>>> : now run into an out of memory situation the OOM killer obviously >>>>>> doesn't knew >>>>>> : anything about that memory and so usually kills the wrong process. >>>>> OK, but how do you attribute that memory to a particular OOM killable >>>>> entity? And how do you actually enforce that those resources get freed >>>>> on the oom killer action? >>>>> >>>>>> : The following set of patches tries to address this problem by >>>>>> introducing a per >>>>>> : file OOM badness score, which device drivers can use to give the >>>>>> OOM killer a >>>>>> : hint how many resources are bound to a file descriptor so that it >>>>>> can make >>>>>> : better decisions which process to kill. >>>>> But files are not killable, they can be shared... In other words this >>>>> doesn't help the oom killer to make an educated guess at all. >>>> Maybe some more context would help the discussion? >>>> >>>> The struct file in patch 3 is the DRM fd.  That's effectively "my >>>> process's interface to talking to the GPU" not "a single GPU resource". >>>> Once that file is closed, all of the process's private, idle GPU buffers >>>> will be immediately freed (this will be most of their allocations), and >>>> some will be freed once the GPU completes some work (this will be most >>>> of the rest of their allocations). >>>> >>>> Some GEM BOs won't be freed just by closing the fd, if they've been >>>> shared between processes.  Those are usually about 8-24MB total in a >>>> process, rather than the GBs that modern apps use (or that our testcases >>>> like to allocate and thus trigger oomkilling of the test harness instead >>>> of the offending testcase...) >>>> >>>> Even if we just had the private+idle buffers being accounted in OOM >>>> badness, that would be a huge step forward in system reliability. >>> OK, in that case I would propose a different approach. We already >>> have rss_stat. So why do not we simply add a new counter there >>> MM_KERNELPAGES and consider those in oom_badness? The rule would be >>> that such a memory is bound to the process life time. I guess we will >>> find more users for this later. >> I already tried that and the problem with that approach is that some >> buffers are not created by the application which actually uses them. >> >> For example X/Wayland is creating and handing out render buffers to >> application which want to use OpenGL. >> >> So the result is when you always account the application who created the >> buffer the OOM killer will certainly reap X/Wayland first. And that is >> exactly what we want to avoid here. > FWIW, what you describe is true with DRI2, but not with DRI3 or Wayland > anymore. With DRI3 and Wayland, buffers are allocated by the clients and > then shared with the X / Wayland server. Good point, when I initially looked at that problem DRI3 wasn't widely used yet. > Also, in all cases, the amount of memory allocated for buffers shared > between DRI/Wayland clients and the server should be relatively small > compared to the amount of memory allocated for buffers used only locally > in the client, particularly for clients which create significant memory > pressure. That is unfortunately only partially true. When you have a single runaway application which tries to allocate everything it would indeed work as you described. But when I tested this a few years ago with X based desktop the applications which actually used most of the memory where Firefox and Thunderbird. Unfortunately they never got accounted for that. Now, on my current Wayland based desktop it actually doesn't look much better. Taking a look at radeon_gem_info/amdgpu_gem_info the majority of all memory was allocated either by gnome-shell or Xwayland. Regards, Christian.