Received: by 10.223.148.5 with SMTP id 5csp7769427wrq; Thu, 18 Jan 2018 09:14:46 -0800 (PST) X-Google-Smtp-Source: ACJfBou+h5kgM/KkdcqF507YcEeeJORbFFQYw2b9K1+f6kWcxq2DPbOxThl291VtlVpqlNtmz5uW X-Received: by 10.98.197.3 with SMTP id j3mr24617858pfg.93.1516295686381; Thu, 18 Jan 2018 09:14:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1516295686; cv=none; d=google.com; s=arc-20160816; b=UPlKuuDafnnINCa3/c/ywob5kleLDx7Lun3PCsVQUNjeX/2uaE/vi7ip+Y/kEACD7a /mXtSE3a4RHGYBb/KJB2sMyHNk5iEv5GvGAi5UTkBw5UyXB+LOVvntQwgv1SNUPnQ7Ni YXX4TTuhGQ7UIC4SyBQyK0jq1mWbrdfnMeBHKUi7ldN8+YZazT+z3NK/prWL+RQvdsYC dUd9rbOf2Oe33b6T75YTsYbsANmrBO9XefzauhFCMKmAIEO8KVnBPnAwP5qy/K0sB/Ie rf7ugcBsbrD6boiwpjn9DB45a2GuMxCazRbaw6K7w8tatcCtefCVEii5w0KHgcPr3fBG Zq6A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date :arc-authentication-results; bh=n4e1jyqd0ltCBkfHQzIjXYwzcqWTvLIqbpOthBGWhT0=; b=wUwYYm/J4p5u+xLajJi2bxCp1O7AklGU+OzuTMxaNJvv2Q05jhwAdWrkovFgCDufpu 3xkVoYaBbtzeqWsAiZtwPGFMFSCTVWrj4mjDyHTFH7FR/ZmOd6r8MYLK4PWSy+nIznTj Fku9Igj6SaDgYbOilRNt+psqGILZgf3ccxsOFPPck07rSISdmQSbNXCzW4boMiQDWNVA Mnf10s64zgF4TsEoH5x577RyDA6AhSyx8f5BxSJN0UN9D0fKOhWlurD38e4W+r7QFS3i 70tlHTC+aGKjdtdrhogiyzSWT8KMufPHh17jGpGqtQWrSnZh+N5esIPHD6cU9cwlZLUf +Pvg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b14si6467045pgv.310.2018.01.18.09.14.32; Thu, 18 Jan 2018 09:14:46 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755355AbeARRN7 (ORCPT + 99 others); Thu, 18 Jan 2018 12:13:59 -0500 Received: from mx2.suse.de ([195.135.220.15]:58765 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755268AbeARRN5 (ORCPT ); Thu, 18 Jan 2018 12:13:57 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 8F210AC0D; Thu, 18 Jan 2018 17:13:56 +0000 (UTC) Date: Thu, 18 Jan 2018 18:13:55 +0100 From: Michal Hocko To: Andrey Grodzovsky Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org, Christian.Koenig@amd.com Subject: Re: [RFC] Per file OOM badness Message-ID: <20180118171355.GH6584@dhcp22.suse.cz> References: <1516294072-17841-1-git-send-email-andrey.grodzovsky@amd.com> <20180118170006.GG6584@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20180118170006.GG6584@dhcp22.suse.cz> User-Agent: Mutt/1.9.2 (2017-12-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu 18-01-18 18:00:06, Michal Hocko wrote: > On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote: > > Hi, this series is a revised version of an RFC sent by Christian K?nig > > a few years ago. The original RFC can be found at > > https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html > > > > This is the same idea and I've just adressed his concern from the original RFC > > and switched to a callback into file_ops instead of a new member in struct file. > > Please add the full description to the cover letter and do not make > people hunt links. > > Here is the origin cover letter text > : I'm currently working on the issue that when device drivers allocate memory on > : behalf of an application the OOM killer usually doesn't knew about that unless > : the application also get this memory mapped into their address space. > : > : This is especially annoying for graphics drivers where a lot of the VRAM > : usually isn't CPU accessible and so doesn't make sense to map into the > : address space of the process using it. > : > : The problem now is that when an application starts to use a lot of VRAM those > : buffers objects sooner or later get swapped out to system memory, but when we > : now run into an out of memory situation the OOM killer obviously doesn't knew > : anything about that memory and so usually kills the wrong process. OK, but how do you attribute that memory to a particular OOM killable entity? And how do you actually enforce that those resources get freed on the oom killer action? > : The following set of patches tries to address this problem by introducing a per > : file OOM badness score, which device drivers can use to give the OOM killer a > : hint how many resources are bound to a file descriptor so that it can make > : better decisions which process to kill. But files are not killable, they can be shared... In other words this doesn't help the oom killer to make an educated guess at all. > : > : So question at every one: What do you think about this approach? I thing is just just wrong semantically. Non-reclaimable memory is a pain, especially when there is way too much of it. If you can free that memory somehow then you can hook into slab shrinker API and react on the memory pressure. If you can account such a memory to a particular process and make sure that the consumption is bound by the process life time then we can think of an accounting that oom_badness can consider when selecting a victim. -- Michal Hocko SUSE Labs