Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp278567imm; Mon, 9 Jul 2018 01:20:45 -0700 (PDT) X-Google-Smtp-Source: AAOMgpd9PQn2XInwr764b9IvtmylK9FnrOMakwQ1lgeAh3ym3Cl5lj3OmBwhL8zoDx9rvW8D12YF X-Received: by 2002:a63:b505:: with SMTP id y5-v6mr18384913pge.213.1531124445361; Mon, 09 Jul 2018 01:20:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1531124445; cv=none; d=google.com; s=arc-20160816; b=bq01A/x16acaF9zTA8hVqzJecSollFEBgsbqZXd1tho2Es+Hcr5MR9NBhFcuJTEVKU HbypAH104hniA+uuWVpzKEkn9jfsf6ZmO60nfrqeKRyzc88YmwRIEjFphKCBKvZhrKVS uKIVfbuwFev88VJU7IBSPaOeDWjJVEK9EAr0ofRSOcb4Mw5HBqcmoytz26LkKh0Gxe8j AijM2qMmZORqgzwRuq7XhyXtW38Brj/33pYHoP9Pd8wXlBcMdCF+wL+XtVMU6p8wC4hF 8LQ4XFrvEh4tGppVfayTl2FCJwKwRn6HQHj2X1BloVidHAcv/G5ox3G7/O74jW3QpmCY W4ZQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=VwgmegUvU2fPsPjR511XGN00AyPuFsj0ESPh9s1ln74=; b=q3fxkAVIonRrf/vKmiP3d5wFocyEN/jfcWv43QXRQsD4LFg/zDtgJ0aQ+nHigai6dU 7am9ciaSXppLf3mtMfNv7s8ea+i/psW8cWqADZ7i+/ywEpai+L84zPLSNCPzeJCyFuoT ugLA3m77fjAy9kzlsvcWMPOX0YJw2P1u2KaQjShQsF454/xkwXBQFWRj4NyFZ1f962KW HhnXzknLwaIRncB34p5ZEyc85RxCqD/CP57TMT6Kqh0vwYG5BBLxPMKJ0C76vLZe9Abl Gb/+iukOS4JnzBIlBjHLHNGbk/girr7qwnsyYmbRDup3MJ8W++tvmtHgYTBeAE+Huhbw VvSA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x19-v6si13200844pgf.477.2018.07.09.01.20.31; Mon, 09 Jul 2018 01:20:45 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932422AbeGIIT0 (ORCPT + 99 others); Mon, 9 Jul 2018 04:19:26 -0400 Received: from mx2.suse.de ([195.135.220.15]:36850 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1754457AbeGIITY (ORCPT ); Mon, 9 Jul 2018 04:19:24 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id B15B5ACF5; Mon, 9 Jul 2018 08:19:22 +0000 (UTC) Date: Mon, 9 Jul 2018 10:19:20 +0200 From: Michal Hocko To: Waiman Long Cc: Alexander Viro , Jonathan Corbet , "Luis R. Rodriguez" , Kees Cook , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-doc@vger.kernel.org, Linus Torvalds , Jan Kara , "Paul E. McKenney" , Andrew Morton , Ingo Molnar , Miklos Szeredi , Matthew Wilcox , Larry Woodman , James Bottomley , "Wangkai (Kevin C)" Subject: Re: [PATCH v6 0/7] fs/dcache: Track & limit # of negative dentries Message-ID: <20180709081920.GD22049@dhcp22.suse.cz> References: <1530905572-817-1-git-send-email-longman@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1530905572-817-1-git-send-email-longman@redhat.com> User-Agent: Mutt/1.10.0 (2018-05-17) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri 06-07-18 15:32:45, Waiman Long wrote: [...] > A rogue application can potentially create a large number of negative > dentries in the system consuming most of the memory available if it > is not under the direct control of a memory controller that enforce > kernel memory limit. How does this differ from other untracked allocations for untrusted tasks in general? E.g. nothing really prevents a task to create a long chain of unreclaimable dentries and even go to OOM potentially. Negative dentries should be easily reclaimable on the other hand. So why does the later needs a special treatment while the first one is ok? There are quite some resources which allow a non privileged user to consume a lot of memory and the memory controller is the only reliable way to mitigate the risk. > This patchset introduces changes to the dcache subsystem to track and > optionally limit the number of negative dentries allowed to be created by > background pruning of excess negative dentries or even kill it after use. > This capability will help to limit the amount of memory that can be > consumed by negative dentries. How are you going to balance that between workload? What prevents a rogue application to simply consume the limit and force all others in the system to go slow path? > Patch 1 tracks the number of negative dentries present in the LRU > lists and reports it in /proc/sys/fs/dentry-state. If anything I _think_ vmstat would benefit from this because behavior of the memory reclaim does depend on the amount of neg. dentries. > Patch 2 adds a "neg-dentry-pc" sysctl parameter that can be used to to > specify a soft limit on the number of negative allowed as a percentage > of total system memory. This parameter is 0 by default which means no > negative dentry limiting will be performed. percentage has turned out to be a really wrong unit for many tunables over time. Even 1% can be just too much on really large machines. > Patch 3 enables automatic pruning of least recently used negative > dentries when the total number is close to the preset limit. Please explain why this cannot be done in a standard dcache shrinking way. I strongly suspect that you are developing yet another reclaim with its own sets of tunable and bypassing the existing infrastructure. I haven't read patches yet but the cover letter doesn't really explain design much so I am only guessing. -- Michal Hocko SUSE Labs