Received: by 10.223.185.116 with SMTP id b49csp2141984wrg; Thu, 22 Feb 2018 08:46:00 -0800 (PST) X-Google-Smtp-Source: AH8x224BFJ9My9zGegeH29kIjRe2MyWcIhaNjVRqWrBWfbJYx6xPHmGEGxFZtmFOQ/3YZVZrRvQl X-Received: by 2002:a17:902:28c4:: with SMTP id f62-v6mr7045847plb.411.1519317960074; Thu, 22 Feb 2018 08:46:00 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1519317960; cv=none; d=google.com; s=arc-20160816; b=Enje7E6RGbL65FjhDOCQMGcvSKVmjBi+McIi89vbLW8B4rms6p5T0GpHEIacTKJg2Y gA3go8k6RcbuhDZX+LFzGtUyWrKtw8dW0PNWBg7Yk0lBNSEAvj2HoeZGhsDomjrYQgOu lpidPwe+UxlFx8wr18jS7FIG8BseX4v4HQTFEGwT3QJHJifzog0sWZD01VZzYBnN2cSd IYHZcxVssb4koTS0B9yLm6W4PL4Ymx5HVjAkiVll5bH6sj3XgCXn3kN/Z4Z1toLHT4Dj J68JW19avlmwUOtfL91TyoYTimtY2dYSmx6V5Jxkpn+I/wipR8TEPtfeREQ471PlECGA Alww== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=zZgFhCuJfn4TqnFLPoqFcH7sKih9PpUMgkvY/7LilhE=; b=f3G8afV+YBtzDtx4/8RLfnky5YDUpEdrlgIoOfBHqmWooQxeuyHxkUEWm7FefwPw+1 jzlghmXUkeQRvbidoUixtjXsEk19C5IEN2jEe/jxfkxNsuOODxHPQtyq7yoPINerAxd/ yYPN+ZmIz3LFJmC89VOY1AV6/j1HqFCcrvWBwd5X0HU/U15qH+VWy2Bhr028xnQ0ruCh UCiPaMYhqSEnmTOHVgEiZSYMG02vAcyRQoW/icqvhFowrNBfOzc/EPbzcKz5s0E0zmZn hxCUydJodImFDBIqxQiuhJQXiDGISppfkfQlT2gJNrpdVuQdpA6yhrtm2jIi5tVszg/U TT3A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=BOslvm8P; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d2si240143pgo.342.2018.02.22.08.45.44; Thu, 22 Feb 2018 08:46:00 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=BOslvm8P; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933361AbeBVQod (ORCPT + 99 others); Thu, 22 Feb 2018 11:44:33 -0500 Received: from mail-qt0-f182.google.com ([209.85.216.182]:45499 "EHLO mail-qt0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933197AbeBVQob (ORCPT ); Thu, 22 Feb 2018 11:44:31 -0500 Received: by mail-qt0-f182.google.com with SMTP id v90so7060706qte.12; Thu, 22 Feb 2018 08:44:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=zZgFhCuJfn4TqnFLPoqFcH7sKih9PpUMgkvY/7LilhE=; b=BOslvm8PGVspowzC5XGuKcQsYy0MeQjetB0y5Bj3c3JFkMbvY8MFaIQLvJp+3tzUPH Lq/XlnJdfWVzXadPp/XVBS5n0hs5LE7HkKFJR7uT3JMZH6DVk0Wo3GHNATfngpVXHdIk fvqwdXoYxPtFq1PloxdxJiqyYyKInU6EXlzXAz5dT3dsAWDp//sfG5yPI31do+xWtscH ESO9YAhLOFQ65zNfpQtclya7zFg5KFarx46WL1o/kgJ4AAtfTGzlAlQ/sw4plujVeSSR Xrr5j2+ZwIlzYcQhYeVoNGj7Ac8714a1RvbVxCgvF0If4sdQVGdFDEbPH5nGn4uSiX3T VXHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=zZgFhCuJfn4TqnFLPoqFcH7sKih9PpUMgkvY/7LilhE=; b=VjsixqsribrQPEMJ6ZRHfi1K6aUG/X/wOTgUh7BsuIaj2INGu7FUN+nLxZ4Z4pay9o B/mngxBe0pjxffqvZZqrfk1grW4QJJsRCzYjbKzO8zRXsUcHIqFvDRyFV+qU26Yx77tr 6p09cASRF1KB1KBTt6/11BoefT7f9+zXKSHrabbwc704RJmw32M80pMS9CBCTZ1stjrg bnmZPZf31/BMEW7QGIycDGfx5sml/2ZL0f7L4aadJw89kGo9PVVifU2ParbDk9C0er8W 6NJmyafj/5RYf82pUB1soj0JSsJ5X0oD8NN1BYBgaFVwM1QxxsWGPrM3PxWCPUC9kHuz 2ACA== X-Gm-Message-State: APf1xPCV1Fk4Ysot6BqfLz73v34hoAHK+m/KQunPt40p18nWljvsmI74 EqkwtFPFnK8y8Tm5GAS52KVKU/QPLRvVSyUXfmWKyoJZ X-Received: by 10.200.8.56 with SMTP id u53mr2232148qth.315.1519317870118; Thu, 22 Feb 2018 08:44:30 -0800 (PST) MIME-Version: 1.0 Received: by 10.237.35.241 with HTTP; Thu, 22 Feb 2018 08:44:29 -0800 (PST) In-Reply-To: <20180222144844.g4p2diu3cnbr7sx3@quack2.suse.cz> References: <20180221030101.221206-1-shakeelb@google.com> <20180221030101.221206-4-shakeelb@google.com> <20180222134944.GK30681@dhcp22.suse.cz> <20180222144844.g4p2diu3cnbr7sx3@quack2.suse.cz> From: Yang Shi Date: Thu, 22 Feb 2018 08:44:29 -0800 Message-ID: Subject: Re: [PATCH v2 3/3] fs: fsnotify: account fsnotify metadata to kmemcg To: Jan Kara Cc: Michal Hocko , Shakeel Butt , Amir Goldstein , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Greg Thelen , Johannes Weiner , Vladimir Davydov , Mel Gorman , Vlastimil Babka , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Feb 22, 2018 at 6:48 AM, Jan Kara wrote: > On Thu 22-02-18 14:49:44, Michal Hocko wrote: >> On Tue 20-02-18 19:01:01, Shakeel Butt wrote: >> > A lot of memory can be consumed by the events generated for the huge or >> > unlimited queues if there is either no or slow listener. This can cause >> > system level memory pressure or OOMs. So, it's better to account the >> > fsnotify kmem caches to the memcg of the listener. >> >> How much memory are we talking about here? > > 32 bytes per event (on 64-bit) which is small but the number of events is > not limited in any way (if the creator uses a special flag and has > CAP_SYS_ADMIN). In the thread [1] a guy from Alibaba wanted this feature so > among cloud people there is apparently some demand to have a way to limit > memory usage of such application... Yes, I'm the guy from Alibaba :-) We did run into such issue occasionally, then I proposed the patch to account fsnotify kmem in memcg although we fixed the bug in user space applications later. However, such accounting still sounds useful to me. > >> > There are seven fsnotify kmem caches and among them allocations from >> > dnotify_struct_cache, dnotify_mark_cache, fanotify_mark_cache and >> > inotify_inode_mark_cachep happens in the context of syscall from the >> > listener. So, SLAB_ACCOUNT is enough for these caches. >> > >> > The objects from fsnotify_mark_connector_cachep are not accounted as >> > they are small compared to the notification mark or events and it is >> > unclear whom to account connector to since it is shared by all events >> > attached to the inode. >> > >> > The allocations from the event caches happen in the context of the event >> > producer. For such caches we will need to remote charge the allocations >> > to the listener's memcg. Thus we save the memcg reference in the >> > fsnotify_group structure of the listener. >> >> Is it typical that the listener lives in a different memcg and if yes >> then cannot this cause one memcg to OOM/DoS the one with the listener? > > We have been through these discussions already in [1] back in November :). > I can understand the wish to limit memory usage of an application using > unlimited fanotify queues. And yes, it may mean that it will be easier for > an attacker to get it oom-killed (currently the malicious app would drive > the whole system oom which will presumably take a bit more effort as there > is more memory to consume). But then I expect this is what admin prefers > when he limits memory usage of fanotify listener. > > I cannot tell how common it is for producer and listener to be in different > memcgs. From Alibaba request it seems it happens... For our usecase, we didn't have producers and listeners in the different memcgs (Please see the original discussion here https://lkml.org/lkml/2017/10/20/819). The different memcg accounting problem is raised by Amir since the accounting might be unfair if the listeners don't consume events and heuristic if producer and listener are in the different memcgs. However, we don't have strong demand on this from our perspective for the time being. So, I didn't continue to move forward on this approach. Regards, Yang > > Honza > > [1] https://lkml.org/lkml/2017/10/27/523 > -- > Jan Kara > SUSE Labs, CR > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: email@kvack.org