Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp1723394imu; Thu, 10 Jan 2019 01:47:31 -0800 (PST) X-Google-Smtp-Source: ALg8bN561hKCZfOF+dNvGkW/qpcyOtjbAzb8pYoZuV9xnTtTi86HKmtHqatKO2VIjjupY3fdDXus X-Received: by 2002:a65:584e:: with SMTP id s14mr8756358pgr.142.1547113651382; Thu, 10 Jan 2019 01:47:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1547113651; cv=none; d=google.com; s=arc-20160816; b=sNY6njK7HBpm6PHgd4GRR6RmTvIaljMFZw05PPYZJztPIyTnJtxs9vVDOLxX/GlDJ+ M+rbqtfj4N7wDl/bUU+jLNEHh+XLo1GPMqOtTpZr4ijJlCU3NCQZoQOyiHfVDc/9IJsy uZ4fS3Q8OxJGUMv99y2gAbw8l42tfeRV5XsX0Z1kf+ZaSk5ty/Kz9kWq3U9gKWSuGiJo mpdxULf997PUtooI34YqSHmj+0hBkM2m1Axo86nahA1igx8hy4N0nGD9I+7iDdtRmfzt RFjbMYs3t5peqs4G+39E9xIXbve5XQmNT+phVhybfAXH12vz3sHXIVIj/LRQZiaoR9/E aXGg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=JXKY1mDXJTNshFjKC4MF8va05Oj60fxIk4Rd0uYxj5M=; b=s0GjQCxHKaE+/ItVIH8GKg3fiVEgVJkP34qImJHalSOv4yIqDMqhZd2reUFSRCONdv SgzfIoanhw3WEbP/MtYugSXAFBAi924EOVlNRfaSpzXkq1FinJ/i+ssEjBWqUJpRQGHu KuFIsV06wkPnYymLz01jn7ypVj+QR0y9IWFfNP4lfX3/wh1FM9QpNPKrGaZzEbd3h4eQ LBps5SSEXgsCjjPSXDUra1OT6kKTW4n+7BobvGiprXDqu9KDv1ZF5Jzlsr4HDx42/qUW BtOSjJAkHBnlqVpRTgXdBzDrsiDLXUIbfkJAeXJQRxt/7XKHE02trX+xcAqB1ncn/2VS 9v3w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=virtuozzo.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 142si73911164pfy.217.2019.01.10.01.47.15; Thu, 10 Jan 2019 01:47:31 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=virtuozzo.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727934AbfAJJqK (ORCPT + 99 others); Thu, 10 Jan 2019 04:46:10 -0500 Received: from relay.sw.ru ([185.231.240.75]:55640 "EHLO relay.sw.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727863AbfAJJqK (ORCPT ); Thu, 10 Jan 2019 04:46:10 -0500 Received: from [172.16.25.169] by relay.sw.ru with esmtp (Exim 4.91) (envelope-from ) id 1ghWua-0003EX-Uo; Thu, 10 Jan 2019 12:46:05 +0300 Subject: Re: [PATCH RFC 0/3] mm: Reduce IO by improving algorithm of memcg pagecache pages eviction To: Shakeel Butt Cc: Andrew Morton , Johannes Weiner , josef@toxicpanda.com, Jan Kara , Hugh Dickins , "Darrick J. Wong" , Michal Hocko , Andrey Ryabinin , Roman Gushchin , Mel Gorman , Linux MM , LKML References: <154703479840.32690.6504699919905946726.stgit@localhost.localdomain> From: Kirill Tkhai Message-ID: Date: Thu, 10 Jan 2019 12:46:04 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.3.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, Shakeel, On 09.01.2019 20:37, Shakeel Butt wrote: > Hi Kirill, > > On Wed, Jan 9, 2019 at 4:20 AM Kirill Tkhai wrote: >> >> On nodes without memory overcommit, it's common a situation, >> when memcg exceeds its limit and pages from pagecache are >> shrinked on reclaim, while node has a lot of free memory. >> Further access to the pages requires real device IO, while >> IO causes time delays, worse powerusage, worse throughput >> for other users of the device, etc. >> >> Cleancache is not a good solution for this problem, since >> it implies copying of page on every cleancache_put_page() >> and cleancache_get_page(). Also, it requires introduction >> of internal per-cleancache_ops data structures to manage >> cached pages and their inodes relationships, which again >> introduces overhead. >> >> This patchset introduces another solution. It introduces >> a new scheme for evicting memcg pages: >> >> 1)__remove_mapping() uncharges unmapped page memcg >> and leaves page in pagecache on memcg reclaim; >> >> 2)putback_lru_page() places page into root_mem_cgroup >> list, since its memcg is NULL. Page may be evicted >> on global reclaim (and this will be easily, as >> page is not mapped, so shrinker will shrink it >> with 100% probability of success); >> >> 3)pagecache_get_page() charges page into memcg of >> a task, which takes it first. >> > > From what I understand from the proposal, on memcg reclaim, the file > pages are uncharged but kept in the memory and if they are accessed > again (either through mmap or syscall), they will be charged again but > to the requesting memcg. Also it is assumed that the global reclaim of > such uncharged file pages is very fast and deterministic. Is that > right? Yes, this was my assumption. But Michal, Josef and Johannes pointed a diving into reclaim in general is not fast. So, maybe we need some more creativity here to minimize the effect of this diving.. Thanks, Kirill