Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp10654574ybi; Thu, 11 Jul 2019 08:49:48 -0700 (PDT) X-Google-Smtp-Source: APXvYqwavaxDtC1TcHkCjEVfpgGL7YmCCqSn8q3LNAmtIS8iK8b+SceHF04j/CZNdE9V+zw2Av0p X-Received: by 2002:a17:902:e65:: with SMTP id 92mr5199681plw.13.1562860188084; Thu, 11 Jul 2019 08:49:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1562860188; cv=none; d=google.com; s=arc-20160816; b=uIpTuXDJuCbEwEtH44Xod6tbK3OxXHGdd9L50Iz9O65sf46stujGHydO/3eGsJCXQ8 +tSkdFCNC590SGHXWcZaS1GKYizkeRKQqcy2Lqo1DMscXgv5yGNqXI6j1fZWmaRuSUTj Zgisvh4nSHUGYSp9g5FdXyJRdJRAiPffez1dUPjR/y6LBkteK7obLEpyyjH9S9KvQapY JgJ0iryoahIoF8xxh7WTHmPI5nr0HrV7IG7q8RYS1TfsqrARhxVjG5PJM7l+deg75yHE JXLHbaB7eW3sXkhh7qv0CzA9WDjIOK5JT13cZTxLetEgvvy86SbpGuSnvejUdqg8SRpT vHoA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=3ahhQ+9CCV1RUDKRM5TFZcgtNPOj1dansSCnwk/nmOg=; b=WCmCJpmAJrr9hyjf5Z4HeOzjMvzvNSuTnFNGnCCBmf0wWhfdaK71ABUWx9PQHulbBn 3pkt9xkSGwoCoJ63R0IB+pl8rP8qJlHyad26NzrO5ZakBnk+IdLdRziMkzhFQxz9kLLA z+Msbps1nugs2ChBkyZLZ7eApODNYXBm2TN5xEyzUck+hen1QZOvQ/QcrP+cjSvFoGi+ HbyGfNCxfhbtQ43Jc7Od7XdMecU4i0kpP1JuXAPp7sAwoi98eBec52RmqwOOz2h5pNG+ ExB384PLGIVkvUvs/jdWLhjSFG2Fz6odCyC91jG8A2oI1B0rGf7HCsGd+OmS4LWu0mpc XxBg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@cmpxchg-org.20150623.gappssmtp.com header.s=20150623 header.b=d2lENQ3w; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k65si5332479pge.422.2019.07.11.08.49.32; Thu, 11 Jul 2019 08:49:48 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@cmpxchg-org.20150623.gappssmtp.com header.s=20150623 header.b=d2lENQ3w; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728950AbfGKPZ7 (ORCPT + 99 others); Thu, 11 Jul 2019 11:25:59 -0400 Received: from mail-pg1-f193.google.com ([209.85.215.193]:33868 "EHLO mail-pg1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728933AbfGKPZ6 (ORCPT ); Thu, 11 Jul 2019 11:25:58 -0400 Received: by mail-pg1-f193.google.com with SMTP id p10so3116790pgn.1 for ; Thu, 11 Jul 2019 08:25:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=3ahhQ+9CCV1RUDKRM5TFZcgtNPOj1dansSCnwk/nmOg=; b=d2lENQ3wKspL1afPpwiMIMJTEtBu+3KRlviHBIV4/B5UC37H51zReuZ416QRoXpIe2 o2wmkTIr2ailmX4MJoktDc/x2fNGBESK7lpnPf7caIWBEthnI9JViE9zFBOEMjvhf43m itDC80oThCy9XFYKTuVKPJxhfVAl1WIfVRYNc/DGVIIlIyfcbvY8GHGrD4lY885E/trN mb5C9EWN/v+uf6P74qonMeev0oC/D5Ql51Bi7zMZrRAMf3hOVm2RTc65AQrbrM2+uR3G FVzAh9j0UcKXid1HRp7x21ZT/XBEHmZbifIcBunTXtfAdXxqZgU0OdEi7CbGXIa3/UQr YdsQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=3ahhQ+9CCV1RUDKRM5TFZcgtNPOj1dansSCnwk/nmOg=; b=N3U0FQaxuU3SCF8zF9xN90FVtBZAvRWuB11YAk8LdeSeiLDkIIOm5eyd+DIYfxQJlY ZSGtvf1d19dEz45rHofLXtxuGusUsSlT8yAXyi55Kuv42J+CaIiKMjw6gRKGGlg90ZAb UPzojjqxN7mVKSB49xS5mL3nZEA05TbbqS2bpqheSMV8wb8XKAzq6IhmFDAEKBDw/bg4 ZFL17Wx+2D0TnDiFcVmGODuGD5ekm125mBDUgNqLrb76tnfdg9Xyz22+jwzROd6wkyHC /6zQbIcQ/fpOLBVCCp6xzFPVqlUuP31F3vcZzjVYJlf0AnFFPv3gEUHPZ9T+ThUDCw/j aL0w== X-Gm-Message-State: APjAAAXTFlxL3FbES9cd5aQ5ohx/3XbLTZ5NO0HVp9c9mnzI0sgnIW3Y 99rHso5gAG3xMJsKEcZWJZU= X-Received: by 2002:a17:90a:d3d4:: with SMTP id d20mr5665939pjw.28.1562858757966; Thu, 11 Jul 2019 08:25:57 -0700 (PDT) Received: from localhost ([2620:10d:c091:500::1:6fa9]) by smtp.gmail.com with ESMTPSA id h129sm5716609pfb.110.2019.07.11.08.25.56 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Thu, 11 Jul 2019 08:25:57 -0700 (PDT) Date: Thu, 11 Jul 2019 11:25:55 -0400 From: Johannes Weiner To: Minchan Kim Cc: Andrew Morton , linux-mm , LKML , linux-api@vger.kernel.org, Michal Hocko , Tim Murray , Joel Fernandes , Suren Baghdasaryan , Daniel Colascione , Shakeel Butt , Sonny Rao , oleksandr@redhat.com, hdanton@sina.com, lizeb@google.com, Dave Hansen , "Kirill A . Shutemov" Subject: Re: [PATCH v4 1/4] mm: introduce MADV_COLD Message-ID: <20190711152555.GB20341@cmpxchg.org> References: <20190711012528.176050-1-minchan@kernel.org> <20190711012528.176050-2-minchan@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190711012528.176050-2-minchan@kernel.org> User-Agent: Mutt/1.12.0 (2019-05-25) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jul 11, 2019 at 10:25:25AM +0900, Minchan Kim wrote: > When a process expects no accesses to a certain memory range, it could > give a hint to kernel that the pages can be reclaimed when memory pressure > happens but data should be preserved for future use. This could reduce > workingset eviction so it ends up increasing performance. > > This patch introduces the new MADV_COLD hint to madvise(2) syscall. > MADV_COLD can be used by a process to mark a memory range as not expected > to be used in the near future. The hint can help kernel in deciding which > pages to evict early during memory pressure. > > It works for every LRU pages like MADV_[DONTNEED|FREE]. IOW, It moves > > active file page -> inactive file LRU > active anon page -> inacdtive anon LRU > > Unlike MADV_FREE, it doesn't move active anonymous pages to inactive > file LRU's head because MADV_COLD is a little bit different symantic. > MADV_FREE means it's okay to discard when the memory pressure because > the content of the page is *garbage* so freeing such pages is almost zero > overhead since we don't need to swap out and access afterward causes just > minor fault. Thus, it would make sense to put those freeable pages in > inactive file LRU to compete other used-once pages. It makes sense for > implmentaion point of view, too because it's not swapbacked memory any > longer until it would be re-dirtied. Even, it could give a bonus to make > them be reclaimed on swapless system. However, MADV_COLD doesn't mean > garbage so reclaiming them requires swap-out/in in the end so it's bigger > cost. Since we have designed VM LRU aging based on cost-model, anonymous > cold pages would be better to position inactive anon's LRU list, not file > LRU. Furthermore, it would help to avoid unnecessary scanning if system > doesn't have a swap device. Let's start simpler way without adding > complexity at this moment. However, keep in mind, too that it's a caveat > that workloads with a lot of pages cache are likely to ignore MADV_COLD > on anonymous memory because we rarely age anonymous LRU lists. > > * man-page material > > MADV_COLD (since Linux x.x) > > Pages in the specified regions will be treated as less-recently-accessed > compared to pages in the system with similar access frequencies. > In contrast to MADV_FREE, the contents of the region are preserved > regardless of subsequent writes to pages. > > MADV_COLD cannot be applied to locked pages, Huge TLB pages, or VM_PFNMAP > pages. > > * v2 > * add up the warn with lots of page cache workload - mhocko > * add man page stuff - dave > > * v1 > * remove page_mapcount filter - hannes, mhocko > * remove idle page handling - joelaf > > * RFCv2 > * add more description - mhocko > > * RFCv1 > * renaming from MADV_COOL to MADV_COLD - hannes > > * internal review > * use clear_page_youn in deactivate_page - joelaf > * Revise the description - surenb > * Renaming from MADV_WARM to MADV_COOL - surenb > > Acked-by: Michal Hocko > Signed-off-by: Minchan Kim Acked-by: Johannes Weiner