Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp734240imm; Sat, 1 Sep 2018 19:24:14 -0700 (PDT) X-Google-Smtp-Source: ANB0Vdb0pSCHFuitwzmZVKurVDdAYsYg7Ps9OLM3e9wWaPLPdI+L2Ca40SINZ2BTJ9knXQIBgc6/ X-Received: by 2002:a62:f610:: with SMTP id x16-v6mr22530317pfh.169.1535855054433; Sat, 01 Sep 2018 19:24:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1535855054; cv=none; d=google.com; s=arc-20160816; b=JM6XXgoPulJPaTSyewBBWbjjukiKC2gQDeokj/NATGZ9+ITYvrHP+z71hEu3PTvqjq HCtl9+GyQuP/xmM3oqMp1rgj9dOb5FyVZy2EAiTmYO8olKqd8fPxFImNyXyclZr8ZJkl 4NbBRjrNJ7GpyJNY4Q5DLsHqWveMa1OFU4+oQ4uYa7kI7KuslIEZTQsuqR3vSG43KJ/3 BtiI1Do8MV1XpSmA0mbJ0H5lAh0UOlaKsBKD3W63K5vrbURSMNyANiLAcBBzpO/4gfvd 6HjU5XT4KQ3ekrIQjjzIc4HoVts7dvZzqQaDwvMQZj0JvL0AKB/5e+rcQqIxIQBC1Jaw VdcA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:lines:subject:cc:cc:cc:cc:cc:cc:cc:cc:cc :to:from:date:user-agent:message-id:arc-authentication-results; bh=J5XU53HJxpMmVaM4NHosvO0wb0yiks5eM1iwToYw34E=; b=0+1c7++JWiSPeYdqqXbAmyldIWv0WU0Fql9aOMW0fHm0yISdKVK6XgQKK4JBRPphKI my3ctR66VuHBeDNn2bfoZTqY2qFBhjQg+pCPem/nqy8v3mKtPf09LqrVwW7tPnXx6rDT 2AdyjpaijARL2FN3H5RDgesv2AYXdmypIgXoiNsyCXyJ9VrvL7yjVkO4IgehWcjUGb3d pkhDKk650cFSuBaIzvz/fFDv7JioEp+VTaq5VImSrNLIYJ5Sp40w7zv9V3lPWx74wpKm B9gJIuN9A7+4NjLNeV+XQVYeg6hbftTEnfl2kXv4L/2MT7hn/9QPRY/oydG8cXvphxKZ jr4w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n1-v6si14574000pfe.66.2018.09.01.19.23.59; Sat, 01 Sep 2018 19:24:14 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726296AbeIBGfJ (ORCPT + 99 others); Sun, 2 Sep 2018 02:35:09 -0400 Received: from mga18.intel.com ([134.134.136.126]:13094 "EHLO mga18.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725905AbeIBGfI (ORCPT ); Sun, 2 Sep 2018 02:35:08 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 Sep 2018 19:21:08 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,318,1531810800"; d="scan'208";a="80211542" Received: from dbxu-mobl.ccr.corp.intel.com (HELO wfg-t570.sh.intel.com) ([10.254.212.218]) by orsmga003.jf.intel.com with ESMTP; 01 Sep 2018 19:20:58 -0700 Received: from wfg by wfg-t570.sh.intel.com with local (Exim 4.89) (envelope-from ) id 1fwI0X-0003ZX-4h; Sun, 02 Sep 2018 10:20:57 +0800 Message-Id: <20180901112818.126790961@intel.com> User-Agent: quilt/0.63-1 Date: Sat, 01 Sep 2018 19:28:18 +0800 From: Fengguang Wu To: Andrew Morton cc: Linux Memory Management List cc: kvm@vger.kernel.org cc: Peng DongX cc: Liu Jingqi cc: Dong Eddie CC: Dave Hansen cc: Huang Ying CC: Brendan Gregg Cc: Fengguang Wu , LKML Subject: [RFC][PATCH 0/5] introduce /proc/PID/idle_bitmap Content-Length: 1694 Lines: 29 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This new /proc/PID/idle_bitmap interface aims to complement the current global /sys/kernel/mm/page_idle/bitmap. To enable efficient user space driven migrations. The pros and cons will be discussed in changelog of "[PATCH] proc: introduce /proc/PID/idle_bitmap". The driving force is to improve efficiency by 10+ times, so that hot/cold page tracking can be done in some regular intervals in user space w/o too much overheads. Making it possible for some user space daemon to do regular page migration between NUMA nodes of different speeds. Note it's not about NUMA migration between local and remote nodes -- we already have NUMA balancing for that. This interface and user space migration daemon targets for NUMA nodes made of different mediums -- ie. DIMM and NVDIMM(*) -- with larger performance gaps. Basic policy will be "move hot pages to DIMM; cold pages to NVDIMM". Since NVDIMMs size can easily reach several Terabytes, working set tracking efficiency will matter and be challeging. (*) Here we use persistent memory (PMEM) w/o using its persistence. Persistence is good to have, however it requires modifying applications. Upcoming NVDIMM products like Intel Apache Pass (AEP) will be more cost and energy effective than DRAM, but slower. Merely using it in form of NUMA memory node could immediately benefit many workloads. For example, warm but not hot apps, workloads with sharp hot/cold page distribution (good for migration), or relies more on memory size than latency and bandwidth, and do more reads than writes. This is an early RFC version to collect feedbacks. It's complete enough to demo the basic ideas and performance, however not usable yet. Regards, Fengguang