Received: by 2002:a05:6358:1087:b0:cb:c9d3:cd90 with SMTP id j7csp4057366rwi; Wed, 12 Oct 2022 10:01:22 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4ZtE2c0638JFffwG88TSnKMEYO+BnDtPgs58pw6NUiYVTp3Iyt/409eKCO1hX3iSOJHdsp X-Received: by 2002:a17:90b:278c:b0:20a:e1e6:5340 with SMTP id pw12-20020a17090b278c00b0020ae1e65340mr6205192pjb.239.1665594082576; Wed, 12 Oct 2022 10:01:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1665594082; cv=none; d=google.com; s=arc-20160816; b=Ynyl146zJeLvneGmSTy1yC1w6S7uoPUq4wST8+FUubQ24EcDyRxU0fUXFPQrteSWpN ROd0MUNsyF02gJZOZOd2NvMUDWIH/YlHwHvn5yAfUaI5VC+D3zLqKCAIGUO67/OMNEkS Gu2gtEDPS9miksbGCoYXFFDnktOrYubQhZ+SArlyjqKUenuAxUXyAYdi4ggqf2Bvdmkm RfQSFr2w5Jgs33J1JVVn1UDZnDfektMabjS2pZbTfMAla6B9hN2SIUTtOjoIjwuhvMBY zeM3tvsLY9JrluHC0cRlg7jE1XarNTGwIxOBac5VUHse/W4m4xZnGT+SE2d4Hc7+mSjv 6Fkg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=Fc9aPWwlaDEIemVAkXQodWBJZdXQzUAWkTmFVV68fY0=; b=seO2mwz/dYKbj6fiXmsbWjd+Z1EhX3Q5y+lulmEn5FtP32vBn5yoaRXJhezr6IWJ8G sisSFbez9VAz2QK08NaKgmIdG+dLCqkWTH+Fs0J+BOVCXp/GdpUeBZt6RG2HE4sJ/XJY HlosuR0yCq4lgPJZyJRS7aNP+d/eBs3ozxObpj5ZnKjiXnMwDwBkHjFtdfQJtImQhS33 n+88u/lO6Ove0qv+M5Tfy9hmKJCH6ql77boTqoYEGfCI8oBF3W8gcAQtNNpMQcIMbJLn SD/Z5zn09U2BrNiSav0eFJkOMnUekOjx962cQVQOxN7nNr+z7kVXYo85ZpMy9j4OYb8R bC6g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=PPRbtC3X; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i6-20020a17090332c600b0017c0d822437si22219194plr.82.2022.10.12.10.01.05; Wed, 12 Oct 2022 10:01:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=PPRbtC3X; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229912AbiJLQvY (ORCPT + 99 others); Wed, 12 Oct 2022 12:51:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56496 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229477AbiJLQvW (ORCPT ); Wed, 12 Oct 2022 12:51:22 -0400 Received: from mail-ed1-x532.google.com (mail-ed1-x532.google.com [IPv6:2a00:1450:4864:20::532]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 79758E0729 for ; Wed, 12 Oct 2022 09:51:21 -0700 (PDT) Received: by mail-ed1-x532.google.com with SMTP id s2so25323031edd.2 for ; Wed, 12 Oct 2022 09:51:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=Fc9aPWwlaDEIemVAkXQodWBJZdXQzUAWkTmFVV68fY0=; b=PPRbtC3X/dWkdaga6/ZZnmIo4zcpZu7mj0OcnoRXqs6WmgIly9BfuRMrykKi5WXuUo zp7G8XeQ5wr7tWraiJqb6btLJG3syUDC/jLP34XdlSXQW9Lm3OdHhCPiyyYF8UK63D4F +yt8q8wn3jWDB9WU56jVz+4tlIgB6VL43rK19Y7R95YF0V6CjN75KIswYQ5ij5VWiH9V ESJNx08v8EVbXB4x/8jSqSVgU0oTzd4OTRzGdf7BYc2uaBlfHGBHWa1628VAXJpY60u7 PKtVuXkyUa8hjIsOEJj4cv4d6Q/Erc2WxmOJ5LkB3mjtVl2i0d5cou2SP7beeVmRvLfi dk2w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Fc9aPWwlaDEIemVAkXQodWBJZdXQzUAWkTmFVV68fY0=; b=VBff1UxtanCrITAzXLrprpvXTztPo3TsvB4W2QHwdX90eyCk5K72LxLOat+yFAcDdZ lz51ioq45BENUXRcko5BvZfnUCXKzu1Q9euzOMX4w715o5X8elKhdxU8u78ZCF+xBhjC WBqNhRBERiYH+2457/121fVq5FYFW8pSOCWWu0dD2AlIrv0O6zquaDgY6q2CyVZnttFs n0HtxglWUB14CTdxhER+3S3oqzhzEBB4Uq6HBN3gQO4a873P3RDUXHAUfTgEkKYloXae VzJOAWrU4BN0weq5MQCu68nbpHomgAPgxv/+zrM7jiWSIBCgkQTMQGOkGEhMLN5+dC1O ghew== X-Gm-Message-State: ACrzQf079oOt2LIEnDW5W1p3ryAcPuJjYV69kpvKZhjp1k4uGkymCTkX kHwbcwjv6DauTb/dPs7b7TbjfjeS8uNsEHXtbEsyqQ== X-Received: by 2002:a05:6402:d6c:b0:458:ef3d:5926 with SMTP id ec44-20020a0564020d6c00b00458ef3d5926mr28667061edb.54.1665593479938; Wed, 12 Oct 2022 09:51:19 -0700 (PDT) MIME-Version: 1.0 References: <20221010094842.4123037-1-hezhongkun.hzk@bytedance.com> In-Reply-To: From: Frank van der Linden Date: Wed, 12 Oct 2022 09:51:08 -0700 Message-ID: Subject: Re: [RFC] mm: add new syscall pidfd_set_mempolicy() To: Vinicius Petrucci Cc: Michal Hocko , Zhongkun He , corbet@lwn.net, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-api@vger.kernel.org, linux-doc@vger.kernel.org, wuyun.abel@bytedance.com Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Oct 12, 2022 at 5:34 AM Vinicius Petrucci wrote: > > > Well, per address range operation is a completely different beast I > > would say. External tool would need to a) understand what that range is > > used for (e.g. stack/heap ranges, mmaped shared files like libraries or > > private mappings) and b) by in sync with memory layout modifications > > done by applications (e.g. that an mmap has been issued to back malloc > > request). Quite a lot of understanding about the specific process. I > > would say that with that intimate knowledge it is quite better to be > > part of the process and do those changes from within of the process > > itself. > > Sorry, this may be a digression, but just wanted to mention a > particular use case from a project I recently collaborated on (to > appear next month at IIWSC 2022: > http://www.iiswc.org/iiswc2022/index.html). > > We carried out a performance analysis of the latest Linux AutoNUMA > memory tiering on graph processing applications. We noticed that hot > pages cannot be properly identified by the reactive approach used by > AutoNUMA due to irregular/random memory access patterns. Thus, as a > POC, we implemented and evaluated a simple idea of having an external > user-level process/agent that, based on prior profiling results of > memory regions, could make more effectively memory chunk/object-based > mappings (instead of page-level allocation/migration) in advance on > either DRAM or CXL/PMEM (via mbind calls). This kind of tiering > solution could deliver up to 2x more performance for graph analytics > workloads. We plan to evaluate other workloads as well. > > Having a feature like "pidfd/process_mbind" would really simplify our > user-level agent implementation moving forward, as right now we are > adding a LD_PRELOAD wrapper (for signal handler) to listen and execute > "mbind" requests from another process. If there's any other > alternative solution to this already (via ptrace?), please let me > know. > Interesting, looking forward to seeing your paper! This is the kind of use case I was trying to describe for pidfd_mbind() - a userspace orchestrator with some intimate knowledge of the process' memory layout (through profiling, like in your case, or otherwise), that can direct memory to the right nodes / memory tiers. - Frank