Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp1015119rwl; Wed, 12 Apr 2023 07:14:07 -0700 (PDT) X-Google-Smtp-Source: AKy350ZXZjzAL+t3ERH7ZITZ8DC2HDSmZKl9ZDm/ZlGIh0ous2NeHKju0sbSKEy4sLgACQ4FIkPI X-Received: by 2002:a17:906:178e:b0:94a:a887:adb6 with SMTP id t14-20020a170906178e00b0094aa887adb6mr2297826eje.67.1681308847544; Wed, 12 Apr 2023 07:14:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1681308847; cv=none; d=google.com; s=arc-20160816; b=avO9fKHP3ThrEPnvy+yzHoaXomDYeygNHeRT8BuITwpfh7hlh2N56UH+1Hb3OA2jjO z1g1k+xWG21scrcJsmaI25vBekssu/uSZ+9lpSAhhks6NEFeuYQnx+R6Sua8uJ5JwUhW YpQ4OUY/Qc50dXqGoVmIRF4b8TyJDeKti7vxquh4bOZTwws9HADMqkOP7zp3MiHZgV2E f3u2ZHfEuYonICTnNcFAqhUgz9SOICOSUjr/OornPwKggRH0jbCAMMd9ZUj/O5tE3MyG 6R8L3cTwTTtPeg0cGgkNSkYWVNrLGbWEBu1Oa8AjxebnCls8UlDFdORMX2juAMxyI8zn 6Lrw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=ezr0Pa2G5CPCejRpgTKWLJ2fU/GGW7Xz3oIM+nnVa98=; b=Kk+HhqYToIIkGfPLZJHxsOQdZ1M9ms3IE1xUB/jry4OuKb7NXxM8SCgaZzFvsaUfOb 7D5zqHBOvT235mafuqugn3/HMPnpvUgHB3ZEtZ0nGLutdu7HmPrVPZE2cR1tcIjfrrmg qE19U/woaPASpRFvYssRjKoSAK+DGWkpKKfc4+5yMY0wGY+D/3HSW6uRa1jZFA/MjosP Tip4ID1ZY/7/+Jk4TaCAfc6m966Y7Zdh6XoPGzlaR5VYO5pJT973w1NfwD/TQ83D9878 05QiEmOpCE8XATDnXQLPG3Bz3QTe7VgwfrkYsLl5HwvFS2P+aLNJBEcRPzDALT4CwNfn e5Kw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=RUfxf6kD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id rv23-20020a17090710d700b0094ae8ac45c1si5405751ejb.438.2023.04.12.07.13.41; Wed, 12 Apr 2023 07:14:07 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=RUfxf6kD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231150AbjDLOHt (ORCPT + 99 others); Wed, 12 Apr 2023 10:07:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38734 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231262AbjDLOHq (ORCPT ); Wed, 12 Apr 2023 10:07:46 -0400 Received: from mail-pl1-x632.google.com (mail-pl1-x632.google.com [IPv6:2607:f8b0:4864:20::632]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B168E8A69 for ; Wed, 12 Apr 2023 07:07:20 -0700 (PDT) Received: by mail-pl1-x632.google.com with SMTP id la3so11344518plb.11 for ; Wed, 12 Apr 2023 07:07:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1681308440; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=ezr0Pa2G5CPCejRpgTKWLJ2fU/GGW7Xz3oIM+nnVa98=; b=RUfxf6kD6IVsna+AcdeGpsOuyFBveZCvq4fj5TUDZEDOJMqYKtchhmzLkACtgRjcoG c0nrY2rl7Jpm6mc9ZXCH7XpVnXkhQfAfisiZ6Q37g31KcI5eAclqRq0msdxE6vgM/P45 cdt6u3/iHu/QmUG9QPGq5X0oVet0CBytzJF0SMywaFud+I6Sao1MeFWBRHd6r/QlYysH Ex3SZ1TNSlbeb110UJuhky/8YxGowIYvNxfyWc/iS6bkGkXoYmjKiFCgC/roxlytfFFL PAzLy2MWPeGu1FWblrPfZGPiuqCrXQXxMl+jZGrZL0xQnHVt1ezlNytEn6Pb99sKTvBF tLsw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1681308440; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ezr0Pa2G5CPCejRpgTKWLJ2fU/GGW7Xz3oIM+nnVa98=; b=1QsM8pXgoCAmMmHFxXVp73XN2Ay0E1fWjF1pzt2SwtXu/gp6ZK53xd8vMiW8iNH1r0 dKvrJeWdaqTt2lmLwMrz6MsgF/kXzA/fLKXktr/ZmYGhmR8bOyHPA1K+PqOqqNWCMlhB XipFlIQ6FPthtSpO9ecj2ZBgesOp+Mmxur1RyaTAU7th7BsUZ3IRdkghTLpvsvINLAlp ZpZR30XtyQJMsIiqci5TpzITjRqHZm1hcO7cjXC2AzKmD6kM0JsVKpgYyzo55sdE8kD1 MQt0HExZjBpqXWTAdlAlfIlKjl0ckuqEzM+hTSrZdUpL+o3QMrymfoGitrtyVbOJVVIw dZiw== X-Gm-Message-State: AAQBX9fi36KbBAaD0F8yRlDNQY1WOM0mx4jq4hqcgmg8NSdjzLDO5hNb 1ONLHbnr3uywtvcC80WXRJDLCw== X-Received: by 2002:a05:6a20:7a90:b0:da:f525:e629 with SMTP id u16-20020a056a207a9000b000daf525e629mr19924239pzh.53.1681308439909; Wed, 12 Apr 2023 07:07:19 -0700 (PDT) Received: from C02FT5A6MD6R.lan ([111.201.131.102]) by smtp.gmail.com with ESMTPSA id f9-20020a63de09000000b00502e6bfedc0sm10473613pgg.0.2023.04.12.07.07.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 12 Apr 2023 07:07:19 -0700 (PDT) From: Gang Li To: John Hubbard , Jonathan Corbet , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , Valentin Schneider Cc: linux-api@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-doc@vger.kernel.org, Gang Li Subject: [PATCH v6 0/2] sched/numa: add per-process numa_balancing Date: Wed, 12 Apr 2023 22:06:58 +0800 Message-Id: <20230412140701.58337-1-ligang.bdlg@bytedance.com> X-Mailer: git-send-email 2.32.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org # Introduce Add PR_NUMA_BALANCING in prctl. A large number of page faults will cause performance loss when numa balancing is performing. Thus those processes which care about worst-case performance need numa balancing disabled. Others, on the contrary, allow a temporary performance loss in exchange for higher average performance, so enable numa balancing is better for them. Numa balancing can only be controlled globally by /proc/sys/kernel/numa_balancing. Due to the above case, we want to disable/enable numa_balancing per-process instead. Set per-process numa balancing: prctl(PR_NUMA_BALANCING, PR_SET_NUMA_BALANCING_DISABLE); //disable prctl(PR_NUMA_BALANCING, PR_SET_NUMA_BALANCING_ENABLE); //enable prctl(PR_NUMA_BALANCING, PR_SET_NUMA_BALANCING_DEFAULT); //follow global Get numa_balancing state: prctl(PR_NUMA_BALANCING, PR_GET_NUMA_BALANCING, &ret); cat /proc//status | grep NumaB_mode # Unixbench This is overhead of this patch, not performance improvement. +-------------------+----------+ | NAME | OVERHEAD | +-------------------+----------+ | Pipe_Throughput | 0.98% | | Context_Switching | -0.96% | | Process_Creation | 1.18% | +-------------------+----------+ # Changes Changes in v6: - rebase on top of next-20230411 - run Unixbench on physical machine - acked by John Hubbard Changes in v5: - replace numab_enabled with numa_balancing_mode (Peter Zijlstra) - make numa_balancing_enabled and numa_balancing_mode inline (Peter Zijlstra) - use static_branch_inc/dec instead of static_branch_enable/disable (Peter Zijlstra) - delete CONFIG_NUMA_BALANCING in task_tick_fair (Peter Zijlstra) - reword commit, use imperative mood (Bagas Sanjaya) - Unixbench overhead result Changes in v4: - code clean: add wrapper function `numa_balancing_enabled` Changes in v3: - Fix compile error. Changes in v2: - Now PR_NUMA_BALANCING support three states: enabled, disabled, default. enabled and disabled will ignore global setting, and default will follow global setting. Gang Li (2): sched/numa: use static_branch_inc/dec for sched_numa_balancing sched/numa: add per-process numa_balancing Documentation/filesystems/proc.rst | 2 ++ fs/proc/task_mmu.c | 20 ++++++++++++ include/linux/mm_types.h | 3 ++ include/linux/sched/numa_balancing.h | 45 ++++++++++++++++++++++++++ include/uapi/linux/prctl.h | 8 +++++ kernel/fork.c | 4 +++ kernel/sched/core.c | 26 +++++++-------- kernel/sched/fair.c | 9 +++--- kernel/sys.c | 47 ++++++++++++++++++++++++++++ mm/mprotect.c | 6 ++-- 10 files changed, 151 insertions(+), 19 deletions(-) -- 2.20.1