Received: by 2002:a05:7412:2a8c:b0:e2:908c:2ebd with SMTP id u12csp2334829rdh; Tue, 26 Sep 2023 22:18:20 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFmhE8QzvRk8c6XYdJct5zmd+E2KefZHjhxqxIvV/G0TmOfqv8HIy9C9sNNjX1x8BPR5936 X-Received: by 2002:a05:6830:1446:b0:6c0:79ed:be35 with SMTP id w6-20020a056830144600b006c079edbe35mr1054464otp.24.1695791900411; Tue, 26 Sep 2023 22:18:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695791900; cv=none; d=google.com; s=arc-20160816; b=Si9flJl82JoEn8klLFx62PA5rOPYygejKoNPWcAEYy7SfdfVWEO5M0zTc4paykkfbb N7ohPkrknKYj3LY8X/6hZkpYCu7mnjdNREWDxCdNzDBvO+u8Nno/oPB+6Wk+pJYU6Zmy ETR5PM/jT2VbcpBlzcNMwFqIqho2pC7PTuW4csnHQxU8Vbwa9/PGvVFNJKcA51NrsmCn ESzBKg4wXBJ6IxG06tO2jKIm0tu3GXrVUnlTI42Uh8LLWQboZf4WTw+SR1dsi3YP9+JW WkPVauyCtz+PeevgwwCLAu3hEhDCNHYRHE7EPIE+82GYyBSEfg76xcrOQuHQ/2cTFArO T8mA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=SO+9N1iE99GaXTC59MQt9jD6XA9ApLco+UiSjA8ZWoA=; fh=k2Da3tdGwoTTGVnisHT5n9jOa+bFowxZWJ3Ziql+GyA=; b=e4XCZU9C6Tq99HrVo7DNpK4VS6y2orRYF7Z3/rAAH31TKdYZp8x9FX0Ba8T/+rzFgH 75J6GDRyJYUuNo+mF95Jb5Hie/ZrRHDA5/qEks4X/MR9sRVDvhNgiwO6wtSOdAgSJCLq RM+gH66zyQQLOmGRhYdzVOlvWb8bKrlnfW3yxYLUgqiXBkgoZ1WBpqWIrpKJb5ZsRB+b QdHevBKB3YWG+tbVNCsNE5++vdI9wmyF0jB1on5LUrwgPiQXGqa1RfEVnFiKQpCode1Y Ez5DIevogDBHkT+1qm61y/Y+3OQsXlEedoA+o2fxAAF4z+E60Ig0EK65l1h9gck8t9tW 59uA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=fmYkfuWF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from agentk.vger.email (agentk.vger.email. [2620:137:e000::3:2]) by mx.google.com with ESMTPS id n9-20020a63f809000000b00577475ee5f6si14632046pgh.618.2023.09.26.22.18.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 Sep 2023 22:18:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) client-ip=2620:137:e000::3:2; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=fmYkfuWF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id 894D780ACCF2; Mon, 25 Sep 2023 22:11:22 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231549AbjIZFLX (ORCPT + 99 others); Tue, 26 Sep 2023 01:11:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58274 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232066AbjIZFLU (ORCPT ); Tue, 26 Sep 2023 01:11:20 -0400 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D03A7101 for ; Mon, 25 Sep 2023 22:11:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695705073; x=1727241073; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=C8T7odWEBQZYn0UfnTi3NP5ddYKHmElS4lKfMfotAh8=; b=fmYkfuWFwKxuQV4DtoDGtSJan1ISGhN37UC+wQZpT1a2owDOVViw1Bhs cnhWVcfDSlgFmdeGj17vvuKXMAnrIPgzwoiFy12JlLxMJTSdGpAap/QMJ YcAOwV/0MSJQvZvzitKQPW16ExJa5RqiTdQ3TDiOF96Q8nRtBTC+4Ynr8 Da653debpiNSNEhVTWEReRHjxv6hm6bZ6JTt2frh25cpDFlW7PzrdPT8F 1Zlyb/2Z2TuXjrqcbVql0wfgh8hBtejIYZdB/QfaT9yeX1BRekb5mQhaF vDj+h7BUXWWtJv36udDMJ8P7IFlEf+Kjqaw2MgXLZ8KSMNF+nVxQKXfE1 w==; X-IronPort-AV: E=McAfee;i="6600,9927,10843"; a="366545731" X-IronPort-AV: E=Sophos;i="6.03,177,1694761200"; d="scan'208";a="366545731" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Sep 2023 22:11:13 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10843"; a="872368066" X-IronPort-AV: E=Sophos;i="6.03,177,1694761200"; d="scan'208";a="872368066" Received: from chenyu-dev.sh.intel.com ([10.239.62.164]) by orsmga004.jf.intel.com with ESMTP; 25 Sep 2023 22:11:08 -0700 From: Chen Yu To: Peter Zijlstra , Mathieu Desnoyers , Ingo Molnar , Vincent Guittot , Juri Lelli Cc: Tim Chen , Aaron Lu , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , Valentin Schneider , K Prateek Nayak , "Gautham R . Shenoy" , linux-kernel@vger.kernel.org, Chen Yu , Chen Yu Subject: [PATCH 1/2] sched/fair: Record the short sleeping time of a task Date: Tue, 26 Sep 2023 13:11:02 +0800 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Mon, 25 Sep 2023 22:11:22 -0700 (PDT) During task wakeup, the wakee firstly checks if its previous running CPU is idle. If yes, choose that CPU as its first choice. However, in most cases, the wakee's previous CPU could be chosen by someone else, which breaks the cache locality. Proposes a mechanism to reserve the task's previous CPU for a short while. In this reservation period, other tasks are not allowed to pick that CPU until a timeout. The reservation period is defined as the average short sleep time of the task. To be more specific, it is the time delta between this task being dequeued and enqueued. Only the sleep time shorter than sysctl_sched_migration_cost will be recorded. If the sleep time is longer than sysctl_sched_migration_cost, give the reservation period a penalty by shrinking it to half. In this way, the 'burst' sleeping time of the task is honored, meanwhile, if that task becomes a long-sleeper, the reservation time of that task is shrunk to reduce the impact on task wakeup. Suggested-by: Mathieu Desnoyers Signed-off-by: Chen Yu --- include/linux/sched.h | 3 +++ kernel/sched/fair.c | 21 +++++++++++++++++++++ 2 files changed, 24 insertions(+) diff --git a/include/linux/sched.h b/include/linux/sched.h index dc37ae787e33..4a0ac0276384 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -561,6 +561,9 @@ struct sched_entity { u64 vruntime; s64 vlag; u64 slice; + u64 prev_dequeue_time; + /* the reservation period of this task during wakeup */ + u64 sis_rsv_avg; u64 nr_migrations; diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index d0877878bcdb..297b9470829c 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6456,6 +6456,24 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags) struct sched_entity *se = &p->se; int idle_h_nr_running = task_has_idle_policy(p); int task_new = !(flags & ENQUEUE_WAKEUP); + u64 last_dequeue = p->se.prev_dequeue_time; + u64 now = sched_clock_cpu(task_cpu(p)); + + /* + * If the task is a short-sleepting task, there is no need + * to migrate it to other CPUs. Estimate the average short sleeping + * time of the wakee. This sleep time is used as a hint to reserve + * the dequeued task's previous CPU for a short while. During this + * reservation period, select_idle_cpu() prevents other wakees from + * choosing this CPU. This could bring a better cache locality. + */ + if ((flags & ENQUEUE_WAKEUP) && last_dequeue && cpu_online(task_cpu(p)) && + now > last_dequeue) { + if (now - last_dequeue < sysctl_sched_migration_cost) + update_avg(&p->se.sis_rsv_avg, now - last_dequeue); + else + p->se.sis_rsv_avg >>= 1; + } /* * The code below (indirectly) updates schedutil which looks at @@ -6550,6 +6568,7 @@ static void dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags) int task_sleep = flags & DEQUEUE_SLEEP; int idle_h_nr_running = task_has_idle_policy(p); bool was_sched_idle = sched_idle_rq(rq); + u64 now; util_est_dequeue(&rq->cfs, p); @@ -6611,6 +6630,8 @@ static void dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags) dequeue_throttle: util_est_update(&rq->cfs, p, task_sleep); hrtick_update(rq); + now = sched_clock_cpu(cpu_of(rq)); + p->se.prev_dequeue_time = task_sleep ? now : 0; } #ifdef CONFIG_SMP -- 2.25.1