Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp4704761ybf; Wed, 4 Mar 2020 09:03:56 -0800 (PST) X-Google-Smtp-Source: ADFU+vv2W9k/iNKVYVfAgvsAx/ZhwUbV1bojjcEoXs+CaecrwxJ2dmxJyz2k4g7dnV63lms5/t9g X-Received: by 2002:a05:6830:1d6f:: with SMTP id l15mr3008120oti.299.1583341436187; Wed, 04 Mar 2020 09:03:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1583341436; cv=none; d=google.com; s=arc-20160816; b=ubS9BERJHWRQ1+OJzQWxxcotKzibTJJ/q8Pkn/atRYW5aOTXT9vTGpmOgdGXceq9CX b8L8iTm/soJ2+weXaKgVbAkVVFzuaaSTLSOTcwr88CSSKEMgnJ4vrkfY1UDvLit84HNU /oMvpkTJqa4797WT8fZAjXRCRS7pcQ/06rfD+7qj67OoB/VEAych38K0WiwnrDXDBuHa C/RVOC9PmYwP/cirb0US9rlXZJ+EX9mMh9JFYo5a4ycfe8DtS1IX8vYopWbKQe4aWdem yC5R/YqB3FHQWAYnwe0EkIlsvt0zEPFVW7MTdqwpPbctJ34hQW+Z3TjJbzAIc+n3l9Gk 3H5w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :dkim-signature; bh=ZcWqV6JD+NpCxVyUMWr25+OOQvCNBSUoF3Hhx9DhCSI=; b=DMLaNEkeAkbtc602hKt/siPx+gk7tzU39rVBwlZSsW7nLnqMno+lTgSYpAqJFScGax v9QTHBZL7oLOszCtoqCVT0YZgeHZCCXh4Pr7u0U2JvXL6JrJIH49eohobjd4RNskK/dv 6dULtHjbxVcEOE1iMHLIpont3avXuIoer6no/tcjdNsNo48cNIOGmcw31aWI6myfMTIK s6r4rdwz4ZdAqQGrBCZdhh1v3Ok9608lENQDPkJDZ9HsZcjNH1BgHpiVk0kqxBdhVerR fInWIiY2wAI4Xii0FAgNxf+4crZ1nityhU2ekBJwRiXqNi38mNvCHdDcChywUBSNSddl Kq2Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@digitalocean.com header.s=google header.b=ds2gCVke; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=digitalocean.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r12si1420761oic.246.2020.03.04.09.03.41; Wed, 04 Mar 2020 09:03:56 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@digitalocean.com header.s=google header.b=ds2gCVke; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=digitalocean.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388494AbgCDRBF (ORCPT + 99 others); Wed, 4 Mar 2020 12:01:05 -0500 Received: from mail-qt1-f196.google.com ([209.85.160.196]:33300 "EHLO mail-qt1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729811AbgCDRAI (ORCPT ); Wed, 4 Mar 2020 12:00:08 -0500 Received: by mail-qt1-f196.google.com with SMTP id d22so1907702qtn.0 for ; Wed, 04 Mar 2020 09:00:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=digitalocean.com; s=google; h=from:to:cc:subject:date:message-id; bh=ZcWqV6JD+NpCxVyUMWr25+OOQvCNBSUoF3Hhx9DhCSI=; b=ds2gCVke6rgnscwflfWsWa4/02xZE9rlbG0jpAqRtttP303WSpmGc5u+Leh2MJeIvY NLmAFZf2U12ZCqTKqF1yQ1OLbpOfhM9T0yI8PW+eWfdAMfk0+6XlCk+BRcv92NyjtTay gNlpaHjzcxXhsVFTX7lc64B713pIOAwPfXFrA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=ZcWqV6JD+NpCxVyUMWr25+OOQvCNBSUoF3Hhx9DhCSI=; b=BpFFwE3RO3TzjqWs8lVPifNvIfANw2ZOqdNkAvouF7B0KkZaftwV03NDhx2M8BrgUK AT2YJmJ81jLkkzC4BB8FsvYLfkYwpW2zul4B9kPFxLcgImcPKFkTcKp+9fh0gib1yT6N R/Mxmo9VUlnK8JoUnvfTgAnDGM1vkn8iPHKJ1p81O8YxvGXWlsx9krNvTmbXtabPYIY2 lrgzQIPTZtL5N2KM4Bu5la76LgzI9nf61NJoNZFHmfPF+KvRizH0sl858Z8MfiGCONGU oZtrgU4dzQmV9GmU/5jX05d+JJ5UdgLTOG+0Uu4kmgeHo9H7AtyPM3txdE25nEPDN9DX 40og== X-Gm-Message-State: ANhLgQ1k+7zj88ezoKyEeFkvfc6nkusSuRu46TFrYYXXNArJzP53NJl8 zsFjSvAJybYFodkUrfSXsrjl75s1S0I= X-Received: by 2002:aed:2a5c:: with SMTP id k28mr3316939qtf.132.1583341206525; Wed, 04 Mar 2020 09:00:06 -0800 (PST) Received: from s2r5node9 ([162.243.188.99]) by smtp.gmail.com with ESMTPSA id o6sm3506117qkj.96.2020.03.04.09.00.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Mar 2020 09:00:05 -0800 (PST) From: vpillai To: Nishanth Aravamudan , Julien Desfossez , Peter Zijlstra , Tim Chen , mingo@kernel.org, tglx@linutronix.de, pjt@google.com, torvalds@linux-foundation.org Cc: vpillai , linux-kernel@vger.kernel.org, fweisbec@gmail.com, keescook@chromium.org, kerrnel@google.com, Phil Auld , Aaron Lu , Aubrey Li , aubrey.li@linux.intel.com, Valentin Schneider , Mel Gorman , Pawan Gupta , Paolo Bonzini , Joel Fernandes , joel@joelfernandes.org Subject: [RFC PATCH 00/13] Core scheduling v5 Date: Wed, 4 Mar 2020 16:59:50 +0000 Message-Id: X-Mailer: git-send-email 2.17.1 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Fifth iteration of the Core-Scheduling feature. Core scheduling is a feature that only allows trusted tasks to run concurrently on cpus sharing compute resources(eg: hyperthreads on a core). The goal is to mitigate the core-level side-channel attacks without requiring to disable SMT (which has a significant impact on performance in some situations). So far, the feature mitigates user-space to user-space attacks but not user-space to kernel attack, when one of the hardware thread enters the kernel (syscall, interrupt etc). By default, the feature doesn't change any of the current scheduler behavior. The user decides which tasks can run simultaneously on the same core (for now by having them in the same tagged cgroup). When a tag is enabled in a cgroup and a task from that cgroup is running on a hardware thread, the scheduler ensures that only idle or trusted tasks run on the other sibling(s). Besides security concerns, this feature can also be beneficial for RT and performance applications where we want to control how tasks make use of SMT dynamically. This version was focusing on performance and stability. Couple of crashes related to task tagging and cpu hotplug path were fixed. This version also improves the performance considerably by making task migration and load balancing coresched aware. In terms of performance, the major difference since the last iteration is that now even IO-heavy and mixed-resources workloads are less impacted by core-scheduling than by disabling SMT. Both host-level and VM-level benchmarks were performed. Details in: https://lkml.org/lkml/2020/2/12/1194 https://lkml.org/lkml/2019/11/1/269 v5 is rebased on top of 5.5.5(449718782a46) https://github.com/digitalocean/linux-coresched/tree/coresched/v5-v5.5.y Changes in v5 ------------- - Fixes for cgroup/process tagging during corner cases like cgroup destroy, task moving across cgroups etc - Tim Chen - Coresched aware task migrations - Aubrey Li - Other minor stability fixes. Changes in v4 ------------- - Implement a core wide min_vruntime for vruntime comparison of tasks across cpus in a core. - Aaron Lu - Fixes a typo bug in setting the forced_idle cpu. - Aaron Lu Changes in v3 ------------- - Fixes the issue of sibling picking up an incompatible task - Aaron Lu - Vineeth Pillai - Julien Desfossez - Fixes the issue of starving threads due to forced idle - Peter Zijlstra - Fixes the refcounting issue when deleting a cgroup with tag - Julien Desfossez - Fixes a crash during cpu offline/online with coresched enabled - Vineeth Pillai - Fixes a comparison logic issue in sched_core_find - Aaron Lu Changes in v2 ------------- - Fixes for couple of NULL pointer dereference crashes - Subhra Mazumdar - Tim Chen - Improves priority comparison logic for process in different cpus - Peter Zijlstra - Aaron Lu - Fixes a hard lockup in rq locking - Vineeth Pillai - Julien Desfossez - Fixes a performance issue seen on IO heavy workloads - Vineeth Pillai - Julien Desfossez - Fix for 32bit build - Aubrey Li ISSUES ------ - Aaron(Intel) found an issue with load balancing when the tasks have different weights(nice or cgroup shares). Task weight is not considered in coresched aware load balancing and causes those higher weights task to starve. - Joel(ChromeOS) found an issue where RT task may be preempted by a lower class task. - Joel(ChromeOS) found a deadlock and crash on PREEMPT kernel in the coreshed idle balance logic TODO ---- - Work on merging patches that are ready to be merged - Decide on the API for exposing the feature to userland - Experiment with adding synchronization points in VMEXIT to mitigate the VM-to-host-kernel leaking - Investigate the source of the overhead even when no tasks are tagged: https://lkml.org/lkml/2019/10/29/242 --- Aaron Lu (2): sched/fair: wrapper for cfs_rq->min_vruntime sched/fair: core wide vruntime comparison Aubrey Li (1): sched: migration changes for core scheduling Peter Zijlstra (9): sched: Wrap rq::lock access sched: Introduce sched_class::pick_task() sched: Core-wide rq->lock sched/fair: Add a few assertions sched: Basic tracking of matching tasks sched: Add core wide task selection and scheduling. sched: Trivial forced-newidle balancer sched: cgroup tagging interface for core scheduling sched: Debug bits... Tim Chen (1): sched: Update core scheduler queue when taking cpu online/offline include/linux/sched.h | 9 +- kernel/Kconfig.preempt | 6 + kernel/sched/core.c | 1037 +++++++++++++++++++++++++++++++++++++- kernel/sched/cpuacct.c | 12 +- kernel/sched/deadline.c | 69 ++- kernel/sched/debug.c | 4 +- kernel/sched/fair.c | 387 +++++++++++--- kernel/sched/idle.c | 11 +- kernel/sched/pelt.h | 2 +- kernel/sched/rt.c | 65 ++- kernel/sched/sched.h | 248 +++++++-- kernel/sched/stop_task.c | 13 +- kernel/sched/topology.c | 4 +- 13 files changed, 1672 insertions(+), 195 deletions(-) -- 2.17.1