Received: by 2002:a25:1506:0:0:0:0:0 with SMTP id 6csp6571971ybv; Wed, 12 Feb 2020 15:09:07 -0800 (PST) X-Google-Smtp-Source: APXvYqxneL3MWErUI5R02pJmfLUwZfbZUCiVcyQN9j3p4bGkQQi5/52HkinXB2K2YtFjNZ9TIQM7 X-Received: by 2002:a54:4f04:: with SMTP id e4mr957840oiy.111.1581548947778; Wed, 12 Feb 2020 15:09:07 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1581548947; cv=none; d=google.com; s=arc-20160816; b=aGzTCYQHiDATAxFSzH0/wyhqplumDnEQgFWiljGJVYH7cmih8UIHj9qLq35JxuMfXG RnbYIWPCefdX0lTUU5SHCBDKaUS5TawFj0HH6ucFxHC1D1KZ2j6tExc/Acx9DwCcmc/D tAlTcWxaOjIIPuSF7KkG245/vHpQCa4IfqacEbPeyt6mugyBm8wmjKuC8fTe9mnQYP25 JSI38Bx9OGxtwKRIGft5Llpsx0IMeAsyypCfSzGKFv1TwQtMpFtQZrrE/FfoqOCHjIDP p0V6YAmaioP/78okpoXz1vjT5rX3maF2qPH+LttzvHZcEZKgHIr5hWgVeTLTtMfKeU0D eVrQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=Qvuj5lbokKOhJStNgAdx/szkl5GXujkV5PkgBEKybTw=; b=PypAQ934FcWkEMu1AMCYSfZ/xdJqykQDKIN9oX5fW+nh9Ka6GQ0m/lVaXMK075RMwy aY04djKW2ZM4AknStX8vv5cstbNnpkKF7eZeWwt7aqy8brWx5/2zMKxL6ca6FbFCD+5T DBIIRqfclJ1EeAqbBDpJVUK9RCE7yfJvbdbIBb0L+eKa539A5rqZt4ynz2RS7tR1vuPI qrylC/PfU9nMjVOUWorhNT5FSH/lWwR3uGtSu5aNYfck/L0PCn4M2YfagEOfATua9jiD 2+4pjNs77w2fuuvggLpFTmrQF0a+tz3OFJnoLV1slfyrvJ0RzdVzMqsRTm82Qamvao5w 3IJA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@digitalocean.com header.s=google header.b=YEaK57cW; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=digitalocean.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m85si260400oig.158.2020.02.12.15.08.43; Wed, 12 Feb 2020 15:09:07 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@digitalocean.com header.s=google header.b=YEaK57cW; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=digitalocean.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729302AbgBLXHN (ORCPT + 99 others); Wed, 12 Feb 2020 18:07:13 -0500 Received: from mail-qt1-f196.google.com ([209.85.160.196]:45666 "EHLO mail-qt1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727692AbgBLXHM (ORCPT ); Wed, 12 Feb 2020 18:07:12 -0500 Received: by mail-qt1-f196.google.com with SMTP id d9so2963207qte.12 for ; Wed, 12 Feb 2020 15:07:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=digitalocean.com; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=Qvuj5lbokKOhJStNgAdx/szkl5GXujkV5PkgBEKybTw=; b=YEaK57cWoveFj/43gBIuh8hAwajMowa5iBw0i0l1ZPydRpKpFo5iiiCp8oBufPHVEC Dot64KQZK5IAac+K7K8TbGmXDgK2LyQRdRCXZCwtGIv5t2L5bHNJCaOGJqPx6t3xtgw4 Ykfes9qBRBL9iqUs9aCslYwdJk0QsHS4fl+8w= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=Qvuj5lbokKOhJStNgAdx/szkl5GXujkV5PkgBEKybTw=; b=oIPjkf0lsx+LVyc5SneUXeWwvCnQYub+CubBMq/RNdBNe4lZ7aBjEoP9N9ncLtabpT ifI4aDNKvFqAWpzlAOFpVX7EBqFcyvOihMk1MN7QdG0n/7gBdkCmtWNoMhyCRO/4NVG6 xONoCQu4knvPr4UoqpxO2nuBrmw/yBDZgyAMKh1Duqn1pH9y0SAEZxzBPO+VIBVzzRe3 gDfPWxKxaqbnl95377LocFq9aUwK9oc3pLs28cYTiZBCivPovj1r2LKsHR2qMsOGqEh5 aoKbBvgJ9Uic5ESSigPi49z0OnkizxFHw3d/IoJPjHAl02YvLrPiQmUnajm6/cKB8R78 bvSA== X-Gm-Message-State: APjAAAWOjHp55qBbOXGXLrRy/Puy42guOMFgYvUhZm8Sz+wHi+lXgaBQ LJwFrHyWRUTUH2umKGHRTHh4vQ== X-Received: by 2002:ac8:318c:: with SMTP id h12mr9361735qte.231.1581548830592; Wed, 12 Feb 2020 15:07:10 -0800 (PST) Received: from sinkpad (192-222-189-155.qc.cable.ebox.net. [192.222.189.155]) by smtp.gmail.com with ESMTPSA id v2sm356500qto.73.2020.02.12.15.07.08 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 12 Feb 2020 15:07:09 -0800 (PST) Date: Wed, 12 Feb 2020 18:07:05 -0500 From: Julien Desfossez To: Tim Chen Cc: Vineeth Remanan Pillai , Nishanth Aravamudan , Peter Zijlstra , Ingo Molnar , Thomas Gleixner , Paul Turner , Linus Torvalds , Linux List Kernel Mailing , Dario Faggioli , =?iso-8859-1?Q?Fr=E9d=E9ric?= Weisbecker , Kees Cook , Greg Kerr , Phil Auld , Aaron Lu , Aubrey Li , Valentin Schneider , Mel Gorman , Pawan Gupta , Paolo Bonzini Subject: Re: [RFC PATCH v4 00/19] Core scheduling v4 Message-ID: <20200212230705.GA25315@sinkpad> References: <5e3cea14-28d1-bf1e-cabe-fb5b48fdeadc@linux.intel.com> <3c3c56c1-b8dc-652c-535e-74f6dcf45560@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Mailer: Mutt 1.9.4 (2018-02-28) User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 05-Feb-2020 04:28:18 PM, Tim Chen wrote: > On 1/14/20 7:40 AM, Vineeth Remanan Pillai wrote: > > On Mon, Jan 13, 2020 at 8:12 PM Tim Chen wrote: > > > >> I also encountered kernel panic with the v4 code when taking cpu offline or online > >> when core scheduler is running. I've refreshed the previous patch, along > >> with 3 other patches to fix problems related to CPU online/offline. > >> > >> As a side effect of the fix, each core can now operate in core-scheduling > >> mode or non core-scheduling mode, depending on how many online SMT threads it has. > >> > >> Vineet, are you guys planning to refresh v4 and update it to v5? Aubrey posted > >> a port to the latest kernel earlier. > >> > > Thanks for the updated patch Tim. > > > > We have been testing with v4 rebased on 5.4.8 as RC kernels had given us > > trouble in the past. v5 is due soon and we are planning to release v5 when > > 5.5 comes out. As of now, v5 has your crash fixes and Aubrey's changes > > related to load balancing. We are investigating a performance issue with > > high overcommit io intensive workload and also we are trying to see if > > we can add synchronization during VMEXITs so that a guest vm cannot run > > run alongside with host kernel. We also need to think about the userland > > interface for corescheduling in preparation for upstreaming work. > > > > Vineet, > > Have you guys been able to make progress on the issues with I/O intensive workload? I finally have some results with the following branch: https://github.com/digitalocean/linux-coresched/tree/coresched/v4-v5.5.y We tested the following classes of workloads in VMs (all vcpus in the same cgroup/tag): - linpack (pure CPU work) - sysbench TPC-C (MySQL benchmark, good mix of CPU/net/disk) with/without noise VMs around - FIO randrw VM with/without noise VMs around Our "noise VMs" are 1-vcpu VMs running a simple workload that wakes up every 30 seconds, sends a couple of metrics over a VPN and go back to sleep. They use between 0% and 30% of CPU on the host all the time, nothing sustained just ups and downs. # linpack 3x 12-vcpus pinned on a 36 hwthreads NUMA node (with smt on): - core scheduling manages to perform slightly better than the baseline by up to 20% in some cases ! - with nosmt (so 2:1 overcommit) the performance drop by 24% # sysbench TPC-C 1x 12-vcpus MySQL server on each NUMA node, 48 client threads (running on a different server): - without noise: no performance difference between the 3 configurations - with 96 noise VMs on each NUMA node: - Performance drops by 54% with core scheduling - Performance drops by 75% with nosmt We write at about 130MB/s on disk with that test. # FIO randrw 50%, 1 thread, libaio, bs=128k, iodepth=32 1x 12-vcpus FIO VM, usually only require up to 100% CPU overall (data thread and all vcpus summed), we read and write at about 350MB/s alone: - coresched drops 5% - nosmt drops 1% 1:1 vcpus vs hardware thread on the NUMA node (filled with noise VMs): - coresched drops 7% - nosmt drops 22% 3:1 ratio: - coresched drops 16% - nosmt drops 22% 5:1 ratio: - coresched drops 51% - nosmt drops 61% So the main conclusion is that for all the test cases we have studied, core scheduling performs better than nosmt ! This is different than what we tested a while back, so it's looking really good ! Now I am looking for confirmation from others. Dario did you have time to re-run your test suite against that same branch ? After that, our next step is to trash all that with adding VMEXIT synchronization points ;-) Thanks, Julien