Received: by 2002:a25:868d:0:0:0:0:0 with SMTP id z13csp579045ybk; Wed, 20 May 2020 07:02:28 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwPmWwtFdGPrlEkGstvu+kHPHhVdOaJE1OqHrTl4tGHx8RHcwFZC+ool9DvreH7a6mDr2+6 X-Received: by 2002:a50:eb0a:: with SMTP id y10mr3540706edp.312.1589983348259; Wed, 20 May 2020 07:02:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1589983348; cv=none; d=google.com; s=arc-20160816; b=ZoLtMQuwOfG2t8pBguKgOJ/d3A0PlcC8GDSckCXt444YQf0d3Ee1TPxpV8FeItF2BO BugOrN3hwJMFK2cqq+8bqAK8P+5ADW+kW+QmkzoWa6LNH2wdwLJc6TWmbxIouiD4Ptl+ hvgSrgxNByCbdYzNu6hLs1SXnN9uy32cRgvfVYhnf2nuL+Z7N3adCTA0ueyuiUcpH2Gr XyPuS8KOT2pLbJYPCqhTsr+iTw2meCVm5tc17QLz9aBv9kmPSIIfuaGlP4N9Z4n3Nc1j LvccYiFf9202DnhkxCzy/tIgZ5BhD7Q0O3GZ6jlGX8bfH7fS7VjaDUBtFDaSKx6gumY+ U4hw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=+6p4jdAvCMC0iizzaffYOBefDLg5kSFsX+MrN5miuVs=; b=oZUL4poScpSyqE+WW6I26G+jejaLNPz8yd9hfvRj+4EIaeU38lTBU/osrLt7rB3nGW CMnWPw7YbeILgaIXqdecR/qY4fJc3+MT0KXDUfWpOECiuCqrIey7KleVYGu4xsxFt+2/ dmOSdDhRwam7l4jK9bAOGGMo7h2uJHjRdAknG7CWw8Q+1aP7ZL16RO9KqZmFYqg8BRMw 1B4puw/DZ1PTiU27WG8KKEsoH703SKCvwpkXffXvahNZmJCMqvDYQG93i6t6ctPtNkgZ aijPkg3+ZGVWvhivcs/cindn42J9V/C67YUw/dLhr+7ifwQh5b25RwGsPS2Nzp5dy+vN 64ig== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=J5Ja4BtS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id r24si1283992edo.6.2020.05.20.07.02.02; Wed, 20 May 2020 07:02:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=J5Ja4BtS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726596AbgETN6V (ORCPT + 99 others); Wed, 20 May 2020 09:58:21 -0400 Received: from us-smtp-2.mimecast.com ([205.139.110.61]:49000 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726486AbgETN6U (ORCPT ); Wed, 20 May 2020 09:58:20 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1589983098; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=+6p4jdAvCMC0iizzaffYOBefDLg5kSFsX+MrN5miuVs=; b=J5Ja4BtSGU3XkqkWRPtxpGj6vrrpoa0kJTGFo8mXYgInJmNjt4hYe2SjaLwLo/hWYxDV9c y9TdJxBB0n/ZXoedWxcTKujYB0uO8G01r3mwXwFTKvsBSqXNP9KjQAb7Uz4xVW30Iw4DvW /Lg7G22KMIkCi0DWEHCehQpWnM0GiGo= Received: from mail-ej1-f71.google.com (mail-ej1-f71.google.com [209.85.218.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-108-0iyzYYn5PaSCCe8Ux0C1bQ-1; Wed, 20 May 2020 09:58:14 -0400 X-MC-Unique: 0iyzYYn5PaSCCe8Ux0C1bQ-1 Received: by mail-ej1-f71.google.com with SMTP id c9so1364359ejr.16 for ; Wed, 20 May 2020 06:58:14 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=+6p4jdAvCMC0iizzaffYOBefDLg5kSFsX+MrN5miuVs=; b=DNgw8QmHV8K6RejRdloGMhetifn2BcDWyc5o338AnOA3fHgHrGe1mUqB9H1WBTQHQh TLMG7PQrmYbwBE95iUKtkgxvojPPoRsVE9kxZlgRAJtJjwxaKtorZXr7G0KmrsrOhXTj EBQ+AhUiyiv/srIvBqCXCr1aac3fk9OwlBh8FCObV+hUXbfBdUo181i0IWcKcWJHU7tZ Q2gGuL0du7N3c4bFglf9NiBJX4qwukMcdFGj8yCoOd4hnUOpxiNrL/ZjyRXcrb/h3ns4 wS2CjtE7fDWhvmwHxxknKiLOz4GEelr9bIQgTPU7nvI0SpeQCknHPloyhofZd9zExgkr RV6Q== X-Gm-Message-State: AOAM533+BMRlyvNJnWrm+gPPUeuHUDWlmrmN/o5D/fVVew3F2GGkQ/rI p+D9GrKf+w7pofxOFWLslVNoNm7aeeFdI9jNmOW/tNJ3Adja8YYAgP+vLBQW1Q49P99Zmb9Mo+h Tnn7mQ8cinHbpCEp9ohDMio/DwOp82imJe8HP3gU0 X-Received: by 2002:a17:906:ce39:: with SMTP id sd25mr3923285ejb.477.1589983093410; Wed, 20 May 2020 06:58:13 -0700 (PDT) X-Received: by 2002:a17:906:ce39:: with SMTP id sd25mr3923259ejb.477.1589983093142; Wed, 20 May 2020 06:58:13 -0700 (PDT) MIME-Version: 1.0 References: <20200312214736.GA3818@techsingularity.net> <20200320152251.GC3818@techsingularity.net> <20200320163843.GD3818@techsingularity.net> <20200507155422.GD3758@techsingularity.net> <20200507174934.GD19331@lorien.usersys.redhat.com> <20200508034741.13036-1-hdanton@sina.com> <20200519043154.10876-1-hdanton@sina.com> In-Reply-To: <20200519043154.10876-1-hdanton@sina.com> From: Jirka Hladky Date: Wed, 20 May 2020 15:58:01 +0200 Message-ID: Subject: Re: [PATCH 00/13] Reconcile NUMA balancing decisions with the load balancer v6 To: Hillf Danton Cc: Phil Auld , Mel Gorman , Peter Zijlstra , Ingo Molnar , Vincent Guittot , Juri Lelli , Dietmar Eggemann , Steven Rostedt , Ben Segall , Valentin Schneider , LKML , Douglas Shakshober , Waiman Long , Joe Mario , Bill Gray , "aokuliar@redhat.com" , "kkolakow@redhat.com" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Hillf, Mel and all, thanks for the patch! It has produced really GOOD results. 1) It has fixed performance problems with 5.7 vanilla kernel for single-tenant workload and low system load scenarios, without performance degradation for the multi-tenant tasks. It's producing the same results as the previous proof-of-concept patch where adjust_numa_imbalance function was modified to be a no-op (returning the same value of imbalance as it gets on the input). 2) We have also added Mel's netperf-cstate-small-cross-socket test to our test battery: https://github.com/gormanm/mmtests/blob/master/configs/config-network-netperf-cstate-small-cross-socket Mel told me that he had seen significant performance improvements with 5.7 over 5.6 for the netperf-cstate-small-cross-socket scenario. Out of 6 different patches we have tested, your patch has performed the best for this scenario. Compared to vanilla, we see minimal performance degradation of 2.5% for the udp stream throughput and 0.6% for the tcp throughput. The testing was done on a dual-socket system with Gold 6132 CPU. @Mel - could you please test Hillf's patch with your full testing suite? So far, it looks very promising, but I would like to check the patch thoroughly to make sure it does not hurt performance in other areas. Thanks a lot! Jirka On Tue, May 19, 2020 at 6:32 AM Hillf Danton wrote: > > > Hi Jirka > > On Mon, 18 May 2020 16:52:52 +0200 Jirka Hladky wrote: > > > > We have compared it against kernel with adjust_numa_imbalance disabled > > [1], and both kernels perform at the same level for the single-tenant > > jobs, but the proposed patch is bad for the multitenancy mode. The > > kernel with adjust_numa_imbalance disabled is a clear winner here. > > Double thanks to you for the tests! > > > We would be very interested in what others think about disabling > > adjust_numa_imbalance function. The patch is bellow. It would be great > > A minute... > > > to collect performance results for different scenarios to make sure > > the results are objective. > > I don't have another test case but a diff trying to confine the tool > in question back to the hard-coded 2's field. > > It's used in the first hunk below to detect imbalance before migrating > a task, and a small churn of code is added at another call site when > balancing idle CPUs. > > Thanks > Hillf > > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -1916,20 +1916,26 @@ static void task_numa_find_cpu(struct ta > * imbalance that would be overruled by the load balancer. > */ > if (env->dst_stats.node_type == node_has_spare) { > - unsigned int imbalance; > - int src_running, dst_running; > + unsigned int imbalance = 2; > > - /* > - * Would movement cause an imbalance? Note that if src has > - * more running tasks that the imbalance is ignored as the > - * move improves the imbalance from the perspective of the > - * CPU load balancer. > - * */ > - src_running = env->src_stats.nr_running - 1; > - dst_running = env->dst_stats.nr_running + 1; > - imbalance = max(0, dst_running - src_running); > - imbalance = adjust_numa_imbalance(imbalance, src_running); > + //No imbalance computed without spare capacity > + if (env->dst_stats.node_type != env->src_stats.node_type) > + goto check_imb; > + > + imbalance = adjust_numa_imbalance(imbalance, > + env->src_stats.nr_running); > + > + //Do nothing without imbalance > + if (!imbalance) { > + imbalance = 2; > + goto check_imb; > + } > + > + //Migrate task if it's likely to grow balance > + if (env->dst_stats.nr_running + 1 < env->src_stats.nr_running) > + imbalance = 0; > > +check_imb: > /* Use idle CPU if there is no imbalance */ > if (!imbalance) { > maymove = true; > @@ -9011,12 +9017,13 @@ static inline void calculate_imbalance(s > env->migration_type = migrate_task; > env->imbalance = max_t(long, 0, (local->idle_cpus - > busiest->idle_cpus) >> 1); > - } > > - /* Consider allowing a small imbalance between NUMA groups */ > - if (env->sd->flags & SD_NUMA) > - env->imbalance = adjust_numa_imbalance(env->imbalance, > - busiest->sum_nr_running); > + /* Consider allowing a small imbalance between NUMA groups */ > + if (env->sd->flags & SD_NUMA && > + local->group_type == busiest->group_type) > + env->imbalance = adjust_numa_imbalance(env->imbalance, > + busiest->sum_nr_running); > + } > > return; > } > -- > -- -Jirka