Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp333092pxj; Thu, 27 May 2021 00:56:44 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz+1cHOri/65bNS8EJpxG7QvmBJLlwDj8zQ0i52HmFSx2JSLYCngqGglc01zpVowmGwrJ41 X-Received: by 2002:a17:907:1b20:: with SMTP id mp32mr2492396ejc.495.1622102204440; Thu, 27 May 2021 00:56:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1622102204; cv=none; d=google.com; s=arc-20160816; b=Q3y/PCXPZwagOBnFy1F0yEMBJ6f70MybULvMVcJUn0H6MnWkYjscLdte7Og3bWuyZq CGZhJtqGYkBfWFeDsHSRbsD2snMLfHBC6FW5I0pZnp0+lcQylV3fSH46n89zy09MB7jL 2lqZCt1BiZx+Cjgfd7WQO7CF2dufVVKAdFZHe/0m6NfXBk6vOr1Ww6V42av/KmHqi8g+ aa5PxSvfPrVU+xzIM3QhI3RPO3QOIfbUGTPYE3Nqifs48wCAi/fLTW7hKpuKCTifsFpb bo+CgPXrfrj4eiRb8nmpfQVe1P44qYadcK3lsp6FdP3Nn6h6dRjdeN35T5kFJCT1nNtg WgyA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=H1K+CltDkIifzJ1ZXQpzAw36adTVLXU/8cQQG6X9chM=; b=Plrzv6uBY4CLd7UURZbhVeRyuCdvZvOnqtScw30paLejRc1m57amXVubA+xXflYsFf PKE0qW3Mtel0xUnZFxK9myh5X18jyx92Cbv38eECSn+YExOUsSihXyzpyZzvAQoofnTz /gz0C+lihw5jMlNrzMLCY3AXSzf0akSKlWdhidDE8AKOM99LqDG8IQmL437eSXUvbqk2 jCxWZhtj/uErR+f+oVP8gtnI85QcXloAqs/IgOh3mDA+JCEECtQ8rcjzfgpxrVCqDkNL d3qcPASmk0CzPRmJlkM0OJXMLSOAQmNyhCxNkNzIyYcxFasbh1IV2k0JbM3wmlLmbHN0 3EcQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@uged.al header.s=google header.b=NeJDPoxQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id dr11si1307010ejc.674.2021.05.27.00.56.18; Thu, 27 May 2021 00:56:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@uged.al header.s=google header.b=NeJDPoxQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233100AbhE0HxA (ORCPT + 99 others); Thu, 27 May 2021 03:53:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41648 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229735AbhE0Hw7 (ORCPT ); Thu, 27 May 2021 03:52:59 -0400 Received: from mail-qt1-x82f.google.com (mail-qt1-x82f.google.com [IPv6:2607:f8b0:4864:20::82f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D38C6C061574 for ; Thu, 27 May 2021 00:51:22 -0700 (PDT) Received: by mail-qt1-x82f.google.com with SMTP id v4so2858625qtp.1 for ; Thu, 27 May 2021 00:51:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uged.al; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=H1K+CltDkIifzJ1ZXQpzAw36adTVLXU/8cQQG6X9chM=; b=NeJDPoxQd8SETUNL2LwQdTKnsSIT2U11WXjT8ZX3IfqzKuKNBMlb8lHBZ552GiYKVu 4o+M09n6CdFkV1VvnJ4Ve2YOpyyK1/W4JL8tOhdePDo1TXf94GQWD2iueoPSJs829opU iFkBKFWX82u/YQTFBumNdsKCK121GG8b0rXJtuLYd5GoMdww0Zm2dA6bOTCu+O2IfJDn RL+oROPUCe9vKQgeY8nXME6sW9DTlMk8MPE6J0u+FsN91doUHZmG4TQ9RlNQmNqQeKMX HYqWYDhfXhN/CtFI52cbOkioxVvJC/xqcfWCs952N/Jc7e4m2anpul5wp+mQuRljXfcg 77fg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=H1K+CltDkIifzJ1ZXQpzAw36adTVLXU/8cQQG6X9chM=; b=XmFqLtyVokLPjhxtGTUO82sxLeTon5sEobMFthrfG6oceq3o9pYiyMisRkWXQz/GBG /krE+l6Hs8V3m1W+PA86s4tL8yYx77PiYFGFnBVjgWzQe7yKDctmT4Hp4BusK7Esv1xK mvBP4DkGiu3R3E/WC+wY4kR0D2irK9xK6ALX9k3pAGavvtDyYqvG4teKZmnUR1L+V3GF E6YdjfFjFd1xqjpw96TUh7R7hJ1rgqvPCfL8rUtjJnP7cS0qtPkcM2zz5ObQxTqpmBLQ KYlKEsA4a2+jmoSiBZzZpGC/DMQQhQbESH0CFCZ0MZg6cVDolAZE2loeh1rSSMmTuenf pd8Q== X-Gm-Message-State: AOAM5329dI1N2E+hjIBeVlrElbKbndoBBJqUzTIwHH0VDtnlIPa1ZEL4 ryLv0FJqwr4HaAqYGG9hwZrFl8hqnf9YGsVPK8m6ZQ== X-Received: by 2002:ac8:45c2:: with SMTP id e2mr2080986qto.138.1622101881977; Thu, 27 May 2021 00:51:21 -0700 (PDT) MIME-Version: 1.0 References: <20210518125202.78658-1-odin@uged.al> <20210518125202.78658-2-odin@uged.al> In-Reply-To: From: Odin Ugedal Date: Thu, 27 May 2021 09:50:42 +0200 Message-ID: Subject: Re: [PATCH 1/3] sched/fair: Add tg_load_contrib cfs_rq decay checking To: Vincent Guittot Cc: Odin Ugedal , Ingo Molnar , Peter Zijlstra , Juri Lelli , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , "open list:CONTROL GROUP (CGROUP)" , linux-kernel Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, Did some quick testing with bpftrace, and it looks like it is "update_tg_cfs_load" that does this; kretfunc:attach_entity_load_avg: cfs_rq: 0xffff8a3e6773e400, se.load_sum: 0, se.load_avg: 0, se->load.weight: 1048576 diff.sum: 0, diff.load: 0, new_sum: 0, new_load: 0, period_contrib: 821 // quite some time passes kretfunc:update_tg_cfs_load: cfs_rq: 0xffff8a3e6773e400, old load_sum 0, old load_avg: 0, new load_sum: 47876096, new load_avg: 1022 tg_load_avg_contrib: 0, period_contrib: 82, on_list: 0, throttled: 0/0 stack: bpf_prog_a37cf01922e02958_update_tg_cfs_l+504 bpf_get_stackid_raw_tp+115 bpf_prog_a37cf01922e02958_update_tg_cfs_l+504 bpf_trampoline_6442466486_1+144 update_tg_cfs_load+5 update_load_avg+450 enqueue_entity+91 enqueue_task_fair+134 move_queued_task+172 affine_move_task+1282 __set_cpus_allowed_ptr+317 update_tasks_cpumask+70 update_cpumasks_hier+448 update_cpumask+345 cpuset_write_resmask+796 cgroup_file_write+162 kernfs_fop_write_iter+284 new_sync_write+345 vfs_write+511 ksys_write+103 do_syscall_64+51 entry_SYSCALL_64_after_hwframe+68 // There might be more stuff happening here // __update_load_avg_cfs_rq runs way too often to be able to // trace it without filtering, and didn't look into that. kretfunc:update_tg_cfs_load: cfs_rq: 0xffff8a3e6773e400, old load_sum 48117265, old load_avg: 1025, new load_sum: 0, new load_avg: 2 tg_load_avg_contrib: 1022, period_contrib: 223, on_list: 1, throttled: 1/1 stack: bpf_prog_a37cf01922e02958_update_tg_cfs_l+504 bpf_get_stackid_raw_tp+115 bpf_prog_a37cf01922e02958_update_tg_cfs_l+504 bpf_trampoline_6442466486_1+144 update_tg_cfs_load+5 update_load_avg+450 update_blocked_averages+723 newidle_balance+533 pick_next_task_fair+57 __schedule+376 schedule+91 schedule_hrtimeout_range_clock+164 do_sys_poll+1043 __x64_sys_poll+179 do_syscall_64+51 entry_SYSCALL_64_after_hwframe+68 This particular example resulted in a 75/25 share of cpu time for the two groups. From this, it also looks like it doesn't matter if new load_avg is 0 or not (when new load_sum is 0), since tg_load_avg_contrib will not be properly removed either way (without this patch). There might be some other precision > I have added a trace in cfs_rq_is_decayed() but I'm not able to > reproduce a situation where load_sum == 0 but not load_avg even with > the script in the cover letter Hmm, strange. I am able to reproduce on both v5.12 and v5.13. I am running on a non SMT 4 core CPU (4670k), with CONFIG_HZ=250. Since this is timing sensitive, you might need to tweak the timings for either the CFS bw period/quota, or the time waiting between changing what cpu it runs on. If you have more than 4 cpus, you can also try to start on CPU 0 and move it through all cores and then onto CPU n. Maybe that makes it possible to reproduce. Since bpftrace doesn't work properly in v5.13, this particular test was on v1.12 with cherry-pick of 0258bdfaff5bd13c4d2383150b7097aecd6b6d82 and the other patch in this patchset. testscript.bt: kfunc:attach_entity_load_avg {@a_sum[tid] = args->cfs_rq->avg.load_sum; @a_load[tid] = args->cfs_rq->avg.load_avg;} kretfunc:attach_entity_load_avg{printf("%s: cfs_rq: %p, se.load_sum: %llu, se.load_avg: %llu, se->load.weight: %llu diff.sum: %llu, diff.load: %llu, new_sum: %llu, new_load: %llu, period_contrib: %llu\n", probe, args->cfs_rq,args->se->avg.load_sum, args->se->avg.load_avg, args->se->load.weight,args->cfs_rq->avg.load_sum - @a_sum[tid], args->cfs_rq->avg.load_avg - @a_load[tid], args->cfs_rq->avg.load_sum, args->cfs_rq->avg.load_avg, args->cfs_rq->avg.period_contrib)} kfunc:update_tg_cfs_load {@beg_load_avg[tid] = args->cfs_rq->avg.load_avg; @beg_load_sum[tid] = args->cfs_rq->avg.load_sum} kretfunc:update_tg_cfs_load {printf("%s: cfs_rq: %p, old load_sum %llu, old load_avg: %llu, new load_sum: %llu, new load_avg: %llu tg_load_avg_contrib: %llu, period_contrib: %llu, on_list: %d, throttled: %d/%d stack: %s\n",probe, args->cfs_rq, @beg_load_sum[tid], @beg_load_avg[tid], args->cfs_rq->avg.load_sum, args->cfs_rq->avg.load_avg, args->cfs_rq->tg_load_avg_contrib, args->cfs_rq->avg.period_contrib,args->cfs_rq->on_list, args->cfs_rq->throttled, args->cfs_rq->throttle_count,kstack)} Thanks Odin