Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750780AbaKYNsZ (ORCPT ); Tue, 25 Nov 2014 08:48:25 -0500 Received: from mail-oi0-f46.google.com ([209.85.218.46]:40861 "EHLO mail-oi0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750781AbaKYNsX (ORCPT ); Tue, 25 Nov 2014 08:48:23 -0500 MIME-Version: 1.0 In-Reply-To: <20141124170502.GK23177@e105550-lin.cambridge.arm.com> References: <1415033687-23294-1-git-send-email-vincent.guittot@linaro.org> <1415033687-23294-6-git-send-email-vincent.guittot@linaro.org> <20141121123559.GF23177@e105550-lin.cambridge.arm.com> <20141124170502.GK23177@e105550-lin.cambridge.arm.com> From: Vincent Guittot Date: Tue, 25 Nov 2014 14:48:02 +0100 Message-ID: Subject: Re: [PATCH v9 05/10] sched: make scale_rt invariant with frequency To: Morten Rasmussen Cc: "peterz@infradead.org" , "mingo@kernel.org" , "linux-kernel@vger.kernel.org" , "preeti@linux.vnet.ibm.com" , "kamalesh@linux.vnet.ibm.com" , "linux-arm-kernel@lists.infradead.org" , "riel@redhat.com" , "efault@gmx.de" , "nicolas.pitre@linaro.org" , "linaro-kernel@lists.linaro.org" Content-Type: text/plain; charset=ISO-8859-1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 24 November 2014 at 18:05, Morten Rasmussen wrote: > On Mon, Nov 24, 2014 at 02:24:00PM +0000, Vincent Guittot wrote: >> On 21 November 2014 at 13:35, Morten Rasmussen wrote: >> > On Mon, Nov 03, 2014 at 04:54:42PM +0000, Vincent Guittot wrote: >> >> [snip] >> >> >> The average running time of RT tasks is used to estimate the remaining compute >> >> @@ -5801,19 +5801,12 @@ static unsigned long scale_rt_capacity(int cpu) >> >> >> >> total = sched_avg_period() + delta; >> >> >> >> - if (unlikely(total < avg)) { >> >> - /* Ensures that capacity won't end up being negative */ >> >> - available = 0; >> >> - } else { >> >> - available = total - avg; >> >> - } >> >> + used = div_u64(avg, total); >> > >> > I haven't looked through all the details of the rt avg tracking, but if >> > 'used' is in the range [0..SCHED_CAPACITY_SCALE], I believe it should >> > work. Is it guaranteed that total > 0 so we don't get division by zero? >> >> static inline u64 sched_avg_period(void) >> { >> return (u64)sysctl_sched_time_avg * NSEC_PER_MSEC / 2; >> } >> > > I see. > >> > >> > It does get a slightly more complicated if we want to figure out the >> > available capacity at the current frequency (current < max) later. Say, >> > rt eats 25% of the compute capacity, but the current frequency is only >> > 50%. In that case get: >> > >> > curr_avail_capacity = (arch_scale_cpu_capacity() * >> > (arch_scale_freq_capacity() - (SCHED_SCALE_CAPACITY - scale_rt_capacity()))) >> > >> SCHED_CAPACITY_SHIFT >> >> You don't have to be so complicated but simply need to do: >> curr_avail_capacity for CFS = (capacity_of(CPU) * >> arch_scale_freq_capacity()) >> SCHED_CAPACITY_SHIFT >> >> capacity_of(CPU) = 600 is the max available capacity for CFS tasks >> once we have removed the 25% of capacity that is used by RT tasks >> arch_scale_freq_capacity = 512 because we currently run at 50% of max freq >> >> so curr_avail_capacity for CFS = 300 > > I don't think that is correct. It is at least not what I had in mind. > > capacity_orig_of(cpu) = 800, we run at 50% frequency which means: > > curr_capacity = capacity_orig_of(cpu) * arch_scale_freq_capacity() > >> SCHED_CAPACITY_SHIFT > = 400 > > So the total capacity at the current frequency (50%) is 400, without > considering RT. scale_rt_capacity() is frequency invariant, so it takes > away capacity_orig_of(cpu) - capacity_of(cpu) = 200 worth of capacity > for RT. We need to subtract that from the current capacity to get the > available capacity at the current frequency. > > curr_available_capacity = curr_capacity - (capacity_orig_of(cpu) - > capacity_of(cpu)) = 200 you're right, this one looks good to me too > > In other words, 800 is the max capacity, we are currently running at 50% > frequency, which gives us 400. RT takes away 25% of 800 > (frequency-invariant) from the 400, which leaves us with 200 left for > CFS tasks at the current frequency. > > In your calculations you subtract the RT load before computing the > current capacity using arch_scale_freq_capacity(), where I think it > should be done after. You find the amount spare capacity you would have > at the maximum frequency when RT has been subtracted and then scale the > result by frequency which means indirectly scaling the RT load > contribution again (the rt avg has already been scaled). So instead of > taking away 200 of the 400 (current capacity @ 50% frequency), it only > takes away 100 which isn't right. > > scale_rt_capacity() is frequency-invariant, so if the RT load is 50% and > the frequency is 50%, there are no spare cycles left. > curr_avail_capacity should be 0. But using your expression above you > would get capacity_of(cpu) = 400 after removing RT, > arch_scale_freq_capacity = 512 and you get 200. I don't think that is > right. > > Morten > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/