Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753205AbbEHV0f (ORCPT ); Fri, 8 May 2015 17:26:35 -0400 Received: from v094114.home.net.pl ([79.96.170.134]:64286 "HELO v094114.home.net.pl" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1752795AbbEHV0d (ORCPT ); Fri, 8 May 2015 17:26:33 -0400 From: "Rafael J. Wysocki" To: Preeti U Murthy Cc: peterz@infradead.org, tglx@linutronix.de, rafael.j.wysocki@intel.com, daniel.lezcano@linaro.org, rlippert@google.com, linux-pm@vger.kernel.org, linus.walleij@linaro.org, linux-kernel@vger.kernel.org, mingo@redhat.com, sudeep.holla@arm.com, linuxppc-dev@lists.ozlabs.org Subject: Re: [PATCH V3] cpuidle: Handle tick_broadcast_enter() failure gracefully Date: Fri, 08 May 2015 23:51:38 +0200 Message-ID: <3981224.gz7iCfoVSM@vostro.rjw.lan> User-Agent: KMail/4.11.5 (Linux/4.0.0+; KDE/4.11.5; x86_64; ; ) In-Reply-To: <8680371.ymYIEaFYPT@vostro.rjw.lan> References: <20150508073418.28491.4150.stgit@preeti.in.ibm.com> <8680371.ymYIEaFYPT@vostro.rjw.lan> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="utf-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 7588 Lines: 203 On Friday, May 08, 2015 04:18:02 PM Rafael J. Wysocki wrote: > On Friday, May 08, 2015 01:05:32 PM Preeti U Murthy wrote: > > When a CPU has to enter an idle state where tick stops, it makes a call > > to tick_broadcast_enter(). The call will fail if this CPU is the > > broadcast CPU. Today, under such a circumstance, the arch cpuidle code > > handles this CPU. This is not convincing because not only do we not > > know what the arch cpuidle code does, but we also do not account for the > > idle state residency time and usage of such a CPU. > > > > This scenario can be handled better by simply choosing an idle state > > where in ticks do not stop. To accommodate this change move the setting > > of runqueue idle state from the core to the cpuidle driver, else the > > rq->idle_state will be set wrong. > > > > Signed-off-by: Preeti U Murthy > > --- > > Changes from V2: https://lkml.org/lkml/2015/5/7/78 > > Introduce a function in cpuidle core to select an idle state where ticks do not > > stop rather than going through the governors. > > > > Changes from V1: https://lkml.org/lkml/2015/5/7/24 > > Rebased on the latest linux-pm/bleeding-edge branch > > > > drivers/cpuidle/cpuidle.c | 45 +++++++++++++++++++++++++++++++++++++++++++-- > > include/linux/sched.h | 16 ++++++++++++++++ > > kernel/sched/core.c | 17 +++++++++++++++++ > > kernel/sched/fair.c | 2 +- > > kernel/sched/idle.c | 6 ------ > > kernel/sched/sched.h | 24 ------------------------ > > 6 files changed, 77 insertions(+), 33 deletions(-) > > > > diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c > > index 8c24f95..d1af760 100644 > > --- a/drivers/cpuidle/cpuidle.c > > +++ b/drivers/cpuidle/cpuidle.c > > @@ -21,6 +21,7 @@ > > #include > > #include > > #include > > +#include > > #include > > > > #include "cpuidle.h" > > @@ -146,6 +147,36 @@ int cpuidle_enter_freeze(struct cpuidle_driver *drv, struct cpuidle_device *dev) > > return index; > > } > > > > +/* > > + * find_tick_valid_state - select a state where tick does not stop > > + * @dev: cpuidle device for this cpu > > + * @drv: cpuidle driver for this cpu > > + */ > > +static int find_tick_valid_state(struct cpuidle_device *dev, > > + struct cpuidle_driver *drv) > > +{ > > + int i, ret = -1; > > + > > + for (i = CPUIDLE_DRIVER_STATE_START; i < drv->state_count; i++) { > > + struct cpuidle_state *s = &drv->states[i]; > > + struct cpuidle_state_usage *su = &dev->states_usage[i]; > > + > > + /* > > + * We do not explicitly check for latency requirement > > + * since it is safe to assume that only shallower idle > > + * states will have the CPUIDLE_FLAG_TIMER_STOP bit > > + * cleared and they will invariably meet the latency > > + * requirement. > > + */ > > + if (s->disabled || su->disable || > > + (s->flags & CPUIDLE_FLAG_TIMER_STOP)) > > + continue; > > + > > + ret = i; > > + } > > + return ret; > > +} > > + > > /** > > * cpuidle_enter_state - enter the state and update stats > > * @dev: cpuidle device for this cpu > > @@ -168,10 +199,17 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct cpuidle_driver *drv, > > * CPU as a broadcast timer, this call may fail if it is not available. > > */ > > if (broadcast && tick_broadcast_enter()) { > > - default_idle_call(); > > - return -EBUSY; > > + index = find_tick_valid_state(dev, drv); > > Well, the new state needs to be deeper I should have said "shallower", sorry about that. The state chosen by the governor satisfies certain latency requirements and we can't violate those by choosing a deeper state here. But the patch I sent actually did the right thing. :-) > than the old one or you may violate the governor's choice and this doesn't > guarantee that. > > Also I don't quite see a reason to duplicate the find_deepest_state() functionality > here. > > > + if (index < 0) { > > + default_idle_call(); > > + return -EBUSY; > > + } > > + target_state = &drv->states[index]; > > } > > > > + /* Take note of the planned idle state. */ > > + idle_set_state(smp_processor_id(), target_state); > > And I wouldn't do this either. > > The behavior here is pretty much as though the driver demoted the state chosen > by the governor and we don't call idle_set_state() again in those cases. > > > + > > trace_cpu_idle_rcuidle(index, dev->cpu); > > time_start = ktime_get(); > > Overall, something like the patch below (untested) should work I suppose? > > --- > drivers/cpuidle/cpuidle.c | 21 ++++++++++++++------- > 1 file changed, 14 insertions(+), 7 deletions(-) > > Index: linux-pm/drivers/cpuidle/cpuidle.c > =================================================================== > --- linux-pm.orig/drivers/cpuidle/cpuidle.c > +++ linux-pm/drivers/cpuidle/cpuidle.c > @@ -73,17 +73,19 @@ int cpuidle_play_dead(void) > } > > static int find_deepest_state(struct cpuidle_driver *drv, > - struct cpuidle_device *dev, bool freeze) > + struct cpuidle_device *dev, bool freeze, > + int limit, unsigned int flags_to_avoid) > { > unsigned int latency_req = 0; > int i, ret = freeze ? -1 : CPUIDLE_DRIVER_STATE_START - 1; > > - for (i = CPUIDLE_DRIVER_STATE_START; i < drv->state_count; i++) { > + for (i = CPUIDLE_DRIVER_STATE_START; i < limit; i++) { > struct cpuidle_state *s = &drv->states[i]; > struct cpuidle_state_usage *su = &dev->states_usage[i]; > > if (s->disabled || su->disable || s->exit_latency <= latency_req > - || (freeze && !s->enter_freeze)) > + || (freeze && !s->enter_freeze) > + || (s->flags & flags_to_avoid)) > continue; > > latency_req = s->exit_latency; > @@ -100,7 +102,7 @@ static int find_deepest_state(struct cpu > int cpuidle_find_deepest_state(struct cpuidle_driver *drv, > struct cpuidle_device *dev) > { > - return find_deepest_state(drv, dev, false); > + return find_deepest_state(drv, dev, false, drv->state_count, 0); > } > > static void enter_freeze_proper(struct cpuidle_driver *drv, > @@ -139,7 +141,7 @@ int cpuidle_enter_freeze(struct cpuidle_ > * that interrupts won't be enabled when it exits and allows the tick to > * be frozen safely. > */ > - index = find_deepest_state(drv, dev, true); > + index = find_deepest_state(drv, dev, true, drv->state_count, 0); > if (index >= 0) > enter_freeze_proper(drv, dev, index); > > @@ -168,8 +170,13 @@ int cpuidle_enter_state(struct cpuidle_d > * CPU as a broadcast timer, this call may fail if it is not available. > */ > if (broadcast && tick_broadcast_enter()) { > - default_idle_call(); > - return -EBUSY; > + index = find_deepest_state(drv, dev, false, index, > + CPUIDLE_FLAG_TIMER_STOP); > + if (index < 0) { > + default_idle_call(); > + return -EBUSY; > + } > + target_state = &drv->states[index]; > } > > trace_cpu_idle_rcuidle(index, dev->cpu); > > -- > To unsubscribe from this list: send the line "unsubscribe linux-pm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- I speak only for myself. Rafael J. Wysocki, Intel Open Source Technology Center. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/