Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757779AbcDGVhX (ORCPT ); Thu, 7 Apr 2016 17:37:23 -0400 Received: from mx2.suse.de ([195.135.220.15]:34482 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751673AbcDGVhV (ORCPT ); Thu, 7 Apr 2016 17:37:21 -0400 Date: Thu, 7 Apr 2016 23:37:19 +0200 (CEST) From: Jiri Kosina X-X-Sender: jkosina@pobox.suse.cz To: Jessica Yu cc: Josh Poimboeuf , Miroslav Benes , linux-kernel@vger.kernel.org, live-patching@vger.kernel.org, Vojtech Pavlik Subject: Re: sched: horrible way to detect whether a task has been preempted In-Reply-To: <20160407211525.GB25804@packer-debian-8-amd64.digitalocean.com> Message-ID: References: <24db5a6ae5b63dfcd2096a12d18e1399a351348e.1458933243.git.jpoimboe@redhat.com> <20160407211525.GB25804@packer-debian-8-amd64.digitalocean.com> User-Agent: Alpine 2.00 (LNX 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1663 Lines: 37 On Thu, 7 Apr 2016, Jessica Yu wrote: > Been sort of rattling my head over the scheduler code :-) Just following > the calls in and out of __schedule() it doesn't look like there is a > current flag/mechanism to tell whether or not a task has been > preempted.. Performing the complete stack unwind just to determine whether task has been preempted non-volutarily is a slight overkill indeed :/ > Is there any reason why you didn't just create a new task flag, > something like TIF_PREEMPTED_IRQ, which would be set once > preempt_schedule_irq() is entered and unset after __schedule() returns > (for that task)? This would roughly correspond to setting the task flag > when the frame for preempt_schedule_irq() is pushed and unsetting it > just before the frame preempt_schedule_irq() is popped for that task. > This seems simpler than walking through all the frames just to see if > in_preempt_schedule_irq() had been called. Would that work? Alternatively, without eating up a TIF_ space, it'd be possible to push a magic contents on top of the stack in preempt_schedule_irq() (and pop it once we are returning from there), and if such magic value is detected, we just don't bother and claim unreliability. That has advantages of both aproaches combined, i.e. it's relatively low-cost in terms of performance penalty, and it's reliable (in a sense that you don't have false positives). The small disadvantage is that you can (very rarely, depending on the chosen magic) have false negatives. That probably doesn't hurt too much, given the high inprobability and non-lethal consequences. How does that sound? -- Jiri Kosina SUSE Labs