Date: Fri, 13 Feb 2015 08:55:25 -0600
From: Josh Poimboeuf <jpoimboe@redhat.com>
To: Miroslav Benes <mbenes@suse.cz>
Cc: Jiri Kosina <jkosina@suse.cz>, Seth Jennings <sjenning@redhat.com>,
        Vojtech Pavlik <vojtech@suse.cz>,
        Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>,
        live-patching@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 0/9] livepatch: consistency model
Message-ID: <20150213145525.GE27180@treble.redhat.com>
References: <cover.1423499826.git.jpoimboe@redhat.com>
 <alpine.LNX.2.00.1502131109200.2423@pobox.suse.cz>
 <20150213141904.GB27180@treble.redhat.com>
 <alpine.LNX.2.00.1502131520390.2423@pobox.suse.cz>
 <alpine.LNX.2.00.1502131532020.14133@pobox.suse.cz>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
In-Reply-To: <alpine.LNX.2.00.1502131532020.14133@pobox.suse.cz>
User-Agent: Mutt/1.5.23.1-rc1 (2014-03-12)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2681
Lines: 59

On Fri, Feb 13, 2015 at 03:40:14PM +0100, Miroslav Benes wrote:
> On Fri, 13 Feb 2015, Jiri Kosina wrote:
> 
> > On Fri, 13 Feb 2015, Josh Poimboeuf wrote:
> > 
> > > > How about we take a slightly different aproach -- put a probe (or ftrace) 
> > > > on __switch_to() during a klp transition period, and examine stacktraces 
> > > > for tasks that are just about to start running from there?
> > > > 
> > > > The only tasks that would not be covered by this would be purely CPU-bound 
> > > > tasks that never schedule. But we are likely in trouble with those anyway, 
> > > > because odds are that non-rescheduling CPU-bound tasks are also 
> > > > RT-priority tasks running on isolated CPUs, which we will fail to handle 
> > > > anyway.
> > > > 
> > > > I think Masami used similar trick in his kpatch-without-stopmachine 
> > > > aproach.
> > > 
> > > Yeah, that's definitely an option, though I'm really not too crazy about
> > > it.  Hooking into the scheduler is kind of scary and disruptive.  
> > 
> > This is basically about running a stack checking for ->next before 
> > switching to it, i.e. read-only operation (admittedly inducing some 
> > latency, but that's the same with locking the runqueue). And only when in 
> > transition phase.
> > 
> > > We'd also have to wake up all the sleeping processes.
> > 
> > Yes, I don't think there is a way around that.
> 
> I think there are two options how to do it if I understand you correctly.
> 
> 1. we would put a probe on __switch_to and afterwards wake up all the 
>    sleeping processes.
> 
> 2. we would do it in an asynchronous manner. We would put a probe and let 
>    the processes to wake themselves. The transition delayed workqueue 
>    would only check if there is some non-migrated process. Of course if 
>    some process sleeps for a long time it would take a long time to 
>    complete the patching. It would be up to the user to send a signal to 
>    the process to wake up.
> 
> Does it make sense? If yes, I cannot decide which approach is better.

Option 2 wouldn't really work for kthreads because you can't signal them
to wake up from user space.  And I really want to avoid having to leave
the system in a partially patched state for a long period of time.

But also option 1 wouldn't necessarily result in the system being
immediately patched, since you could have some CPU-bound tasks.  So some
asynchronous patching is still needed.

-- 
Josh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/