Message-ID: <54EC603C.5080003@hitachi.com>
Date: Tue, 24 Feb 2015 20:27:56 +0900
From: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Organization: Hitachi, Ltd., Japan
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:13.0) Gecko/20120614 Thunderbird/13.0.1
MIME-Version: 1.0
To: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Kosina <jkosina@suse.cz>, Seth Jennings <sjenning@redhat.com>,
        Vojtech Pavlik <vojtech@suse.cz>, live-patching@vger.kernel.org,
        linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 0/9] livepatch: consistency model
References: <cover.1423499826.git.jpoimboe@redhat.com> <alpine.LNX.2.00.1502131109200.2423@pobox.suse.cz> <20150213141904.GB27180@treble.redhat.com> <alpine.LNX.2.00.1502131520390.2423@pobox.suse.cz> <20150213144152.GD27180@treble.redhat.com>
In-Reply-To: <20150213144152.GD27180@treble.redhat.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2510
Lines: 62

(2015/02/13 23:41), Josh Poimboeuf wrote:
> On Fri, Feb 13, 2015 at 03:22:15PM +0100, Jiri Kosina wrote:
>> On Fri, 13 Feb 2015, Josh Poimboeuf wrote:
>>
>>>> How about we take a slightly different aproach -- put a probe (or ftrace) 
>>>> on __switch_to() during a klp transition period, and examine stacktraces 
>>>> for tasks that are just about to start running from there?
>>>>
>>>> The only tasks that would not be covered by this would be purely CPU-bound 
>>>> tasks that never schedule. But we are likely in trouble with those anyway, 
>>>> because odds are that non-rescheduling CPU-bound tasks are also 
>>>> RT-priority tasks running on isolated CPUs, which we will fail to handle 
>>>> anyway.
>>>>
>>>> I think Masami used similar trick in his kpatch-without-stopmachine 
>>>> aproach.
>>>
>>> Yeah, that's definitely an option, though I'm really not too crazy about
>>> it.  Hooking into the scheduler is kind of scary and disruptive.  
>>
>> This is basically about running a stack checking for ->next before 
>> switching to it, i.e. read-only operation (admittedly inducing some 
>> latency, but that's the same with locking the runqueue). And only when in 
>> transition phase.
> 
> Yes, but it would introduce much more latency than locking rq, since
> there would be at least some added latency to every schedule() call
> during the transition phase.  Locking the rq would only add latency in
> those cases where another CPU is trying to do a context switch while
> we're holding the lock.

If we can implement checking routine at the enter of switching process,
it will not have such bigger cost. My prototype code used kprobes just
for hack, but we can do it in the scheduler too.

>
> It also seems much more dangerous.  A bug in __switch_to() could easily
> do a lot of damage.

Indeed. It requires per-task locking on scheduler for safety on switching
to avoid concurrent stack checking.

>>> We'd also have to wake up all the sleeping processes.
>>
>> Yes, I don't think there is a way around that.
> 
> Actually this patch set is a way around that :-)

Thank you,

-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/