Date: Fri, 20 Feb 2015 22:46:13 +0100
From: Vojtech Pavlik <vojtech@suse.com>
To: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Kosina <jkosina@suse.cz>, Josh Poimboeuf <jpoimboe@redhat.com>,
        Peter Zijlstra <peterz@infradead.org>,
        Andrew Morton <akpm@linux-foundation.org>,
        Ingo Molnar <mingo@redhat.com>, Seth Jennings <sjenning@redhat.com>,
        linux-kernel@vger.kernel.org,
        Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: live patching design (was: Re: [PATCH 1/3] sched: add
 sched_task_call())
Message-ID: <20150220214613.GA21598@suse.com>
References: <20150219173255.GC15980@treble.redhat.com>
 <20150219204036.GA16882@suse.com>
 <20150219214229.GD15980@treble.redhat.com>
 <alpine.LNX.2.00.1502200830430.28769@pobox.suse.cz>
 <alpine.LNX.2.00.1502200939230.28769@pobox.suse.cz>
 <20150220095003.GA23506@gmail.com>
 <alpine.LNX.2.00.1502201052050.28769@pobox.suse.cz>
 <20150220104418.GD25076@gmail.com>
 <alpine.LNX.2.00.1502201151390.28769@pobox.suse.cz>
 <20150220194901.GB3603@gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20150220194901.GB3603@gmail.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2730
Lines: 64

On Fri, Feb 20, 2015 at 08:49:01PM +0100, Ingo Molnar wrote:

> > ... the choice the sysadmins have here is either have the 
> > system running in an intermediate state, or have the 
> > system completely dead for the *same time*. Because to 
> > finish the transition successfully, all the tasks have to 
> > be woken up in any case.
> 
> That statement is false: an 'intermediate state' system 
> where 'new' tasks are still running is still running and 
> will interfere with the resolution of 'old' tasks.

Can you suggest a way how they would interfere? The transition happens
on entering or returning from a syscall, there is no influence between
individual tasks.

If you mean that the patch author has to consider the fact that both old
and new code will be running simultaneously, then yes, they have to.

> > But I do get your point; what you are basically saying is 
> > that your preference is what kgraft is doing, and option 
> > to allow for a global synchronization point should be 
> > added in parallel to the gradual lazy migration.
> 
> I think you misunderstood: the 'simple' method I outlined 
> does not just 'synchronize', it actually executes the live 
> patching atomically, once all tasks are gathered and we 
> know they are _all_ in a safe state.

The 'simple' method has to catch and freeze all tasks one by one in
syscall entry/exit, at the kernel/userspace boundary, until all are
frozen and then patch the system atomically. 

This means that each and every sleeping task in the system has to be
woken up in some way (sending a signal ...) to exit from a syscall it is
sleeping in. Same for CPU hogs. All kernel threads need to be parked.

This is exactly what you need to do for kGraft to complete patching.

This may take a significant amount of time to achieve and you won't be
able to use a userspace script to send the signals, because it'd be
frozen immediately.

> I.e. it's in essence the strong stop-all atomic patching 
> model of 'kpatch', combined with the reliable avoidance of 
> kernel stacks that 'kgraft' uses.

> That should be the starting point, because it's the most 
> reliable method.

In the consistency models discussion, this was marked the
"LEAVE_KERNEL+SWITCH_KERNEL" model. It's indeed the strongest model of
all, but also comes at the highest cost in terms of impact on running
tasks. It's so high (the interruption may be seconds or more) that it
was deemed not worth implementing.

-- 
Vojtech Pavlik
Director SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/