Date: Thu, 19 Feb 2015 10:24:29 -0600
From: Josh Poimboeuf <jpoimboe@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>, Ingo Molnar <mingo@redhat.com>,
        Jiri Kosina <jkosina@suse.cz>, Seth Jennings <sjenning@redhat.com>,
        linux-kernel@vger.kernel.org, Vojtech Pavlik <vojtech@suse.cz>
Subject: Re: [PATCH 1/3] sched: add sched_task_call()
Message-ID: <20150219162429.GA15980@treble.redhat.com>
References: <20150216220505.GB11861@treble.redhat.com>
 <20150217092450.GI5029@twins.programming.kicks-ass.net>
 <20150217141211.GC11861@treble.redhat.com>
 <20150217181541.GP5029@twins.programming.kicks-ass.net>
 <20150217212532.GJ11861@treble.redhat.com>
 <20150218152100.GZ5029@twins.programming.kicks-ass.net>
 <20150218171256.GA28553@treble.hsd1.ky.comcast.net>
 <20150219002058.GD5029@twins.programming.kicks-ass.net>
 <20150219041753.GA13423@treble.redhat.com>
 <20150219101607.GG5029@twins.programming.kicks-ass.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
In-Reply-To: <20150219101607.GG5029@twins.programming.kicks-ass.net>
User-Agent: Mutt/1.5.23.1-rc1 (2014-03-12)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 3968
Lines: 93

On Thu, Feb 19, 2015 at 11:16:07AM +0100, Peter Zijlstra wrote:
> On Wed, Feb 18, 2015 at 10:17:53PM -0600, Josh Poimboeuf wrote:
> > On Thu, Feb 19, 2015 at 01:20:58AM +0100, Peter Zijlstra wrote:
> > > On Wed, Feb 18, 2015 at 11:12:56AM -0600, Josh Poimboeuf wrote:
> 
> > > > The next line of attack is patching tasks when exiting the kernel to
> > > > user space (system calls, interrupts, signals), to catch all CPU-bound
> > > > and some I/O-bound tasks.  That's done in patch 9 [1] of the consistency
> > > > model patch set.
> > > 
> > > So the HPC people are really into userspace that does for (;;) ; and
> > > isolate that on CPUs and have the tick interrupt stopped and all that.
> > > 
> > > You'll not catch those threads on the sysexit path.
> > > 
> > > And I'm fairly sure they'll not want to SIGSTOP/CONT their stuff either.
> > > 
> > > Now its fairly easy to also handle this; just mark those tasks with a
> > > _TIF_WORK_SYSCALL_ENTRY flag, have that slowpath wait for the flag to
> > > go-away, then flip their state and clear the flag.
> > 
> > I guess you mean patch the task when it makes a syscall?  I'm doing that
> > already on syscall exit with a bit in _TIF_ALLWORK_MASK and
> > _TIF_DO_NOTIFY_MASK.
> 
> No, these tasks will _never_ make syscalls. So you need to guarantee
> they don't accidentally enter the kernel while you flip them. Something
> like so should do.
> 
> You set TIF_ENTER_WAIT on them, check they're still in userspace, flip
> them then clear TIF_ENTER_WAIT.

Ah, that's a good idea.  But how do we check if they're in user space?

> > > > As a last resort, if there are still any tasks which are sleeping on a
> > > > to-be-patched function, the user can send them SIGSTOP and SIGCONT to
> > > > force them to be patched.
> > > 
> > > You typically cannot SIGSTOP/SIGCONT kernel threads. Also
> > > TASK_UNINTERRUPTIBLE sleeps are unaffected by signals.
> > > 
> > > Bit pesky that.. needs pondering.
> 
> I still absolutely hate you need to disturb userspace like that. Signals
> are quite visible and perturb userspace state.

Yeah, SIGSTOP on a sleeping task can be intrusive to user space if it
results in EINTR being returned from a system call.  It's not ideal, but
it's much less intrusive than something like suspend.

But anyway we leave it up to the user to decide whether they want to
take that risk, or wait for the task to wake up on its own, or cancel
the patch.

> Also, you cannot SIGCONT a task that was SIGSTOP'ed by userspace for
> what they thought was a good reason. You'd wreck their state.

Hm, that's a good point.  Maybe use the freezer instead of signals?

(Again this would only be for those user tasks which are sleeping on a
patched function)

> > But now I'm thinking that kthreads will almost never be a problem.  Most
> > kthreads are basically this:
> 
> You guys are way too optimistic; maybe its because I've worked on
> realtime stuff too much, but I'm always looking at worst cases. If you
> can't handle those, I feel you might as well not bother :-)

Well, I think we're already resigned to the fact that live patching
won't work for every patch, every time.  And that the patch author must
know what they're doing, and must do it carefully.

Our target worst case is that the patching fails gracefully and the user
can't patch their system with that particular patch.  Or that the system
remains in a partially patched state forever, if the user is ok with
that.

> > Patching thread_fn wouldn't be possible unless we killed the thread.
> 
> It is, see kthread_park().

When the kthread returns from kthread_parkme(), it'll still be running
the old thread_fn code, regardless of whether we flipped the task's
patch state.

-- 
Josh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/