2013-10-11 09:29:32

by Daniel Kiper

[permalink] [raw]
Subject: kexec: Clearing registers just before jumping into purgatory

Hi,

Could you explain why do you clear all registers just before jumping
into purgatory (please look into arch/x86/kernel/relocate_kernel_64.S
for more details)? There is no any single word about that. I do not
count comment which states what is going on. purgatory on entry does
not assume any value in registers. Are you going to use that feature
for something in the future (e.g. to differentiate between callers
and/or Linux versions if it be needed)?

By the way, interestingly it is not done if preserve_context is in force.

Daniel


2013-10-11 10:11:59

by Eric W. Biederman

[permalink] [raw]
Subject: Re: kexec: Clearing registers just before jumping into purgatory

Daniel Kiper <[email protected]> writes:

> Hi,
>
> Could you explain why do you clear all registers just before jumping
> into purgatory (please look into arch/x86/kernel/relocate_kernel_64.S
> for more details)? There is no any single word about that. I do not
> count comment which states what is going on. purgatory on entry does
> not assume any value in registers. Are you going to use that feature
> for something in the future (e.g. to differentiate between callers
> and/or Linux versions if it be needed)?

It has been a long time now, but as I recall the reason was to just
have things well defined and to make certain that we were not
accidentially exporting anything except the stack pointer for
applications to depend upon.

0/NULL is a good choice because if you are expecting pointer for some
strange reason interesting things happen.

purgatory is definitely not the only target and the C version of
purgatory was actually written well after kexec came into existence.

Is there any particular reason why you are asking?

> By the way, interestingly it is not done if preserve_context is in
> force.

Something different is done, and all of the registers should be
preserved from the when the return to Linux.

In theory you can swap between to kernels with the preserve_context
case. Technically I like the ability but I don't know that it has ever
achieved much uptake.

Eric

2013-10-11 11:05:46

by Daniel Kiper

[permalink] [raw]
Subject: Re: kexec: Clearing registers just before jumping into purgatory

On Fri, Oct 11, 2013 at 03:08:43AM -0700, [email protected] wrote:
> Daniel Kiper <[email protected]> writes:
>
> > Hi,
> >
> > Could you explain why do you clear all registers just before jumping
> > into purgatory (please look into arch/x86/kernel/relocate_kernel_64.S
> > for more details)? There is no any single word about that. I do not
> > count comment which states what is going on. purgatory on entry does
> > not assume any value in registers. Are you going to use that feature
> > for something in the future (e.g. to differentiate between callers
> > and/or Linux versions if it be needed)?
>
> It has been a long time now, but as I recall the reason was to just
> have things well defined and to make certain that we were not
> accidentially exporting anything except the stack pointer for
> applications to depend upon.
>
> 0/NULL is a good choice because if you are expecting pointer for some
> strange reason interesting things happen.

This covers more or less with my expectations.

> purgatory is definitely not the only target and the C version of
> purgatory was actually written well after kexec came into existence.
>
> Is there any particular reason why you are asking?

Yes, we (Xen guys) are discussing is it worth to do it or not in our
kexec implementation. I think that yes because we used Linux Kernel
kexec implementation as a base for our work and we use kexec-tools too.
So we should be aligined to what currently is in the wild. David do not
agree with me. You could find more here:

http://lists.xen.org/archives/html/xen-devel/2013-10/msg00710.html
http://lists.xen.org/archives/html/xen-devel/2013-10/msg00296.html

What is your opinion in that case?

> > By the way, interestingly it is not done if preserve_context is in
> > force.
>
> Something different is done, and all of the registers should be
> preserved from the when the return to Linux.

I expected that but purgatory does nothing with them.
However, maybe I missed something.

> In theory you can swap between to kernels with the preserve_context
> case. Technically I like the ability but I don't know that it has ever
> achieved much uptake.

I think that this is nice idea too. However, I have not seen its usage in real.
Even once there was an idea to remove that stuff from Linux Kernel.

Daniel

2013-10-11 12:52:45

by Vivek Goyal

[permalink] [raw]
Subject: Re: kexec: Clearing registers just before jumping into purgatory

On Fri, Oct 11, 2013 at 01:04:55PM +0200, Daniel Kiper wrote:

[..]
> > In theory you can swap between to kernels with the preserve_context
> > case. Technically I like the ability but I don't know that it has ever
> > achieved much uptake.
>
> I think that this is nice idea too. However, I have not seen its usage in real.
> Even once there was an idea to remove that stuff from Linux Kernel.

I have not seen anybody using it. I don't even know if it works or not. If
nobody is using it, I will advocate removing it. It introduces extra
complexity and makes new code changes more difficult.

Thanks
Vivek

2013-10-11 15:38:04

by Matthew Garrett

[permalink] [raw]
Subject: Re: kexec: Clearing registers just before jumping into purgatory

On Fri, Oct 11, 2013 at 08:52:06AM -0400, Vivek Goyal wrote:
> On Fri, Oct 11, 2013 at 01:04:55PM +0200, Daniel Kiper wrote:
>
> [..]
> > > In theory you can swap between to kernels with the preserve_context
> > > case. Technically I like the ability but I don't know that it has ever
> > > achieved much uptake.
> >
> > I think that this is nice idea too. However, I have not seen its usage in real.
> > Even once there was an idea to remove that stuff from Linux Kernel.
>
> I have not seen anybody using it. I don't even know if it works or not.

It works. I'm using it.

--
Matthew Garrett | [email protected]

2013-10-11 15:46:36

by Vivek Goyal

[permalink] [raw]
Subject: Re: kexec: Clearing registers just before jumping into purgatory

On Fri, Oct 11, 2013 at 04:37:27PM +0100, Matthew Garrett wrote:
> On Fri, Oct 11, 2013 at 08:52:06AM -0400, Vivek Goyal wrote:
> > On Fri, Oct 11, 2013 at 01:04:55PM +0200, Daniel Kiper wrote:
> >
> > [..]
> > > > In theory you can swap between to kernels with the preserve_context
> > > > case. Technically I like the ability but I don't know that it has ever
> > > > achieved much uptake.
> > >
> > > I think that this is nice idea too. However, I have not seen its usage in real.
> > > Even once there was an idea to remove that stuff from Linux Kernel.
> >
> > I have not seen anybody using it. I don't even know if it works or not.
>
> It works. I'm using it.

Hi Matthew,

Just Curious. How is it useful. IOW, what's your use case of booting a new
kernel and then jumping back.

Thanks
Vivek

2013-10-11 15:48:36

by Matthew Garrett

[permalink] [raw]
Subject: Re: kexec: Clearing registers just before jumping into purgatory

On Fri, Oct 11, 2013 at 11:44:50AM -0400, Vivek Goyal wrote:

> Just Curious. How is it useful. IOW, what's your use case of booting a new
> kernel and then jumping back.

I'm kexecing into a kernel with a modified /dev/mem, modifying the
original kernel and then jumping back into it.

--
Matthew Garrett | [email protected]

2013-10-11 16:33:27

by Richard Weinberger

[permalink] [raw]
Subject: Re: kexec: Clearing registers just before jumping into purgatory

On Fri, Oct 11, 2013 at 5:48 PM, Matthew Garrett <[email protected]> wrote:
> On Fri, Oct 11, 2013 at 11:44:50AM -0400, Vivek Goyal wrote:
>
>> Just Curious. How is it useful. IOW, what's your use case of booting a new
>> kernel and then jumping back.
>
> I'm kexecing into a kernel with a modified /dev/mem, modifying the
> original kernel and then jumping back into it.

How do you update the original kernel?
Sounds interesting.

--
Thanks,
//richard

2013-10-11 16:40:11

by Matthew Garrett

[permalink] [raw]
Subject: Re: kexec: Clearing registers just before jumping into purgatory

On Fri, Oct 11, 2013 at 06:33:23PM +0200, Richard Weinberger wrote:
> On Fri, Oct 11, 2013 at 5:48 PM, Matthew Garrett <[email protected]> wrote:
> > On Fri, Oct 11, 2013 at 11:44:50AM -0400, Vivek Goyal wrote:
> >
> >> Just Curious. How is it useful. IOW, what's your use case of booting a new
> >> kernel and then jumping back.
> >
> > I'm kexecing into a kernel with a modified /dev/mem, modifying the
> > original kernel and then jumping back into it.
>
> How do you update the original kernel?

It's still in RAM, so the same way you'd modify any other arbitrary
physical address?

--
Matthew Garrett | [email protected]

2013-10-11 16:42:38

by Richard Weinberger

[permalink] [raw]
Subject: Re: kexec: Clearing registers just before jumping into purgatory

On Fri, Oct 11, 2013 at 6:39 PM, Matthew Garrett <[email protected]> wrote:
> On Fri, Oct 11, 2013 at 06:33:23PM +0200, Richard Weinberger wrote:
>> On Fri, Oct 11, 2013 at 5:48 PM, Matthew Garrett <[email protected]> wrote:
>> > On Fri, Oct 11, 2013 at 11:44:50AM -0400, Vivek Goyal wrote:
>> >
>> >> Just Curious. How is it useful. IOW, what's your use case of booting a new
>> >> kernel and then jumping back.
>> >
>> > I'm kexecing into a kernel with a modified /dev/mem, modifying the
>> > original kernel and then jumping back into it.
>>
>> How do you update the original kernel?
>
> It's still in RAM, so the same way you'd modify any other arbitrary
> physical address?

So, you have a tool like ksplice which patches the kernel in RAM?

--
Thanks,
//richard

2013-10-11 16:44:33

by Matthew Garrett

[permalink] [raw]
Subject: Re: kexec: Clearing registers just before jumping into purgatory

On Fri, Oct 11, 2013 at 06:42:36PM +0200, Richard Weinberger wrote:
> On Fri, Oct 11, 2013 at 6:39 PM, Matthew Garrett <[email protected]> wrote:
> > On Fri, Oct 11, 2013 at 06:33:23PM +0200, Richard Weinberger wrote:
> >> On Fri, Oct 11, 2013 at 5:48 PM, Matthew Garrett <[email protected]> wrote:
> >> > On Fri, Oct 11, 2013 at 11:44:50AM -0400, Vivek Goyal wrote:
> >> >
> >> >> Just Curious. How is it useful. IOW, what's your use case of booting a new
> >> >> kernel and then jumping back.
> >> >
> >> > I'm kexecing into a kernel with a modified /dev/mem, modifying the
> >> > original kernel and then jumping back into it.
> >>
> >> How do you update the original kernel?
> >
> > It's still in RAM, so the same way you'd modify any other arbitrary
> > physical address?
>
> So, you have a tool like ksplice which patches the kernel in RAM?

I have /dev/mem and a list of addresses I want to modify.

--
Matthew Garrett | [email protected]

2013-10-11 16:47:32

by Richard Weinberger

[permalink] [raw]
Subject: Re: kexec: Clearing registers just before jumping into purgatory

Am 11.10.2013 18:44, schrieb Matthew Garrett:
> On Fri, Oct 11, 2013 at 06:42:36PM +0200, Richard Weinberger wrote:
>> On Fri, Oct 11, 2013 at 6:39 PM, Matthew Garrett <[email protected]> wrote:
>>> On Fri, Oct 11, 2013 at 06:33:23PM +0200, Richard Weinberger wrote:
>>>> On Fri, Oct 11, 2013 at 5:48 PM, Matthew Garrett <[email protected]> wrote:
>>>>> On Fri, Oct 11, 2013 at 11:44:50AM -0400, Vivek Goyal wrote:
>>>>>
>>>>>> Just Curious. How is it useful. IOW, what's your use case of booting a new
>>>>>> kernel and then jumping back.
>>>>>
>>>>> I'm kexecing into a kernel with a modified /dev/mem, modifying the
>>>>> original kernel and then jumping back into it.
>>>>
>>>> How do you update the original kernel?
>>>
>>> It's still in RAM, so the same way you'd modify any other arbitrary
>>> physical address?
>>
>> So, you have a tool like ksplice which patches the kernel in RAM?
>
> I have /dev/mem and a list of addresses I want to modify.

But you still need a magic tool which create you this list.
If you have a tool which takes two kernel images and create such
a delta, fine.
I'm interested in that tool. :-)

Thanks,
//richard

2013-10-11 16:54:31

by Vivek Goyal

[permalink] [raw]
Subject: Re: kexec: Clearing registers just before jumping into purgatory

On Fri, Oct 11, 2013 at 05:44:00PM +0100, Matthew Garrett wrote:
> On Fri, Oct 11, 2013 at 06:42:36PM +0200, Richard Weinberger wrote:
> > On Fri, Oct 11, 2013 at 6:39 PM, Matthew Garrett <[email protected]> wrote:
> > > On Fri, Oct 11, 2013 at 06:33:23PM +0200, Richard Weinberger wrote:
> > >> On Fri, Oct 11, 2013 at 5:48 PM, Matthew Garrett <[email protected]> wrote:
> > >> > On Fri, Oct 11, 2013 at 11:44:50AM -0400, Vivek Goyal wrote:
> > >> >
> > >> >> Just Curious. How is it useful. IOW, what's your use case of booting a new
> > >> >> kernel and then jumping back.
> > >> >
> > >> > I'm kexecing into a kernel with a modified /dev/mem, modifying the
> > >> > original kernel and then jumping back into it.
> > >>
> > >> How do you update the original kernel?
> > >
> > > It's still in RAM, so the same way you'd modify any other arbitrary
> > > physical address?
> >
> > So, you have a tool like ksplice which patches the kernel in RAM?
>
> I have /dev/mem and a list of addresses I want to modify.

Why to boot in a second kernel to modify first kernel's RAM. Why not
do it directly from the first kernel itself (until and unless we want
first kernel to be stopped while doing those modifications).

I guess one could hibernate, modify the image and resume to do the
similar thing.

Thanks
Vivek

2013-10-11 16:56:13

by Matthew Garrett

[permalink] [raw]
Subject: Re: kexec: Clearing registers just before jumping into purgatory

On Fri, Oct 11, 2013 at 06:47:19PM +0200, Richard Weinberger wrote:

> But you still need a magic tool which create you this list.

I just read /proc/kallsyms. I'm really not doing anything complicated.

> If you have a tool which takes two kernel images and create such
> a delta, fine.

Isn't that ksplice?

--
Matthew Garrett | [email protected]

2013-10-11 16:56:46

by Matthew Garrett

[permalink] [raw]
Subject: Re: kexec: Clearing registers just before jumping into purgatory

On Fri, Oct 11, 2013 at 12:53:51PM -0400, Vivek Goyal wrote:
> On Fri, Oct 11, 2013 at 05:44:00PM +0100, Matthew Garrett wrote:
> > I have /dev/mem and a list of addresses I want to modify.
>
> Why to boot in a second kernel to modify first kernel's RAM. Why not
> do it directly from the first kernel itself (until and unless we want
> first kernel to be stopped while doing those modifications).

Because the kernel in question won't let me do that.

--
Matthew Garrett | [email protected]

2013-10-11 16:59:49

by Richard Weinberger

[permalink] [raw]
Subject: Re: kexec: Clearing registers just before jumping into purgatory

Am 11.10.2013 18:55, schrieb Matthew Garrett:
> On Fri, Oct 11, 2013 at 06:47:19PM +0200, Richard Weinberger wrote:
>
>> But you still need a magic tool which create you this list.
>
> I just read /proc/kallsyms. I'm really not doing anything complicated.
>
>> If you have a tool which takes two kernel images and create such
>> a delta, fine.
>
> Isn't that ksplice?

So, you have a variant of ksplice which is able to kexec?

Thanks,
//richard

2013-10-11 17:02:23

by Matthew Garrett

[permalink] [raw]
Subject: Re: kexec: Clearing registers just before jumping into purgatory

On Fri, Oct 11, 2013 at 06:59:41PM +0200, Richard Weinberger wrote:
> Am 11.10.2013 18:55, schrieb Matthew Garrett:
> > On Fri, Oct 11, 2013 at 06:47:19PM +0200, Richard Weinberger wrote:
> >
> >> But you still need a magic tool which create you this list.
> >
> > I just read /proc/kallsyms. I'm really not doing anything complicated.
> >
> >> If you have a tool which takes two kernel images and create such
> >> a delta, fine.
> >
> > Isn't that ksplice?
>
> So, you have a variant of ksplice which is able to kexec?

No, I manually look up some addresses from /proc/kallsyms and then
modify them in the second kernel.

--
Matthew Garrett | [email protected]

2013-10-11 20:47:42

by Eric W. Biederman

[permalink] [raw]
Subject: Re: kexec: Clearing registers just before jumping into purgatory

Matthew Garrett <[email protected]> writes:

> On Fri, Oct 11, 2013 at 06:59:41PM +0200, Richard Weinberger wrote:
>> Am 11.10.2013 18:55, schrieb Matthew Garrett:
>> > On Fri, Oct 11, 2013 at 06:47:19PM +0200, Richard Weinberger wrote:
>> >
>> >> But you still need a magic tool which create you this list.
>> >
>> > I just read /proc/kallsyms. I'm really not doing anything complicated.
>> >
>> >> If you have a tool which takes two kernel images and create such
>> >> a delta, fine.
>> >
>> > Isn't that ksplice?
>>
>> So, you have a variant of ksplice which is able to kexec?
>
> No, I manually look up some addresses from /proc/kallsyms and then
> modify them in the second kernel.

An interesting approach I think most of the rest of us would have just
built a module, or rebuilt our kernels.

Now if this is a backwards argument to remove that silly code path it
totally fails because now we know the code has not bit-rotted and
that there are active users.

If you are still pushing the signed-boot agenda I eagerly await your
patches to make all of this work in a sensible way with signed binaries.

Eric

2013-10-11 20:51:15

by Matthew Garrett

[permalink] [raw]
Subject: Re: kexec: Clearing registers just before jumping into purgatory

On Fri, Oct 11, 2013 at 01:44:19PM -0700, Eric W. Biederman wrote:
> Matthew Garrett <[email protected]> writes:
> > No, I manually look up some addresses from /proc/kallsyms and then
> > modify them in the second kernel.
>
> An interesting approach I think most of the rest of us would have just
> built a module, or rebuilt our kernels.

Well yeah, but my kernel refuses to load unsigned modules, so.

> Now if this is a backwards argument to remove that silly code path it
> totally fails because now we know the code has not bit-rotted and
> that there are active users.

No, it's not any argument of the kind.

> If you are still pushing the signed-boot agenda I eagerly await your
> patches to make all of this work in a sensible way with signed binaries.

Vivek's working on a separate kexec system call for that, as we agreed
with Linus at LPC.

--
Matthew Garrett | [email protected]

2013-10-11 22:15:56

by Eric W. Biederman

[permalink] [raw]
Subject: Re: kexec: Clearing registers just before jumping into purgatory

Daniel Kiper <[email protected]> writes:

> On Fri, Oct 11, 2013 at 03:08:43AM -0700, [email protected] wrote:
>> Daniel Kiper <[email protected]> writes:
>>
>> > Hi,
>> >
>> > Could you explain why do you clear all registers just before jumping
>> > into purgatory (please look into arch/x86/kernel/relocate_kernel_64.S
>> > for more details)? There is no any single word about that. I do not
>> > count comment which states what is going on. purgatory on entry does
>> > not assume any value in registers. Are you going to use that feature
>> > for something in the future (e.g. to differentiate between callers
>> > and/or Linux versions if it be needed)?
>>
>> It has been a long time now, but as I recall the reason was to just
>> have things well defined and to make certain that we were not
>> accidentially exporting anything except the stack pointer for
>> applications to depend upon.
>>
>> 0/NULL is a good choice because if you are expecting pointer for some
>> strange reason interesting things happen.
>
> This covers more or less with my expectations.
>
>> purgatory is definitely not the only target and the C version of
>> purgatory was actually written well after kexec came into existence.
>>
>> Is there any particular reason why you are asking?
>
> Yes, we (Xen guys) are discussing is it worth to do it or not in our
> kexec implementation. I think that yes because we used Linux Kernel
> kexec implementation as a base for our work and we use kexec-tools too.
> So we should be aligined to what currently is in the wild. David do not
> agree with me. You could find more here:
>
> http://lists.xen.org/archives/html/xen-devel/2013-10/msg00710.html
> http://lists.xen.org/archives/html/xen-devel/2013-10/msg00296.html
>
> What is your opinion in that case?

I can see documenting the registers other than the stack pointer
as undefined. (A stack pointer is needed to implement PIC code).

For the implementation I recommend setting these registers to known
values. The issue is that your implementation will not change much and
if you don't set the registers to known values someone may develop a
dependency on what you happen to have those registers set to.

It is easier to force a fixed value into a register that isn't hard to
maintain into your registers than to discover when you make a change
that there is some odd client that depends on some value that just
happened to be in your register, and that your necessary change is now
made 10x harder by a client you can't afford to break that depends on a
bug in the previous implementation.

So yes I strongly recommend setting the registers to a 0 in this case.

>> Something different is done, and all of the registers should be
>> preserved from the when the return to Linux.
>
> I expected that but purgatory does nothing with them.
> However, maybe I missed something.

Yes. I think I am mostly in the document that you can't depend on them,
but keep them fixed to prevent problematic dependencies creeping in
because something just works...

Eric

2013-10-14 18:25:35

by Daniel Kiper

[permalink] [raw]
Subject: Re: kexec: Clearing registers just before jumping into purgatory

On Fri, Oct 11, 2013 at 03:15:38PM -0700, [email protected] wrote:
> Daniel Kiper <[email protected]> writes:
>
> > On Fri, Oct 11, 2013 at 03:08:43AM -0700, [email protected] wrote:
> >> Daniel Kiper <[email protected]> writes:

[...]

> > What is your opinion in that case?
>
> I can see documenting the registers other than the stack pointer
> as undefined. (A stack pointer is needed to implement PIC code).
>
> For the implementation I recommend setting these registers to known
> values. The issue is that your implementation will not change much and
> if you don't set the registers to known values someone may develop a
> dependency on what you happen to have those registers set to.
>
> It is easier to force a fixed value into a register that isn't hard to
> maintain into your registers than to discover when you make a change
> that there is some odd client that depends on some value that just
> happened to be in your register, and that your necessary change is now
> made 10x harder by a client you can't afford to break that depends on a
> bug in the previous implementation.
>
> So yes I strongly recommend setting the registers to a 0 in this case.

Thank you for this explanation. I think that it is worth to add relevant
comment to arch/x86/kernel/relocate_kernel_*.S and purgatory entry.
I will try to prepare something when we work out nice thing for Xen.

Daniel