LinuxLists.cc - [GIT PULL] x86/kaslr for v3.14

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

So I pulled this, but one question:

On Mon, Jan 20, 2014 at 8:47 AM, H. Peter Anvin <[email protected]> wrote:
> +config RANDOMIZE_BASE
> + bool "Randomize the address of the kernel image"
> + depends on RELOCATABLE
> + depends on !HIBERNATION

How fundamental is that "!HIBERNATION" issue? Right now that
anti-dependency on hibernation support will mean that no distro kernel
will actually use the kernel address space randomization. Which
long-term is a problem.

I'm not sure HIBERNATION is really getting all that much use, but I
suspect distros would still want to support it.

Is it just a temporary "I wasn't able to make it work, need to get
some PM people involved", or is it something really fundamental?

Linus

2014-01-20 23:00:55

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On 01/20/2014 02:54 PM, Linus Torvalds wrote:
> So I pulled this, but one question:
>
> On Mon, Jan 20, 2014 at 8:47 AM, H. Peter Anvin <[email protected]> wrote:
>> +config RANDOMIZE_BASE
>> + bool "Randomize the address of the kernel image"
>> + depends on RELOCATABLE
>> + depends on !HIBERNATION
>
> How fundamental is that "!HIBERNATION" issue? Right now that
> anti-dependency on hibernation support will mean that no distro kernel
> will actually use the kernel address space randomization. Which
> long-term is a problem.
>
> I'm not sure HIBERNATION is really getting all that much use, but I
> suspect distros would still want to support it.
>
> Is it just a temporary "I wasn't able to make it work, need to get
> some PM people involved", or is it something really fundamental?
>

Kees, could you comment?

-hpa

2014-01-20 23:12:17

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On Mon, Jan 20, 2014 at 2:54 PM, Linus Torvalds
<[email protected]> wrote:
> So I pulled this, but one question:

.. oh, and since I decided to test it, and was looking for problems:
enabling kaslr breaks "perf". The *profile* looks fine, but the
disassembly doesn't work.

I'm not entirely surprised. I decided I wanted to test it for a
reason, after all. So it's not unexpected, but perhaps people hadn't
thought about it, and clearly hadn't tested it.

Kernel modules disassemble fine, so clearly perf knows about code that
moves around, but apparently it gets surprised when the core vmlinux
file disassembly doesn't match addresses.

Linus

2014-01-20 23:14:19

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On 01/20/2014 03:12 PM, Linus Torvalds wrote:
> On Mon, Jan 20, 2014 at 2:54 PM, Linus Torvalds
> <[email protected]> wrote:
>> So I pulled this, but one question:
>
> .. oh, and since I decided to test it, and was looking for problems:
> enabling kaslr breaks "perf". The *profile* looks fine, but the
> disassembly doesn't work.
>
> I'm not entirely surprised. I decided I wanted to test it for a
> reason, after all. So it's not unexpected, but perhaps people hadn't
> thought about it, and clearly hadn't tested it.
>
> Kernel modules disassemble fine, so clearly perf knows about code that
> moves around, but apparently it gets surprised when the core vmlinux
> file disassembly doesn't match addresses.
>

So this is presumably something that needs to be fixed in perf?

-hpa

2014-01-21 05:18:48

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On Mon, Jan 20, 2014 at 2:54 PM, Linus Torvalds
<[email protected]> wrote:
> So I pulled this, but one question:
>
> On Mon, Jan 20, 2014 at 8:47 AM, H. Peter Anvin <[email protected]> wrote:
>> +config RANDOMIZE_BASE
>> + bool "Randomize the address of the kernel image"
>> + depends on RELOCATABLE
>> + depends on !HIBERNATION
>
> How fundamental is that "!HIBERNATION" issue? Right now that
> anti-dependency on hibernation support will mean that no distro kernel
> will actually use the kernel address space randomization. Which
> long-term is a problem.
>
> I'm not sure HIBERNATION is really getting all that much use, but I
> suspect distros would still want to support it.
>
> Is it just a temporary "I wasn't able to make it work, need to get
> some PM people involved", or is it something really fundamental?

Right, this is a "need to get PM people involved" situation. When
kASLR was being designed, hibernation learning about the kernel base
looked like a separable problem, and given the very long list of
requirements for making it work at all, I carved this out as "future
work".

As for perf, it's similar -- it's another entirely solvable problem,
but perf needs to be untaught some of its assumptions.

We've had a static kernel base forever, so I'm expecting some bumps in
the road here. I'm hopeful none of it will be too painful, though.

-Kees

--
Kees Cook
Chrome OS Security

2014-01-21 09:00:43

by Peter Zijlstra

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On Mon, Jan 20, 2014 at 03:13:33PM -0800, H. Peter Anvin wrote:
> On 01/20/2014 03:12 PM, Linus Torvalds wrote:
> > On Mon, Jan 20, 2014 at 2:54 PM, Linus Torvalds
> > <[email protected]> wrote:
> >> So I pulled this, but one question:
> >
> > .. oh, and since I decided to test it, and was looking for problems:
> > enabling kaslr breaks "perf". The *profile* looks fine, but the
> > disassembly doesn't work.
> >
> > I'm not entirely surprised. I decided I wanted to test it for a
> > reason, after all. So it's not unexpected, but perhaps people hadn't
> > thought about it, and clearly hadn't tested it.
> >
> > Kernel modules disassemble fine, so clearly perf knows about code that
> > moves around, but apparently it gets surprised when the core vmlinux
> > file disassembly doesn't match addresses.
> >
>
> So this is presumably something that needs to be fixed in perf?

Where do we learn about the offset from userspace?

2014-01-21 10:27:11

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

* Linus Torvalds <[email protected]> wrote:

> On Mon, Jan 20, 2014 at 2:54 PM, Linus Torvalds
> <[email protected]> wrote:
> > So I pulled this, but one question:
>
> .. oh, and since I decided to test it, and was looking for problems:
> enabling kaslr breaks "perf". The *profile* looks fine, but the
> disassembly doesn't work.
>
> I'm not entirely surprised. I decided I wanted to test it for a
> reason, after all. So it's not unexpected, but perhaps people hadn't
> thought about it, and clearly hadn't tested it.
>
> Kernel modules disassemble fine, so clearly perf knows about code
> that moves around, but apparently it gets surprised when the core
> vmlinux file disassembly doesn't match addresses.

Hm, live annotation of the kernel image is a relatively new perf
feature, and KASLR predated that (by years) - which would at least in
part explain why it went unnoticed. (Although it does not excuse the
lack of testing.)

I've Cc:-ed Adrian Hunter, the author of the /proc/kcore annotation
feature.

Adrian: enabling x86 KASLR in Linus's latestest kernel via
CONFIG_RANDOMIZE_BASE=y [this is also present in the -tip tree] breaks
/proc/kcore annotation, because the kernel image position is now per
bootup randomized and perf's annotation code gets surprised by that.

Thanks,

Ingo

2014-01-21 13:56:56

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On 01/21/2014 02:27 AM, Ingo Molnar wrote:
>
> Hm, live annotation of the kernel image is a relatively new perf
> feature, and KASLR predated that (by years) - which would at least in
> part explain why it went unnoticed. (Although it does not excuse the
> lack of testing.)
>

kASLR is new, but on 32 bits we have relocated the kernel for a long
time. kASLR is the first use case of relocating the 64-bit kernel, though.

-hpa

2014-01-21 14:03:18

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

* H. Peter Anvin <[email protected]> wrote:

> On 01/21/2014 02:27 AM, Ingo Molnar wrote:
> >
> > Hm, live annotation of the kernel image is a relatively new perf
> > feature, and KASLR predated that (by years) - which would at least in
> > part explain why it went unnoticed. (Although it does not excuse the
> > lack of testing.)
>
> kASLR is new, but on 32 bits we have relocated the kernel for a long
> time. [...]

I doubt many people develop on 32-bit x86, and the group of people
looking at annotated 32-bit assembly kernel profiles ought to be
another order of magnitude smaller than that ...

> [...] kASLR is the first use case of relocating the 64-bit kernel,
> though.

Yeah.

Thanks,

Ingo

2014-01-21 14:06:42

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On 01/21/2014 06:03 AM, Ingo Molnar wrote:
>
> * H. Peter Anvin <[email protected]> wrote:
>
>> On 01/21/2014 02:27 AM, Ingo Molnar wrote:
>>>
>>> Hm, live annotation of the kernel image is a relatively new perf
>>> feature, and KASLR predated that (by years) - which would at least in
>>> part explain why it went unnoticed. (Although it does not excuse the
>>> lack of testing.)
>>
>> kASLR is new, but on 32 bits we have relocated the kernel for a long
>> time. [...]
>
> I doubt many people develop on 32-bit x86, and the group of people
> looking at annotated 32-bit assembly kernel profiles ought to be
> another order of magnitude smaller than that ...
>

Yes... I was commenting on the statement that "kASLR predated that by
years". It hasn't been common.

-hpa

2014-01-21 14:14:26

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

* H. Peter Anvin <[email protected]> wrote:

> On 01/21/2014 06:03 AM, Ingo Molnar wrote:
> >
> > * H. Peter Anvin <[email protected]> wrote:
> >
> >> On 01/21/2014 02:27 AM, Ingo Molnar wrote:
> >>>
> >>> Hm, live annotation of the kernel image is a relatively new perf
> >>> feature, and KASLR predated that (by years) - which would at least in
> >>> part explain why it went unnoticed. (Although it does not excuse the
> >>> lack of testing.)
> >>
> >> kASLR is new, but on 32 bits we have relocated the kernel for a long
> >> time. [...]
> >
> > I doubt many people develop on 32-bit x86, and the group of people
> > looking at annotated 32-bit assembly kernel profiles ought to be
> > another order of magnitude smaller than that ...
>
> Yes... I was commenting on the statement that "kASLR predated that
> by years". It hasn't been common.

Ah, I didn't mean to suggest that it's an old upstream feature: what I
mean is that the KASLR patch is pretty old, and it has been deployed
by the Chromium guys for quite some time, and by others?

It was just never combined with perf live annotation which is a recent
perf feature.

Anyway ... I suspect it's the fixing of the bug that matters most, not
its genealogy ;)

Thanks,

Ingo

2014-01-21 14:18:46

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On 01/21/2014 06:14 AM, Ingo Molnar wrote:
>
> Ah, I didn't mean to suggest that it's an old upstream feature: what I
> mean is that the KASLR patch is pretty old, and it has been deployed
> by the Chromium guys for quite some time, and by others?
>
> It was just never combined with perf live annotation which is a recent
> perf feature.
>

Ah, right. Parallel convergence.

> Anyway ... I suspect it's the fixing of the bug that matters most, not
> its genealogy ;)

Indeed.

-hpa

2014-01-21 14:21:26

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On 01/21/2014 01:00 AM, Peter Zijlstra wrote:
>>
>> So this is presumably something that needs to be fixed in perf?
>
> Where do we learn about the offset from userspace?
>

Now this is tricky... if this offset is too easy to get it completely
defeats kASLR. On the other hand, I presume that if we are exporting
/proc/kcore we're not secure anyway. Kees, I assume that in "secure"
mode perf annotations simply wouldn't work anyway?

-hpa

2014-01-21 14:39:45

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

* H. Peter Anvin <[email protected]> wrote:

> On 01/21/2014 01:00 AM, Peter Zijlstra wrote:
> >>
> >> So this is presumably something that needs to be fixed in perf?
> >
> > Where do we learn about the offset from userspace?
> >
>
> Now this is tricky... if this offset is too easy to get it
> completely defeats kASLR. On the other hand, I presume that if we
> are exporting /proc/kcore we're not secure anyway. Kees, I assume
> that in "secure" mode perf annotations simply wouldn't work anyway?

So /proc/kcore is:

aldebaran:~> ls -l /proc/kcore
-r-------- 1 root root 140737486266368 Jan 21 15:35 /proc/kcore

i.e. root only.

The thing is, one of my first remarks on this whole KASLR series was
that tooling needs to work. I suggested that the kernel should only
expose non-randomized addresses and that all facilities need to
continue to 'just work' with those. That argument was ignored AFAICS
and the problem still isn't solved.

I'd argue that solving it in the kernel instead of making all tooling
variants aware of KASLR one by one is a far more intelligent and
efficient solution ...

Thanks,

Ingo

2014-01-21 14:52:46

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On 01/21/2014 06:39 AM, Ingo Molnar wrote:
>>
>> Now this is tricky... if this offset is too easy to get it
>> completely defeats kASLR. On the other hand, I presume that if we
>> are exporting /proc/kcore we're not secure anyway. Kees, I assume
>> that in "secure" mode perf annotations simply wouldn't work anyway?
>
> So /proc/kcore is:
>
> aldebaran:~> ls -l /proc/kcore
> -r-------- 1 root root 140737486266368 Jan 21 15:35 /proc/kcore
>
> i.e. root only.
>

Yes, I mean if we want to be sealed against user space intrusion I'm
assuming /proc/kcore is unavailable, but that we don't

> The thing is, one of my first remarks on this whole KASLR series was
> that tooling needs to work. I suggested that the kernel should only
> expose non-randomized addresses and that all facilities need to
> continue to 'just work' with those. That argument was ignored AFAICS
> and the problem still isn't solved.
>
> I'd argue that solving it in the kernel instead of making all tooling
> variants aware of KASLR one by one is a far more intelligent and
> efficient solution ...

Not ignored, but found not to really work all that well (we had that
discussion in the context of relocated kernels, too.) The problem you
end up with is that as soon as you run into situations where you have to
deal with pointers during debugging, be it using kgdb, stack dumps or
whatever, all the work that you have done in the kernel to try to hide
relocation from the debug infrastructure all of a sudden becomes a huge
liability, and ends up backfiring in a horrific way.

-hpa

2014-01-21 14:56:33

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

* H. Peter Anvin <[email protected]> wrote:

> > The thing is, one of my first remarks on this whole KASLR series
> > was that tooling needs to work. I suggested that the kernel should
> > only expose non-randomized addresses and that all facilities need
> > to continue to 'just work' with those. That argument was ignored
> > AFAICS and the problem still isn't solved.
> >
> > I'd argue that solving it in the kernel instead of making all
> > tooling variants aware of KASLR one by one is a far more
> > intelligent and efficient solution ...
>
> Not ignored, but found not to really work all that well (we had that
> discussion in the context of relocated kernels, too.) The problem
> you end up with is that as soon as you run into situations where you
> have to deal with pointers during debugging, be it using kgdb, stack
> dumps or whatever, all the work that you have done in the kernel to
> try to hide relocation from the debug infrastructure all of a sudden
> becomes a huge liability, and ends up backfiring in a horrific way.

The thing is, that 'huge liability' is now pushed into tooling, which
isn't in any better position to judge a piece of data in a backtrace
than the kernel - in fact it's in an arguably worse position, as it
does not generate that data.

kgdb is an entirely different animal, I'm talking about the 99%
usecase: code profiling and tooling interpreting code addresses that
come from the kernel.

Thanks,

Ingo

2014-01-21 18:37:04

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On Tue, Jan 21, 2014 at 6:20 AM, H. Peter Anvin <[email protected]> wrote:
> On 01/21/2014 01:00 AM, Peter Zijlstra wrote:
>>>
>>> So this is presumably something that needs to be fixed in perf?
>>
>> Where do we learn about the offset from userspace?
>
> Now this is tricky... if this offset is too easy to get it completely
> defeats kASLR. On the other hand, I presume that if we are exporting
> /proc/kcore we're not secure anyway. Kees, I assume that in "secure"
> mode perf annotations simply wouldn't work anyway?

The goal scope of the kernel base address randomization is to keep it
secret from non-root users, confined processes, and/or remote systems.
For local secrecy, if you're running with kaslr and you haven't set
kptr_restrict, dmesg_restrict, and perf_event_paranoid, that's a
problem since you're likely leaking things trivially through
/proc/kallsyms, dmesg, and/or perf.

-Kees

--
Kees Cook
Chrome OS Security

2014-01-23 09:39:25

by Pavel Machek

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On Mon 2014-01-20 14:54:12, Linus Torvalds wrote:
> So I pulled this, but one question:
>
> On Mon, Jan 20, 2014 at 8:47 AM, H. Peter Anvin <[email protected]> wrote:
> > +config RANDOMIZE_BASE
> > + bool "Randomize the address of the kernel image"
> > + depends on RELOCATABLE
> > + depends on !HIBERNATION
>
> How fundamental is that "!HIBERNATION" issue? Right now that
> anti-dependency on hibernation support will mean that no distro kernel
> will actually use the kernel address space randomization. Which
> long-term is a problem.

Hibernation does some interesting tricks, and it did depend on image
layout staying same.

Rafael did some great work on x86-64 to enable resuming different
kernel version, so this should no longer be a problem (on
x86-64). Just test it, failure will not be subtle.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2014-01-26 10:16:33

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On Mon, Jan 20, 2014 at 5:47 PM, H. Peter Anvin <[email protected]> wrote:
> Hi Linus,
>
> This enables kernel address space randomization for x86.
>
> The following changes since commit d0e639c9e06d44e713170031fe05fb60ebe680af:
>
> Linux 3.12-rc4 (2013-10-06 14:00:20 -0700)
>
> are available in the git repository at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86-kaslr-for-linus
>
> The head of this tree is da2b6fb990cf782b18952f534ec7323453bc4fc9.
>
> x86, kaslr: Clarify RANDOMIZE_BASE_MAX_OFFSET (2014-01-14 10:45:56 -0800)
>
> ----------------------------------------------------------------
> H. Peter Anvin (2):
> x86, boot: Rename get_flags() and check_flags() to *_cpuflags()
> x86, kaslr: Add a circular multiply for better bit diffusion
>
> Kees Cook (10):
> x86, boot: Move CPU flags out of cpucheck
> x86, kaslr: Return location from decompress_kernel
> x86, kaslr: Provide randomness functions
> x86, kaslr: Select random position from e820 maps
> x86, kaslr: Report kernel offset on panic
> x86, kaslr: Raise the maximum virtual address to -1 GiB on x86_64
> x86/relocs: Add percpu fixup for GNU ld 2.23
> x86, kaslr: Mix entropy sources together as needed
> x86, kaslr: Use char array to gain sizeof sanity
> x86, kaslr: Clarify RANDOMIZE_BASE_MAX_OFFSET
>
> Michael Davidson (1):
> x86, relocs: Add more per-cpu gold special cases
>
> Wei Yongjun (1):
> x86, kaslr: Remove unused including <linux/version.h>

Currently we print the kernel offset only upon a panic() using the
panic notifier list.
This way it does not show up if the kernel hits a BUG() in process
context or something less critical.
Wouldn't make more sense to report the offset in every dump_stack() or
show_regs() call?
Or at least add a hint that the kernel was built with kaslr enabled.

I'm a heavy user of addr2line -e vmlinux <booboo-happened-here>, so I
want to make sure that I don't waste
too much time with finding out that the addresses are useless. :-)

--
Thanks,
//richard

2014-01-27 05:33:39

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On 01/26/2014 02:16 AM, Richard Weinberger wrote:
>
> Currently we print the kernel offset only upon a panic() using the
> panic notifier list.
> This way it does not show up if the kernel hits a BUG() in process
> context or something less critical.
> Wouldn't make more sense to report the offset in every dump_stack() or
> show_regs() call?

No, because that information is available to user space unless we panic.

-hpa

2014-01-27 06:49:14

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On Mon, Jan 27, 2014 at 6:33 AM, H. Peter Anvin <[email protected]> wrote:
> On 01/26/2014 02:16 AM, Richard Weinberger wrote:
>>
>> Currently we print the kernel offset only upon a panic() using the
>> panic notifier list.
>> This way it does not show up if the kernel hits a BUG() in process
>> context or something less critical.
>> Wouldn't make more sense to report the offset in every dump_stack() or
>> show_regs() call?
>
> No, because that information is available to user space unless we panic.

Didn't you mean non-root?
I thought one has to set dmesg_restrict anyways if kASLR is used.

And isn't the offset available to perf too?
Of course only for root, but still user space.

--
Thanks,
//richard

2014-01-27 06:51:46

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On 01/26/2014 10:49 PM, Richard Weinberger wrote:
>>
>> No, because that information is available to user space unless we panic.
>
> Didn't you mean non-root?
> I thought one has to set dmesg_restrict anyways if kASLR is used.
>
> And isn't the offset available to perf too?
> Of course only for root, but still user space.
>

For certain system security levels one want to protect even from a rogue
root. In those cases, leaking that information via dmesg and perf isn't
going to work, either.

With lower security settings, by all means...

-hpa

2014-01-27 06:53:23

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

Of course, stack traces themselves contain that information, so one
could argue that oops=panic is required in order for kASLR to provide
any kind of security against root. (oops=panic is probably a good idea
in secure environments anyway...)

-hpa

2014-01-27 07:34:35

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

Am 27.01.2014 07:52, schrieb H. Peter Anvin:
> Of course, stack traces themselves contain that information, so one
> could argue that oops=panic is required in order for kASLR to provide
> any kind of security against root. (oops=panic is probably a good idea
> in secure environments anyway...)

Now I understand your point.

/proc/<pid>/stack and a world-readable /boot also need to be disabled.
Deploying a secure kASLR is not easy, especially for end-user distros.

Maybe a CONFIG_RANDOMIZE_BASE_I_MEAN_IT which disables various sources of
information leakage would help too. ;-)

Thanks,
//richard

2014-01-27 07:38:42

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

* H. Peter Anvin <[email protected]> wrote:

> On 01/26/2014 10:49 PM, Richard Weinberger wrote:
> >>
> >> No, because that information is available to user space unless we panic.
> >
> > Didn't you mean non-root?
> > I thought one has to set dmesg_restrict anyways if kASLR is used.
> >
> > And isn't the offset available to perf too?
> > Of course only for root, but still user space.
> >
>
> For certain system security levels one want to protect even from a
> rogue root. In those cases, leaking that information via dmesg and
> perf isn't going to work, either.
>
> With lower security settings, by all means...

The 'no' was categorical and unconditional though, so is the right
answer perhaps something more along the lines of:

'Yes, the random offset can be reported in an oops, as long as
high security setups can turn off the reporting of the offset,
in their idealistic attempt to protect the system against root.'

?

I also still think that in addition to reporting the offset,
automatically 'un-randomizing' the oopses and warnings would be useful
as well: with a clear to recognize indicator used for every value
unrandomized, such as capitalizing their first hexa digit.

Let me show a mockup of how I think it could work:

raw 64-bit original:

[ 246.085174] <IRQ> [<ffffffff8264fbf6>] dump_stack+0x46/0x58
[ 246.098352] [<ffffffff82054fb6>] warn_slowpath_fmt+0x46/0x50
[ 246.104786] [<ffffffff825710d6>] dev_watchdog+0x246/0x250
[ 246.110923] [<ffffffff82570e90>] ? dev_deactivate_queue.constprop.31+0x80/0x80
[ 246.119097] [<ffffffff8206092a>] call_timer_fn+0x3a/0x110
[ 246.125224] [<ffffffff8206280f>] ? update_process_times+0x6f/0x80

64-bit un-randomized:

[ 246.085174] <IRQ> [<FFFFFFFF8164fbf6>] dump_stack+0x46/0x58
[ 246.091633] [<FFFFFFFF81054ecc>] warn_slowpath_common+0x8c/0xc0
[ 246.098352] [<FFFFFFFF81054fb6>] warn_slowpath_fmt+0x46/0x50
[ 246.104786] [<FFFFFFFF815710d6>] dev_watchdog+0x246/0x250
[ 246.110923] [<FFFFFFFF81570e90>] ? dev_deactivate_queue.constprop.31+0x80/0x80
[ 246.119097] [<FFFFFFFF8106092a>] call_timer_fn+0x3a/0x110
[ 246.125224] [<FFFFFFFF8106280f>] ? update_process_times+0x6f/0x80

Note how the hex values of unrandomized kernel text start with capital
letters, and how their values match up System.map and vmlinux symbol
values.

raw 32-bit randomized:

[ 39.054098] [<c20ded55>] ? __jump_label_update+0x45/0x60
[ 39.064852] [<c2057aa2>] ? queue_work_on+0x32/0x70
[ 39.074570] [<c20085b1>] ? mark_tsc_unstable+0x21/0x60
[ 39.084980] [<c2f03af6>] ? tsc_init+0x326/0x344
[ 39.094175] [<c2eff9c5>] ? start_kernel+0x2c7/0x356

32-bit un-randomized:

[ 39.054098] [<C10ded55>] ? __jump_label_update+0x45/0x60
[ 39.064852] [<C1057aa2>] ? queue_work_on+0x32/0x70
[ 39.074570] [<C10085b1>] ? mark_tsc_unstable+0x21/0x60
[ 39.084980] [<C1f03af6>] ? tsc_init+0x326/0x344
[ 39.094175] [<C1eff9c5>] ? start_kernel+0x2c7/0x356

This looks eminently useful to me, I could plug those hexa values into
gdb straight away to look up a symbol instead of having to subtract
the random offset first.

This would do 99% of the unrandomizing job for the user/developer (and
not the least, for tooling), without obfuscating oopses as it would be
clear on which values the unrandomizing was performed, without losing
information.

But only reporting the random offset with every raw oops and stack
dump would lead to equivalent information as well. We can tweak this
as we go, it's not like these details are ABIs - I just still think
that the KASLR feature isn't nearly as utility oriented as it should
be.

Thanks,

Ingo

2014-01-27 07:43:41

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

* Ingo Molnar <[email protected]> wrote:

>
> * H. Peter Anvin <[email protected]> wrote:
>
> > On 01/26/2014 10:49 PM, Richard Weinberger wrote:
> > >>
> > >> No, because that information is available to user space unless we panic.
> > >
> > > Didn't you mean non-root?
> > > I thought one has to set dmesg_restrict anyways if kASLR is used.
> > >
> > > And isn't the offset available to perf too?
> > > Of course only for root, but still user space.
> > >
> >
> > For certain system security levels one want to protect even from a
> > rogue root. In those cases, leaking that information via dmesg and
> > perf isn't going to work, either.
> >
> > With lower security settings, by all means...
>
> The 'no' was categorical and unconditional though, so is the right
> answer perhaps something more along the lines of:
>
> 'Yes, the random offset can be reported in an oops, as long as
> high security setups can turn off the reporting of the offset,
> in their idealistic attempt to protect the system against root.'
>
> ?

'reporting of the offset' should probably be 'reporting kernel data' -
there's many possible ways an oops (and its associated raw stack dump)
can leak the offset, I'm not sure this can ever be made 'safe' against
a rougue root.

Not giving kernel originated debug information at all would. (At the
cost of reducing the utility of having that root user around, of
course.)

Thanks,

Ingo

2014-01-27 07:59:48

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

Am 27.01.2014 08:38, schrieb Ingo Molnar:
>
> * H. Peter Anvin <[email protected]> wrote:
>
>> On 01/26/2014 10:49 PM, Richard Weinberger wrote:
>>>>
>>>> No, because that information is available to user space unless we panic.
>>>
>>> Didn't you mean non-root?
>>> I thought one has to set dmesg_restrict anyways if kASLR is used.
>>>
>>> And isn't the offset available to perf too?
>>> Of course only for root, but still user space.
>>>
>>
>> For certain system security levels one want to protect even from a
>> rogue root. In those cases, leaking that information via dmesg and
>> perf isn't going to work, either.
>>
>> With lower security settings, by all means...
>
> The 'no' was categorical and unconditional though, so is the right
> answer perhaps something more along the lines of:
>
> 'Yes, the random offset can be reported in an oops, as long as
> high security setups can turn off the reporting of the offset,
> in their idealistic attempt to protect the system against root.'
>
> ?
>
> I also still think that in addition to reporting the offset,
> automatically 'un-randomizing' the oopses and warnings would be useful
> as well: with a clear to recognize indicator used for every value
> unrandomized, such as capitalizing their first hexa digit.
>
> Let me show a mockup of how I think it could work:
>
> raw 64-bit original:
>
> [ 246.085174] <IRQ> [<ffffffff8264fbf6>] dump_stack+0x46/0x58
> [ 246.098352] [<ffffffff82054fb6>] warn_slowpath_fmt+0x46/0x50
> [ 246.104786] [<ffffffff825710d6>] dev_watchdog+0x246/0x250
> [ 246.110923] [<ffffffff82570e90>] ? dev_deactivate_queue.constprop.31+0x80/0x80
> [ 246.119097] [<ffffffff8206092a>] call_timer_fn+0x3a/0x110
> [ 246.125224] [<ffffffff8206280f>] ? update_process_times+0x6f/0x80
>
> 64-bit un-randomized:
>
> [ 246.085174] <IRQ> [<FFFFFFFF8164fbf6>] dump_stack+0x46/0x58
> [ 246.091633] [<FFFFFFFF81054ecc>] warn_slowpath_common+0x8c/0xc0
> [ 246.098352] [<FFFFFFFF81054fb6>] warn_slowpath_fmt+0x46/0x50
> [ 246.104786] [<FFFFFFFF815710d6>] dev_watchdog+0x246/0x250
> [ 246.110923] [<FFFFFFFF81570e90>] ? dev_deactivate_queue.constprop.31+0x80/0x80
> [ 246.119097] [<FFFFFFFF8106092a>] call_timer_fn+0x3a/0x110
> [ 246.125224] [<FFFFFFFF8106280f>] ? update_process_times+0x6f/0x80
>
> Note how the hex values of unrandomized kernel text start with capital
> letters, and how their values match up System.map and vmlinux symbol
> values.
>
> raw 32-bit randomized:
>
> [ 39.054098] [<c20ded55>] ? __jump_label_update+0x45/0x60
> [ 39.064852] [<c2057aa2>] ? queue_work_on+0x32/0x70
> [ 39.074570] [<c20085b1>] ? mark_tsc_unstable+0x21/0x60
> [ 39.084980] [<c2f03af6>] ? tsc_init+0x326/0x344
> [ 39.094175] [<c2eff9c5>] ? start_kernel+0x2c7/0x356
>
> 32-bit un-randomized:
>
> [ 39.054098] [<C10ded55>] ? __jump_label_update+0x45/0x60
> [ 39.064852] [<C1057aa2>] ? queue_work_on+0x32/0x70
> [ 39.074570] [<C10085b1>] ? mark_tsc_unstable+0x21/0x60
> [ 39.084980] [<C1f03af6>] ? tsc_init+0x326/0x344
> [ 39.094175] [<C1eff9c5>] ? start_kernel+0x2c7/0x356
>
> This looks eminently useful to me, I could plug those hexa values into
> gdb straight away to look up a symbol instead of having to subtract
> the random offset first.
>
> This would do 99% of the unrandomizing job for the user/developer (and
> not the least, for tooling), without obfuscating oopses as it would be
> clear on which values the unrandomizing was performed, without losing
> information.

I like this idea.

Hopefully nothing breaks if the mix lower and upper case hex numbers. =)
If so we could still inject a line like
"[<fffffffffffffffe>] __unrandomize_addr+0x0/0x0" into the trace
to mark a an un-randomized one.
Or a <UN-RANDOM> like <IRQ> on x86_64...

Thanks,
//richard

2014-01-27 17:05:28

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On Sun, Jan 26, 2014 at 10:49 PM, Richard Weinberger
<[email protected]> wrote:
> On Mon, Jan 27, 2014 at 6:33 AM, H. Peter Anvin <[email protected]> wrote:
>> On 01/26/2014 02:16 AM, Richard Weinberger wrote:
>>>
>>> Currently we print the kernel offset only upon a panic() using the
>>> panic notifier list.
>>> This way it does not show up if the kernel hits a BUG() in process
>>> context or something less critical.
>>> Wouldn't make more sense to report the offset in every dump_stack() or
>>> show_regs() call?
>>
>> No, because that information is available to user space unless we panic.
>
> Didn't you mean non-root?
> I thought one has to set dmesg_restrict anyways if kASLR is used.
>
> And isn't the offset available to perf too?
> Of course only for root, but still user space.

Setting dmesg_restrict is done mostly in an effort to try to lock down
access to dmesg since it'll likely contain enough clues to help an
attacker. System owners need to avoid dmesg getting sprayed into
/var/log world-readable, or available via privileged debugging
daemons, etc. Since keeping dmesg secret from non-root users is going
to be error-prone, I had a goal of keeping the offset out of dmesg
while the system is still running -- hence doing it only at panic
time.

Finding the offset as the (unconfined) root user is extremely easy, so
I personally see no reason to hide it from root (and it would be very
irritating for things like perf, too). I view kASLR as a tool for
statistical defense against confined processes or remote attacks.

I would argue that decoding a non-panic oops on a running system is
entirely possible as-is, since the offset can be found from
/proc/kallsyms as root. It was the dead system that needed the offset
exported: via text in the panic, or via an ELF note in a core.

-Kees

--
Kees Cook
Chrome OS Security

2014-01-27 17:20:23

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

Am 27.01.2014 18:05, schrieb Kees Cook:
> I would argue that decoding a non-panic oops on a running system is
> entirely possible as-is, since the offset can be found from
> /proc/kallsyms as root. It was the dead system that needed the offset
> exported: via text in the panic, or via an ELF note in a core.

The problem is that you have to pickup information from two sources.
As a kernel developer users/customers often show you a backtrace (oops or panic)
and want you do find the problem.
They barley manage it copy&paste the topmost full trace from dmesg or /var/log/messages.
If I have to ask them a bit later to tell me the offset from /proc/kallsyms or something else
I'm lost. Mostly because they have already rebooted the box...

Thanks,
//richard

2014-01-27 17:24:19

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On Mon, Jan 27, 2014 at 9:20 AM, Richard Weinberger <[email protected]> wrote:
> Am 27.01.2014 18:05, schrieb Kees Cook:
>> I would argue that decoding a non-panic oops on a running system is
>> entirely possible as-is, since the offset can be found from
>> /proc/kallsyms as root. It was the dead system that needed the offset
>> exported: via text in the panic, or via an ELF note in a core.
>
> The problem is that you have to pickup information from two sources.
> As a kernel developer users/customers often show you a backtrace (oops or panic)
> and want you do find the problem.
> They barley manage it copy&paste the topmost full trace from dmesg or /var/log/messages.
> If I have to ask them a bit later to tell me the offset from /proc/kallsyms or something else
> I'm lost. Mostly because they have already rebooted the box...

As long as I can turn it off, I'd be happy. :)
/proc/sys/kernel/kaslr_in_oops or something?

-Kees

--
Kees Cook
Chrome OS Security

2014-01-28 06:28:12

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

* Kees Cook <[email protected]> wrote:

> On Mon, Jan 27, 2014 at 9:20 AM, Richard Weinberger <[email protected]> wrote:
> > Am 27.01.2014 18:05, schrieb Kees Cook:
> >> I would argue that decoding a non-panic oops on a running system is
> >> entirely possible as-is, since the offset can be found from
> >> /proc/kallsyms as root. It was the dead system that needed the offset
> >> exported: via text in the panic, or via an ELF note in a core.
> >
> > The problem is that you have to pickup information from two sources.
> > As a kernel developer users/customers often show you a backtrace (oops or panic)
> > and want you do find the problem.
> > They barley manage it copy&paste the topmost full trace from dmesg or /var/log/messages.
> > If I have to ask them a bit later to tell me the offset from /proc/kallsyms or something else
> > I'm lost. Mostly because they have already rebooted the box...
>
> As long as I can turn it off, I'd be happy. :)
> /proc/sys/kernel/kaslr_in_oops or something?

Yeah, as long as it decodes by default.

Thanks,

Ingo

2014-01-28 08:26:11

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

Am 28.01.2014 07:28, schrieb Ingo Molnar:
>
> * Kees Cook <[email protected]> wrote:
>
>> On Mon, Jan 27, 2014 at 9:20 AM, Richard Weinberger <[email protected]> wrote:
>>> Am 27.01.2014 18:05, schrieb Kees Cook:
>>>> I would argue that decoding a non-panic oops on a running system is
>>>> entirely possible as-is, since the offset can be found from
>>>> /proc/kallsyms as root. It was the dead system that needed the offset
>>>> exported: via text in the panic, or via an ELF note in a core.
>>>
>>> The problem is that you have to pickup information from two sources.
>>> As a kernel developer users/customers often show you a backtrace (oops or panic)
>>> and want you do find the problem.
>>> They barley manage it copy&paste the topmost full trace from dmesg or /var/log/messages.
>>> If I have to ask them a bit later to tell me the offset from /proc/kallsyms or something else
>>> I'm lost. Mostly because they have already rebooted the box...
>>
>> As long as I can turn it off, I'd be happy. :)
>> /proc/sys/kernel/kaslr_in_oops or something?

Would be nice to have! :)

> Yeah, as long as it decodes by default.

Yep.
I like Ingo's idea (capital letters as indicators).
Are we all fine with that?

Thanks,
//richard

2014-01-28 15:55:21

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On 01/28/2014 12:25 AM, Richard Weinberger wrote:
>
> Yep.
> I like Ingo's idea (capital letters as indicators).
> Are we all fine with that?
>

I guess it is extremely unlikely that we'd have a kernel address without
any letters... the "general solutions only please" part of my brain
wants to scream to handle that corner case, though. The most logical
way would be to change the brackets.

It doesn't matter much, though.

-hpa

2014-01-28 16:25:43

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

Am 28.01.2014 16:55, schrieb H. Peter Anvin:
> On 01/28/2014 12:25 AM, Richard Weinberger wrote:
>>
>> Yep.
>> I like Ingo's idea (capital letters as indicators).
>> Are we all fine with that?
>>
>
> I guess it is extremely unlikely that we'd have a kernel address without
> any letters... the "general solutions only please" part of my brain
> wants to scream to handle that corner case, though. The most logical
> way would be to change the brackets.
>
> It doesn't matter much, though.

I fear both solutions will break various scripts.
For example scripts/markup_oops.pl looks for \[\<([a-z0-9]+)\>\].

Thanks,
//richard

2014-01-28 16:31:09

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On 01/28/2014 08:25 AM, Richard Weinberger wrote:
>
> I fear both solutions will break various scripts.
> For example scripts/markup_oops.pl looks for \[\<([a-z0-9]+)\>\].
>

I don't think there is any way to not break anything... we're
introducing a new kind of object ("normalized kernel pointer") here.

-hpa

2014-01-28 16:51:19

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On Tue, Jan 28, 2014 at 8:30 AM, H. Peter Anvin <[email protected]> wrote:
>
> I don't think there is any way to not break anything... we're
> introducing a new kind of object ("normalized kernel pointer") here.

I suspect we could just drop the addresses entirely if we have a
symbolic version. It's not like the hex addresses are all that
interesting, and in commit messages I actually end up editing them out
just because they are worthless noise.

So we could just do "schedule+0x45/0x368" without the actual hex
address in brackets..

Linus

2014-01-28 17:05:16

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

* Linus Torvalds <[email protected]> wrote:

> On Tue, Jan 28, 2014 at 8:30 AM, H. Peter Anvin <[email protected]> wrote:
> >
> > I don't think there is any way to not break anything... we're
> > introducing a new kind of object ("normalized kernel pointer") here.
>
> I suspect we could just drop the addresses entirely if we have a
> symbolic version. It's not like the hex addresses are all that
> interesting, and in commit messages I actually end up editing them
> out just because they are worthless noise.

Well, I often use the hex numbers to look them up and disassemble them
in a vmlinux via gdb and 'list *0x1234123412341234' - where the
vmlinux has no debuginfo. (Debuginfo takes longer to build so I
generally build without it.)

I'm quite sure other kernel developers do that as well.

> So we could just do "schedule+0x45/0x368" without the actual hex
> address in brackets..

AFAICS this won't work in a symbol-less vmlinux. Is there some trick
to do it with gdb?

Thanks,

Ingo

2014-01-28 17:12:59

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On Tue, Jan 28, 2014 at 9:05 AM, Ingo Molnar <[email protected]> wrote:
>
> Well, I often use the hex numbers to look them up and disassemble them
> in a vmlinux via gdb and 'list *0x1234123412341234' - where the
> vmlinux has no debuginfo. (Debuginfo takes longer to build so I
> generally build without it.)

Why the heck wouldn't you do that? Just do

list schedule+0x45

instead.

> AFAICS this won't work in a symbol-less vmlinux. Is there some trick
> to do it with gdb?

Why would you have a symbol-less vmlinux? The only reason to strip
vmlinux is because you were crazy enough to build with
CONFIG_DEBUG_INFO and the damn debug info is so large that it won't
fit on your root partition. But dammit, if you build with debug_info
and then strip the end result, you're just insane. You made your build
take ten times longer, use ten times more diskspace, and then you
throw it all away. Crazy.

So I don't think the symbol-less version is worth even worrying about.
You do want to build with KALLSYMS (or whatever the config option is
called), so that the symbolic name is worth something, but once you
have the symbolc name, you're good unless you did something terminally
stupid.

Btw, we should make it harder to enable CONFIG_DEBUG_INFO. It's a
f*cking pain. It's particularly nasty when you do "make allmodconfig"
and it enables debug-info and makes the build take forever and waste
diskspace - but nobody sane actually *boots* the end result, so that
debug info is all pointless.

Linus

2014-01-28 17:24:46

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

Am 28.01.2014 18:12, schrieb Linus Torvalds:
> On Tue, Jan 28, 2014 at 9:05 AM, Ingo Molnar <[email protected]> wrote:
>>
>> Well, I often use the hex numbers to look them up and disassemble them
>> in a vmlinux via gdb and 'list *0x1234123412341234' - where the
>> vmlinux has no debuginfo. (Debuginfo takes longer to build so I
>> generally build without it.)
>
> Why the heck wouldn't you do that? Just do
>
> list schedule+0x45
>
> instead.

Hmm, my gdb does not like that notion.

(gdb) list schedule+0x45
Function "schedule+0x45" not defined.

It thinks +0x45 is part of the function name, adding spaces also does not help.
Or am I doing something very stupid?

Thanks,
//richard

2014-01-28 17:35:43

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On Tue, Jan 28, 2014 at 9:24 AM, Richard Weinberger <[email protected]> wrote:
>
> Hmm, my gdb does not like that notion.
>
> (gdb) list schedule+0x45
> Function "schedule+0x45" not defined.

I don't have a debug build, so maybe it's something specific to gdb
actually seeing type information and then being confused..

Can you do "x/10i schedule+0x45" because that definitely works for me.
But I literally detest DBUG_INFO builds, because it's useless crap.
99% of everything I debug is from people reporting problems on their
kernels, so it's not like my local debug info would match that anyway.

Anyway, if it's a type information thing due to DEBUG_INFO that
confuses gdb and makes it think that the "+0x45" part doesn't make
sense, you may need to add a cast to get gdb to ignore the type
information. IOW, does

list (void *)schedule + 0x45

work for you?

I happen to have gdb-7.6.50 here that I tested with, but I've used
"symbol+offset" forever afaik, so it's definitely not something new.

Linus

2014-01-28 17:52:26

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

Am 28.01.2014 18:35, schrieb Linus Torvalds:
> On Tue, Jan 28, 2014 at 9:24 AM, Richard Weinberger <[email protected]> wrote:
>>
>> Hmm, my gdb does not like that notion.
>>
>> (gdb) list schedule+0x45
>> Function "schedule+0x45" not defined.
>
> I don't have a debug build, so maybe it's something specific to gdb
> actually seeing type information and then being confused..
>
> Can you do "x/10i schedule+0x45" because that definitely works for me.
> But I literally detest DBUG_INFO builds, because it's useless crap.
> 99% of everything I debug is from people reporting problems on their
> kernels, so it's not like my local debug info would match that anyway.

x/10i schedule+0x45 works.

> Anyway, if it's a type information thing due to DEBUG_INFO that
> confuses gdb and makes it think that the "+0x45" part doesn't make
> sense, you may need to add a cast to get gdb to ignore the type
> information. IOW, does
>
> list (void *)schedule + 0x45
>
> work for you?

Nope.

(gdb) list (void *)schedule + 0x45
Function "(void *)schedule + 0x45" not defined.

> I happen to have gdb-7.6.50 here that I tested with, but I've used
> "symbol+offset" forever afaik, so it's definitely not something new.

I'm on 7.5.1.

Thanks,
//richard

2014-01-28 17:56:23

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On Tue, Jan 28, 2014 at 9:52 AM, Richard Weinberger <[email protected]> wrote:
>
> x/10i schedule+0x45 works.

Ok, so it's just list that is braindamaged. Maybe it wants a
*(schedule+0x45) or something. I dunno. You can obviously get the hex
number from just doing "p schedule+0x45" and then do that.

Whatever. I still claim that the hex numbers in the oopses are
worthless noise. The fact that gdb has "UI issues" is an unrelated
issue and perhaps not surprising.

Linus

2014-01-28 18:55:07

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

Am 28.01.2014 18:56, schrieb Linus Torvalds:
> On Tue, Jan 28, 2014 at 9:52 AM, Richard Weinberger <[email protected]> wrote:
>>
>> x/10i schedule+0x45 works.
>
> Ok, so it's just list that is braindamaged. Maybe it wants a
> *(schedule+0x45) or something. I dunno. You can obviously get the hex
> number from just doing "p schedule+0x45" and then do that.

Ah, "list *(schedule+0x45)" works.
This reminds my why I never use gdb and like addr2line so much. ;D

Thanks,
//richard

2014-01-28 19:48:29

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

* Linus Torvalds <[email protected]> wrote:

> On Tue, Jan 28, 2014 at 9:05 AM, Ingo Molnar <[email protected]> wrote:
> >
> > Well, I often use the hex numbers to look them up and disassemble them
> > in a vmlinux via gdb and 'list *0x1234123412341234' - where the
> > vmlinux has no debuginfo. (Debuginfo takes longer to build so I
> > generally build without it.)
>
> Why the heck wouldn't you do that? Just do
>
> list schedule+0x45
>
> instead.
>
> > AFAICS this won't work in a symbol-less vmlinux. Is there some trick
> > to do it with gdb?
>
> Why would you have a symbol-less vmlinux? The only reason to strip
> vmlinux is because you were crazy enough to build with

I don't think I ever stripped a vmlinux in my life, and I definitely
didn't strip this one:

phoenix:~/linux/linux> file vmlinux
vmlinux: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=0x239d56b40bb654ddd2dd704e6b585d5c90de4e12, not stripped

> CONFIG_DEBUG_INFO and the damn debug info is so large that it won't
> fit on your root partition. But dammit, if you build with debug_info
> and then strip the end result, you're just insane. [...]

So, to quote myself from 2 sentences ago:

> > vmlinux has no debuginfo. (Debuginfo takes longer to build so I
> > generally build without it.)

I really meant it when I said I build without debuginfo! :)

So, when I build a kernel, such as with a regular 'make defconfig',
the following happens in gdb:

Reading symbols from /home/mingo/tip/vmlinux...(no debugging symbols found)...done.
(gdb) list schedule+0x45
No symbol table is loaded. Use the "file" command.

Is there a way to resolve schedule+0x45 in a regular vmlinux? It was
an honest question.

> [...] You made your build take ten times longer, use ten times more
> diskspace, and then you throw it all away. Crazy.

It's so crazy that I in fact try to force off debuginfo for all my
builds, even randconfig ones:

config DEBUG_INFO
bool "Compile the kernel with debug info"
depends on DEBUG_KERNEL
# too slow build in QA
depends on 0

> So I don't think the symbol-less version is worth even worrying
> about. You do want to build with KALLSYMS (or whatever the config
> option is called), so that the symbolic name is worth something, but
> once you have the symbolc name, you're good unless you did something
> terminally stupid.

Hm, I have kallsyms on - it's a regular 'make defconfig':

phoenix:~/linux/linux> grep KALL .config
CONFIG_KALLSYMS=y

I might be doing something terminally stupid.

> Btw, we should make it harder to enable CONFIG_DEBUG_INFO. [...]

It's really not set:

phoenix:~/linux/linux> grep DEBUG_INFO .config
# CONFIG_DEBUG_INFO is not set

> [...] It's a f*cking pain. It's particularly nasty when you do "make
> allmodconfig" and it enables debug-info and makes the build take
> forever and waste diskspace - but nobody sane actually *boots* the
> end result, so that debug info is all pointless.

I actually boot (almost-)allmod and allyesconfigs - still I disable
DEBUG_INFO because it's such a strain on our planet's climate.

Thanks,

Ingo

2014-01-28 20:07:28

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On Tue, Jan 28, 2014 at 11:48 AM, Ingo Molnar <[email protected]> wrote:
>
> I really meant it when I said I build without debuginfo! :)

Ok, but so what?

As mentioned, nobody sane should build with DEBUG_INFO. But a normal
vmlinux file has the symbol information even without it.

> So, when I build a kernel, such as with a regular 'make defconfig',
> the following happens in gdb:
>
> Reading symbols from /home/mingo/tip/vmlinux...(no debugging symbols found)...done.
> (gdb) list schedule+0x45
> No symbol table is loaded. Use the "file" command.
>
> Is there a way to resolve schedule+0x45 in a regular vmlinux? It was
> an honest question.

That seems to be just a gdb bug (or "UI feature"), in that gdb likes
to give misleading error messages and requires odd syntax for some
things.

The symbols are there (see the first line):

Reading symbols from /home/mingo/tip/vmlinux...(no debugging symbols
found)...done.

and that "no debugging symbols found" is just because you don't have
the extra *debug* info.

The "list" command requires debug info to work (since that's where the
line number information is), and will not work with hex symbols either
if you don't have that. So when it says "No symbol table is loaded" it
really means "no debug information is loaded".

But you can see that the symbol is perfectly fine:

(gdb) list *(schedule+0x45)
No symbol table is loaded. Use the "file" command.
(gdb) x/3i schedule+0x45
0xffffffff81616d05 <schedule+69>: adc 0x1(%rsi),%bh
0xffffffff81616d0b <schedule+75>: callq 0xffffffff812c0b30
<blk_flush_plug_list>
0xffffffff81616d10 <schedule+80>: jmp 0xffffffff81616cdf <schedule+31>

(ok, so 0x45 wasn't the start of a real instruction, but you get the idea).

So my point is that the hex address doesn't give you *anything* that
the symbolic address doesn't give you. Unless you do truly crazy
things like actively strip the kernel.

Linus

2014-01-28 20:15:49

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On Tue, Jan 28, 2014 at 12:07:26PM -0800, Linus Torvalds wrote:
> On Tue, Jan 28, 2014 at 11:48 AM, Ingo Molnar <[email protected]> wrote:
> >
> > I really meant it when I said I build without debuginfo! :)
>
> Ok, but so what?
>
> As mentioned, nobody sane should build with DEBUG_INFO.

Shouldn't we hold that down in the Kconfig help text of DEBUG_INFO?
Something like:

"You don't need to enable this if you want symbolic names for kernel
objects. Enable CONFIG_KALLSYMS instead."

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2014-01-28 20:25:18

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On Tue, Jan 28, 2014 at 12:15 PM, Borislav Petkov <[email protected]> wrote:
>
> Shouldn't we hold that down in the Kconfig help text of DEBUG_INFO?
> Something like:
>
> "You don't need to enable this if you want symbolic names for kernel
> objects. Enable CONFIG_KALLSYMS instead."

Probably. And then we should make sure that allyesconfig/allmodconfig
don't pick it.

There *are* reasonable uses for DEBUG_INFO:

- if you very actively and extensively use kgdb, DEBUG_INFO is very useful.

- distros might want to build distro kernels with DEBUG_INFO for the
kernel debug package

but for most kernel developers, DEBUG_INFO really just bloats things
and makes the build much *much* slower, for very little gain. Sure,
you get line numbers, but let's face it, it's not that hard to just do
"x/20i <address>" and trying to match it up with the generated code
instead. And since "just match it up" model works with user-reported
oopses of random kernels, you had better be able and willing to do
that *anyway*.

If most of the oopses you decode are on your own machine with your own
kernel, you might want to try to learn to be more careful when writing
code. And I'm not even kidding.

Linus

2014-01-28 20:28:42

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

Am 28.01.2014 21:25, schrieb Linus Torvalds:
> On Tue, Jan 28, 2014 at 12:15 PM, Borislav Petkov <[email protected]> wrote:
>>
>> Shouldn't we hold that down in the Kconfig help text of DEBUG_INFO?
>> Something like:
>>
>> "You don't need to enable this if you want symbolic names for kernel
>> objects. Enable CONFIG_KALLSYMS instead."
>
> Probably. And then we should make sure that allyesconfig/allmodconfig
> don't pick it.

Let's make it depend on CONFIG_EXPERT.

Thanks,
//richard

2014-01-28 20:39:13

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On 01/28/2014 12:28 PM, Richard Weinberger wrote:
> Am 28.01.2014 21:25, schrieb Linus Torvalds:
>> On Tue, Jan 28, 2014 at 12:15 PM, Borislav Petkov <[email protected]> wrote:
>>>
>>> Shouldn't we hold that down in the Kconfig help text of DEBUG_INFO?
>>> Something like:
>>>
>>> "You don't need to enable this if you want symbolic names for kernel
>>> objects. Enable CONFIG_KALLSYMS instead."
>>
>> Probably. And then we should make sure that allyesconfig/allmodconfig
>> don't pick it.
>
> Let's make it depend on CONFIG_EXPERT.
>

That doesn't solve all*config. My scripts for all*config just overrides
it, maybe we need to do something like that by default?

-hpa

2014-01-28 20:49:19

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On Tue, Jan 28, 2014 at 12:25:15PM -0800, Linus Torvalds wrote:
> Probably. And then we should make sure that allyesconfig/allmodconfig
> don't pick it.

I'd need to think about that a bit longer as scripts/kconfig/conf.c goes
and sets those. Unless someone has a better idea...

> There *are* reasonable uses for DEBUG_INFO:
>
> - if you very actively and extensively use kgdb, DEBUG_INFO is very useful.

Right, and at least KGDB depends on it. I was going to suggest to hide
it and make only tools select it but...

> - distros might want to build distro kernels with DEBUG_INFO for the
> kernel debug package

Yes. SUSE uses crash to analyze dumps and for that it needs -g symbols
and other distros I'm sure too in some fashion.

> but for most kernel developers, DEBUG_INFO really just bloats things
> and makes the build much *much* slower, for very little gain.

Right, your discussion kinda made me aware of this - I hadn't realized
that KALLSYMS is enough.

> Sure, you get line numbers, but let's face it, it's not that hard
> to just do "x/20i <address>" and trying to match it up with the
> generated code instead. And since "just match it up" model works with
> user-reported oopses of random kernels, you had better be able and
> willing to do that *anyway*.

Hohumm.

> If most of the oopses you decode are on your own machine with your own
> kernel, you might want to try to learn to be more careful when writing
> code. And I'm not even kidding.

Haha.

Ok, here's the help text hunk:

---
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index dbf94a7d25a8..73b36a1624ec 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -126,7 +126,11 @@ config DEBUG_INFO
This adds debug symbols to the kernel and modules (gcc -g), and
is needed if you intend to use kernel crashdump or binary object
tools like crash, kgdb, LKCD, gdb, etc on the kernel.
- Say Y here only if you plan to debug the kernel.
+
+ If you only want to have symbols in kernel traces and are not
+ going to need support for those tools above, you don't need
+ to enable this as it is a huge bloat and build slowdown.
+ Enable CONFIG_KALLSYMS instead.

If unsure, say N.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2014-01-28 21:10:00

by Dave Jones

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On Tue, Jan 28, 2014 at 12:25:15PM -0800, Linus Torvalds wrote:
> On Tue, Jan 28, 2014 at 12:15 PM, Borislav Petkov <[email protected]> wrote:
> >
> > Shouldn't we hold that down in the Kconfig help text of DEBUG_INFO?
> > Something like:
> >
> > "You don't need to enable this if you want symbolic names for kernel
> > objects. Enable CONFIG_KALLSYMS instead."
>
> Probably. And then we should make sure that allyesconfig/allmodconfig
> don't pick it.
>
> There *are* reasonable uses for DEBUG_INFO:
>
> - if you very actively and extensively use kgdb, DEBUG_INFO is very useful.
>
> - distros might want to build distro kernels with DEBUG_INFO for the
> kernel debug package

- objdump -S is kind of useful. I find myself using that quite often.
(like at least a dozen times each merge window)

pretty sure it works with DEBUG_INFO_REDUCED though, which is somewhat
faster than full DEBUG_INFO

Dave

2014-01-28 23:37:35

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On Tue, Jan 28, 2014 at 09:49:06PM +0100, Borislav Petkov wrote:
> On Tue, Jan 28, 2014 at 12:25:15PM -0800, Linus Torvalds wrote:
> > Probably. And then we should make sure that allyesconfig/allmodconfig
> > don't pick it.
>
> I'd need to think about that a bit longer as scripts/kconfig/conf.c goes
> and sets those. Unless someone has a better idea...

Maybe we can do something simple like this, it works here. We could
even generate a random name for the all.config file so that there are
no conflicts with existing stuff and even test for its presence. Or we
could hide all that in a script and so on and so on...

---
diff --git a/scripts/kconfig/Makefile b/scripts/kconfig/Makefile
index 844bc9da08da..76e889fc6384 100644
--- a/scripts/kconfig/Makefile
+++ b/scripts/kconfig/Makefile
@@ -77,7 +77,9 @@ update-po-config: $(obj)/kxgettext $(obj)/gconf.glade.h
PHONY += allnoconfig allyesconfig allmodconfig alldefconfig randconfig

allnoconfig allyesconfig allmodconfig alldefconfig randconfig: $(obj)/conf
- $< --$@ $(Kconfig)
+ $(shell echo "# CONFIG_DEBUG_INFO is not set" > /tmp/all.config)
+ $(Q)KCONFIG_ALLCONFIG=/tmp/all.config $< --$@ $(Kconfig)
+ $(Q)rm -rf /tmp/all.config

PHONY += listnewconfig olddefconfig oldnoconfig savedefconfig defconfig

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2014-01-29 06:38:19

by Mike Galbraith

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On Tue, 2014-01-28 at 16:08 -0500, Dave Jones wrote:
> On Tue, Jan 28, 2014 at 12:25:15PM -0800, Linus Torvalds wrote:
> > On Tue, Jan 28, 2014 at 12:15 PM, Borislav Petkov <[email protected]> wrote:
> > >
> > > Shouldn't we hold that down in the Kconfig help text of DEBUG_INFO?
> > > Something like:
> > >
> > > "You don't need to enable this if you want symbolic names for kernel
> > > objects. Enable CONFIG_KALLSYMS instead."
> >
> > Probably. And then we should make sure that allyesconfig/allmodconfig
> > don't pick it.
> >
> > There *are* reasonable uses for DEBUG_INFO:
> >
> > - if you very actively and extensively use kgdb, DEBUG_INFO is very useful.
> >
> > - distros might want to build distro kernels with DEBUG_INFO for the
> > kernel debug package
>
> - objdump -S is kind of useful. I find myself using that quite often.
> (like at least a dozen times each merge window)

For those with wimpy asm muscles (/me), it's very useful indeed, as are
gdb list *foo()+0x10 or *0xfeedbabedeadbeef. I always build with full
DEBUG_INFO, just keep configs lean enough that lots of kernels fit in a
2G /boot, and _never ever_ set a box up with a microscopic root.

> pretty sure it works with DEBUG_INFO_REDUCED though, which is somewhat
> faster than full DEBUG_INFO

Yeah, it does.

-Mike

2014-01-29 08:11:34

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

* Linus Torvalds <[email protected]> wrote:

> On Tue, Jan 28, 2014 at 11:48 AM, Ingo Molnar <[email protected]> wrote:
> >
> > I really meant it when I said I build without debuginfo! :)
>
> Ok, but so what?
>
> As mentioned, nobody sane should build with DEBUG_INFO. But a normal
> vmlinux file has the symbol information even without it.

So, your mail sure read to me as a rant directed at me, so I thought
I'd defend myself or something :)

I now realize that the whole episode was caused by me calling the
vmlinux 'symbol-less':

> > > > AFAICS this won't work in a symbol-less vmlinux. Is there some
> > > > trick to do it with gdb?

while I should have said 'debuginfo-less'. Mea culpa.

> > So, when I build a kernel, such as with a regular 'make defconfig',
> > the following happens in gdb:
> >
> > Reading symbols from /home/mingo/tip/vmlinux...(no debugging symbols found)...done.
> > (gdb) list schedule+0x45
> > No symbol table is loaded. Use the "file" command.
> >
> > Is there a way to resolve schedule+0x45 in a regular vmlinux? It
> > was an honest question.
>
> That seems to be just a gdb bug (or "UI feature"), in that gdb likes
> to give misleading error messages and requires odd syntax for some
> things.

Yeah. Almost as if they worked hard to make annoying users go away or
something. (LLVM is IMO a blessing because, despite its somewhat
broken licensing, it cured a similar attitude of the GCC folks. In a
way competition is more important than licensing details!)

> But you can see that the symbol is perfectly fine:
>
> (gdb) list *(schedule+0x45)

Oh, cool. Thanks for that trick - this will save me quite some time in
the future.

So we can strip absolute addresses just fine from oopses - cool.

I'd even argue to strip the hex on non-randomized kernels as long as
there's kallsyms around, and only print hex if we don't have any
symbols.

> So my point is that the hex address doesn't give you *anything* that
> the symbolic address doesn't give you. [...]

Yeah, and with your trick that's now the case for my debugging as
well, which is a nice touch.

> [...] Unless you do truly crazy things like actively strip the
> kernel.

Being crazy is something I try to avoid. (Beyond being a maintainer
of a software project as busy and stressful as the Linux kernel that is.)

Thanks,

Ingo

2014-01-29 08:25:12

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

* H. Peter Anvin <[email protected]> wrote:

> On 01/28/2014 12:28 PM, Richard Weinberger wrote:
> > Am 28.01.2014 21:25, schrieb Linus Torvalds:
> >> On Tue, Jan 28, 2014 at 12:15 PM, Borislav Petkov <[email protected]> wrote:
> >>>
> >>> Shouldn't we hold that down in the Kconfig help text of DEBUG_INFO?
> >>> Something like:
> >>>
> >>> "You don't need to enable this if you want symbolic names for kernel
> >>> objects. Enable CONFIG_KALLSYMS instead."
> >>
> >> Probably. And then we should make sure that allyesconfig/allmodconfig
> >> don't pick it.
> >
> > Let's make it depend on CONFIG_EXPERT.
> >
>
> That doesn't solve all*config. My scripts for all*config just
> overrides it, maybe we need to do something like that by default?

For features we want to exclude from allyesconfig we can define them
as logical negatives. (I've used this technique in the past to hide
silly options from allyesconfig and it works well.)

So to reign in debuginfo in allyesconfig/allmodconfig builds we could
do something like:

config SAVE_TIME_AND_DISK_SPACE
bool "Faster and leaner kernel build: compile without debug info"
default y

choice
prompt "Choose DEBUGINFO bloat level"
depends on !SAVE_TIME_AND_DISK_SPACE
default DEBUG_INFO_REDUCED
help
This option allows to select the debuginfo model.

config DEBUG_INFO_REDUCED
bool "Reduced amount of debugging information"

config DEBUG_INFO
bool "Full DEBUG info, 'planet killer' version"

endchoice

(or so, perhaps with slightly more PC text.)

That would default to Y and would disable debuginfo by default.

( And yeah, it's a bit ugly, but it beats having to hack the kconfig
language! )

Thanks,

Ingo

2014-01-29 08:27:14

by Mathias Krause

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On 29 January 2014 09:11, Ingo Molnar <[email protected]> wrote:
>> But you can see that the symbol is perfectly fine:
>>
>> (gdb) list *(schedule+0x45)
>
> Oh, cool. Thanks for that trick - this will save me quite some time in
> the future.
>
> So we can strip absolute addresses just fine from oopses - cool.
>
> I'd even argue to strip the hex on non-randomized kernels as long as
> there's kallsyms around, and only print hex if we don't have any
> symbols.

Please, don't do so! I do find the hex values in the backtrace *very*
useful as I'm using 'objdump -wdr vmlinux | less' quite often to
"browse around" in the kernel binary. Grepping for addresses from a
backtrace works quite nicely this way. Having to lookup symbols and do
base-16 arithmetics in the head (or a shell, for that matter) would
only slow down this process. So, please leave the hex values in place.
They do help a lot -- at least in the non-kASLR case.

Regards,
Mathias

2014-01-29 10:40:59

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On Wed, Jan 29, 2014 at 09:25:05AM +0100, Ingo Molnar wrote:
> That would default to Y and would disable debuginfo by default.
>
> ( And yeah, it's a bit ugly, but it beats having to hack the kconfig
> language! )

Yeah, that could be another solution - I came up with another hack last
night: http://marc.info/?l=linux-kernel&m=139095226124549

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2014-01-30 09:23:09

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

* Mathias Krause <[email protected]> wrote:

> On 29 January 2014 09:11, Ingo Molnar <[email protected]> wrote:
> >> But you can see that the symbol is perfectly fine:
> >>
> >> (gdb) list *(schedule+0x45)
> >
> > Oh, cool. Thanks for that trick - this will save me quite some time in
> > the future.
> >
> > So we can strip absolute addresses just fine from oopses - cool.
> >
> > I'd even argue to strip the hex on non-randomized kernels as long as
> > there's kallsyms around, and only print hex if we don't have any
> > symbols.
>
> Please, don't do so! I do find the hex values in the backtrace
> *very* useful as I'm using 'objdump -wdr vmlinux | less' quite often
> to "browse around" in the kernel binary. Grepping for addresses from
> a backtrace works quite nicely this way. Having to lookup symbols
> and do base-16 arithmetics in the head (or a shell, for that matter)
> would only slow down this process. So, please leave the hex values
> in place. They do help a lot -- at least in the non-kASLR case.

Well, if the consensus is that they help then we better make them
correct in the KASLR case as well ...

Thanks,

Ingo

2014-01-30 18:15:22

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On Thu, Jan 30, 2014 at 1:23 AM, Ingo Molnar <[email protected]> wrote:
>
> Well, if the consensus is that they help then we better make them
> correct in the KASLR case as well ...

In the kaslr case, the hex values cannot possibly help, since they are
meaningless due to the random offset (the external tools will *not* be
able to use them without relocation information anyway, and if the
tools can take relocation into account, they might as well just take
the symbol name into account anyway).

I'd really suggest dropping the damn thing entirely. Even if you do
use "objdump" (and I admit to doing that quite often myself), the
symbol version isn't really all that hard to use. And the thing is,
once again, the symbol version works even for kaslr, while the raw hex
does not.

I do agree that having to do the arithmetic can be annoying, but it
wouldn't actually be all that hard to write a script that turns the
"objdump" format from hex address to "symbol+offset" format. It would
probably be three lines of perl:

- parse the "hex: rest-of-line" thing
- if "rest-of-line" is of the format "<symbol>:", then remember the
hex and the symbol name.
- print out "symbol+offset: rest-of-line" where "offset" is "hex -
remembered_hex"

Even I could probably do it, and my perl knowledge is really limited
to "I can edit other peoples perl code if I'm lucky". Somebody who
actually does perl probably goes "Three lines? That's a oneliner".

Linus

2014-01-30 22:07:33

by Vivek Goyal

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On Sun, Jan 26, 2014 at 10:51:06PM -0800, H. Peter Anvin wrote:
> On 01/26/2014 10:49 PM, Richard Weinberger wrote:
> >>
> >> No, because that information is available to user space unless we panic.
> >
> > Didn't you mean non-root?
> > I thought one has to set dmesg_restrict anyways if kASLR is used.
> >
> > And isn't the offset available to perf too?
> > Of course only for root, but still user space.
> >
>
> For certain system security levels one want to protect even from a rogue
> root. In those cases, leaking that information via dmesg and perf isn't
> going to work, either.
>
> With lower security settings, by all means...

I am wondering if kdump functionality is impacted with this change.

Kexec tools prepares ELF headers for kernel memory ranges and for the
range where kernel text is mapped. So it needs to know virtual address
of the region where kernel is mapped (obtained by /proc/kcore) and
it gets the physical address where kernel is loaded from /proc/iomem.

So with this change are we planning to hide kernel text virtual address and
physical address information information from root in /proc/kcore and
/proc/iomem in anyway?

Thanks
Vivek

2014-01-31 16:57:07

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On Thu, Jan 30, 2014 at 2:07 PM, Vivek Goyal <[email protected]> wrote:
> On Sun, Jan 26, 2014 at 10:51:06PM -0800, H. Peter Anvin wrote:
>> On 01/26/2014 10:49 PM, Richard Weinberger wrote:
>> >>
>> >> No, because that information is available to user space unless we panic.
>> >
>> > Didn't you mean non-root?
>> > I thought one has to set dmesg_restrict anyways if kASLR is used.
>> >
>> > And isn't the offset available to perf too?
>> > Of course only for root, but still user space.
>> >
>>
>> For certain system security levels one want to protect even from a rogue
>> root. In those cases, leaking that information via dmesg and perf isn't
>> going to work, either.
>>
>> With lower security settings, by all means...
>
> I am wondering if kdump functionality is impacted with this change.
>
> Kexec tools prepares ELF headers for kernel memory ranges and for the
> range where kernel text is mapped. So it needs to know virtual address
> of the region where kernel is mapped (obtained by /proc/kcore) and
> it gets the physical address where kernel is loaded from /proc/iomem.
>
> So with this change are we planning to hide kernel text virtual address and
> physical address information information from root in /proc/kcore and
> /proc/iomem in anyway?

I have no intention of that. Mentioned earlier in the thread, hiding
it from root will be pretty ugly/hard/pointless:
https://lkml.org/lkml/2014/1/27/287
I would like to just keep the offset out of dmesg.

-Kees

--
Kees Cook
Chrome OS Security

2014-02-07 14:49:58

by Vivek Goyal

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On Fri, Jan 31, 2014 at 08:57:03AM -0800, Kees Cook wrote:

[..]
> I have no intention of that. Mentioned earlier in the thread, hiding
> it from root will be pretty ugly/hard/pointless:
> https://lkml.org/lkml/2014/1/27/287
> I would like to just keep the offset out of dmesg.

[ CC Dave Young ]

Hi Kees,

Dave Young is testing kdump with kaslr enabled. He is facing some issues.

One issue he mentioned is that when second kernel boots, it might be
placed in an area which is outside the reserved area for second kernel.

We reserve a certain memory for second kernel. And modify memory map of
second kernel using memmap=exactmap parameter. Looks like kernel placement
is happening before memmap=exactmap takes effect. And that seems to be
the reason that second kernel can be placed outside the reserved memory.

IOW, memmap=exactmap and kaslr don't work together. Is it possible to
first let memmap=exactmap take affect and then kaslr does its job. Or it
is too late by the time memmap=exactmap is parsed.

As a workaround, Dave is currently using "nokaslr" command line parameter
for second kernel. He is still facing issues where makedumpfile segment
faults. He is looking into it further.

I thought I will atleast bring up with issue of memmap=exactmap and kaslr
being incompatible.

Thanks
Vivek

2014-02-07 16:04:30

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On 02/07/2014 06:49 AM, Vivek Goyal wrote:
>
> Hi Kees,
>
> Dave Young is testing kdump with kaslr enabled. He is facing some issues.
>
> One issue he mentioned is that when second kernel boots, it might be
> placed in an area which is outside the reserved area for second kernel.
>
> We reserve a certain memory for second kernel. And modify memory map of
> second kernel using memmap=exactmap parameter. Looks like kernel placement
> is happening before memmap=exactmap takes effect. And that seems to be
> the reason that second kernel can be placed outside the reserved memory.
>
> IOW, memmap=exactmap and kaslr don't work together. Is it possible to
> first let memmap=exactmap take affect and then kaslr does its job. Or it
> is too late by the time memmap=exactmap is parsed.
>
> As a workaround, Dave is currently using "nokaslr" command line parameter
> for second kernel. He is still facing issues where makedumpfile segment
> faults. He is looking into it further.
>
> I thought I will atleast bring up with issue of memmap=exactmap and kaslr
> being incompatible.
>

Yes, because memmap=exactmap gets parsed too late; kaslr assumes that
the e820 information passed to it is actually correct.

Yet another cause of breakage caused by the decision on the part of
kdump to rely on command-line options.

-hpa

2014-02-07 16:25:14

by Vivek Goyal

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On Fri, Feb 07, 2014 at 08:04:13AM -0800, H. Peter Anvin wrote:
> On 02/07/2014 06:49 AM, Vivek Goyal wrote:
> >
> > Hi Kees,
> >
> > Dave Young is testing kdump with kaslr enabled. He is facing some issues.
> >
> > One issue he mentioned is that when second kernel boots, it might be
> > placed in an area which is outside the reserved area for second kernel.
> >
> > We reserve a certain memory for second kernel. And modify memory map of
> > second kernel using memmap=exactmap parameter. Looks like kernel placement
> > is happening before memmap=exactmap takes effect. And that seems to be
> > the reason that second kernel can be placed outside the reserved memory.
> >
> > IOW, memmap=exactmap and kaslr don't work together. Is it possible to
> > first let memmap=exactmap take affect and then kaslr does its job. Or it
> > is too late by the time memmap=exactmap is parsed.
> >
> > As a workaround, Dave is currently using "nokaslr" command line parameter
> > for second kernel. He is still facing issues where makedumpfile segment
> > faults. He is looking into it further.
> >
> > I thought I will atleast bring up with issue of memmap=exactmap and kaslr
> > being incompatible.
> >
>
> Yes, because memmap=exactmap gets parsed too late; kaslr assumes that
> the e820 information passed to it is actually correct.
>
> Yet another cause of breakage caused by the decision on the part of
> kdump to rely on command-line options.

[CC kexec mailing list]

Ok, I think this is high time we change kexec-tools to not use
memmap=exactmap and start passing modified memory map in bootparams. I
think only concern with that change was backward compatibility of
kexec-tools with older kernels.

IIUC, only thing which will be impacted by this change is users of
saved_max_pfn which determine the highest accessible pfn in first
kernel. Some calgary IOMMU code seems to be the only user of it now.

So may be we can create a new command line option say --pass-memmap-cmdline
to kexec-tools which forces old behavior and by default we pass memmap
in bootparams.

Or create --do-not-pass-memmap-cmdline and new users will use it. Default
will be old behavior.

Thansk
Vivek

2014-02-07 19:07:13

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On 02/07/2014 06:49 AM, Vivek Goyal wrote:
>
> As a workaround, Dave is currently using "nokaslr" command line parameter
> for second kernel. He is still facing issues where makedumpfile segment
> faults. He is looking into it further.
>

Now, let's state this: kaslr for kdump is almost certainly useless (the
amount of reserved memory is not enough to provide any meaningful
randomization, so any randomization needs to happen during the memory
reservation phase.) So disabling kaslr in the kdump kernel is entirely
appropriate.

-hpa

2014-02-07 19:44:50

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On Fri, Feb 7, 2014 at 11:07 AM, H. Peter Anvin <[email protected]> wrote:
> On 02/07/2014 06:49 AM, Vivek Goyal wrote:
>>
>> As a workaround, Dave is currently using "nokaslr" command line parameter
>> for second kernel. He is still facing issues where makedumpfile segment
>> faults. He is looking into it further.
>>
>
> Now, let's state this: kaslr for kdump is almost certainly useless (the
> amount of reserved memory is not enough to provide any meaningful
> randomization, so any randomization needs to happen during the memory
> reservation phase.) So disabling kaslr in the kdump kernel is entirely
> appropriate.

Peter covered everything already, but yeah, kaslr and kdump may not
make a lot of sense together, but regardless, yes, it only examines
e820 for memory space. It has to do all this work before the kernel
decompresses, so it's very early.

-Kees

--
Kees Cook
Chrome OS Security

2014-02-07 23:16:41

by Dave Young

[permalink] [raw]

Subject: Re: [GIT PULL] x86/kaslr for v3.14

On 02/07/14 at 11:24am, Vivek Goyal wrote:
> On Fri, Feb 07, 2014 at 08:04:13AM -0800, H. Peter Anvin wrote:
> > On 02/07/2014 06:49 AM, Vivek Goyal wrote:
> > >
> > > Hi Kees,
> > >
> > > Dave Young is testing kdump with kaslr enabled. He is facing some issues.
> > >
> > > One issue he mentioned is that when second kernel boots, it might be
> > > placed in an area which is outside the reserved area for second kernel.
> > >
> > > We reserve a certain memory for second kernel. And modify memory map of
> > > second kernel using memmap=exactmap parameter. Looks like kernel placement
> > > is happening before memmap=exactmap takes effect. And that seems to be
> > > the reason that second kernel can be placed outside the reserved memory.
> > >
> > > IOW, memmap=exactmap and kaslr don't work together. Is it possible to
> > > first let memmap=exactmap take affect and then kaslr does its job. Or it
> > > is too late by the time memmap=exactmap is parsed.
> > >
> > > As a workaround, Dave is currently using "nokaslr" command line parameter
> > > for second kernel. He is still facing issues where makedumpfile segment
> > > faults. He is looking into it further.
> > >
> > > I thought I will atleast bring up with issue of memmap=exactmap and kaslr
> > > being incompatible.
> > >
> >
> > Yes, because memmap=exactmap gets parsed too late; kaslr assumes that
> > the e820 information passed to it is actually correct.
> >
> > Yet another cause of breakage caused by the decision on the part of
> > kdump to rely on command-line options.
>
> [CC kexec mailing list]
>
> Ok, I think this is high time we change kexec-tools to not use
> memmap=exactmap and start passing modified memory map in bootparams. I
> think only concern with that change was backward compatibility of
> kexec-tools with older kernels.
>
> IIUC, only thing which will be impacted by this change is users of
> saved_max_pfn which determine the highest accessible pfn in first
> kernel. Some calgary IOMMU code seems to be the only user of it now.
>
> So may be we can create a new command line option say --pass-memmap-cmdline
> to kexec-tools which forces old behavior and by default we pass memmap
> in bootparams.
>
> Or create --do-not-pass-memmap-cmdline and new users will use it. Default
> will be old behavior.

Hi Vivek,

Chaowang <[email protected]> is looking into passing setup_data SETUP_E820_EXT
instead of using exactmap. Previously Thomas Renninger tried passing them in e820.
I did not find the old thread, but I remember it's not enough because of the
128 entries limitation.

In case for kaslr, I agree that it do not make sense for kdump kernel. IMHO it's
better to disable it automaticlly for kdump kernel but we can use the nokaslr
as workaround for now.

Thanks
Dave

2014-02-07 23:21:45