2006-10-26 09:50:57

by Magnus Damm

[permalink] [raw]
Subject: [PATCH] Kdump: Align 64-bit ELF crash notes correctly (x86_64, powerpc)

Kdump: Align 64-bit ELF crash notes correctly (x86_64, powerpc)

The current ELF code aligns data to 32-bit addresses, regardless if ELFCLASS32
or ELFCLASS64 is used. This works well for the 32-bit case, but for 64-bit
notes we should (of course) align to 64-bit addresses. At least if we intend
to follow the "ELF-64 Object File Format, Version 1.5 Draft 2, May 27, 1998".

Unfortunately this change affects 3 pieces of code:
- The regular Linux kernel: See x86_64 and powerpc changes below.
- The "crash" kernel: Needs to align properly when merging notes, see below.
- The utilities that read the vmcore files: Crash, GDB and so on.

I am sure that this change will cause all sorts of trouble if someone is using
a certain combination of kernels and tools, but I believe the best long-term
solution is simply to fix this properly as soon as possible and live with the
fact that 64-bit vmcore files may have been broken up until now.

Signed-off-by: Magnus Damm <[email protected]>
---

Compiles on x86_64, powerpc code only dry-coded.
Applies on top of 2.6.19-rc3.

arch/powerpc/kernel/crash.c | 18 +++++++++++-------
arch/x86_64/kernel/crash.c | 14 +++++++-------
fs/proc/vmcore.c | 4 ++--
3 files changed, 20 insertions(+), 16 deletions(-)

--- 0001/arch/powerpc/kernel/crash.c
+++ work/arch/powerpc/kernel/crash.c 2006-10-26 17:09:33.000000000 +0900
@@ -41,12 +41,16 @@
#define DBG(fmt...)
#endif

+#define ELF_ALIGN(x) ((x + (sizeof(elf_addr_t) - 1)) \
+ & ~(sizeof(elf_addr_t) - 1))
+
/* This keeps a track of which one is crashing cpu. */
int crashing_cpu = -1;
static cpumask_t cpus_in_crash = CPU_MASK_NONE;
cpumask_t cpus_in_sr = CPU_MASK_NONE;

-static u32 *append_elf_note(u32 *buf, char *name, unsigned type, void *data,
+static unsigned char *
+append_elf_note(unsigned char *buf, char *name, unsigned type, void *data,
size_t data_len)
{
struct elf_note note;
@@ -55,16 +59,16 @@ static u32 *append_elf_note(u32 *buf, ch
note.n_descsz = data_len;
note.n_type = type;
memcpy(buf, &note, sizeof(note));
- buf += (sizeof(note) +3)/4;
+ buf += ELF_ALIGN(sizeof(note));
memcpy(buf, name, note.n_namesz);
- buf += (note.n_namesz + 3)/4;
+ buf += ELF_ALIGN(note.n_namesz);
memcpy(buf, data, note.n_descsz);
- buf += (note.n_descsz + 3)/4;
+ buf += ELF_ALIGN(note.n_descsz);

return buf;
}

-static void final_note(u32 *buf)
+static void final_note(unsigned char *buf)
{
struct elf_note note;

@@ -77,7 +81,7 @@ static void final_note(u32 *buf)
static void crash_save_this_cpu(struct pt_regs *regs, int cpu)
{
struct elf_prstatus prstatus;
- u32 *buf;
+ unsigned char *buf;

if ((cpu < 0) || (cpu >= NR_CPUS))
return;
@@ -89,7 +93,7 @@ static void crash_save_this_cpu(struct p
* squirrelled away. ELF notes happen to provide
* all of that that no need to invent something new.
*/
- buf = (u32*)per_cpu_ptr(crash_notes, cpu);
+ buf = (unsigned char *)per_cpu_ptr(crash_notes, cpu);
if (!buf)
return;

--- 0002/arch/x86_64/kernel/crash.c
+++ work/arch/x86_64/kernel/crash.c 2006-10-26 16:58:18.000000000 +0900
@@ -28,7 +28,7 @@
/* This keeps a track of which one is crashing cpu. */
static int crashing_cpu;

-static u32 *append_elf_note(u32 *buf, char *name, unsigned type,
+static u64 *append_elf_note(u64 *buf, char *name, unsigned type,
void *data, size_t data_len)
{
struct elf_note note;
@@ -37,16 +37,16 @@ static u32 *append_elf_note(u32 *buf, ch
note.n_descsz = data_len;
note.n_type = type;
memcpy(buf, &note, sizeof(note));
- buf += (sizeof(note) +3)/4;
+ buf += (sizeof(note) + 7) / 8;
memcpy(buf, name, note.n_namesz);
- buf += (note.n_namesz + 3)/4;
+ buf += (note.n_namesz + 7) / 8;
memcpy(buf, data, note.n_descsz);
- buf += (note.n_descsz + 3)/4;
+ buf += (note.n_descsz + 7) / 8;

return buf;
}

-static void final_note(u32 *buf)
+static void final_note(u64 *buf)
{
struct elf_note note;

@@ -59,7 +59,7 @@ static void final_note(u32 *buf)
static void crash_save_this_cpu(struct pt_regs *regs, int cpu)
{
struct elf_prstatus prstatus;
- u32 *buf;
+ u64 *buf;

if ((cpu < 0) || (cpu >= NR_CPUS))
return;
@@ -72,7 +72,7 @@ static void crash_save_this_cpu(struct p
* all of that, no need to invent something new.
*/

- buf = (u32*)per_cpu_ptr(crash_notes, cpu);
+ buf = (u64*)per_cpu_ptr(crash_notes, cpu);

if (!buf)
return;
--- 0001/fs/proc/vmcore.c
+++ work/fs/proc/vmcore.c 2006-10-26 17:31:36.000000000 +0900
@@ -256,8 +256,8 @@ static int __init merge_note_headers_elf
if (nhdr_ptr->n_namesz == 0)
break;
sz = sizeof(Elf64_Nhdr) +
- ((nhdr_ptr->n_namesz + 3) & ~3) +
- ((nhdr_ptr->n_descsz + 3) & ~3);
+ ((nhdr_ptr->n_namesz + 7) & ~7) +
+ ((nhdr_ptr->n_descsz + 7) & ~7);
real_sz += sz;
nhdr_ptr = (Elf64_Nhdr*)((char*)nhdr_ptr + sz);
}


2006-10-26 14:40:59

by Vivek Goyal

[permalink] [raw]
Subject: Re: [PATCH] Kdump: Align 64-bit ELF crash notes correctly (x86_64, powerpc)

On Thu, Oct 26, 2006 at 06:49:57PM +0900, Magnus Damm wrote:
> Kdump: Align 64-bit ELF crash notes correctly (x86_64, powerpc)
>
> The current ELF code aligns data to 32-bit addresses, regardless if ELFCLASS32
> or ELFCLASS64 is used. This works well for the 32-bit case, but for 64-bit
> notes we should (of course) align to 64-bit addresses. At least if we intend
> to follow the "ELF-64 Object File Format, Version 1.5 Draft 2, May 27, 1998".
>
> Unfortunately this change affects 3 pieces of code:
> - The regular Linux kernel: See x86_64 and powerpc changes below.
> - The "crash" kernel: Needs to align properly when merging notes, see below.
> - The utilities that read the vmcore files: Crash, GDB and so on.
>
Hi Magnus,

Interesting observation. Going through the ELF-64 Object File format,
version 1.5, it does look like that note data should be aligned to
8byte boundary for 64bit and not 4 byte boundary.

But given the fact that as of today, gdb, readelf parse the notes correctly,
then they are broken too and needs to be changed?

I just looked at process core dumper (binfmt_elf.c) and that too also
seems to be creating notes aligned at 4byte boundary (alignfile()).

Same seems to be the case of /proc/kcore ((storenote()). Notes seem to
be 4byte aligned.

So looks like, everywhere in kernel and tool chain we are still following
the assumption of notes being 4byte aligned even for 64bit.

I think if you are fixing it, then please fix it for /proc/kcore and
process core dumps too so that kernel exports a consistent image and then
tool chain folks can do the modifications.

Thanks
Vivek

> I am sure that this change will cause all sorts of trouble if someone is using
> a certain combination of kernels and tools, but I believe the best long-term
> solution is simply to fix this properly as soon as possible and live with the
> fact that 64-bit vmcore files may have been broken up until now.
>
> Signed-off-by: Magnus Damm <[email protected]>
> ---
>
> Compiles on x86_64, powerpc code only dry-coded.
> Applies on top of 2.6.19-rc3.
>
> arch/powerpc/kernel/crash.c | 18 +++++++++++-------
> arch/x86_64/kernel/crash.c | 14 +++++++-------
> fs/proc/vmcore.c | 4 ++--
> 3 files changed, 20 insertions(+), 16 deletions(-)
>
> --- 0001/arch/powerpc/kernel/crash.c
> +++ work/arch/powerpc/kernel/crash.c 2006-10-26 17:09:33.000000000 +0900
> @@ -41,12 +41,16 @@
> #define DBG(fmt...)
> #endif
>
> +#define ELF_ALIGN(x) ((x + (sizeof(elf_addr_t) - 1)) \
> + & ~(sizeof(elf_addr_t) - 1))
> +
> /* This keeps a track of which one is crashing cpu. */
> int crashing_cpu = -1;
> static cpumask_t cpus_in_crash = CPU_MASK_NONE;
> cpumask_t cpus_in_sr = CPU_MASK_NONE;
>
> -static u32 *append_elf_note(u32 *buf, char *name, unsigned type, void *data,
> +static unsigned char *
> +append_elf_note(unsigned char *buf, char *name, unsigned type, void *data,
> size_t data_len)
> {
> struct elf_note note;
> @@ -55,16 +59,16 @@ static u32 *append_elf_note(u32 *buf, ch
> note.n_descsz = data_len;
> note.n_type = type;
> memcpy(buf, &note, sizeof(note));
> - buf += (sizeof(note) +3)/4;
> + buf += ELF_ALIGN(sizeof(note));
> memcpy(buf, name, note.n_namesz);
> - buf += (note.n_namesz + 3)/4;
> + buf += ELF_ALIGN(note.n_namesz);
> memcpy(buf, data, note.n_descsz);
> - buf += (note.n_descsz + 3)/4;
> + buf += ELF_ALIGN(note.n_descsz);
>
> return buf;
> }
>
> -static void final_note(u32 *buf)
> +static void final_note(unsigned char *buf)
> {
> struct elf_note note;
>
> @@ -77,7 +81,7 @@ static void final_note(u32 *buf)
> static void crash_save_this_cpu(struct pt_regs *regs, int cpu)
> {
> struct elf_prstatus prstatus;
> - u32 *buf;
> + unsigned char *buf;
>
> if ((cpu < 0) || (cpu >= NR_CPUS))
> return;
> @@ -89,7 +93,7 @@ static void crash_save_this_cpu(struct p
> * squirrelled away. ELF notes happen to provide
> * all of that that no need to invent something new.
> */
> - buf = (u32*)per_cpu_ptr(crash_notes, cpu);
> + buf = (unsigned char *)per_cpu_ptr(crash_notes, cpu);
> if (!buf)
> return;
>
> --- 0002/arch/x86_64/kernel/crash.c
> +++ work/arch/x86_64/kernel/crash.c 2006-10-26 16:58:18.000000000 +0900
> @@ -28,7 +28,7 @@
> /* This keeps a track of which one is crashing cpu. */
> static int crashing_cpu;
>
> -static u32 *append_elf_note(u32 *buf, char *name, unsigned type,
> +static u64 *append_elf_note(u64 *buf, char *name, unsigned type,
> void *data, size_t data_len)
> {
> struct elf_note note;
> @@ -37,16 +37,16 @@ static u32 *append_elf_note(u32 *buf, ch
> note.n_descsz = data_len;
> note.n_type = type;
> memcpy(buf, &note, sizeof(note));
> - buf += (sizeof(note) +3)/4;
> + buf += (sizeof(note) + 7) / 8;
> memcpy(buf, name, note.n_namesz);
> - buf += (note.n_namesz + 3)/4;
> + buf += (note.n_namesz + 7) / 8;
> memcpy(buf, data, note.n_descsz);
> - buf += (note.n_descsz + 3)/4;
> + buf += (note.n_descsz + 7) / 8;
>
> return buf;
> }
>
> -static void final_note(u32 *buf)
> +static void final_note(u64 *buf)
> {
> struct elf_note note;
>
> @@ -59,7 +59,7 @@ static void final_note(u32 *buf)
> static void crash_save_this_cpu(struct pt_regs *regs, int cpu)
> {
> struct elf_prstatus prstatus;
> - u32 *buf;
> + u64 *buf;
>
> if ((cpu < 0) || (cpu >= NR_CPUS))
> return;
> @@ -72,7 +72,7 @@ static void crash_save_this_cpu(struct p
> * all of that, no need to invent something new.
> */
>
> - buf = (u32*)per_cpu_ptr(crash_notes, cpu);
> + buf = (u64*)per_cpu_ptr(crash_notes, cpu);
>
> if (!buf)
> return;
> --- 0001/fs/proc/vmcore.c
> +++ work/fs/proc/vmcore.c 2006-10-26 17:31:36.000000000 +0900
> @@ -256,8 +256,8 @@ static int __init merge_note_headers_elf
> if (nhdr_ptr->n_namesz == 0)
> break;
> sz = sizeof(Elf64_Nhdr) +
> - ((nhdr_ptr->n_namesz + 3) & ~3) +
> - ((nhdr_ptr->n_descsz + 3) & ~3);
> + ((nhdr_ptr->n_namesz + 7) & ~7) +
> + ((nhdr_ptr->n_descsz + 7) & ~7);
> real_sz += sz;
> nhdr_ptr = (Elf64_Nhdr*)((char*)nhdr_ptr + sz);
> }

2006-10-27 05:56:04

by Magnus Damm

[permalink] [raw]
Subject: Re: [PATCH] Kdump: Align 64-bit ELF crash notes correctly (x86_64, powerpc)

Hi Vivek,

Thanks for the quick reply!

On 10/26/06, Vivek Goyal <[email protected]> wrote:
> On Thu, Oct 26, 2006 at 06:49:57PM +0900, Magnus Damm wrote:
> > Kdump: Align 64-bit ELF crash notes correctly (x86_64, powerpc)
> >
> > The current ELF code aligns data to 32-bit addresses, regardless if ELFCLASS32
> > or ELFCLASS64 is used. This works well for the 32-bit case, but for 64-bit
> > notes we should (of course) align to 64-bit addresses. At least if we intend
> > to follow the "ELF-64 Object File Format, Version 1.5 Draft 2, May 27, 1998".
> >
> > Unfortunately this change affects 3 pieces of code:
> > - The regular Linux kernel: See x86_64 and powerpc changes below.
> > - The "crash" kernel: Needs to align properly when merging notes, see below.
> > - The utilities that read the vmcore files: Crash, GDB and so on.
> >
> Hi Magnus,
>
> Interesting observation. Going through the ELF-64 Object File format,
> version 1.5, it does look like that note data should be aligned to
> 8byte boundary for 64bit and not 4 byte boundary.
>
> But given the fact that as of today, gdb, readelf parse the notes correctly,
> then they are broken too and needs to be changed?

Well, yes, other tools needs to be updated as well.

I just quickly went through the code for readelf from binutils-2.16.1
and in readelf.c, process_corefile_note_segment() always treats
alignment as 4. Broken. The bfd code in the same package seems to be
correct in most 64-bit cases though, struct elf_size_info ->
log_file_align seems to be the value we are looking for. It is set to
3 for at least elf64-{alpha, hppa, mips, s390, sparc}. I've not been
able to figure out the status for ppc and x86_64.

I do however believe that no-one has been troubled by this so far
because we happen to have well formed data. It seems like the only
string that is passed to append_elf_note() in the kdump case is "CORE"
and it is 5 bytes including terminating zero.

The old code adds 3 bytes and divides by 4 to get 2 32-bit values.
With my fix we would add 7 bytes and divide by 8 to get 1 64-bit value.

In both cases we pad to a total of 8 bytes. So we seem to be lucky with "CORE".
This also means that my fix doesn't affect the format of the data in
the current case.

> I just looked at process core dumper (binfmt_elf.c) and that too also
> seems to be creating notes aligned at 4byte boundary (alignfile()).
>
> Same seems to be the case of /proc/kcore ((storenote()). Notes seem to
> be 4byte aligned.
>
> So looks like, everywhere in kernel and tool chain we are still following
> the assumption of notes being 4byte aligned even for 64bit.
>
> I think if you are fixing it, then please fix it for /proc/kcore and
> process core dumps too so that kernel exports a consistent image and then
> tool chain folks can do the modifications.

Yes, that makes sense. I will have a look at it. I assume that
following the spec is what we want to do, but if someone disagrees
please shout _now_!

Thanks,

/ magnus