2005-01-19 23:13:29

by Janos Farkas

[permalink] [raw]
Subject: COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6

Hi Andi!

I had difficulties booting recent rc1-bkN kernels on at least two
Athlon machines (but somehow, on an *old* Pentium laptop booted with the
a very similar system just fine).

The kernel just hung very early, just after displaying "BIOS data check
successful" by lilo (22.6.1). Ctrl-Alt-Del worked to reboot, but
nothing else was shown.

It is a similar experience to Chris Bruner's post here:
> http://article.gmane.org/gmane.linux.kernel/271352

I also recall someone having similar problem with Opterons too, but
can't find just now..

rc1-bk6 didn't boot, and thus I started checking revisions:
rc1-bk3 did boot (as well as plain rc1)
rc1-bk4 didn't boot
rc1-bk7 booted *after* reverting the patch below:

> 4 days ak 1.2329.1.38 [PATCH] x86_64/i386: increase command line size
> Enlarge i386/x86-64 kernel command line to 2k
> This is useful when the kernel command line is used to pass other
> information to initrds or installers.
> On i386 it was duplicated for unknown reasons.
> Signed-off-by: Andi Kleen
> Signed-off-by: Andrew Morton
> Signed-off-by: Linus Torvalds

While arguably it's not a completely scientific approach (no plain bk7,
and no bk6 reverted was tested), I'm inclined to say this was my
problem...

Isn't this define a lilo dependence?

--
Janos | romfs is at http://romfs.sourceforge.net/ | Don't talk about silence.


2005-01-20 04:21:27

by Chris Bruner

[permalink] [raw]
Subject: Re: COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6

FYI, I found that the problem I was having was caused by the "BIOS Enhanced
Disk Drives" turned on. It was on in previous versions as well, and they
worked ok, so I assume that something has changed. In anycase turning it off
fixed my problem.

Chris Bruner

On Wed January 19 2005 06:13 pm, Janos Farkas wrote:
> Hi Andi!
>
> I had difficulties booting recent rc1-bkN kernels on at least two
> Athlon machines (but somehow, on an *old* Pentium laptop booted with the
> a very similar system just fine).
>
> The kernel just hung very early, just after displaying "BIOS data check
> successful" by lilo (22.6.1). Ctrl-Alt-Del worked to reboot, but
> nothing else was shown.
>
> It is a similar experience to Chris Bruner's post here:
> > http://article.gmane.org/gmane.linux.kernel/271352
>
> I also recall someone having similar problem with Opterons too, but
> can't find just now..
>
> rc1-bk6 didn't boot, and thus I started checking revisions:
> rc1-bk3 did boot (as well as plain rc1)
> rc1-bk4 didn't boot
>
> rc1-bk7 booted *after* reverting the patch below:
> > 4 days ak 1.2329.1.38 [PATCH] x86_64/i386: increase command line size
> > Enlarge i386/x86-64 kernel command line to 2k
> > This is useful when the kernel command line is used to pass other
> > information to initrds or installers.
> > On i386 it was duplicated for unknown reasons.
> > Signed-off-by: Andi Kleen
> > Signed-off-by: Andrew Morton
> > Signed-off-by: Linus Torvalds
>
> While arguably it's not a completely scientific approach (no plain bk7,
> and no bk6 reverted was tested), I'm inclined to say this was my
> problem...
>
> Isn't this define a lilo dependence?

--
I say, if your knees aren't green by the end of the day, you ought to
seriously re-examine your life. -- Calvin

2005-01-20 16:30:27

by Adrian Bunk

[permalink] [raw]
Subject: Re: COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6

On Thu, Jan 20, 2005 at 12:13:22AM +0100, Janos Farkas wrote:

> Hi Andi!
>
> I had difficulties booting recent rc1-bkN kernels on at least two
> Athlon machines (but somehow, on an *old* Pentium laptop booted with the
> a very similar system just fine).
>
> The kernel just hung very early, just after displaying "BIOS data check
> successful" by lilo (22.6.1). Ctrl-Alt-Del worked to reboot, but
> nothing else was shown.
>
> It is a similar experience to Chris Bruner's post here:
> > http://article.gmane.org/gmane.linux.kernel/271352
>
> I also recall someone having similar problem with Opterons too, but
> can't find just now..
>
> rc1-bk6 didn't boot, and thus I started checking revisions:
> rc1-bk3 did boot (as well as plain rc1)
> rc1-bk4 didn't boot
> rc1-bk7 booted *after* reverting the patch below:
>
> > 4 days ak 1.2329.1.38 [PATCH] x86_64/i386: increase command line size
> > Enlarge i386/x86-64 kernel command line to 2k
> > This is useful when the kernel command line is used to pass other
> > information to initrds or installers.
> > On i386 it was duplicated for unknown reasons.
> > Signed-off-by: Andi Kleen
> > Signed-off-by: Andrew Morton
> > Signed-off-by: Linus Torvalds
>
> While arguably it's not a completely scientific approach (no plain bk7,
> and no bk6 reverted was tested), I'm inclined to say this was my
> problem...
>
> Isn't this define a lilo dependence?

AOL:
- lilo 22.6.1
- CONFIG_EDD=y
- 2.6.10-mm1 and 2.6.11-rc1 did boot
- 2.6.11-rc1-mm1 and 2.6.11-rc1-mm2 didn't boot
- 2.6.11-rc1-mm2 with this ChangeSet reverted boots.

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2005-01-20 16:54:46

by Andi Kleen

[permalink] [raw]
Subject: Re: COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6

> AOL:
> - lilo 22.6.1
> - CONFIG_EDD=y
> - 2.6.10-mm1 and 2.6.11-rc1 did boot
> - 2.6.11-rc1-mm1 and 2.6.11-rc1-mm2 didn't boot
> - 2.6.11-rc1-mm2 with this ChangeSet reverted boots.

What I gather so far the problem seems to only happen with lilo
and EDID together. grub appears to work. Or did anyone
see problems with grub too?

I'll dig a bit, but reverting for now is probably best.
Thanks Linus.

-Andi

2005-01-20 16:50:07

by Linus Torvalds

[permalink] [raw]
Subject: Re: COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6



On Thu, 20 Jan 2005, Adrian Bunk wrote:
>
> On Thu, Jan 20, 2005 at 12:13:22AM +0100, Janos Farkas wrote:
> >
> > Isn't this define a lilo dependence?
>
> AOL:
> - lilo 22.6.1
> - CONFIG_EDD=y
> - 2.6.10-mm1 and 2.6.11-rc1 did boot
> - 2.6.11-rc1-mm1 and 2.6.11-rc1-mm2 didn't boot
> - 2.6.11-rc1-mm2 with this ChangeSet reverted boots.

Thanks. Reverted.

Linus

2005-01-20 20:54:39

by Eric Dumazet

[permalink] [raw]
Subject: Something very strange on x86_64 2.6.X kernels

Hi Andi

I have very strange coredumps happening on a big 64bits program.

Some background :
- This program is multi-threaded
- Machine is a dual Opteron 248 machine, 12GB ram.
- Kernel 2.6.6 (tried 2.6.10 too but problems too)
- The program uses hugetlb pages.
- The program uses prefetchnta
- The program uses about 8GB of ram.

After numerous differents core dumps of this program, and gdb debugging
I found :

Every time the crash occurs when one thread is using some ram located at
virtual address 0xffffe6xx

When examining the core image, the data saved on this page seems correct
(ie countains coherent user data). But one register (%rbx) is usually
corrupted and contains a small value (like 0x3c)

The last instruction using this register is :
prefetchnta 0x18(,%rbx,4)


Examining linux sources, I found that 0xffffe000 is 'special' (ia 32
vsyscall) and 0xffffe600 is about sigreturn subsection of this special area.

Is it possible some vm trick just kicks in and corrupts my true 64bits
program ?

Thank you
Eric Dumazet

2005-01-20 21:09:24

by Andrew Morton

[permalink] [raw]
Subject: Re: Something very strange on x86_64 2.6.X kernels

Eric Dumazet <[email protected]> wrote:
>
> Hi Andi
>
> I have very strange coredumps happening on a big 64bits program.
>
> Some background :
> - This program is multi-threaded
> - Machine is a dual Opteron 248 machine, 12GB ram.
> - Kernel 2.6.6 (tried 2.6.10 too but problems too)
> - The program uses hugetlb pages.
> - The program uses prefetchnta
> - The program uses about 8GB of ram.
>
> After numerous differents core dumps of this program, and gdb debugging
> I found :
>
> Every time the crash occurs when one thread is using some ram located at
> virtual address 0xffffe6xx

What does "using" mean? Is the program executing from that location?

> When examining the core image, the data saved on this page seems correct
> (ie countains coherent user data). But one register (%rbx) is usually
> corrupted and contains a small value (like 0x3c)
>
> The last instruction using this register is :
> prefetchnta 0x18(,%rbx,4)
>
>
> Examining linux sources, I found that 0xffffe000 is 'special' (ia 32
> vsyscall) and 0xffffe600 is about sigreturn subsection of this special area.
>
> Is it possible some vm trick just kicks in and corrupts my true 64bits
> program ?
>

Interesting. IIRC, opterons will very occasionally (and incorrectly) take
a fault when performing a prefetch against a dud pointer. The kernel will
fix that up. At a guess, I'd say tha the fixup code isn't doing the right
thing when the faulting EIP is in the vsyscall page.

2005-01-20 21:19:51

by Eric Dumazet

[permalink] [raw]
Subject: Re: Something very strange on x86_64 2.6.X kernels

Andrew Morton wrote:

> Eric Dumazet <[email protected]> wrote:

>>
>>Every time the crash occurs when one thread is using some ram located at
>>virtual address 0xffffe6xx
>
>
> What does "using" mean? Is the program executing from that location?

No, the program text is located between 0x00100000 and 0x001c6000 (no
shared libs)

0xffffe6xx is READ|WRITE data, mapped on Hugetlb fs

extract from /proc/pid/maps
ff400000-100400000 rw-s 82000000 00:0b 12960938
/huge/file

>
> Interesting. IIRC, opterons will very occasionally (and incorrectly) take
> a fault when performing a prefetch against a dud pointer. The kernel will
> fix that up. At a guess, I'd say tha the fixup code isn't doing the right
> thing when the faulting EIP is in the vsyscall page.

Maybe, but I want to say that in this case, the address 'prefetched' is
valid (ie mapped read/write by the program, on a huge page too)

Thanks
Eric Dumazet

Subject: Re: COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6

On Thu, 20 Jan 2005, Andi Kleen wrote:

>> AOL:
>> - lilo 22.6.1
>> - CONFIG_EDD=y
>> - 2.6.10-mm1 and 2.6.11-rc1 did boot
>> - 2.6.11-rc1-mm1 and 2.6.11-rc1-mm2 didn't boot
>> - 2.6.11-rc1-mm2 with this ChangeSet reverted boots.
>
> What I gather so far the problem seems to only happen with lilo
> and EDID together. grub appears to work. Or did anyone
> see problems with grub too?
>
> I'll dig a bit, but reverting for now is probably best.
> Thanks Linus.

I really suggest to push this limit to 4k. My reason is that under UML I
need to put a lot of stuff in command line and uml crash if I not extend
this limit. Can we make it depend on arhitecture?

Thanks.
---
Catalin(ux aka Dino) BOIE
catab at deuroconsult.ro
http://kernel.umbrella.ro/

2005-01-21 07:12:15

by Andi Kleen

[permalink] [raw]
Subject: Re: COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6

> I really suggest to push this limit to 4k. My reason is that under UML I
> need to put a lot of stuff in command line and uml crash if I not extend
> this limit. Can we make it depend on arhitecture?

It's dependent on the architecture already. I would like to enable
it on i386/x86-64 because the kernel command line is often used
to pass parameters to installers, and having a small limit there
can be awkward.

But first need to figure out what went wrong with EDD.

Matt D., do you have thoughts on this?

-Andi

2005-01-21 16:27:28

by Petr Vandrovec

[permalink] [raw]
Subject: Re: Something very strange on x86_64 2.6.X kernels

On Thu, Jan 20, 2005 at 09:53:36PM +0100, Eric Dumazet wrote:
>
> Examining linux sources, I found that 0xffffe000 is 'special' (ia 32
> vsyscall) and 0xffffe600 is about sigreturn subsection of this special area.
>
> Is it possible some vm trick just kicks in and corrupts my true 64bits
> program ?

Maybe I already missed answer, but try patch below. It is definitely bad
to mark syscall page as global one...

When you build program below, once as 64bit and once as 32bit, 32bit one
should print 464C457F and 64bit one should die with SIGSEGV. But when
you run both in parallel, 64bit one sometime gets SIGSEGV as it should,
sometime it gets 464C457F. (actually results below are from SMP system;
I believe that on UP you'll get reproducible 464C457F on UP system...)

vana:~/64bit-test# ./tpg32
Memory at ffffe000 is 464C457F
vana:~/64bit-test# ./tpg
Segmentation fault
vana:~/64bit-test# ./tpg32 & ./tpg
[1] 8450
Memory at ffffe000 is 464C457F
Memory at ffffe000 is 464C457F
[1]+ Exit 31 ./tpg32
vana:~/64bit-test# ./tpg32 & ./tpg
[1] 8454
Memory at ffffe000 is 464C457F
[1]+ Exit 31 ./tpg32
Segmentation fault
vana:~/64bit-test# ./tpg32 & ./tpg
[1] 8456
Memory at ffffe000 is 464C457F
Memory at ffffe000 is 464C457F
[1]+ Exit 31 ./tpg32
vana:~/64bit-test# ./tpg32 & ./tpg
[1] 8458
Memory at ffffe000 is 464C457F
Memory at ffffe000 is 464C457F
[1]+ Exit 31 ./tpg32
vana:~/64bit-test#


void main(void) {
int acc;
int i;

for (i = 0; i < 100000000; i++) ;
acc = *(volatile unsigned long*)(0xffffe000);
printf("Memory at ffffe000 is %08X\n", acc);
}

Petr


diff -urdN linux/arch/x86_64/ia32/syscall32.c linux/arch/x86_64/ia32/syscall32.c
--- linux/arch/x86_64/ia32/syscall32.c 2005-01-17 12:29:05.000000000 +0000
+++ linux/arch/x86_64/ia32/syscall32.c 2005-01-21 16:15:04.000000000 +0000
@@ -55,7 +55,7 @@
if (pte_none(*pte)) {
set_pte(pte,
mk_pte(virt_to_page(syscall32_page),
- PAGE_KERNEL_VSYSCALL));
+ PAGE_KERNEL_VSYSCALL32));
}
/* Flush only the local CPU. Other CPUs taking a fault
will just end up here again
diff -urdN linux/include/asm-x86_64/pgtable.h linux/include/asm-x86_64/pgtable.h
--- linux/include/asm-x86_64/pgtable.h 2005-01-17 12:29:11.000000000 +0000
+++ linux/include/asm-x86_64/pgtable.h 2005-01-21 16:14:44.000000000 +0000
@@ -182,6 +182,7 @@
#define PAGE_KERNEL_EXEC MAKE_GLOBAL(__PAGE_KERNEL_EXEC)
#define PAGE_KERNEL_RO MAKE_GLOBAL(__PAGE_KERNEL_RO)
#define PAGE_KERNEL_NOCACHE MAKE_GLOBAL(__PAGE_KERNEL_NOCACHE)
+#define PAGE_KERNEL_VSYSCALL32 __pgprot(__PAGE_KERNEL_VSYSCALL)
#define PAGE_KERNEL_VSYSCALL MAKE_GLOBAL(__PAGE_KERNEL_VSYSCALL)
#define PAGE_KERNEL_LARGE MAKE_GLOBAL(__PAGE_KERNEL_LARGE)
#define PAGE_KERNEL_VSYSCALL_NOCACHE MAKE_GLOBAL(__PAGE_KERNEL_VSYSCALL_NOCACHE)

2005-01-21 16:51:25

by Eric Dumazet

[permalink] [raw]
Subject: Re: Something very strange on x86_64 2.6.X kernels

Petr Vandrovec wrote:

>
> Maybe I already missed answer, but try patch below. It is definitely bad
> to mark syscall page as global one...
>

Hi Petr

If I follow you, any 64 bits program is corrupted as soon one 32bits
program using sysenter starts ?

Thank you for the patch, I will try it as soon as possible.
I tried your tpg program and had the same behavior you describe.

I confirm that avoiding the 0xFFFFE000 - 0x100000000 VM ranges is also
OK , the program never crash...

Eric
> When you build program below, once as 64bit and once as 32bit, 32bit one
> should print 464C457F and 64bit one should die with SIGSEGV. But when
> you run both in parallel, 64bit one sometime gets SIGSEGV as it should,
> sometime it gets 464C457F. (actually results below are from SMP system;
> I believe that on UP you'll get reproducible 464C457F on UP system...)
>
> vana:~/64bit-test# ./tpg32
> Memory at ffffe000 is 464C457F
> vana:~/64bit-test# ./tpg
> Segmentation fault
> vana:~/64bit-test# ./tpg32 & ./tpg
> [1] 8450
> Memory at ffffe000 is 464C457F
> Memory at ffffe000 is 464C457F
> [1]+ Exit 31 ./tpg32
> vana:~/64bit-test# ./tpg32 & ./tpg
> [1] 8454
> Memory at ffffe000 is 464C457F
> [1]+ Exit 31 ./tpg32
> Segmentation fault
> vana:~/64bit-test# ./tpg32 & ./tpg
> [1] 8456
> Memory at ffffe000 is 464C457F
> Memory at ffffe000 is 464C457F
> [1]+ Exit 31 ./tpg32
> vana:~/64bit-test# ./tpg32 & ./tpg
> [1] 8458
> Memory at ffffe000 is 464C457F
> Memory at ffffe000 is 464C457F
> [1]+ Exit 31 ./tpg32
> vana:~/64bit-test#
>
>
> void main(void) {
> int acc;
> int i;
>
> for (i = 0; i < 100000000; i++) ;
> acc = *(volatile unsigned long*)(0xffffe000);
> printf("Memory at ffffe000 is %08X\n", acc);
> }
>
> Petr
>
>
> diff -urdN linux/arch/x86_64/ia32/syscall32.c linux/arch/x86_64/ia32/syscall32.c
> --- linux/arch/x86_64/ia32/syscall32.c 2005-01-17 12:29:05.000000000 +0000
> +++ linux/arch/x86_64/ia32/syscall32.c 2005-01-21 16:15:04.000000000 +0000
> @@ -55,7 +55,7 @@
> if (pte_none(*pte)) {
> set_pte(pte,
> mk_pte(virt_to_page(syscall32_page),
> - PAGE_KERNEL_VSYSCALL));
> + PAGE_KERNEL_VSYSCALL32));
> }
> /* Flush only the local CPU. Other CPUs taking a fault
> will just end up here again
> diff -urdN linux/include/asm-x86_64/pgtable.h linux/include/asm-x86_64/pgtable.h
> --- linux/include/asm-x86_64/pgtable.h 2005-01-17 12:29:11.000000000 +0000
> +++ linux/include/asm-x86_64/pgtable.h 2005-01-21 16:14:44.000000000 +0000
> @@ -182,6 +182,7 @@
> #define PAGE_KERNEL_EXEC MAKE_GLOBAL(__PAGE_KERNEL_EXEC)
> #define PAGE_KERNEL_RO MAKE_GLOBAL(__PAGE_KERNEL_RO)
> #define PAGE_KERNEL_NOCACHE MAKE_GLOBAL(__PAGE_KERNEL_NOCACHE)
> +#define PAGE_KERNEL_VSYSCALL32 __pgprot(__PAGE_KERNEL_VSYSCALL)
> #define PAGE_KERNEL_VSYSCALL MAKE_GLOBAL(__PAGE_KERNEL_VSYSCALL)
> #define PAGE_KERNEL_LARGE MAKE_GLOBAL(__PAGE_KERNEL_LARGE)
> #define PAGE_KERNEL_VSYSCALL_NOCACHE MAKE_GLOBAL(__PAGE_KERNEL_VSYSCALL_NOCACHE)
>
>

2005-01-21 17:48:18

by Matt Domsch

[permalink] [raw]
Subject: Re: COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6

On Fri, Jan 21, 2005 at 08:11:44AM +0100, Andi Kleen wrote:
> > I really suggest to push this limit to 4k. My reason is that under UML I
> > need to put a lot of stuff in command line and uml crash if I not extend
> > this limit. Can we make it depend on arhitecture?
>
> It's dependent on the architecture already. I would like to enable
> it on i386/x86-64 because the kernel command line is often used
> to pass parameters to installers, and having a small limit there
> can be awkward.
>
> But first need to figure out what went wrong with EDD.
>
> Matt D., do you have thoughts on this?

It is definitely boot-loader dependent. Simply changing
COMMAND_LINE_SIZE from 256 to 2048 in the kernel isn't enough.

There are 2 ways the command line is passed from the boot loader into
the kernel.

Boot loader version <= 0x0201 (which LILO uses)
I believe the command line is located at the end of what was known as
the 'empty zero page', now known as the boot parameters. This part is
black magic to me.

Boot loader version >= 0x0202 (which GRUB uses)
command line can be essentially any size, located anywhere in memory,
and the boot loader tells the kernel where to find it. The EDD real
mode code uses only this case for parsing the command line, and if an
older loader is used, EDD skips parsing the command line looking
for its options.


There's little space left in the boot parameters block, my EDD code
uses nearly all that was remaining, and could use some more if it were
available. Having a longer command line would be nice too. I spoke
with hpa at OLS last summer about this, and he offered to help.
Peter?


Thanks,
Matt

--
Matt Domsch
Software Architect
Dell Linux Solutions linux.dell.com & http://www.dell.com/linux
Linux on Dell mailing lists @ http://lists.us.dell.com

2005-01-21 18:32:42

by Petr Vandrovec

[permalink] [raw]
Subject: Re: Something very strange on x86_64 2.6.X kernels

On Fri, Jan 21, 2005 at 05:49:25PM +0100, Eric Dumazet wrote:
> Petr Vandrovec wrote:
>
> >
> >Maybe I already missed answer, but try patch below. It is definitely bad
> >to mark syscall page as global one...
> >
>
> Hi Petr
>
> If I follow you, any 64 bits program is corrupted as soon one 32bits
> program using sysenter starts ?

Yes. As soon as 32bit app touches sysenter page (execution, read, whatever),
it is loaded to the processor's TLB, and as page is marked global it is not
flushed when kernel switches address space to another app - like 64bit
one. Fortunately TLB is not that big, so for most of real-world workloads
you'll not notice, but if you are doing context switches really often,
sooner or later you'll hit vsyscall page instead of data page your process
has mapped, and bad things happen.

To get your app (or any other 64bit app...) to work reliably on unpatched
kernels you should mmap one page at 0xffffe000 and forget about that page
forever...
Petr

2005-01-21 19:09:55

by H. Peter Anvin

[permalink] [raw]
Subject: Re: COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6

Matt Domsch wrote:
> On Fri, Jan 21, 2005 at 08:11:44AM +0100, Andi Kleen wrote:
>
>>>I really suggest to push this limit to 4k. My reason is that under UML I
>>>need to put a lot of stuff in command line and uml crash if I not extend
>>>this limit. Can we make it depend on arhitecture?
>>
>>It's dependent on the architecture already. I would like to enable
>>it on i386/x86-64 because the kernel command line is often used
>>to pass parameters to installers, and having a small limit there
>>can be awkward.
>>
>>But first need to figure out what went wrong with EDD.
>>
>>Matt D., do you have thoughts on this?
>
>
> It is definitely boot-loader dependent. Simply changing
> COMMAND_LINE_SIZE from 256 to 2048 in the kernel isn't enough.
>
> There are 2 ways the command line is passed from the boot loader into
> the kernel.
>
> Boot loader version <= 0x0201 (which LILO uses)
> I believe the command line is located at the end of what was known as
> the 'empty zero page', now known as the boot parameters. This part is
> black magic to me.
>
> Boot loader version >= 0x0202 (which GRUB uses)
> command line can be essentially any size, located anywhere in memory,
> and the boot loader tells the kernel where to find it. The EDD real
> mode code uses only this case for parsing the command line, and if an
> older loader is used, EDD skips parsing the command line looking
> for its options.
>
>
> There's little space left in the boot parameters block, my EDD code
> uses nearly all that was remaining, and could use some more if it were
> available. Having a longer command line would be nice too. I spoke
> with hpa at OLS last summer about this, and he offered to help.
> Peter?
>

The protocol itself doesn't encode it, but before we extend it for
protocol >= 0x0202 we need to make sure that older kernels don't break
if they get a very long command line (truncation is OK, crashing is
not.) If they do crash, we need to add a field in the header.

I don't see any reason why the boot parameter block can't be more than
one page long. I think today that it's just a static structure.

-hpa

2005-01-22 01:54:32

by Andi Kleen

[permalink] [raw]
Subject: Re: Something very strange on x86_64 2.6.X kernels

On Fri, Jan 21, 2005 at 05:26:01PM +0100, Petr Vandrovec wrote:
> On Thu, Jan 20, 2005 at 09:53:36PM +0100, Eric Dumazet wrote:
> >
> > Examining linux sources, I found that 0xffffe000 is 'special' (ia 32
> > vsyscall) and 0xffffe600 is about sigreturn subsection of this special area.
> >
> > Is it possible some vm trick just kicks in and corrupts my true 64bits
> > program ?
>
> Maybe I already missed answer, but try patch below. It is definitely bad
> to mark syscall page as global one...

Patch looks good thanks. Ugh, what a stupid bug.

I applied the patch to my tree.

-Andi

2005-01-22 02:15:57

by Linus Torvalds

[permalink] [raw]
Subject: Re: Something very strange on x86_64 2.6.X kernels



On Sat, 22 Jan 2005, Andi Kleen wrote:
>
> I applied the patch to my tree.

I already applied it as obvious ;)

Linus

2005-02-07 06:59:35

by Werner Almesberger

[permalink] [raw]
Subject: Re: COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6

Andi Kleen wrote:
> It's dependent on the architecture already. I would like to enable
> it on i386/x86-64 because the kernel command line is often used
> to pass parameters to installers, and having a small limit there
> can be awkward.

Something to keep in mind when extending the command line is that
we'll probably need a mechanism for passing additional (and
possibly large) data blocks from the boot loader soon.

The reason for this is that, if booting through kexec, it would be
attractive to pass device scan results, so that the second kernel
doesn't have to repeat the work. As an obvious extension, anyone
who wants to boot *quickly* could also pass such data from
persistent storage without actually performing the device scan at
all when the machine is booted.

The command line may be suitable for this, but to allow for passing
a lot of data, its place in memory should perhaps just be reserved,
at least until the system has passed initialization, without trying
to copy it to a "safe" place early in kernel startup.

- Werner

--
_________________________________________________________________________
/ Werner Almesberger, Buenos Aires, Argentina [email protected] /
/_http://www.almesberger.net/____________________________________________/

2005-02-12 13:58:56

by Eric W. Biederman

[permalink] [raw]
Subject: Re: COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6

Werner Almesberger <[email protected]> writes:

> Andi Kleen wrote:
> > It's dependent on the architecture already. I would like to enable
> > it on i386/x86-64 because the kernel command line is often used
> > to pass parameters to installers, and having a small limit there
> > can be awkward.
>
> Something to keep in mind when extending the command line is that
> we'll probably need a mechanism for passing additional (and
> possibly large) data blocks from the boot loader soon.
>
> The reason for this is that, if booting through kexec, it would be
> attractive to pass device scan results, so that the second kernel
> doesn't have to repeat the work. As an obvious extension, anyone
> who wants to boot *quickly* could also pass such data from
> persistent storage without actually performing the device scan at
> all when the machine is booted.
>
> The command line may be suitable for this, but to allow for passing
> a lot of data, its place in memory should perhaps just be reserved,
> at least until the system has passed initialization, without trying
> to copy it to a "safe" place early in kernel startup.

Actually this is trivial to do by using a file in initramfs.
If we need something in a well defined format anyway.

Eric

2005-02-12 14:53:54

by Werner Almesberger

[permalink] [raw]
Subject: Re: COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6

Eric W. Biederman wrote:
> Actually this is trivial to do by using a file in initramfs.
> If we need something in a well defined format anyway.

Yes, constructing an additional initramfs, or modifying an existing
one to hold such data is certainly a possibility.

I think there are mainly three choices:
1) the command line
2) an initramfs
3) some other, yet to be defined data structure

1) is relatively easy to do, but leads to more little parsers and
doesn't scale too well. 2) scales well but has a relatively high
overhead (constructing/scanning a cpio archive, etc., particularly
for items needed early in the boot process), and does not work too
well for discontiguous data structures. 3) is of course what we
should try to avoid :-)

So far, I also think that using an initramfs, or at least
something that looks like one, even if not normally used as such,
is the thing to try first.

- Werner

--
_________________________________________________________________________
/ Werner Almesberger, Buenos Aires, Argentina [email protected] /
/_http://www.almesberger.net/____________________________________________/

2005-02-12 15:21:15

by Eric W. Biederman

[permalink] [raw]
Subject: Re: COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6

Werner Almesberger <[email protected]> writes:

> Eric W. Biederman wrote:
> > Actually this is trivial to do by using a file in initramfs.
> > If we need something in a well defined format anyway.
>
> Yes, constructing an additional initramfs, or modifying an existing
> one to hold such data is certainly a possibility.
>
> I think there are mainly three choices:
> 1) the command line
> 2) an initramfs
> 3) some other, yet to be defined data structure
>
> 1) is relatively easy to do, but leads to more little parsers and
> doesn't scale too well. 2) scales well but has a relatively high
> overhead (constructing/scanning a cpio archive, etc., particularly
> for items needed early in the boot process), and does not work too
> well for discontiguous data structures.

There is certainly an issue with reading it early. But constructing
an additional cpio and sticking it into the initrd block is fairly
simple. For detecting devices especially in the case that takes
a while that isn't something we need to do early
in the boot process.

> 3) is of course what we should try to avoid :-)

Well the data structure is still yet to be defined. The
question you raised is how to pass it.

> So far, I also think that using an initramfs, or at least
> something that looks like one, even if not normally used as such,
> is the thing to try first.

Something like that. I have yet to see a even a proof of concept
of the idea of passing device information, to clean up probes.
Nor am I quite certain if it is really useful. But when it
happens I am sure we can cope.

Eric

2005-02-14 05:51:38

by Werner Almesberger

[permalink] [raw]
Subject: Re: COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6

Eric W. Biederman wrote:
> For detecting devices especially in the case that takes
> a while that isn't something we need to do early
> in the boot process.

Yes, but I'd rather have a generic mechanism that works in all
reasonable cases. Things have a tendency of growing in the oddest
directions. E.g. when introducing the boot command line, all I
had in mind was to have a way to boot single-user mode :-)

> Well the data structure is still yet to be defined. The
> question you raised is how to pass it.

Err yes, that's what I wanted to say :) Some new mechanism to
pass the data, or a weird data structure instead of (as opposed
to be on) initrd/initramfs.

> Something like that. I have yet to see a even a proof of concept
> of the idea of passing device information, to clean up probes.

Yes, the kexec-based boot loader first, then this. For a
kexec-based boot loader, passing device scan results will be
very useful, plus it's a good environment for experimenting
with such a feature.

- Werner

--
_________________________________________________________________________
/ Werner Almesberger, Buenos Aires, Argentina [email protected] /
/_http://www.almesberger.net/____________________________________________/

2005-02-14 06:11:54

by Adam Sulmicki

[permalink] [raw]
Subject: Re: COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6

On Fri, 21 Jan 2005, Catalin(ux aka Dino) BOIE wrote:

> I really suggest to push this limit to 4k. My reason is that under UML I need
> to put a lot of stuff in command line and uml crash if I not extend this
> limit. Can we make it depend on arhitecture?

another nice feature would be the kernel ignoring the any "/n" in the
command line. Currently if you accdentally pass the "/n" in the command
line the most weird things happen.

for examle, type, following

mkelfImage /boot/vmlinuz-2.6.11-rc2-mm1 /boot/vmlinuz-2.6.11-rc2-mm1.elf \
--command-line="console=ttyS0,19200 root=/dev/nfs nfsroot=/ ip=any
init=/usr/src/cm/files/init.kexec.sh"

and watch kernel saying that it does not get any DHCP replies, while the
real problem is that there's /n before init= line.

2005-02-14 07:39:33

by Eric W. Biederman

[permalink] [raw]
Subject: Re: COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6

Werner Almesberger <[email protected]> writes:

> Eric W. Biederman wrote:
> > Something like that. I have yet to see a even a proof of concept
> > of the idea of passing device information, to clean up probes.
>
> Yes, the kexec-based boot loader first, then this. For a
> kexec-based boot loader, passing device scan results will be
> very useful, plus it's a good environment for experimenting
> with such a feature.

And from another perspective what drives things are practical
requirements. Boot speed while nice does not yet seem to be a
driver, and that is all I have seen proposed with passing
the list of hardware. What is currently a driver in
the kexec scenario is booting a kernel without firmware
calls, and in the kexec-on-panic case booting a kernel
without a kernel where the hardware is in a known messed
up state.

So far I have seen nothing that even resembles an architecture
independent solution to avoiding firmware calls. And right
now I'm not even certain I even expect to see something it become
architecture independent. At the very least we need some
clean architecture specific support first, so we can have
a clue what needs to be generalized. ia64 and ppc are coming...

At any rate I see the problem of which hardware devices
are present as a subset of the problem of booting without firmware.
So I suspect we are going to get some pretty weird architecture
specific implementations at least in the first go round.

Eric