2007-11-08 00:33:33

by werner

[permalink] [raw]
Subject: Fwd: same problem with 2.6.24-rc2

On 7/Nov/2007 20:10 werner wrote ..
> With 2.6.23-rc2 is the same problem: it crashed at the beginning: EIP 060 c03fdea4
> EFLAGS 00010212 EIP is at xor_sse_2+0x34/0x200
> Again during the compilation was reclaimed that <source-dir>/arch/x86/Makefile.o
> cannot be found and were certain dependencies on it not made, such a file isn't
> present in the source code (present are, f.ex. Makefile_32 , Makefile_64 ), nor
> was generated automaticaly during compilation, I think this is incorrect and the
> reason for the problems
>
> wl
> [email protected]
> =============================================================================
> On 7/Nov/2007 16:14 Andrew Morton wrote ..
> > > On Wed, 07 Nov 2007 15:55:12 -0300 (GFT) "werner" <[email protected]> wrote:
> > > I really don't know what's happening. I don't understand nothing about the
> kernel
> > error reporting system. Because of this, always when there is a problem, I
> report
> > it via e-mail to [email protected] . I don't know what people there
> > do with my messages.
> >
> >
> > It went like this:
> >
> > 1: you sent an email to linux-kernel
> >
> > 2: I sent a reply to you and linux-kernel
> >
> > 3: you sent a reply to me, but NOT linux-kernel!
> >
> > In other words, you did "reply", not "reply to all", thus you removed three
> > thousand people from the discussion. One of those people is the person who
> > created the bug which you're hitting, and that person no longer knows
> > what's happening.
> >
> >
> > So please go back and resend all those emails, and retain ALL Cc:'s. Don't
> > just send them only to me. Keep all indivisuals and all mailing lists on
> > the email Cc: list.
> ==============================================================================
> *** http://www.copaya.yi.org / http://www.monkey.is-a-geek.net ***
> O ?nico servidor comunit?rio na Guiana-Francesa. Situado no local, r?pido, imuno
> contra guerras / desastres na Europa. Servi?o n?o-comercial e gratuito de: http
> (forum, p?gina web), irc (chat), ftp (download), name (subdomain) .
==============================================================================
*** http://www.copaya.yi.org / http://www.monkey.is-a-geek.net ***
O ?nico servidor comunit?rio na Guiana-Francesa. Situado no local, r?pido, imuno contra guerras / desastres na Europa. Servi?o n?o-comercial e gratuito de: http (forum, p?gina web), irc (chat), ftp (download), name (subdomain) .


2007-11-08 01:07:18

by Randy Dunlap

[permalink] [raw]
Subject: Re: Fwd: same problem with 2.6.24-rc2

On Wed, 07 Nov 2007 21:32:43 -0300 (GFT) werner wrote:

> On 7/Nov/2007 20:10 werner wrote ..
> > With 2.6.23-rc2 is the same problem: it crashed at the beginning: EIP 060 c03fdea4
> > EFLAGS 00010212 EIP is at xor_sse_2+0x34/0x200
> > Again during the compilation was reclaimed that <source-dir>/arch/x86/Makefile.o
> > cannot be found and were certain dependencies on it not made, such a file isn't
> > present in the source code (present are, f.ex. Makefile_32 , Makefile_64 ), nor
> > was generated automaticaly during compilation, I think this is incorrect and the
> > reason for the problems

Hi,

Please provide the complete build log (with V=1 if possible) for the
missing Makefile.o problem.

E.g.:

make V=1 all >build.log 2>&1

Make sure that build.log contains the error message and then send
the complete build.log file to us at [email protected] .


> > wl
> > [email protected]
> > =============================================================================
> > On 7/Nov/2007 16:14 Andrew Morton wrote ..
> > > > On Wed, 07 Nov 2007 15:55:12 -0300 (GFT) "werner" <[email protected]> wrote:
> > > > I really don't know what's happening. I don't understand nothing about the
> > kernel
> > > error reporting system. Because of this, always when there is a problem, I
> > report
> > > it via e-mail to [email protected] . I don't know what people there
> > > do with my messages.
> > >
> > >
> > > It went like this:
> > >
> > > 1: you sent an email to linux-kernel
> > >
> > > 2: I sent a reply to you and linux-kernel
> > >
> > > 3: you sent a reply to me, but NOT linux-kernel!
> > >
> > > In other words, you did "reply", not "reply to all", thus you removed three
> > > thousand people from the discussion. One of those people is the person who
> > > created the bug which you're hitting, and that person no longer knows
> > > what's happening.
> > >
> > >
> > > So please go back and resend all those emails, and retain ALL Cc:'s. Don't
> > > just send them only to me. Keep all indivisuals and all mailing lists on
> > > the email Cc: list.



---
~Randy

2007-11-08 07:07:27

by Randy Dunlap

[permalink] [raw]
Subject: Re: Fwd: same problem with 2.6.24-rc2


[adding linux-kernel again]

werner wrote:
> The compilation is ready. By any reason that list as suggested by you wasn't generated.
> However, the 3 compiling/linking lists what my kernel-build-script normally generates,
> were. They are annexed here. It's the same , after booting the kernel crashs imediately
> with EIP error. And the building process reclaims a missing Makefile.o in /<source-dir>/arch/x86.
>

OK, first show us (that is, the mailing list "[email protected]", not
just me) what your "kernel-build-script" looks like.

The beginning of the log files that you sent to me (at end of this email)
is very suspicious looking. It looks like you are not using the expect kernel
build procedures.

The crash problem (snippet below) is a fault in xor_sse_2() in the function
that tries to choose the best (fastest) xor method. I would expect other
people to be having a similar problem. I don't suspect that it's related to
the build problem (Makefile.o), but we need to have you building kernels
correctly before we try to find out why they break when you boot them.

>
> =================================================================================
> On 7/Nov/2007 22:06 Randy Dunlap wrote ..
>> On Wed, 07 Nov 2007 21:32:43 -0300 (GFT) werner wrote:
>>
>>> On 7/Nov/2007 20:10 werner wrote ..
>>>> With 2.6.23-rc2 is the same problem: it crashed at the beginning: EIP 060
>> c03fdea4
>>>> EFLAGS 00010212 EIP is at xor_sse_2+0x34/0x200
>>>> Again during the compilation was reclaimed that <source-dir>/arch/x86/Makefile.o
>>>> cannot be found and were certain dependencies on it not made, such a file isn't
>>>> present in the source code (present are, f.ex. Makefile_32 , Makefile_64 ),
>> nor
>>>> was generated automaticaly during compilation, I think this is incorrect and
>> the
>>>> reason for the problems
>> Hi,
>>
>> Please provide the complete build log (with V=1 if possible) for the
>> missing Makefile.o problem.
>>
>> E.g.:
>>
>> make V=1 all >build.log 2>&1
>>
>> Make sure that build.log contains the error message and then send
>> the complete build.log file to us at [email protected] .
>>
>>
>>>> wl
>>>> [email protected]
>>>> =============================================================================
>>>> On 7/Nov/2007 16:14 Andrew Morton wrote ..
>>>>>> On Wed, 07 Nov 2007 15:55:12 -0300 (GFT) "werner" <[email protected]>
>> wrote:
>>>>>> I really don't know what's happening. I don't understand nothing about
>> the
>>>> kernel
>>>>> error reporting system. Because of this, always when there is a problem,
>> I
>>>> report
>>>>> it via e-mail to [email protected] . I don't know what people
>> there
>>>>> do with my messages.
>>>>>
>>>>>
>>>>> It went like this:
>>>>>
>>>>> 1: you sent an email to linux-kernel
>>>>>
>>>>> 2: I sent a reply to you and linux-kernel
>>>>>
>>>>> 3: you sent a reply to me, but NOT linux-kernel!
>>>>>
>>>>> In other words, you did "reply", not "reply to all", thus you removed three
>>>>> thousand people from the discussion. One of those people is the person who
>>>>> created the bug which you're hitting, and that person no longer knows
>>>>> what's happening.
>>>>>
>>>>>
>>>>> So please go back and resend all those emails, and retain ALL Cc:'s. Don't
>>>>> just send them only to me. Keep all indivisuals and all mailing lists on
>>>>> the email Cc: list.



> gcc -m32 -m elf_i386 /usr/src/linux-2.6.24-rc2-i486-1mn/arch/x86/Makefile.o -o /usr/src/linux-2.6.24-rc2-i486-1mn/arch/x86/Makefile
> gcc: /usr/src/linux-2.6.24-rc2-i486-1mn/arch/x86/Makefile.o: No such file or directory
> gcc: no input files
> make: [/usr/src/linux-2.6.24-rc2-i486-1mn/arch/x86/Makefile] Error 1 (ignored)



--
~Randy

2007-11-09 05:32:33

by Randy Dunlap

[permalink] [raw]
Subject: Re: Fwd: same problem with 2.6.24-rc2

On Wed, 07 Nov 2007 23:05:32 -0800 Randy Dunlap wrote:

Hi Sam,

This is somewhat of a build regression... a confusing one to me.
Maybe you will know what it's up to.


There's also a kernel boot regression: something in
crypto/xor.c::calibrate_xor_blocks() finds a null pointer.
I can't reproduce it. [more below]

Werner, for the machine that crashes during boot, please send us
the contents of /proc/cpuinfo. Thanks.

(BTW, for anyone reading along, vger sees Werner's emails as spam,
so you may receive mail directly from him instead of seeing it on
lkml.)


Werner's kernel-build-script is a large multi-purpose script with
package building capability. He reported the following build output:


> gcc -m32 -m elf_i386 /usr/src/linux-2.6.24-rc2-i486-1mn/arch/x86/Makefile.o -o /usr/src/linux-2.6.24-rc2-i486-1mn/arch/x86/Makefile
> gcc: /usr/src/linux-2.6.24-rc2-i486-1mn/arch/x86/Makefile.o: No such file or directory
> gcc: no input files
> make: [/usr/src/linux-2.6.24-rc2-i486-1mn/arch/x86/Makefile] Error 1 (ignored)

which I can easily reproduce by doing (at kernel top-level dir):

make defconfig
make -B

This does not happen in 2.6.23. Instead, that sequence just loops
forever with: (this is what I get:)

make -f /tester/linsrc/linux-2623-pv/Makefile silentoldconfig
make -f scripts/Makefile.build obj=scripts/basic
mkdir -p include/linux include/config
make -f scripts/Makefile.build obj=scripts/kconfig silentoldconfig
cat scripts/kconfig/zconf.tab.c_shipped > scripts/kconfig/zconf.tab.c
cat scripts/kconfig/lex.zconf.c_shipped > scripts/kconfig/lex.zconf.c
cat scripts/kconfig/zconf.hash.c_shipped > scripts/kconfig/zconf.hash.c
gcc -Wp,-MD,scripts/kconfig/.zconf.tab.o.d -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -DCURSES_LOC="<ncurses.h>" -DLOCALE -Iscripts/kconfig -c -o scripts/kconfig/zconf.tab.o scripts/kconfig/zconf.tab.c
gcc -o scripts/kconfig/conf scripts/kconfig/conf.o scripts/kconfig/zconf.tab.o -lncursesw
scripts/kconfig/conf -s arch/x86_64/Kconfig
make -f /tester/linsrc/linux-2623-pv/Makefile silentoldconfig
make -f scripts/Makefile.build obj=scripts/basic
mkdir -p include/linux include/config
make -f scripts/Makefile.build obj=scripts/kconfig silentoldconfig
cat scripts/kconfig/zconf.tab.c_shipped > scripts/kconfig/zconf.tab.c
cat scripts/kconfig/lex.zconf.c_shipped > scripts/kconfig/lex.zconf.c
cat scripts/kconfig/zconf.hash.c_shipped > scripts/kconfig/zconf.hash.c
gcc -Wp,-MD,scripts/kconfig/.zconf.tab.o.d -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -DCURSES_LOC="<ncurses.h>" -DLOCALE -Iscripts/kconfig -c -o scripts/kconfig/zconf.tab.o scripts/kconfig/zconf.tab.c
gcc -o scripts/kconfig/conf scripts/kconfig/conf.o scripts/kconfig/zconf.tab.o -lncursesw
scripts/kconfig/conf -s arch/x86_64/Kconfig
make -f /tester/linsrc/linux-2623-pv/Makefile silentoldconfig
make -f scripts/Makefile.build obj=scripts/basic
mkdir -p include/linux include/config

...

I suppose we could argue that the 2.6.24-rcN handling is better than
the 2.6.23 handling. Werner does not report any problems like this
with 2.6.23, so he's not reporting what I am seeing.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Pasted from earlier email from Werner (typed in):

2.6.24-rc1-git10
EIP 0600: <c0407284> EFLAGS 00010212 CPU 0
EIUP is at xor_sse_2+0x34/0x200
EAX: 10 EBX fffedb22 ECX c183f000 EDX c183c000 ESS 8005003b EDI c0929614 EBP c183f000 ESP c1823ef0
DS 7b ES 7b FS d8 GS 0 SS 68
Process swapper pid 1 ti: c182200 task c1820000 task.ti c=1822000
Stack: 8x 0 8x 0 fffedb22 0 c04067b3 10 c0849b62 c1030780 c183f000 c183c000
call trace
c0 4067b3 do_xor_speed+0x53/0xd0
9a9582 calibrate_xor_blocks 0xe2/0x100 (or 1a0 ?)
191594 register_filesystem =0X44/0X70
991565 kernel_init+0x125/0x2f0
10420a ret_from_fork +0x6/0x1c (or 0xb ...)
991440 kernel_init+0x0/0x2f0
" again
c0104edf kernel_thread_helper+0x7/0x18
code 08 89 74 24 44 0f 20 cf 0f 06 (or 0b) 0f 11 04 24 0f 11 4c 34 10 0f 11 54 24 20 0f 11 5c 24 30 0f 18 82 00
01 00 00 0f 18 82 20 01 00 00 <00> 20x 0
EIP c0407284 xor_sse_2+0x34/0x200 SS ESP 068: c1823ef0
kernel panic


and later:

> With 2.6.23-rc2 is the same problem: it crashed at the beginning: EIP 060 c03fdea4
> EFLAGS 00010212 EIP is at xor_sse_2+0x34/0x200

---
~Randy