2002-08-24 15:09:19

by Dave Gilbert (Home)

[permalink] [raw]
Subject: Of hanging menuconfig [cause found]

Hi,
make menuconfig will hang just after the parsing in the
activate_menu loop in the case where the file scripts/lxdialog/lxdialog
won't execute. Some error codes in this case are caught; but the case
where the file scripts/lxdialog/lxdialog is a binary for the wrong
architecture (case 126) is not caught. This is quite easy to trip if
you are swapping between native and cross building - you get a couple of
errors when you try and build make menuconfig for the first time about
wrong binaries; in my case I just deleted those binaries and did the
make again; however this failure is silent - it just hangs.

A make mrproper is probably the best thing to do when switching - but
the error case needs catching, and I'm sure there are other similar
cases.

Dave

P.S. This was on 2.4.18-rmk7 but I believe it is general.
P.P.S. Is it a good idea to keep binaries in the scripts subdirectory?

---------------- Have a happy GNU millennium! ----------------------
/ Dr. David Alan Gilbert | Running GNU/Linux on Alpha,68K| Happy \
\ gro.gilbert @ treblig.org | MIPS,x86,ARM, SPARC and HP-PA | In Hex /
\ _________________________|_____ http://www.treblig.org |_______/


2002-08-24 20:45:25

by Sam Ravnborg

[permalink] [raw]
Subject: Re: Of hanging menuconfig [cause found]

On Sat, Aug 24, 2002 at 04:13:29PM +0100, Dr. David Alan Gilbert wrote:
> Hi,
> make menuconfig will hang just after the parsing in the
> activate_menu loop in the case where the file scripts/lxdialog/lxdialog
> won't execute. Some error codes in this case are caught; but the case
> where the file scripts/lxdialog/lxdialog is a binary for the wrong
> architecture (case 126) is not caught. This is quite easy to trip if
> you are swapping between native and cross building - you get a couple of
> errors when you try and build make menuconfig for the first time about
> wrong binaries; in my case I just deleted those binaries and did the
> make again; however this failure is silent - it just hangs.
This does not make sense...
lxdialog are compiled utilising HOSTCC, and HOSTCC always points to gcc.
So unless you fail to keep gcc for native in PATH and use:
$> make CROSS_COMPILE=arm all
to do cross-compile the above scenario should not be possible.
In other words
$> which gcc
shall always point to the gcc used for native architecture. Cross
compiling are done by specifying another gcc using the above syntax.
[I've only looked in 2.5 sources by the way, 2.4 may differ here].

Another point is that the current kbuild is too weak when architecture
is changed. Changing architecture should require a make mrproper.

Sam

2002-08-24 22:32:00

by Dave Gilbert (Home)

[permalink] [raw]
Subject: Re: Of hanging menuconfig [cause found]

* Sam Ravnborg ([email protected]) wrote:
> On Sat, Aug 24, 2002 at 04:13:29PM +0100, I wrote:

<in short, menuconfig hangs if lxdialog is built for wrong
architecture>

> This does not make sense...
> lxdialog are compiled utilising HOSTCC, and HOSTCC always points to gcc.
> So unless you fail to keep gcc for native in PATH and use:
> $> make CROSS_COMPILE=arm all
> to do cross-compile the above scenario should not be possible.

No, much simpler scenario; your kernel source is on an NFS partition.
You cross compile it; many days later you compile the same code natively
on the target from the same directory (for example if you suspect
instability is caused by cross compilation).

Dave
---------------- Have a happy GNU millennium! ----------------------
/ Dr. David Alan Gilbert | Running GNU/Linux on Alpha,68K| Happy \
\ gro.gilbert @ treblig.org | MIPS,x86,ARM, SPARC and HP-PA | In Hex /
\ _________________________|_____ http://www.treblig.org |_______/

2002-08-25 08:03:59

by Richard Z

[permalink] [raw]
Subject: Re: Of hanging menuconfig [cause found]

On Sat, Aug 24, 2002 at 04:13:29PM +0100, Dr. David Alan Gilbert wrote:
> Hi,
> make menuconfig will hang just after the parsing in the
> activate_menu loop in the case where the file scripts/lxdialog/lxdialog
> won't execute. Some error codes in this case are caught; but the case
> where the file scripts/lxdialog/lxdialog is a binary for the wrong
> architecture (case 126) is not caught. This is quite easy to trip if
> you are swapping between native and cross building - you get a couple of
> errors when you try and build make menuconfig for the first time about
> wrong binaries; in my case I just deleted those binaries and did the
> make again; however this failure is silent - it just hangs.
>
> A make mrproper is probably the best thing to do when switching - but
> the error case needs catching, and I'm sure there are other similar
> cases.

look at dmesg and add an
alias binfmt-xxxx off
to /etc/modules.conf so similar problems get caught properly - unless
you want to actually use an emulator for this architecture of course :)

This is one of the cases where I wish kmod would do something more
intelligent by default than endless loop. Would it be a good idea
to attempt loading of emulator modules only for formats that are
previously somehow registered + a few well known like aout,misc,elf?

Looking at exec.c, why isn't the result of request_module() tested?

Richard


2002-08-25 09:07:01

by Russell King

[permalink] [raw]
Subject: Re: Of hanging menuconfig [cause found]

On Sat, Aug 24, 2002 at 09:21:44PM +0200, Richard Zidlicky wrote:
> look at dmesg and add an
> alias binfmt-xxxx off
> to /etc/modules.conf so similar problems get caught properly - unless
> you want to actually use an emulator for this architecture of course :)

I believe Dave was saying that the scripts try to run lxdialog, which
returns with an error code (to the program that called it.) This error
code is not checked by the caller, who just tries again. So the execve()
system call is already correctly failing.

The problem lies in activate_menu in scripts/Menuconfig; it contains a
loop that calls a small script which then calls lxdialog. It only tests
for a limited range of return codes:

0 1 2 3 4 5 6 139 255

The important thing here is the missing two codes (from bash's man page):

If a command is not found, the child process created to
execute it returns a status of 127. If a command is found
but is not executable, the return status is 126.

We currently handle neither; we just loop and try again.

> Looking at exec.c, why isn't the result of request_module() tested?

That doesn't really tell you anything; that tells you something successful
happened. It doesn't tell you that you now have (say) the a.out binary
format loaded; that's why we rescan the formats list afterwards.

Anyway, here's a patch that should solve the problem. Dave - thanks for
finding this odd behaviour.

--- orig/scripts/Menuconfig Tue Mar 5 19:56:45 2002
+++ linux/scripts/Menuconfig Sun Aug 25 10:08:44 2002
@@ -905,6 +905,26 @@
cleanup
exit 139
;;
+ 126|127)
+ stty sane
+ clear
+ cat << EOM
+
+There seems to be a problem with the lxdialog companion utility which is
+built prior to running Menuconfig. lxdialog could not be found, or could
+not be executed. This can be caused when lxdialog has been built for a
+different architecture.
+
+You should rebuild lxdialog. This can be done by moving to the
+/usr/src/linux/scripts/lxdialog directory and issuing the "make clean all"
+command.
+
+If the problem persists, you may email the maintainer <[email protected]> or
+post a message to <[email protected]> for additional assistance.
+
+EOM
+ cleanup
+ exit 1
esac
done
}

--
Russell King ([email protected]) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html