2006-10-11 11:26:00

by Andrew Walrond

[permalink] [raw]
Subject: 2.6.18 intermittent parallel build failure

When compiling the kernel on a Sun T1000 ( Niagra - 6 cores/24
threads) with

make -j12

I occasionally see failures like:

CC drivers/net/pppox.mod.o
CC drivers/net/r8169.mod.o
CC drivers/net/sk98lin/sk98lin.mod.o
CC drivers/net/skge.mod.o
CC drivers/net/slhc.mod.o
gcc: no input files
make[1]: *** [drivers/net/slhc.mod.o] Error 1
make[1]: *** Waiting for unfinished jobs....
make: *** [modules] Error 2

Restarting the make command completes the build successfully.

I am using the latest make 3.81 (which had a scary list of BACKWARDS
COMPATIBILITY warnings in the NEWS file) in case that might be
relevant.

Hope thats useful

Andrew Walrond


2006-10-12 07:52:38

by Andrew Walrond

[permalink] [raw]
Subject: Re: 2.6.18 intermittent parallel build failure

On Wed, Oct 11, 2006 at 11:25:58AM +0000, [email protected] wrote:
> When compiling the kernel on a Sun T1000 ( Niagra - 6 cores/24
> threads) with
>
> make -j12
>
> I occasionally see failures like:
>

This was due a serious bug in gnu make; see
https://savannah.gnu.org/bugs/?14853

This effects most of the recent make releases, and is _not_
sun/sparc/solaris specific, so don't stop reading the bug report at
the first line ;)

I've posted a patch against make-3.81 which fixes it for me.

Symptoms: Random crashing of make worker sub-processes at high -j#.
More pronounced when threads are waiting on i/o (make enables SIGCHLD
interrupts for short periods which can interrupt read(2) in the worker
threads and EINTR isn't handled)

This will likely effect anyone using large -j# on larger
multi-core/multi-processor machines, so BEWARE!

Andrew Walrond