2000-11-20 14:23:44

by Charles Turner, Ph.D.

[permalink] [raw]
Subject: Defective Red Hat Distribution poorly represents Linux


I tried to help a friend this weekend convert to Linux.
He lives in Upstate New York, so it was a long trip from
Cambridge, Massachusetts.

He has a Dual Pentium III, 600 MHz TYAN "Thunderbolt".
It has a built-in Adaptec SCSI controller and Intel
100-base-T Ethernet controller. It also has 1/2 Gbytes
of RAM. It's a superb machine.

It had been running Windows 2000 "Professional". Several months
ago, he purchased Red Hat "DELUXE" version 6.2. He was unable to
install it. I convinced him that installation was easy.

I was terribly wrong. This Red Hat version is irrevocably defective.

(1) It will not create a bootable disk because it forgets
to load scsi_mod.o, and sd_mod.o before it loads
aic7xxx.o
/etc/conf.modules was found to contain only aic7xxx
as an alias for scsi_hostadapter.

(2) Once I made a bootable diskette from a working machine
at home (200 miles round-trip distance), I was able to
install its supplied kernel version 2.2.14-5.0, and
2.2.14-5.0smp into the LILO boot sequence.

(3) It "sort of" worked. However, network daemons kept
dropping core. X would eventually crash, leaving the
terminal in an unusable state, etc.

(4) It is impossible to build a known working kernel on the
machine because the linker, `ld` crashes:

Script started on Sun Nov 19 19:11:55 2000
[[email protected] linux-2.2.17]# make dep
gcc -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -o scripts/mkdep scripts/mkdep.c
collect2: ld terminated with signal 11 [Segmentation fault], core dumped
make: *** [scripts/mkdep] Error 1
[[email protected] linux-2.2.17]# ld --version
GNU ld 2.9.5
Copyright 1997 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License. This program has absolutely no warranty.
Supported emulations:
elf_i386
i386linux
[[email protected] linux-2.2.17]# cd scripts
[[email protected] scripts]# gcc -o mkdep.o mkdep.c
collect2: ld terminated with signal 11 [Segmentation fault], core dumped
[[email protected] scripts]# gcc -c -o mkdep.o mkdep.c
[[email protected] scripts]# ld -o mkdep mkdep.o
Segmentation fault (core dumped)

[[email protected] scripts]# strace ld -o mkdep mkdep.c
execve("/usr/bin/ld", ["ld", "-o", "mkdep", "mkdep.o"], [/* 20 vars */]) = 0
brk(0) = 0x807bb84
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40014000
open("/etc/ld.so.preload", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=16290, ...}) = 0
old_mmap(NULL, 16290, PROT_READ, MAP_PRIVATE, 3, 0) = 0x40015000
close(3) = 0
open("/usr/lib/libbfd-2.9.5.0.22.so", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0755, st_size=314936, ...}) = 0
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\320\242"..., 4096) = 4096
old_mmap(NULL, 279260, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x40019000
mprotect(0x40059000, 17116, PROT_NONE) = 0
old_mmap(0x40059000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x3f000) = 0x40059000
old_mmap(0x4005d000, 732, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x4005d000
close(3) = 0
open("/lib/libdl.so.2", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0755, st_size=75131, ...}) = 0
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\340\34"..., 4096) = 4096
old_mmap(NULL, 12428, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x4005e000
mprotect(0x40060000, 4236, PROT_NONE) = 0
old_mmap(0x40060000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x1000) = 0x40060000
close(3) = 0
open("/lib/libc.so.6", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0755, st_size=4101324, ...}) = 0
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\210\212"..., 4096) = 4096
old_mmap(NULL, 1001564, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x40062000
mprotect(0x4014f000, 30812, PROT_NONE) = 0
old_mmap(0x4014f000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0xec000) = 0x4014f000
old_mmap(0x40153000, 14428, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x40153000
close(3) = 0
mprotect(0x40062000, 970752, PROT_READ|PROT_WRITE) = 0
mprotect(0x40062000, 970752, PROT_READ|PROT_EXEC) = 0
munmap(0x40015000, 16290) = 0
personality(PER_LINUX) = 0
getpid() = 763
getrusage(RUSAGE_SELF, {ru_utime={0, 0}, ru_stime={0, 0}, ...}) = 0
brk(0) = 0x807bb84
brk(0x807bbbc) = 0x807bbbc
brk(0x807c000) = 0x807c000
open("/usr/share/locale/locale.alias", O_RDONLY) = 3
fstat64(0x3, 0xbfffb84c) = -1 ENOSYS (Function not implemented)
fstat(3, {st_mode=S_IFREG|0644, st_size=2265, ...}) = 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40015000
read(3, "# Locale name alias data base.\n#"..., 4096) = 2265
brk(0x807d000) = 0x807d000
read(3, "", 4096) = 0
close(3) = 0
munmap(0x40015000, 4096) = 0
open("/usr/share/i18n/locale.alias", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale/en_US/LC_MESSAGES", O_RDONLY) = 3
fstat(3, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
close(3) = 0
open("/usr/share/locale/en_US/LC_MESSAGES/SYS_LC_MESSAGES", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=44, ...}) = 0
old_mmap(NULL, 44, PROT_READ, MAP_PRIVATE, 3, 0) = 0x40015000
close(3) = 0
stat("/usrusr/lib/ldscripts", 0xbffffa7c) = -1 ENOENT (No such file or directory)
brk(0x807f000) = 0x807f000
brk(0x8080000) = 0x8080000
brk(0x8081000) = 0x8081000
brk(0x8082000) = 0x8082000
brk(0x8083000) = 0x8083000
unlink("mkdep") = -1 ENOENT (No such file or directory)
open("mkdep", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
brk(0x8084000) = 0x8084000
brk(0x8088000) = 0x8088000
brk(0x8089000) = 0x8089000
brk(0x808a000) = 0x808a000
open("mkdep.o", O_RDONLY) = 4
fstat(4, {st_mode=S_IFREG|0644, st_size=9716, ...}) = 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40016000
_llseek(4, 0, [0], SEEK_SET) = 0
read(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\1\0\3\0\1\0\0\0\0\0\0\0"..., 4096) = 4096
_llseek(4, 4096, [4096], SEEK_SET) = 0
_llseek(4, 4096, [4096], SEEK_SET) = 0
_llseek(4, 4096, [4096], SEEK_SET) = 0
_llseek(4, 4096, [4096], SEEK_SET) = 0
_llseek(4, 4096, [4096], SEEK_SET) = 0
_llseek(4, 4096, [4096], SEEK_SET) = 0
_llseek(4, 4096, [4096], SEEK_SET) = 0
_llseek(4, 4096, [4096], SEEK_SET) = 0
_llseek(4, 4096, [4096], SEEK_SET) = 0
_llseek(4, 4096, [4096], SEEK_SET) = 0
read(4, "E\364\215\24E\0\0\0\0\241\0\0\0\0f\213\24\20\203\342\10"..., 4096) = 4096
_llseek(4, 8192, [8192], SEEK_SET) = 0
--- SIGSEGV (Segmentation fault) ---
+++ killed by SIGSEGV +++
[[email protected] scripts]# exit
exit

Script done on Sun Nov 19 19:14:35 2000

I can even see obvious bugs in the trace, i.e., :

stat("/usrusr/lib/ldscripts", 0xbffffa7c) = -1 ENOENT (No such file or directory)
----------^^^

Although this was not the reason for the seg-fault.


(5) I returned home (200 mile round trip), removed my
SCSI disks from my home machine, and then returned
and installed them in my friend's machine. The
machine worked perfectly with Linux version 2.2.17,
and gcc-2.7.2.3, Binutils-2.8.1.0, etc., the standard
stuff.

This shows that the problems are not because of a
defective machine.

I cloned my disks to his disks and, he's running.
however, I don't have all the GUI stuff installed that
he likes (needs). My disks had been built up over
over several years by Richard Johnson, a frequent
contributor to Linux.

(6) I have been told that I could get a statically-linked
version of `ld`, plus another 'C' compiler statically-
linked so that these don't use the possibly defective
dynamic libraries. I could then build a decent working
system using sources available on the Internet.

From a customer's perspective, this is absurd. When
a customer purchases a "shrink-wrapped" operating
system he expects it to work.

(7) This fiasco is an example of why Linux is in big trouble.
Once Linux obtained visibility, it became necessary for
the most visible distributors and VARs to provide a high
quality product.

Red Hat, provably does not. Red Hat employees dominate
Linux kernel development, even moderating (read controlling)
what will be and what will not be done within this operating
system.

As repeatably demonstrated on this list, Red Hat employees
spend much time denigrating others at the expense of providing
a useful product.

(8) Now I'm pretty sure that this email will generate a lot of
flames. So be it. Linux, as an operating system to replace
windows, has gone to hell because of at least one distributor's
selling of garbage. There may be more such defective distributions
out there. I certainly don't know what to purchase for my
next attempt at a "shrink-wrap" installation.


Very Truly Yours,

Charles Turner

Member(s) IEEE, IEEE Computer Society, AIAA

I speak only for myself, which is enough of a problem.




2000-11-20 14:31:15

by Mohammad A. Haque

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

You're complaining on the wrong list.


On Mon, 20 Nov 2000, Charles Turner, Ph.D. wrote:

>
> I tried to help a friend this weekend convert to Linux.
> He lives in Upstate New York, so it was a long trip from
> Cambridge, Massachusetts.
>
> He has a Dual Pentium III, 600 MHz TYAN "Thunderbolt".
> It has a built-in Adaptec SCSI controller and Intel
> 100-base-T Ethernet controller. It also has 1/2 Gbytes
> of RAM. It's a superb machine.
>
> It had been running Windows 2000 "Professional". Several months
> ago, he purchased Red Hat "DELUXE" version 6.2. He was unable to
> install it. I convinced him that installation was easy.
>
> I was terribly wrong. This Red Hat version is irrevocably defective.
>
> (1) It will not create a bootable disk because it forgets
> to load scsi_mod.o, and sd_mod.o before it loads
> aic7xxx.o
> /etc/conf.modules was found to contain only aic7xxx
> as an alias for scsi_hostadapter.
>
> (2) Once I made a bootable diskette from a working machine
> at home (200 miles round-trip distance), I was able to
> install its supplied kernel version 2.2.14-5.0, and
> 2.2.14-5.0smp into the LILO boot sequence.
>
> (3) It "sort of" worked. However, network daemons kept
> dropping core. X would eventually crash, leaving the
> terminal in an unusable state, etc.
>
> (4) It is impossible to build a known working kernel on the
> machine because the linker, `ld` crashes:
>
> Script started on Sun Nov 19 19:11:55 2000
> [[email protected] linux-2.2.17]# make dep
> gcc -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -o scripts/mkdep scripts/mkdep.c
> collect2: ld terminated with signal 11 [Segmentation fault], core dumped
> make: *** [scripts/mkdep] Error 1
> [[email protected] linux-2.2.17]# ld --version
> GNU ld 2.9.5
> Copyright 1997 Free Software Foundation, Inc.
> This program is free software; you may redistribute it under the terms of
> the GNU General Public License. This program has absolutely no warranty.
> Supported emulations:
> elf_i386
> i386linux
> [[email protected] linux-2.2.17]# cd scripts
> [[email protected] scripts]# gcc -o mkdep.o mkdep.c
> collect2: ld terminated with signal 11 [Segmentation fault], core dumped
> [[email protected] scripts]# gcc -c -o mkdep.o mkdep.c
> [[email protected] scripts]# ld -o mkdep mkdep.o
> Segmentation fault (core dumped)
>
> [[email protected] scripts]# strace ld -o mkdep mkdep.c
> execve("/usr/bin/ld", ["ld", "-o", "mkdep", "mkdep.o"], [/* 20 vars */]) = 0
> brk(0) = 0x807bb84
> old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40014000
> open("/etc/ld.so.preload", O_RDONLY) = -1 ENOENT (No such file or directory)
> open("/etc/ld.so.cache", O_RDONLY) = 3
> fstat(3, {st_mode=S_IFREG|0644, st_size=16290, ...}) = 0
> old_mmap(NULL, 16290, PROT_READ, MAP_PRIVATE, 3, 0) = 0x40015000
> close(3) = 0
> open("/usr/lib/libbfd-2.9.5.0.22.so", O_RDONLY) = 3
> fstat(3, {st_mode=S_IFREG|0755, st_size=314936, ...}) = 0
> read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\320\242"..., 4096) = 4096
> old_mmap(NULL, 279260, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x40019000
> mprotect(0x40059000, 17116, PROT_NONE) = 0
> old_mmap(0x40059000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x3f000) = 0x40059000
> old_mmap(0x4005d000, 732, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x4005d000
> close(3) = 0
> open("/lib/libdl.so.2", O_RDONLY) = 3
> fstat(3, {st_mode=S_IFREG|0755, st_size=75131, ...}) = 0
> read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\340\34"..., 4096) = 4096
> old_mmap(NULL, 12428, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x4005e000
> mprotect(0x40060000, 4236, PROT_NONE) = 0
> old_mmap(0x40060000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x1000) = 0x40060000
> close(3) = 0
> open("/lib/libc.so.6", O_RDONLY) = 3
> fstat(3, {st_mode=S_IFREG|0755, st_size=4101324, ...}) = 0
> read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\210\212"..., 4096) = 4096
> old_mmap(NULL, 1001564, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x40062000
> mprotect(0x4014f000, 30812, PROT_NONE) = 0
> old_mmap(0x4014f000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0xec000) = 0x4014f000
> old_mmap(0x40153000, 14428, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x40153000
> close(3) = 0
> mprotect(0x40062000, 970752, PROT_READ|PROT_WRITE) = 0
> mprotect(0x40062000, 970752, PROT_READ|PROT_EXEC) = 0
> munmap(0x40015000, 16290) = 0
> personality(PER_LINUX) = 0
> getpid() = 763
> getrusage(RUSAGE_SELF, {ru_utime={0, 0}, ru_stime={0, 0}, ...}) = 0
> brk(0) = 0x807bb84
> brk(0x807bbbc) = 0x807bbbc
> brk(0x807c000) = 0x807c000
> open("/usr/share/locale/locale.alias", O_RDONLY) = 3
> fstat64(0x3, 0xbfffb84c) = -1 ENOSYS (Function not implemented)
> fstat(3, {st_mode=S_IFREG|0644, st_size=2265, ...}) = 0
> old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40015000
> read(3, "# Locale name alias data base.\n#"..., 4096) = 2265
> brk(0x807d000) = 0x807d000
> read(3, "", 4096) = 0
> close(3) = 0
> munmap(0x40015000, 4096) = 0
> open("/usr/share/i18n/locale.alias", O_RDONLY) = -1 ENOENT (No such file or directory)
> open("/usr/share/locale/en_US/LC_MESSAGES", O_RDONLY) = 3
> fstat(3, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
> close(3) = 0
> open("/usr/share/locale/en_US/LC_MESSAGES/SYS_LC_MESSAGES", O_RDONLY) = 3
> fstat(3, {st_mode=S_IFREG|0644, st_size=44, ...}) = 0
> old_mmap(NULL, 44, PROT_READ, MAP_PRIVATE, 3, 0) = 0x40015000
> close(3) = 0
> stat("/usrusr/lib/ldscripts", 0xbffffa7c) = -1 ENOENT (No such file or directory)
> brk(0x807f000) = 0x807f000
> brk(0x8080000) = 0x8080000
> brk(0x8081000) = 0x8081000
> brk(0x8082000) = 0x8082000
> brk(0x8083000) = 0x8083000
> unlink("mkdep") = -1 ENOENT (No such file or directory)
> open("mkdep", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
> brk(0x8084000) = 0x8084000
> brk(0x8088000) = 0x8088000
> brk(0x8089000) = 0x8089000
> brk(0x808a000) = 0x808a000
> open("mkdep.o", O_RDONLY) = 4
> fstat(4, {st_mode=S_IFREG|0644, st_size=9716, ...}) = 0
> old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40016000
> _llseek(4, 0, [0], SEEK_SET) = 0
> read(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\1\0\3\0\1\0\0\0\0\0\0\0"..., 4096) = 4096
> _llseek(4, 4096, [4096], SEEK_SET) = 0
> _llseek(4, 4096, [4096], SEEK_SET) = 0
> _llseek(4, 4096, [4096], SEEK_SET) = 0
> _llseek(4, 4096, [4096], SEEK_SET) = 0
> _llseek(4, 4096, [4096], SEEK_SET) = 0
> _llseek(4, 4096, [4096], SEEK_SET) = 0
> _llseek(4, 4096, [4096], SEEK_SET) = 0
> _llseek(4, 4096, [4096], SEEK_SET) = 0
> _llseek(4, 4096, [4096], SEEK_SET) = 0
> _llseek(4, 4096, [4096], SEEK_SET) = 0
> read(4, "E\364\215\24E\0\0\0\0\241\0\0\0\0f\213\24\20\203\342\10"..., 4096) = 4096
> _llseek(4, 8192, [8192], SEEK_SET) = 0
> --- SIGSEGV (Segmentation fault) ---
> +++ killed by SIGSEGV +++
> [[email protected] scripts]# exit
> exit
>
> Script done on Sun Nov 19 19:14:35 2000
>
> I can even see obvious bugs in the trace, i.e., :
>
> stat("/usrusr/lib/ldscripts", 0xbffffa7c) = -1 ENOENT (No such file or directory)
> ----------^^^
>
> Although this was not the reason for the seg-fault.
>
>
> (5) I returned home (200 mile round trip), removed my
> SCSI disks from my home machine, and then returned
> and installed them in my friend's machine. The
> machine worked perfectly with Linux version 2.2.17,
> and gcc-2.7.2.3, Binutils-2.8.1.0, etc., the standard
> stuff.
>
> This shows that the problems are not because of a
> defective machine.
>
> I cloned my disks to his disks and, he's running.
> however, I don't have all the GUI stuff installed that
> he likes (needs). My disks had been built up over
> over several years by Richard Johnson, a frequent
> contributor to Linux.
>
> (6) I have been told that I could get a statically-linked
> version of `ld`, plus another 'C' compiler statically-
> linked so that these don't use the possibly defective
> dynamic libraries. I could then build a decent working
> system using sources available on the Internet.
>
> From a customer's perspective, this is absurd. When
> a customer purchases a "shrink-wrapped" operating
> system he expects it to work.
>
> (7) This fiasco is an example of why Linux is in big trouble.
> Once Linux obtained visibility, it became necessary for
> the most visible distributors and VARs to provide a high
> quality product.
>
> Red Hat, provably does not. Red Hat employees dominate
> Linux kernel development, even moderating (read controlling)
> what will be and what will not be done within this operating
> system.
>
> As repeatably demonstrated on this list, Red Hat employees
> spend much time denigrating others at the expense of providing
> a useful product.
>
> (8) Now I'm pretty sure that this email will generate a lot of
> flames. So be it. Linux, as an operating system to replace
> windows, has gone to hell because of at least one distributor's
> selling of garbage. There may be more such defective distributions
> out there. I certainly don't know what to purchase for my
> next attempt at a "shrink-wrap" installation.
>
>
> Very Truly Yours,
>
> Charles Turner
>
> Member(s) IEEE, IEEE Computer Society, AIAA
>
> I speak only for myself, which is enough of a problem.
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> Please read the FAQ at http://www.tux.org/lkml/
>

--

=====================================================================
Mohammad A. Haque http://www.haque.net/
[email protected]

"Alcohol and calculus don't mix. Project Lead
Don't drink and derive." --Unknown http://wm.themes.org/
[email protected]
=====================================================================

2000-11-20 14:36:15

by Andreas Jaeger

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

>>>>> Charles Turner, Ph D writes:

> I tried to help a friend this weekend convert to Linux.
> He lives in Upstate New York, so it was a long trip from
> Cambridge, Massachusetts.

> I was terribly wrong. This Red Hat version is irrevocably defective.

This list is about problems with the Linux kernel and not with
specific distributions. Please report this directly to the
distribution maker, it's totally off-topic here.

Andreas
--
Andreas Jaeger
SuSE Labs [email protected]
private [email protected]
http://www.suse.de/~aj

2000-11-20 14:39:25

by Tigran Aivazian

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

On Mon, 20 Nov 2000, Charles Turner, Ph.D. wrote:
> I certainly don't know what to purchase for my
> next attempt at a "shrink-wrap" installation.

Try Red Hat 7.0 -- it is certainly better. True, no distribution is
perfect but over the years I've developed my own CD image upgrade.iso
which goes directly after installing latest Red Hat distribution. It is
full of things like BRS, dict(1), 'alias md="mkdir -p"' or "set
editing-mode vi" which should be installed by default but for some reason
aren't. upgrade.iso is about 120M which means there is only 120M of bits
that separate "plain red hat" from "perfect Linux workstation" (assuming
my configuration is perfect :) Red Hat 7.0 installed just fine on a range
of my home machines from 486/66MHz/16M RAM to PIII laptop to 2way PIII
desktop to 4way Xeon server -- some with SCSI, some without -- no glitches
whatsoever.

So, instead of blaming some old obsolete versions of Red Hat, try the
latest, especially when recommending to a friend.

The only obvious bug present even in the latest Red Hat 7.0 (which I keep
forgetting to report to them) is that it won't boot if you install on a
disk with lots of foreign partitions (I have UnixWare 7 and FreeBSD 4.x
there) because the installation kernel doesn't support them and the final
kernel does which creates a mismatch in the partition numbering.

The above bug is not critical as Andries Brouwer has long ago fixed this
bug in the 2.4 kernel (i.e. made DOS physical partitions enumerated first
so no foreign partitions can mix anything up) so the workaround is -- boot
your system somehow (be a man, find _some_ way of booting your system even
if there is no way ;) and then install 2.4 kernel and from then on
everything will be okay.

Regards,
Tigran

2000-11-20 14:46:45

by Gregory Maxwell

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

On Mon, Nov 20, 2000 at 08:53:19AM -0500, Charles Turner, Ph.D. wrote:
[snip]
> I was terribly wrong. This Red Hat version is irrevocably defective.
[snip]
> (3) It "sort of" worked. However, network daemons kept
> dropping core. X would eventually crash, leaving the
> terminal in an unusable state, etc.
>
> (4) It is impossible to build a known working kernel on the
> machine because the linker, `ld` crashes:
[snip]
> This shows that the problems are not because of a
> defective machine.
[snip]
> I speak only for myself, which is enough of a problem.

The only thing defective I can see here is you:

1. You posted this to a totally inappropriate mailing list.
2. You posted with a tone that shows you are totally uninterested in getting
help with your problems.
3. You have failed to use the proper support channels.
3. You trouble shooting skills are defective:
If you think that a Linux distribution that works just fine for test of
thousands of people fails to a buggy linker then you are a fool.

Just because it works with other software doesn't mean that the
memory is good. If you have a single bad bit, then you are very
sensitive on alignment a different piece of software may have no
issues.

But why should I expect anything reasonable from a poster at
analogic.com? (apologies to those there who have improved!) :)

2000-11-20 15:21:12

by Jes Sorensen

[permalink] [raw]
Subject: Whiner spams linux-kernel (Re: Defective Red Hat Distribution poorly represents Linux)

>>>>> "Charles" == Charles Turner, Ph D <[email protected]> writes:

Charles> It had been running Windows 2000 "Professional". Several
Charles> months ago, he purchased Red Hat "DELUXE" version 6.2. He was
Charles> unable to install it. I convinced him that installation was
Charles> easy.

Charles> I was terribly wrong. This Red Hat version is irrevocably
Charles> defective.

Great for you, since you know all these details one would have
expected you to know what the linux-kernel list is meant for as
well. Guess what, it's not a list for whining about distributions,
it's something you can complain about to the distributors.

Please come back once you have somethin on topic to discuss.

Jes

2000-11-20 16:19:28

by Bernhard Rosenkraenzer

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

Wrong list, but this needs to be set straight. Please send any further
problem reports about Red Hat Linux to http://bugzilla.redhat.com/bugzilla

> I was terribly wrong. This Red Hat version is irrevocably defective.

With the exception that it works for everyone else.

> (1) It will not create a bootable disk because it forgets
> to load scsi_mod.o, and sd_mod.o before it loads
> aic7xxx.o

This doesn't happen here. It's supposed to use modprobe, which
automatically finds these dependencies.

> /etc/conf.modules was found to contain only aic7xxx
> as an alias for scsi_hostadapter.

How did you install that?
>From a relatively fresh 6.2 install (this box doesn't have any SCSI
controllers or soundcards):

# cat /etc/conf.modules
alias eth0 3c90x
alias parport_lowlevel parport_pc

> (3) It "sort of" worked. However, network daemons kept
> dropping core. X would eventually crash, leaving the
> terminal in an unusable state, etc.

Are you sure the hardware is ok? Applications that usually work well
dumping core is usually a sign of bad memory or overheated CPUs. See
http://www.bitwizard.nl/sig11/ for more detailed information.

It's either this, or you've added customizations that don't work, or
you've used a CD someone has tampered with.

We know of _many_ servers running Red Hat Linux 6.2 with an uptime ever
since they first installed.

> (4) It is impossible to build a known working kernel on the
> machine because the linker, `ld` crashes:

Same as (3).
I've been using 6.2 until 7 was released, I usually compile about 25
packages a day, and I've never seen ld crashing.

2000-11-20 16:43:57

by Bernhard Rosenkraenzer

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

On Mon, 20 Nov 2000, Tigran Aivazian wrote:

> Try Red Hat 7.0 -- it is certainly better. True, no distribution is
> perfect but over the years I've developed my own CD image upgrade.iso
> which goes directly after installing latest Red Hat distribution. It is
> full of things like BRS, dict(1), 'alias md="mkdir -p"' or "set
> editing-mode vi" which should be installed by default but for some reason
> aren't.

Is this thing available for download somewhere? I'd definitely like to see
what we should be doing differently.

"set editing-mode vi" definitely won't make it into the base distribution
though, it's impossible for total newbies to handle (and people who like
it usually know how to turn it on themselves).

LLaP
bero


2000-11-20 17:05:50

by Jeff V. Merkey

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

On Mon, Nov 20, 2000 at 08:53:19AM -0500, Charles Turner, Ph.D. wrote:

Charles,

6.2 is one of te better distributions. You should also go talk to RedHat
directly.

Jeff

>
> I tried to help a friend this weekend convert to Linux.
> He lives in Upstate New York, so it was a long trip from
> Cambridge, Massachusetts.
>
> He has a Dual Pentium III, 600 MHz TYAN "Thunderbolt".
> It has a built-in Adaptec SCSI controller and Intel
> 100-base-T Ethernet controller. It also has 1/2 Gbytes
> of RAM. It's a superb machine.
>
> It had been running Windows 2000 "Professional". Several months
> ago, he purchased Red Hat "DELUXE" version 6.2. He was unable to
> install it. I convinced him that installation was easy.
>
> I was terribly wrong. This Red Hat version is irrevocably defective.
>
> (1) It will not create a bootable disk because it forgets
> to load scsi_mod.o, and sd_mod.o before it loads
> aic7xxx.o
> /etc/conf.modules was found to contain only aic7xxx
> as an alias for scsi_hostadapter.
>
> (2) Once I made a bootable diskette from a working machine
> at home (200 miles round-trip distance), I was able to
> install its supplied kernel version 2.2.14-5.0, and
> 2.2.14-5.0smp into the LILO boot sequence.
>
> (3) It "sort of" worked. However, network daemons kept
> dropping core. X would eventually crash, leaving the
> terminal in an unusable state, etc.
>
> (4) It is impossible to build a known working kernel on the
> machine because the linker, `ld` crashes:
>
> Script started on Sun Nov 19 19:11:55 2000
> [[email protected] linux-2.2.17]# make dep
> gcc -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -o scripts/mkdep scripts/mkdep.c
> collect2: ld terminated with signal 11 [Segmentation fault], core dumped
> make: *** [scripts/mkdep] Error 1
> [[email protected] linux-2.2.17]# ld --version
> GNU ld 2.9.5
> Copyright 1997 Free Software Foundation, Inc.
> This program is free software; you may redistribute it under the terms of
> the GNU General Public License. This program has absolutely no warranty.
> Supported emulations:
> elf_i386
> i386linux
> [[email protected] linux-2.2.17]# cd scripts
> [[email protected] scripts]# gcc -o mkdep.o mkdep.c
> collect2: ld terminated with signal 11 [Segmentation fault], core dumped
> [[email protected] scripts]# gcc -c -o mkdep.o mkdep.c
> [[email protected] scripts]# ld -o mkdep mkdep.o
> Segmentation fault (core dumped)
>
> [[email protected] scripts]# strace ld -o mkdep mkdep.c
> execve("/usr/bin/ld", ["ld", "-o", "mkdep", "mkdep.o"], [/* 20 vars */]) = 0
> brk(0) = 0x807bb84
> old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40014000
> open("/etc/ld.so.preload", O_RDONLY) = -1 ENOENT (No such file or directory)
> open("/etc/ld.so.cache", O_RDONLY) = 3
> fstat(3, {st_mode=S_IFREG|0644, st_size=16290, ...}) = 0
> old_mmap(NULL, 16290, PROT_READ, MAP_PRIVATE, 3, 0) = 0x40015000
> close(3) = 0
> open("/usr/lib/libbfd-2.9.5.0.22.so", O_RDONLY) = 3
> fstat(3, {st_mode=S_IFREG|0755, st_size=314936, ...}) = 0
> read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\320\242"..., 4096) = 4096
> old_mmap(NULL, 279260, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x40019000
> mprotect(0x40059000, 17116, PROT_NONE) = 0
> old_mmap(0x40059000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x3f000) = 0x40059000
> old_mmap(0x4005d000, 732, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x4005d000
> close(3) = 0
> open("/lib/libdl.so.2", O_RDONLY) = 3
> fstat(3, {st_mode=S_IFREG|0755, st_size=75131, ...}) = 0
> read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\340\34"..., 4096) = 4096
> old_mmap(NULL, 12428, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x4005e000
> mprotect(0x40060000, 4236, PROT_NONE) = 0
> old_mmap(0x40060000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x1000) = 0x40060000
> close(3) = 0
> open("/lib/libc.so.6", O_RDONLY) = 3
> fstat(3, {st_mode=S_IFREG|0755, st_size=4101324, ...}) = 0
> read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\210\212"..., 4096) = 4096
> old_mmap(NULL, 1001564, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x40062000
> mprotect(0x4014f000, 30812, PROT_NONE) = 0
> old_mmap(0x4014f000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0xec000) = 0x4014f000
> old_mmap(0x40153000, 14428, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x40153000
> close(3) = 0
> mprotect(0x40062000, 970752, PROT_READ|PROT_WRITE) = 0
> mprotect(0x40062000, 970752, PROT_READ|PROT_EXEC) = 0
> munmap(0x40015000, 16290) = 0
> personality(PER_LINUX) = 0
> getpid() = 763
> getrusage(RUSAGE_SELF, {ru_utime={0, 0}, ru_stime={0, 0}, ...}) = 0
> brk(0) = 0x807bb84
> brk(0x807bbbc) = 0x807bbbc
> brk(0x807c000) = 0x807c000
> open("/usr/share/locale/locale.alias", O_RDONLY) = 3
> fstat64(0x3, 0xbfffb84c) = -1 ENOSYS (Function not implemented)
> fstat(3, {st_mode=S_IFREG|0644, st_size=2265, ...}) = 0
> old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40015000
> read(3, "# Locale name alias data base.\n#"..., 4096) = 2265
> brk(0x807d000) = 0x807d000
> read(3, "", 4096) = 0
> close(3) = 0
> munmap(0x40015000, 4096) = 0
> open("/usr/share/i18n/locale.alias", O_RDONLY) = -1 ENOENT (No such file or directory)
> open("/usr/share/locale/en_US/LC_MESSAGES", O_RDONLY) = 3
> fstat(3, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
> close(3) = 0
> open("/usr/share/locale/en_US/LC_MESSAGES/SYS_LC_MESSAGES", O_RDONLY) = 3
> fstat(3, {st_mode=S_IFREG|0644, st_size=44, ...}) = 0
> old_mmap(NULL, 44, PROT_READ, MAP_PRIVATE, 3, 0) = 0x40015000
> close(3) = 0
> stat("/usrusr/lib/ldscripts", 0xbffffa7c) = -1 ENOENT (No such file or directory)
> brk(0x807f000) = 0x807f000
> brk(0x8080000) = 0x8080000
> brk(0x8081000) = 0x8081000
> brk(0x8082000) = 0x8082000
> brk(0x8083000) = 0x8083000
> unlink("mkdep") = -1 ENOENT (No such file or directory)
> open("mkdep", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
> brk(0x8084000) = 0x8084000
> brk(0x8088000) = 0x8088000
> brk(0x8089000) = 0x8089000
> brk(0x808a000) = 0x808a000
> open("mkdep.o", O_RDONLY) = 4
> fstat(4, {st_mode=S_IFREG|0644, st_size=9716, ...}) = 0
> old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40016000
> _llseek(4, 0, [0], SEEK_SET) = 0
> read(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\1\0\3\0\1\0\0\0\0\0\0\0"..., 4096) = 4096
> _llseek(4, 4096, [4096], SEEK_SET) = 0
> _llseek(4, 4096, [4096], SEEK_SET) = 0
> _llseek(4, 4096, [4096], SEEK_SET) = 0
> _llseek(4, 4096, [4096], SEEK_SET) = 0
> _llseek(4, 4096, [4096], SEEK_SET) = 0
> _llseek(4, 4096, [4096], SEEK_SET) = 0
> _llseek(4, 4096, [4096], SEEK_SET) = 0
> _llseek(4, 4096, [4096], SEEK_SET) = 0
> _llseek(4, 4096, [4096], SEEK_SET) = 0
> _llseek(4, 4096, [4096], SEEK_SET) = 0
> read(4, "E\364\215\24E\0\0\0\0\241\0\0\0\0f\213\24\20\203\342\10"..., 4096) = 4096
> _llseek(4, 8192, [8192], SEEK_SET) = 0
> --- SIGSEGV (Segmentation fault) ---
> +++ killed by SIGSEGV +++
> [[email protected] scripts]# exit
> exit
>
> Script done on Sun Nov 19 19:14:35 2000
>
> I can even see obvious bugs in the trace, i.e., :
>
> stat("/usrusr/lib/ldscripts", 0xbffffa7c) = -1 ENOENT (No such file or directory)
> ----------^^^
>
> Although this was not the reason for the seg-fault.
>
>
> (5) I returned home (200 mile round trip), removed my
> SCSI disks from my home machine, and then returned
> and installed them in my friend's machine. The
> machine worked perfectly with Linux version 2.2.17,
> and gcc-2.7.2.3, Binutils-2.8.1.0, etc., the standard
> stuff.
>
> This shows that the problems are not because of a
> defective machine.
>
> I cloned my disks to his disks and, he's running.
> however, I don't have all the GUI stuff installed that
> he likes (needs). My disks had been built up over
> over several years by Richard Johnson, a frequent
> contributor to Linux.
>
> (6) I have been told that I could get a statically-linked
> version of `ld`, plus another 'C' compiler statically-
> linked so that these don't use the possibly defective
> dynamic libraries. I could then build a decent working
> system using sources available on the Internet.
>
> From a customer's perspective, this is absurd. When
> a customer purchases a "shrink-wrapped" operating
> system he expects it to work.
>
> (7) This fiasco is an example of why Linux is in big trouble.
> Once Linux obtained visibility, it became necessary for
> the most visible distributors and VARs to provide a high
> quality product.
>
> Red Hat, provably does not. Red Hat employees dominate
> Linux kernel development, even moderating (read controlling)
> what will be and what will not be done within this operating
> system.
>
> As repeatably demonstrated on this list, Red Hat employees
> spend much time denigrating others at the expense of providing
> a useful product.
>
> (8) Now I'm pretty sure that this email will generate a lot of
> flames. So be it. Linux, as an operating system to replace
> windows, has gone to hell because of at least one distributor's
> selling of garbage. There may be more such defective distributions
> out there. I certainly don't know what to purchase for my
> next attempt at a "shrink-wrap" installation.
>
>
> Very Truly Yours,
>
> Charles Turner
>
> Member(s) IEEE, IEEE Computer Society, AIAA
>
> I speak only for myself, which is enough of a problem.
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> Please read the FAQ at http://www.tux.org/lkml/

2000-11-20 17:31:23

by Werner Almesberger

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

Charles Turner, Ph.D. wrote:
> I can even see obvious bugs in the trace, i.e., :
> stat("/usrusr/lib/ldscripts", 0xbffffa7c) = -1 ENOENT (No such file or directory)

Probably only a cosmetic problem. A regular run (RedHat binutils-2.9.5.0.22-6)
yields:

stat("/usrusr/lib/ldscripts", 0xbffff5c4) = -1 ENOENT (No such file or directory
)
stat("/usr/bin/ldscripts", 0xbffff5c4) = -1 ENOENT (No such file or directory)
stat("/usr/bin/../lib/ldscripts", {st_mode=S_IFDIR|0755, st_size=1024, ...}) = 0

So it's not perfect, but it works.

- Werner

--
_________________________________________________________________________
/ Werner Almesberger, ICA, EPFL, CH [email protected] /
/_IN_N_032__Tel_+41_21_693_6621__Fax_+41_21_693_6610_____________________/

2000-11-20 18:08:43

by Tigran Aivazian

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

On Mon, 20 Nov 2000, Bernhard Rosenkraenzer wrote:

> On Mon, 20 Nov 2000, Tigran Aivazian wrote:
>
> > Try Red Hat 7.0 -- it is certainly better. True, no distribution is
> > perfect but over the years I've developed my own CD image upgrade.iso
> > which goes directly after installing latest Red Hat distribution. It is
> > full of things like BRS, dict(1), 'alias md="mkdir -p"' or "set
> > editing-mode vi" which should be installed by default but for some reason
> > aren't.
>
> Is this thing available for download somewhere? I'd definitely like to see
> what we should be doing differently.

I can put it somewhere, but it won't contain things which are hard to turn
into files, e.g.:

1) after install go through the /etc/rc.d/rc3.d and turn S -> s, for
these:

s05kudzu s08ipchains s18autofs s35identd s45pcmcia s60lpd
s85httpd s99linuxconf
s06reconfig s16apmd s25netfs s40atd s55sshd s80isdn
s97rhnsd

2) edit /etc/sysctl.conf to _allow_ sysrq!

3) edit /etc/ftpusers to allow root ftp

4) edit /etc/pam.d/login and /etc/pam.d/rlogin to comment out securetty
PAM module (so we can telnet as root on _any_ tty)

5) edit /etc/inittab to have --noclear in front of the first getty
(actually, either SuSE or Mandrake, can't remember which, guessed that it
is the sane thing to do)

6) edit /etc/inittab to get rid of update, it is not needed (on 2.4)

7) edit /etc/rc.d/init.d/halt to get rid of acct and and quotaoff lines
(who uses accounting and quota on a desktop/laptop, yes, I know I am very
subjective :) but, seriously, those who need them know how to turn them on

8) edit /etc/rc.d/init.d/nfs and allow NFS v3 (also make a symlink in
/etc/rc.d/rc3.d -- there is one for nfslock but not for nfs)

9) edit /etc/rc.d/init.d/nfslock and get rid of the obsolete rpc.lockd
thingy -- it's a 2.2.x monster, not needed anymore.

10) edit /etc/rc.d/rc.local and (that's an important one, I always forget
to mail you about it!) make sure that each terminal shows which tty it is,
i.e. /etc/issue.net should be generated into something like this:

Red Hat Linux 7.0 (Guinness)
Kernel %r %m (%t on %h)

and /etc/issue into something like this:

Red Hat Linux 7.0 (Guinness)
Kernel \r \m (\l)

(yes, I know it is not perfect for a serial console but whoever knows of
an /etc/issue that can satisfy both, let me know). Also, note that my
version is so much more compact than yours (think of those network packets
over telnet!) and yet conveys more information (apart from how many CPU-s
but I can enhance it to do it, I _know_ how many cpus each of my machines
has, but I do _not_ know what tty I am on, unless it shows me).

11) edit /etc/rc.d/rc.local to add setterm -blank 0. It is so annoying to
have a kernel panic and you can't even unblank the console to see what it
was.

12) edit /etc/rc.d/rc.sysinit to get rid of all those unneeded lines (like
swaping to files, if people swap to files they can uncomment them, also
forcing SCSI tape module -- are you sure it's still needed, etc.) my
/etc/rc.d/rc.sysinit is tiny and yet has all I ever needed.

13) add packages: a2ps, acroread, bonnie, brs, ddd, cscope, dictd, dictdb
(last two I have in rpm form, if you need them, the rest are from your
powertools cd). psutils, timidity, xv, xanim, bvi, libdes (I don't care
about laws -- I use that which I know and is comfortable and des(1) fits
both), xruskb (ok, not many people need russian stuff so you can drop this
one :)

14) install from source: util-linux (because the way it is compiled in
redhat by default is wrong), memtest86, unixbench, modutils.

these are just from memory, there must be lots of other things.

regards,
Tigran

2000-11-20 18:34:18

by Charles Turner, Ph.D.

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

On Mon, 20 Nov 2000, Tigran Aivazian wrote:

> On Mon, 20 Nov 2000, Charles Turner, Ph.D. wrote:
> > I certainly don't know what to purchase for my
> > next attempt at a "shrink-wrap" installation.
>
> Try Red Hat 7.0 -- it is certainly better. True, no distribution is
[SNIPPED...]

I just got in after trying to recover from the 500++ mile
trips yesterday.

I will answer all with this short response. Only one will be
forwarded to linux-kernel.

(1) Most nasty-grams were from those who didn't even read the subject.
And yes, it should be of great concern to those on the linux-
kernel development list. The most visible advocate of Linux
is Red Hat. When they drop the ball, it's a concern for all
the developers.

(2) I got about 32 private responses from folks who wanted to help.
Thank you to all of them.

(3) One Red Hat employee stated that the distribution must have
been hacked. I think it's a bit hard to rewrite Distribution
CD-ROMS. He also didn't know that the boot occurs with initrd,
requiring the proper modules to be loaded from the RAM disk
before the SCSI hard disk could be accessed.

(4) For those who think the hardware is broken; The hardware worked
for six months using Windows/2000. It has a NT core.

The distribution was reinstalled with only one CPU installed.
When that failed, I changed to the other CPU and tried again.

Then I installed only one 'stick' of RAM (128 meg). Then
I tried to install the distribution again.

I did this 4 times for each of the four sticks of RAM.

In every case, the distribution failed to make a bootable
system. However, in each case I booted it on a 2.2.17
rescue disk and it worked.

(5) Again, the system works fine when a 'homemade' distribution
using the current glibc, gcc compiler, and linux-2.2.17
are used. I have kept all the tools listed in
linux/Documentation/Changes current on this hard disk.

(6) One of my co-workers pointed out that the distribution
kernel does a test for MMX speed upon startup. It then
will use MMX for copies, etc., if it finds it's fast.
He pointed out that this was not very mature around the
time this distribution was made. It probably was not well
tested and may be the reason for network daemons dying.

This distribution was purchased in July of this year.
If a 4 month old distribution is "obsolete", as one
respondent said, then we had all better give up.


Very Truly Yours,

Charles Turner

Member(s) IEEE, IEEE Computer Society, AIAA

I speak only for myself, which is enough of a problem.



2000-11-20 18:50:02

by Mohammad A. Haque

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

At the risk of being flamed for a distribution type discussion...

Security nuts are probably rolling on the floor laughing at you for
these two. I can think of some situations where these would be usefull
though.

On Mon, 20 Nov 2000, Tigran Aivazian wrote:

> 3) edit /etc/ftpusers to allow root ftp
>
> 4) edit /etc/pam.d/login and /etc/pam.d/rlogin to comment out securetty
> PAM module (so we can telnet as root on _any_ tty)
>

--

=====================================================================
Mohammad A. Haque http://www.haque.net/
[email protected]

"Alcohol and calculus don't mix. Project Lead
Don't drink and derive." --Unknown http://wm.themes.org/
[email protected]
=====================================================================

2000-11-20 18:53:23

by John Jasen

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

On Mon, 20 Nov 2000, Charles Turner, Ph.D. wrote:

> (4) For those who think the hardware is broken; The hardware worked
> for six months using Windows/2000. It has a NT core.

On this note, I recall a time that I 'appropriated' a workstation for
linux.

It was pulled out of the student labs, where it had worked for 3 months
running NT 4.0, but the RH install kept on crashing out.

I could even reinstall NT 4.0.

*shrug*

Eventually traced it down to memory, and had our hardware hacks replace
it.

Sometimes hardware problems can be subtle.

--
-- John E. Jasen ([email protected])
-- Some elections you just can't buy. For others, there's GORE 2000

2000-11-20 19:15:06

by Mohammad A. Haque

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

Same experince here....

Boxes ran perfectly fine with Windows (95/98/NT) but barfed with linux.
RAM replacement fixed it. Now whenever I see a signal 11 with gcc memory
is the first thing I go after.

On Mon, 20 Nov 2000, John Jasen wrote:

> On Mon, 20 Nov 2000, Charles Turner, Ph.D. wrote:
> On this note, I recall a time that I 'appropriated' a workstation for
> linux.
>
> It was pulled out of the student labs, where it had worked for 3 months
> running NT 4.0, but the RH install kept on crashing out.
>
> I could even reinstall NT 4.0.
>
> *shrug*
>
> Eventually traced it down to memory, and had our hardware hacks replace
> it.
>
> Sometimes hardware problems can be subtle.

--

=====================================================================
Mohammad A. Haque http://www.haque.net/
[email protected]

"Alcohol and calculus don't mix. Project Lead
Don't drink and derive." --Unknown http://wm.themes.org/
[email protected]
=====================================================================

2000-11-20 19:38:29

by Igmar Palsenberg

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

On Mon, 20 Nov 2000, Charles Turner, Ph.D. wrote:

<snip bullshit story>

These are hardware problems, not software. Programs like gcc and ld
segfaulting like this is NOT a software problem.

Please don't turn up with some 'hey, it worked with my disk', that's no
clue that the distrib is bad. The same arguments as 'it works with
Windows'.

Stick with RH 6.2 if you want something stable.


Igmar

2000-11-20 20:08:09

by Mohammad A. Haque

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

To Charles:
I see your intentions but you really want to take this up with Redhat
and some linux advocacy groups. linux-kernel really doesnt need to deal
with things like gcc being broken and such (which I don't think is your
case; check your hardware -- my reason? I've deployed RH 6.2 on 20 or
so server, all scsi, without a hitch. also signal 11 points to memory
problems).

To list:
Could we please come across to users a little more politely? I know its
frustrating when people come barging in complaining about something that
really should be directed somewhere else. But it's heck of alot better
if we don't have a user that later thinks 'Damn, linux developers are
meanies' and then are afraid to ask us questions later which wuld
probably be useful for us. I'm not saying I was right by simply blowing
off Charles by telling him this is the wrong list. I should have pointed
him in the right direction and given quick pointers.

On Mon, 20 Nov 2000, Charles Turner, Ph.D. wrote:
> (1) Most nasty-grams were from those who didn't even read the subject.
> And yes, it should be of great concern to those on the linux-
> kernel development list. The most visible advocate of Linux
> is Red Hat. When they drop the ball, it's a concern for all
> the developers.

--

=====================================================================
Mohammad A. Haque http://www.haque.net/
[email protected].net

"Alcohol and calculus don't mix. Project Lead
Don't drink and derive." --Unknown http://wm.themes.org/
[email protected]
=====================================================================

2000-11-20 20:13:50

by Andre Hedrick

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux


Can everyone lay off this guy, he made a mistake and the heat is not cool.
This is no way for the general masses to get a taste of Linux, cool?
Please jsut let it die or offline the chap.

Regards,

Andre Hedrick
CTO Timpanogas Research Group
EVP Linux Development, TRG
Linux ATA Development

2000-11-20 20:25:55

by Paul Fulghum

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

> it's heck of alot better if we don't have a user that later
> thinks 'Damn, linux developers are meanies'...
>
> Mohammad A. Haque http://www.haque.net/

When in fact according to this linux-kernel post:
http://www.uwsg.iu.edu/hypermail/linux/kernel/9912.1/0653.html
they are goats that eat fermented potatoes.

Paul Fulghum [email protected]
Microgate Corporation http://www.microgate.com


2000-11-20 20:41:41

by Frank van Maarseveen

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

On Mon, Nov 20, 2000 at 08:53:19AM -0500, Charles Turner, Ph.D. wrote:
> [[email protected] linux-2.2.17]# make dep
> gcc -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -o scripts/mkdep scripts/mkdep.c
> collect2: ld terminated with signal 11 [Segmentation fault], core dumped
> make: *** [scripts/mkdep] Error 1
[...]
> [[email protected] linux-2.2.17]# cd scripts
> [[email protected] scripts]# gcc -o mkdep.o mkdep.c
> collect2: ld terminated with signal 11 [Segmentation fault], core dumped
> [[email protected] scripts]# gcc -c -o mkdep.o mkdep.c
> [[email protected] scripts]# ld -o mkdep mkdep.o
> Segmentation fault (core dumped)
This _is_ a hardware problem.

>
> (5) I returned home (200 mile round trip), removed my
> SCSI disks from my home machine, and then returned
> and installed them in my friend's machine. The
> machine worked perfectly with Linux version 2.2.17,
> and gcc-2.7.2.3, Binutils-2.8.1.0, etc., the standard
> stuff.
>
> This shows that the problems are not because of a
> defective machine.
Wrong.
One cannot do statistics on one case. But you can on 10000+ of other
cases where the above just works (actually, even one case where it works
proves enough). You should give the mainboard a good massage to make it
behave more deterministically.

dust, dirt, aging, bad connectors, broken lines on the mainboard
incidentally making contact due to mechanical forces, thermal effects,
who knows what it is. It could be anything. It really is faulty hardware.

--
Frank

2000-11-20 21:04:44

by Horst H. von Brand

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

John Jasen <[email protected]> said:

[...]

> On this note, I recall a time that I 'appropriated' a workstation for
> linux.
>
> It was pulled out of the student labs, where it had worked for 3 months
> running NT 4.0, but the RH install kept on crashing out.

So what? My former machine ran fine with Win95/WinNT. Linux wouldn't even
end booting the kernel. Reason: P/100 was running at 120Mhz. Fixed that, no
trouble for years. Not the only case of WinXX running (apparently?) fine
on broken/misconfigured hardware I've seen, mind you.
--
Dr. Horst H. von Brand mailto:[email protected]
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513

2000-11-20 21:42:19

by Ben Ford

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

Tigran Aivazian wrote:

<snip>

>
> 3) edit /etc/ftpusers to allow root ftp
>
> 4) edit /etc/pam.d/login and /etc/pam.d/rlogin to comment out securetty
> PAM module (so we can telnet as root on _any_ tty)

Not into security are you?

-b


2000-11-20 21:46:40

by Ben Ford

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

Ya, I also had a system that ran many OS's great, including Linux, Win98,
Win2k, etc. However when I went to install NT on it, the CPU overheated
every time. Ya, I know, doesn't make sense, but that's how it was.

-b


John Jasen wrote:

> On Mon, 20 Nov 2000, Charles Turner, Ph.D. wrote:
>
> > (4) For those who think the hardware is broken; The hardware worked
> > for six months using Windows/2000. It has a NT core.
>
> On this note, I recall a time that I 'appropriated' a workstation for
> linux.
>
> It was pulled out of the student labs, where it had worked for 3 months
> running NT 4.0, but the RH install kept on crashing out.
>
> I could even reinstall NT 4.0.
>
> *shrug*
>
> Eventually traced it down to memory, and had our hardware hacks replace
> it.
>
> Sometimes hardware problems can be subtle.
>
> --
> -- John E. Jasen ([email protected])
> -- Some elections you just can't buy. For others, there's GORE 2000
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> Please read the FAQ at http://www.tux.org/lkml/

2000-11-20 23:31:54

by spam

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

On Mon, 20 Nov 2000, Paul Fulghum wrote:

> When in fact according to this linux-kernel post:
> http://www.uwsg.iu.edu/hypermail/linux/kernel/9912.1/0653.html
> they are goats that eat fermented potatoes.

Hahaha, gotta love flame wars =)
pavel

--
Bask in the glow of the digital silence
http://www.vancouver.yi.org

2000-11-21 00:13:17

by FORT David

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

Ben Ford wrote:

> Ya, I also had a system that ran many OS's great, including Linux, Win98,
> Win2k, etc. However when I went to install NT on it, the CPU overheated
> every time. Ya, I know, doesn't make sense, but that's how it was.
>
> -b
>
>

It makes sense for me as win2000 is always 5?c hotter than linux, and on

both CPU.....

--
%-------------------------------------------------------------------------%
% FORT David, %
% 7 avenue de la morvandi?re 0240726275 %
% 44470 Thouare, France [email protected] %
% ICU:78064991 AIM: enlighted popo [email protected] %
%--LINUX-HTTPD-PIOGENE----------------------------------------------------%
% -datamining <-/ | .~. %
% -networking/flashed PHP3 coming soon | /V\ L I N U X %
% -opensource | // \\ >Fear the Penguin< %
% -GNOME/enlightenment/GIMP | /( )\ %
% feel enlighted.... | ^^-^^ %
% http://ibonneace.dnsalias.org/ when connected %
%-------------------------------------------------------------------------%



2000-11-21 06:24:17

by Albert D. Cahalan

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

> These are hardware problems, not software. Programs like gcc and ld
> segfaulting like this is NOT a software problem.
>
> Please don't turn up with some 'hey, it worked with my disk', that's no
> clue that the distrib is bad. The same arguments as 'it works with
> Windows'.

This could be a Linux kernel problem. (though likely not)

If one disk works and another one not, one might suspect
that the wrong DMA mode is being used in the crashing case.
The easy fix: replace the drive with a different model, and
make sure you have the most modern cables.



2000-11-21 08:47:16

by Peter Samuelson

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux


[Albert D. Cahalan]
> If one disk works and another one not, one might suspect
> that the wrong DMA mode is being used in the crashing case.

So, what DMA mode do *you* usually set for aic7xxx? (: (:

Peter

2000-11-21 21:38:56

by David Riley

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

Horst von Brand wrote:
>
> So what? My former machine ran fine with Win95/WinNT. Linux wouldn't even
> end booting the kernel. Reason: P/100 was running at 120Mhz. Fixed that, no
> trouble for years. Not the only case of WinXX running (apparently?) fine
> on broken/misconfigured hardware I've seen, mind you.

This is something I've noticed as well...

Windoze is not the only OS to handle bad hardware better than Linux. On
my Mac, I had a bad DIMM that worked fine on the MacOS side, but kept
causing random bus-type errors in Linux. Same as when I accidentally
(long story) overclocked the bus on the CPU. I think that more
tolerance for faulty hardware (more than just poorly programmed BIOS or
chipsets with known bugs) is something that might be worth looking into.
I'm sure it would solve problems like this (which I clearly identify as
a hardware problem, because the same thing happened with the bad DIMM,
the overclocked bus, and two different overclocked processors (AMD 5x86
and AMD K6-2 500) and went away when I remedied the offending problem).
Additionally, overclockers (I myself am a reformed one) might appreciate
more tolerance for such things.

My two cents/pence/centavos/local tiny currency denomination,
David

2000-11-21 21:50:52

by David Lang

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

David, usually when it turns out that Linux finds hardware problems the
underlying cause is that linux makes more effective use of the component,
and as such something that was marginal under windows fails under linux as
the correct timing is used.

David Lang

On Tue, 21 Nov 2000, David Riley wrote:

> Date: Tue, 21 Nov 2000 16:08:26 -0500
> From: David Riley <[email protected]>
> To: unlisted-recipients: ;
> Cc: [email protected]
> Subject: Re: Defective Red Hat Distribution poorly represents Linux
>
> Horst von Brand wrote:
> >
> > So what? My former machine ran fine with Win95/WinNT. Linux wouldn't even
> > end booting the kernel. Reason: P/100 was running at 120Mhz. Fixed that, no
> > trouble for years. Not the only case of WinXX running (apparently?) fine
> > on broken/misconfigured hardware I've seen, mind you.
>
> This is something I've noticed as well...
>
> Windoze is not the only OS to handle bad hardware better than Linux. On
> my Mac, I had a bad DIMM that worked fine on the MacOS side, but kept
> causing random bus-type errors in Linux. Same as when I accidentally
> (long story) overclocked the bus on the CPU. I think that more
> tolerance for faulty hardware (more than just poorly programmed BIOS or
> chipsets with known bugs) is something that might be worth looking into.
> I'm sure it would solve problems like this (which I clearly identify as
> a hardware problem, because the same thing happened with the bad DIMM,
> the overclocked bus, and two different overclocked processors (AMD 5x86
> and AMD K6-2 500) and went away when I remedied the offending problem).
> Additionally, overclockers (I myself am a reformed one) might appreciate
> more tolerance for such things.
>
> My two cents/pence/centavos/local tiny currency denomination,
> David
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> Please read the FAQ at http://www.tux.org/lkml/
>

2000-11-21 22:04:22

by David Riley

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

David Lang wrote:
>
> David, usually when it turns out that Linux finds hardware problems the
> underlying cause is that linux makes more effective use of the component,
> and as such something that was marginal under windows fails under linux as
> the correct timing is used.

This is true. What I suppose would be the solution is that if faulty
hardware is found, a reduction in performance should be made. This is
already the case for things like broken PCI BIOS where one can either
set the initialization to work a different way or try to make the
machine autodetect it. I certainly approve of more effective use of any
given component, but sometimes I think it's better to offer the user a
choice in the case of faulty hardware.

2000-11-21 22:07:52

by David Lang

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

the problem is that unless you tecompile the kernel to add timing delays,
you cannot change the timing like this (if you put the tests in all your
fast paths to add delays you have just destroyed your performance in the
case where the hardware is good)

also you don't know the hardware is really working properly under windows,
how do you know if the blue screen was caused by a windows bug or a
hardware error.

David Lang

On Tue, 21 Nov 2000, David Riley wrote:

> Date: Tue, 21 Nov 2000 16:34:08 -0500
> From: David Riley <[email protected]>
> To: David Lang <[email protected]>, [email protected]
> Subject: Re: Defective Red Hat Distribution poorly represents Linux
>
> David Lang wrote:
> >
> > David, usually when it turns out that Linux finds hardware problems the
> > underlying cause is that linux makes more effective use of the component,
> > and as such something that was marginal under windows fails under linux as
> > the correct timing is used.
>
> This is true. What I suppose would be the solution is that if faulty
> hardware is found, a reduction in performance should be made. This is
> already the case for things like broken PCI BIOS where one can either
> set the initialization to work a different way or try to make the
> machine autodetect it. I certainly approve of more effective use of any
> given component, but sometimes I think it's better to offer the user a
> choice in the case of faulty hardware.
>

2000-11-21 22:14:32

by Gérard Roudier

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux



On Tue, 21 Nov 2000, David Riley wrote:

> Horst von Brand wrote:
> >
> > So what? My former machine ran fine with Win95/WinNT. Linux wouldn't even
> > end booting the kernel. Reason: P/100 was running at 120Mhz. Fixed that, no
> > trouble for years. Not the only case of WinXX running (apparently?) fine
> > on broken/misconfigured hardware I've seen, mind you.
>
> This is something I've noticed as well...
>
> Windoze is not the only OS to handle bad hardware better than Linux. On
> my Mac, I had a bad DIMM that worked fine on the MacOS side, but kept
> causing random bus-type errors in Linux. Same as when I accidentally
> (long story) overclocked the bus on the CPU. I think that more
> tolerance for faulty hardware (more than just poorly programmed BIOS or
> chipsets with known bugs) is something that might be worth looking into.
> I'm sure it would solve problems like this (which I clearly identify as
> a hardware problem, because the same thing happened with the bad DIMM,
> the overclocked bus, and two different overclocked processors (AMD 5x86
> and AMD K6-2 500) and went away when I remedied the offending problem).
> Additionally, overclockers (I myself am a reformed one) might appreciate
> more tolerance for such things.

Hmmm... The more an O/S wait stupidly for something when it could do
useful work, the less it is likely to trigger hardware problems.

Windoze is probably still far better that Linux at handling billions
dollars. I never noticed it was good at anything else. :-)

> My two cents/pence/centavos/local tiny currency denomination,
> David

G?rard.

2000-11-21 22:16:12

by Bob Lorenzini

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

On Tue, 21 Nov 2000, David Riley wrote:

> Horst von Brand wrote:
>
> Windoze is not the only OS to handle bad hardware better than Linux. On
> my Mac, I had a bad DIMM that worked fine on the MacOS side, but kept
> causing random bus-type errors in Linux. Same as when I accidentally

I believe Linux uses memory from the top down rather than from the bottom
up like MS which may explain some of the reports that "it werks great in
windoze" where the faulty bit is high in the map.

Bob

2000-11-21 22:20:52

by Jeff Epler

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

On Tue, Nov 21, 2000 at 04:08:26PM -0500, David Riley wrote:
> Windoze is not the only OS to handle bad hardware better than Linux. On
> my Mac, I had a bad DIMM that worked fine on the MacOS side, but kept
> causing random bus-type errors in Linux. Same as when I accidentally
> (long story) overclocked the bus on the CPU. I think that more
> tolerance for faulty hardware (more than just poorly programmed BIOS or
> chipsets with known bugs) is something that might be worth looking into.

And how do you propose to do that?

For instance, in some other operating systems having the top bit flip
in a pointer will cause silent use of incorrect data. On Linux, this
will cause a signal 11. Which do you prefer, bad results or an error
message?

Can you suggest a specific way in which Linux can react correctly to
e.g. flipped bits in RAM or cache which cannot be detected at the hardware
level? Or maybe tell me how Linux can react correctly when an overclocked
CPU starts producing incorrect results for right shifts once every few
thousand instructions?

There exists hardware specifically intended to be able to diagnose and
contain its own failures, but the number of such features on a common
home PC is probably a big fat zero.

> I'm sure it would solve problems like this (which I clearly identify as
> a hardware problem, because the same thing happened with the bad DIMM,
> the overclocked bus, and two different overclocked processors (AMD 5x86
> and AMD K6-2 500) and went away when I remedied the offending problem).

And that's what you have to do --- fix the problem. In a few situations,
you might be able to isolate and exclude the section of RAM which is
bad (in fact, there are patches for this and tools to diagnose the
problem), but what do you want Linux to do about a processor which cannot
reliably execute instructions?

> Additionally, overclockers (I myself am a reformed one) might appreciate
> more tolerance for such things.

I've got a better idea: Overclockers can go to hell, and their bug reports
to the trash, until they "reform" like you and I have.

Jeff

2000-11-21 22:25:22

by Horst H. von Brand

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

David Riley <[email protected]> said:
> Horst von Brand wrote:
> > So what? My former machine ran fine with Win95/WinNT. Linux wouldn't even
> > end booting the kernel. Reason: P/100 was running at 120Mhz. Fixed that, no
> > trouble for years. Not the only case of WinXX running (apparently?) fine
> > on broken/misconfigured hardware I've seen, mind you.

> This is something I've noticed as well...
>
> Windoze is not the only OS to handle bad hardware better than Linux. On
> my Mac, I had a bad DIMM that worked fine on the MacOS side, but kept
> causing random bus-type errors in Linux. Same as when I accidentally
> (long story) overclocked the bus on the CPU. I think that more
> tolerance for faulty hardware (more than just poorly programmed BIOS or
> chipsets with known bugs) is something that might be worth looking into.

NO! The method they use is not to drive the hardware too hard (i.e., you
don't get what you paid for, performance-wise), or just paper over the bug
(it _will_ crash soon enough anyway, so why bother?).
--
Dr. Horst H. von Brand mailto:[email protected]
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513

2000-11-21 22:33:14

by Horst H. von Brand

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

David Riley <[email protected]> said:

[...]

> This is true. What I suppose would be the solution is that if faulty
> hardware is found, a reduction in performance should be made.

Finding out if you've got bad RAM might take a few hours running mem86. Not
exactly what I have in mind to do each boot...
--
Dr. Horst H. von Brand mailto:[email protected]
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513

2000-11-21 22:43:15

by Dan Hollis

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

On Tue, 21 Nov 2000, Horst von Brand wrote:
> David Riley <[email protected]> said:
> > This is true. What I suppose would be the solution is that if faulty
> > hardware is found, a reduction in performance should be made.
> Finding out if you've got bad RAM might take a few hours running mem86. Not
> exactly what I have in mind to do each boot...

ecc ram and ecc-capable northbridge isn't exactly expensive...

-Dan

2000-11-21 22:55:56

by Gerd Knorr

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

> > This is true. What I suppose would be the solution is that if faulty
> > hardware is found, a reduction in performance should be made.
>
> Finding out if you've got bad RAM might take a few hours running mem86. Not
> exactly what I have in mind to do each boot...

Even if memtest doesn't find anything you can't be sure the box is fine.
I've seen boxed which passed memtest just fine, but compiling kernels in
a endless loop with "make -j" still bombed after some time with gcc sig11.

Gerd

--
Wirtschaftsinformatiker == Leute, die zwar die aktuellen Aktienkurse
jedes Softwareherstellers kennen, aber keines der Produkte auch nur
ansatzweise bedienen k?nnen. -- Benedict Mangelsdorff

2000-11-21 22:57:16

by David Riley

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

Jeff Epler wrote:
>
> On Tue, Nov 21, 2000 at 04:08:26PM -0500, David Riley wrote:
> > Windoze is not the only OS to handle bad hardware better than Linux. On
> > my Mac, I had a bad DIMM that worked fine on the MacOS side, but kept
> > causing random bus-type errors in Linux. Same as when I accidentally
> > (long story) overclocked the bus on the CPU. I think that more
> > tolerance for faulty hardware (more than just poorly programmed BIOS or
> > chipsets with known bugs) is something that might be worth looking into.
>
> And how do you propose to do that?
>
> For instance, in some other operating systems having the top bit flip
> in a pointer will cause silent use of incorrect data. On Linux, this
> will cause a signal 11. Which do you prefer, bad results or an error
> message?
>
> Can you suggest a specific way in which Linux can react correctly to
> e.g. flipped bits in RAM or cache which cannot be detected at the hardware
> level? Or maybe tell me how Linux can react correctly when an overclocked
> CPU starts producing incorrect results for right shifts once every few
> thousand instructions?

Hmm... Good point. That would be hard to do. On that note, there
should be some prominent note on things like user manuals (though Linux
users shouldn't need *manuals* :-) that notes that common crashes like
signal 11 or "cc: internal failure" messages are generally caused by
hardware problems. That sort of thing would keep spurious complaints
and error messages from inappropriate boards like this and on newbie
boards where they belong (I'm not saying it was a bad complaint, but
generally questions like "Why does RH 6.2, known to be stable on
thousands of machines, not install of this machine where NT worked
before?" belong on newbie boards and not as a flame of RedHat on the
kernel board). Unfortunately, most people who get these error messages
don't read the manuals. Besides, where would you put it in a manual? I
know that error codes are a great mystery among us on the MacOS (even
those of us that have been using it for 16 years only know that Error 11
is usually hardware and [1|2|3] are software) since they aren't really
clearly and understandably documented in prominent user-land documentation.

By the way, I have no idea how to implement a suggestion like I had.
That's why I posted here. If I had a clue how to do that any better
than a useless, inefficient kludge, I would have done it myself and
submitted a patch. As much as I like the "do it yourself" model of
development here, I think a lot of people might appreciate it if more
experienced coders wouldn't jump down the throats of people who suggest
a feature they can't do themselves yet. I speak for myself, but I don't
think I'll find a dearth of support for that opinion.

Thanks,
David

2000-11-21 23:01:46

by Richard Torkar

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David Riley wrote:

> Jeff Epler wrote:
> >
> > On Tue, Nov 21, 2000 at 04:08:26PM -0500, David Riley wrote:
> > > Windoze is not the only OS to handle bad hardware better than Linux. On
> > > my Mac, I had a bad DIMM that worked fine on the MacOS side, but kept
> > > causing random bus-type errors in Linux. Same as when I accidentally
> > > (long story) overclocked the bus on the CPU. I think that more
> > > tolerance for faulty hardware (more than just poorly programmed BIOS or
> > > chipsets with known bugs) is something that might be worth looking into.
> >
> > And how do you propose to do that?
> >
> > For instance, in some other operating systems having the top bit flip
> > in a pointer will cause silent use of incorrect data. On Linux, this
> > will cause a signal 11. Which do you prefer, bad results or an error
> > message?
> >
> > Can you suggest a specific way in which Linux can react correctly to
> > e.g. flipped bits in RAM or cache which cannot be detected at the hardware
> > level? Or maybe tell me how Linux can react correctly when an overclocked
> > CPU starts producing incorrect results for right shifts once every few
> > thousand instructions?
>
> Hmm... Good point. That would be hard to do. On that note, there
> should be some prominent note on things like user manuals (though Linux
> users shouldn't need *manuals* :-) that notes that common crashes like
> signal 11 or "cc: internal failure" messages are generally caused by
> hardware problems.

Well David, there is such a "manual".

http://ftp.sunet.se/LDP/FAQ/faqs/GCC-SIG11-FAQ



/Richard
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.4 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE6Gve8USLExYo23RsRAtrQAJ4glySTwLB+e02mlYX0L42pf3+8BACdEssx
L2fhmp7uY+xa3wpWYt6cb+M=
=aP6d
-----END PGP SIGNATURE-----


2000-11-21 23:48:08

by David Riley

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

Richard Torkar wrote:
>
> Well David, there is such a "manual".
>
> http://ftp.sunet.se/LDP/FAQ/faqs/GCC-SIG11-FAQ

Yes. And if you ask the average new Linux user if they've read it, I
doubt you'll get a "yes". My question boils down to this, and this I
suppose is a personal/informational request for comments, so don't
clutter the list with responses directed at me: What (in your opinion)
is the most commonly read Linux user-land document?

2000-11-22 03:59:01

by Jeff Epler

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

On Tue, Nov 21, 2000 at 06:17:48PM -0500, David Riley wrote:
> Richard Torkar wrote:
> >
> > Well David, there is such a "manual".
> >
> > http://ftp.sunet.se/LDP/FAQ/faqs/GCC-SIG11-FAQ
>
> Yes. And if you ask the average new Linux user if they've read it, I
> doubt you'll get a "yes". My question boils down to this, and this I
> suppose is a personal/informational request for comments, so don't
> clutter the list with responses directed at me: What (in your opinion)
> is the most commonly read Linux user-land document?

Well, a copy of that document *is* the first hit for a google search on
'linux signal 11 faq'
http://www.google.com/search?q=linux+signal+11+faq

In other words, someone who does the slightest bit of research will
find the answer.

Jeff

2000-11-22 11:33:07

by Richard Torkar

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David Riley wrote:

> Richard Torkar wrote:
> >
> > Well David, there is such a "manual".
> >
> > http://ftp.sunet.se/LDP/FAQ/faqs/GCC-SIG11-FAQ
>
> Yes. And if you ask the average new Linux user if they've read it, I
> doubt you'll get a "yes". My question boils down to this, and this I
> suppose is a personal/informational request for comments, so don't
> clutter the list with responses directed at me: What (in your opinion)
> is the most commonly read Linux user-land document?

I would say the manual that comes with the distribution whether it is
RedHat, Debian, Slackware etc...

So yes it would be a good idea to contact the distributions-people and
tell them to point it out "clearly" in their manual.


/Richard
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.4 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE6G6fQUSLExYo23RsRAofAAKCKvLzgDTHs/lYu6Bx0PA/F9Z7nYACgl9qs
PgbaC8JGvJalG1Sh+6KUhRU=
=UvUj
-----END PGP SIGNATURE-----


2000-11-22 18:51:36

by Anthony Liu

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

On Mon, Nov 20, 2000 at 08:33:25PM +0100, Frank van Maarseveen wrote:

> [...]
> > [[email protected] linux-2.2.17]# cd scripts
> > [[email protected] scripts]# gcc -o mkdep.o mkdep.c
> > collect2: ld terminated with signal 11 [Segmentation fault], core dumped
> > [[email protected] scripts]# gcc -c -o mkdep.o mkdep.c
> > [[email protected] scripts]# ld -o mkdep mkdep.o
> > Segmentation fault (core dumped)
> This _is_ a hardware problem.
>
> >
> > (5) I returned home (200 mile round trip), removed my
> > SCSI disks from my home machine, and then returned
> > and installed them in my friend's machine. The
> > machine worked perfectly with Linux version 2.2.17,
> > and gcc-2.7.2.3, Binutils-2.8.1.0, etc., the standard
> > stuff.
> >
> > This shows that the problems are not because of a
> > defective machine.
> Wrong.
> One cannot do statistics on one case. But you can on 10000+ of other
> cases where the above just works (actually, even one case where it works
> proves enough). You should give the mainboard a good massage to make it
> behave more deterministically.
>
> dust, dirt, aging, bad connectors, broken lines on the mainboard
> incidentally making contact due to mechanical forces, thermal effects,
> who knows what it is. It could be anything. It really is faulty hardware.

Yes, hardware problems can be very subtle.

Case (1)

Just bought a new mouse and stick it in the USB port in my ABIT BX6 2.0
board for test, W98 runs ok. With the USB to PS/2 converter, I inserted
the mouse to the PS/2 port and the board refused to start: no ram count,
not even beep.

Opened the box, yanked all the cables out, pressed on the PS/2
connectors firmly on top of the board, put all cables back and
reconnected the mouse to the PS/2 port. The board started, booted into
2.2.17, everything worked. BTW, the ABIT board is only 1 year old!

Case (2)

One of the older machines I have is a 430VX, with the IBM 6x86 P120 chip
and 48M edo ram. This one would generate sig 11 on kernel compiles,
no matter what I do, even after replacing all ram chips. Sometimes
installing Redhat on this would just failed with the install scripts
choked to death.

There are three major factors here: the board, the CPU and the ram
chips. Replacing ram chips did not solve the problem, replacing
everything is not the most economical option.

Solutions:

Make boot floppies on another machine, compile kernels on another
machine with proper target. Try other distributions as well, like
Mandrake, SuSe or Debian. With that, this 430VX board install and run
Linux just fine as a firewall, caching proxy/DNS machine for six months
already, with no software related problem except that the CMOS battery
is losing charge and old hard disks are acting up. This baby is five
years old.

If the kernel somehow crashed on this 430VX machine, I wouldn't complain
about Linux, nor any distribution company. The fact is: it did generate
sig 11 on kernel compiles which won't happen on another machine.

If a looping kernel compiles test failed, it might corrupt your hard
disk. For a "safer" test, try looping a Quake demo for hours, the bigger
demo the better. It does not matter if you are running Linux or Windows,
machine with problems would hang just after a few loops.

PS: hardware memory tester might help, but in the end it does not test
the machine as a whole.

2000-11-22 21:07:08

by David Riley

[permalink] [raw]
Subject: Re: Defective Red Hat Distribution poorly represents Linux

Jeff Epler wrote:
> Well, a copy of that document *is* the first hit for a google search on
> 'linux signal 11 faq'
> http://www.google.com/search?q=linux+signal+11+faq
>
> In other words, someone who does the slightest bit of research will
> find the answer.

Perhaps, but if a new user starts using linux and his/her machine is
randomly crashing (not always showing the number 11 anywhere in the
error messages, mind you) the first thing they look for won't be "linux
signal 11 faq". They'd look for something like "random linux crashes"
or "constant linux crashes" or something to that effect. Try these on
for size...

<http://www.google.com/search?q=random+linux+crashes>
This one goes six entries before it even comes upon a similar hardware
problem (though to be fair, the report of this problem was far more
intelligent than the one that started this thread) and that is full of
stack traces and cryptic things that a newbie wouldn't even pretend to
understand. A few years ago, I would have run away screaming from that report.

<http://www.google.com/search?q=constant+linux+crashes>
The first link from this search points to a forum on linuxsucks.org.
Not what we want newbies looking at... Some of the posts on the forum
bemoan the lack of documentation for linux.

I think the "slightest bit of research" is a lot different for
experienced Linux users than for those who come from Windoze or MacOS.
Someone suggested to me that one could put such info on the default page
of the brower in the distribution (the one on the local disk in case of
RedHat, etc), perhaps in the "troubleshooting" section. That sounds
like a good idea to me.

It also occurs to me that a discussion of documentation belongs on
another list unless it pertains to kernel documentation. I'll try to
make this my last post.

2000-11-23 01:04:46

by Richard.Reynolds

[permalink] [raw]
Subject: Re: Was:Defective Red Hat Distribution poorly represents Linux, running with failed hardware?

To change the topic a bit.

Just an interesting thought, I realize that for every pro there is a
con. But what about implimenting in some kind of background "process"(for
lack of a better word right now), and probibly in a duplicate copy
of the current kernel. Checks on the system memory and redundant
processing of known to be problems, I have not, been keeping up to date on
the kernel but I imagine that there is a map into memory that bad areas
either are or could be blocked as bad.

While testing this at startup would not seem to be advisable, for many
reasons, including some pc's are not restarted often enough to catch such
errors in time, I think the table could be saved, and reloaded on startup,
there would also be the need to maintain this table, but I dont see that
as being unreasionable.

It only interests me in that I enjoy the stability of my Linux box,
vs. any of my Win98/nt/2k boxes, and as someone that uses Linux I would
think such a kernel would be of interest in more quasi mission critical
installations.

Just my 2cents
Richard Reynolds
[email protected]


On Tue, 21 Nov 2000, David Riley wrote:

> Jeff Epler wrote:
> >
> > On Tue, Nov 21, 2000 at 04:08:26PM -0500, David Riley wrote:
> > > Windoze is not the only OS to handle bad hardware better than Linux. On
> > > my Mac, I had a bad DIMM that worked fine on the MacOS side, but kept
> > > causing random bus-type errors in Linux. Same as when I accidentally
> > > (long story) overclocked the bus on the CPU. I think that more
> > > tolerance for faulty hardware (more than just poorly programmed BIOS or
> > > chipsets with known bugs) is something that might be worth looking into.
> >
> > And how do you propose to do that?
> >
> > For instance, in some other operating systems having the top bit flip
> > in a pointer will cause silent use of incorrect data. On Linux, this
> > will cause a signal 11. Which do you prefer, bad results or an error
> > message?
> >
> > Can you suggest a specific way in which Linux can react correctly to
> > e.g. flipped bits in RAM or cache which cannot be detected at the hardware
> > level? Or maybe tell me how Linux can react correctly when an overclocked
> > CPU starts producing incorrect results for right shifts once every few
> > thousand instructions?
>
> Hmm... Good point. That would be hard to do. On that note, there
> should be some prominent note on things like user manuals (though Linux
> users shouldn't need *manuals* :-) that notes that common crashes like
> signal 11 or "cc: internal failure" messages are generally caused by
> hardware problems. That sort of thing would keep spurious complaints
> and error messages from inappropriate boards like this and on newbie
> boards where they belong (I'm not saying it was a bad complaint, but
> generally questions like "Why does RH 6.2, known to be stable on
> thousands of machines, not install of this machine where NT worked
> before?" belong on newbie boards and not as a flame of RedHat on the
> kernel board). Unfortunately, most people who get these error messages
> don't read the manuals. Besides, where would you put it in a manual? I
> know that error codes are a great mystery among us on the MacOS (even
> those of us that have been using it for 16 years only know that Error 11
> is usually hardware and [1|2|3] are software) since they aren't really
> clearly and understandably documented in prominent user-land documentation.
>
> By the way, I have no idea how to implement a suggestion like I had.
> That's why I posted here. If I had a clue how to do that any better
> than a useless, inefficient kludge, I would have done it myself and
> submitted a patch. As much as I like the "do it yourself" model of
> development here, I think a lot of people might appreciate it if more
> experienced coders wouldn't jump down the throats of people who suggest
> a feature they can't do themselves yet. I speak for myself, but I don't
> think I'll find a dearth of support for that opinion.
>
> Thanks,
> David
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> Please read the FAQ at http://www.tux.org/lkml/
>
>