2000-12-23 12:01:22

by Andreas Franck

[permalink] [raw]
Subject: Fatal Oops on boot with 2.4.0testX and recent GCC snapshots

Hello,

I hope I am not doing something particularly stupid here, but as Linus
encouraged curious people to try compiling the kernel with the
latest gcc snapshots, I have tried - as several weeks before, but again
in vain.

Since I have tried, the same following error on early boot (just after
"Starting kswapd v1.8" appears on the screen) has bitten me, when I
compiled the kernel with a recent gcc snapshot. This was for at least
2.4.0-test11 with gcc snapshots from 2 months ago till yesterday.

The ksymoops output is attached here, and I hope it will help. I tried
to narrow it down by myself a bit, and ended in kernel/sched.c:
__wake_up_common, where my understanding of the code came to a sudden
end, so I hope some gurus here will be able to figure out what's wrong.

All (?) relevant output should be found below, if anything important
is missing, I am willing to provide aly further information later on.

I don't know if this happens if I compile the kernel for something
less than Pentium II, this is what I have tried (System is a PII-266 with
160MB RAM on an Intel 430LX motherboard).

With gcc version 2.95.2 20000220 (Debian GNU/Linux) everything works
perfectly fine.

Thanks for any advice and happy hacking!
Andreas

Here comes all important info:
---snip---

ksymoops 2.3.5 on i686 2.4.0-test12. Options used
-V (default)
-K (specified)
-l /proc/modules (default)
-o /lib/modules/2.4.0-test13-pre4/ (specified)
-m /usr/src/linux/System.map (specified)

No modules in ksyms, skipping objects
No ksyms, skipping lsmod
Unable to handle kernel paging request at virtual address fffffe4c
c0114e9d
*pde = 00001063
Oops: 0000
CPU: 0
EIP: 0010:[<c0114e9d>]
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010097
eax: c40effb8 ebx: c3585a59 ecx: fffffe4c edx: 00000000
esi: c0107b0c edi: fff9ffff ebp: c12b9fc8 esp: c12b9fa4
ds: 0018 es: 0018 ss: 0018
Process kupdate (pid 6, stackpage=c12b900)
Stack: 00000000 00000246 00000000 c40effb8 00000001 00000003 c12b8000 ffffffff
fff9ffff c12b8000 c0107b38 c40effac c12b8550 00000000 c01f896f 00010f00
c40eff74 00105000 0008e000 c0107486 c40effac c0137900 00000000
Call Trace: [<fff9ffff>] [<c0107b38>] [<c01f896f>] [<c0105000>] [<c0107486>]
[<c0137900>]
Code: 8b 01 85 45 f0 74 ec 8b 7d dc 85 ff 74 79 8b 45 ec 8b 16 21

>>EIP; c0114e9d <__wake_up+5d/140> <=====
Trace; fff9ffff <END_OF_CODE+3fcfcb6f/????>
Trace; c0107b38 <__up_wakeup+8/c>
Trace; c01f896f <stext_lock+447/4258>
Trace; c0105000 <empty_bad_page+0/1000>
Trace; c0107486 <kernel_thread+26/30>
Trace; c0137900 <kupdate+0/f0>
Code; c0114e9d <__wake_up+5d/140>
00000000 <_EIP>:
Code; c0114e9d <__wake_up+5d/140> <=====
0: 8b 01 mov (%ecx),%eax <=====
Code; c0114e9f <__wake_up+5f/140>
2: 85 45 f0 test %eax,0xfffffff0(%ebp)
Code; c0114ea2 <__wake_up+62/140>
5: 74 ec je fffffff3 <_EIP+0xfffffff3> c0114e90
<__wake_up+50/140>
Code; c0114ea4 <__wake_up+64/140>
7: 8b 7d dc mov 0xffffffdc(%ebp),%edi
Code; c0114ea7 <__wake_up+67/140>
a: 85 ff test %edi,%edi
Code; c0114ea9 <__wake_up+69/140>
c: 74 79 je 87 <_EIP+0x87> c0114f24
<__wake_up+e4/140>
Code; c0114eab <__wake_up+6b/140>
e: 8b 45 ec mov 0xffffffec(%ebp),%eax
Code; c0114eae <__wake_up+6e/140>
11: 8b 16 mov (%esi),%edx
Code; c0114eb0 <__wake_up+70/140>
13: 21 00 and %eax,(%eax)

gcc snapshot version:

Reading specs from /usr/lib/gcc-lib/i686-pc-linux-gnu/2.97/specs
Configured with: ../gcc/configure --prefix=/usr --enable-shared
--enable-threads
gcc version 2.97 20001222 (experimental)


My .config:

#
# Automatically generated by make menuconfig: don't edit
#
CONFIG_X86=y
CONFIG_ISA=y
# CONFIG_SBUS is not set
CONFIG_UID16=y

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y

#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODVERSIONS=y
CONFIG_KMOD=y

#
# Processor type and features
#
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
CONFIG_M686=y
# CONFIG_M686FXSR is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_CMPXCHG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_L1_CACHE_SHIFT=5
CONFIG_X86_TSC=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_PGE=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
# CONFIG_TOSHIBA is not set
CONFIG_MICROCODE=m
CONFIG_X86_MSR=m
CONFIG_X86_CPUID=m
CONFIG_NOHIGHMEM=y
# CONFIG_HIGHMEM4G is not set
# CONFIG_HIGHMEM64G is not set
# CONFIG_MATH_EMULATION is not set
CONFIG_MTRR=y
# CONFIG_SMP is not set
CONFIG_X86_UP_IOAPIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_LOCAL_APIC=y

#
# General setup
#
CONFIG_NET=y
# CONFIG_VISWS is not set
CONFIG_PCI=y
# CONFIG_PCI_GOBIOS is not set
# CONFIG_PCI_GODIRECT is not set
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_NAMES=y
# CONFIG_EISA is not set
# CONFIG_MCA is not set
CONFIG_HOTPLUG=y

#
# PCMCIA/CardBus support
#
# CONFIG_PCMCIA is not set
CONFIG_SYSVIPC=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_SYSCTL=y
CONFIG_KCORE_ELF=y
# CONFIG_KCORE_AOUT is not set
CONFIG_BINFMT_AOUT=m
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_MISC=m
CONFIG_PM=y
CONFIG_ACPI=y
CONFIG_APM=y
# CONFIG_APM_IGNORE_USER_SUSPEND is not set
CONFIG_APM_DO_ENABLE=y
CONFIG_APM_CPU_IDLE=y
# CONFIG_APM_DISPLAY_BLANK is not set
CONFIG_APM_RTC_IS_GMT=y
# CONFIG_APM_ALLOW_INTS is not set
# CONFIG_APM_REAL_MODE_POWER_OFF is not set

#
# Memory Technology Devices (MTD)
#
# CONFIG_MTD is not set

#
# Parallel port support
#
CONFIG_PARPORT=m
CONFIG_PARPORT_PC=m
CONFIG_PARPORT_PC_FIFO=y
CONFIG_PARPORT_PC_SUPERIO=y
# CONFIG_PARPORT_AMIGA is not set
# CONFIG_PARPORT_MFC3 is not set
# CONFIG_PARPORT_ATARI is not set
# CONFIG_PARPORT_SUNBPP is not set
# CONFIG_PARPORT_OTHER is not set
CONFIG_PARPORT_1284=y

#
# Plug and Play configuration
#
CONFIG_PNP=m
CONFIG_ISAPNP=m

#
# Block devices
#
CONFIG_BLK_DEV_FD=m
# CONFIG_BLK_DEV_XD is not set
# CONFIG_PARIDE is not set
# CONFIG_BLK_CPQ_DA is not set
# CONFIG_BLK_CPQ_CISS_DA is not set
# CONFIG_BLK_DEV_DAC960 is not set
CONFIG_BLK_DEV_LOOP=m
CONFIG_BLK_DEV_NBD=m
CONFIG_BLK_DEV_RAM=m
CONFIG_BLK_DEV_RAM_SIZE=4096
# CONFIG_BLK_DEV_INITRD is not set

#
# Multi-device support (RAID and LVM)
#
CONFIG_MD=y
CONFIG_BLK_DEV_MD=y
CONFIG_MD_LINEAR=y
CONFIG_MD_RAID0=y
CONFIG_MD_RAID1=y
CONFIG_MD_RAID5=y
CONFIG_MD_BOOT=y
CONFIG_AUTODETECT_RAID=y
CONFIG_BLK_DEV_LVM=y
CONFIG_LVM_PROC_FS=y

#
# Networking options
#
CONFIG_PACKET=m
CONFIG_PACKET_MMAP=y
CONFIG_NETLINK=y
CONFIG_RTNETLINK=y
CONFIG_NETLINK_DEV=m
CONFIG_NETFILTER=y
# CONFIG_NETFILTER_DEBUG is not set
# CONFIG_FILTER is not set
CONFIG_UNIX=y
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
# CONFIG_IP_ADVANCED_ROUTER is not set
# CONFIG_IP_PNP is not set
CONFIG_NET_IPIP=m
# CONFIG_NET_IPGRE is not set
# CONFIG_IP_MROUTE is not set
# CONFIG_ARPD is not set
# CONFIG_INET_ECN is not set
CONFIG_SYN_COOKIES=y

#
# IP: Netfilter Configuration
#
# CONFIG_IP_NF_CONNTRACK is not set
# CONFIG_IP_NF_QUEUE is not set
# CONFIG_IP_NF_IPTABLES is not set
# CONFIG_IP_NF_COMPAT_IPCHAINS is not set
# CONFIG_IP_NF_COMPAT_IPFWADM is not set
# CONFIG_IPV6 is not set
CONFIG_KHTTPD=m
# CONFIG_ATM is not set
CONFIG_IPX=m
CONFIG_IPX_INTERN=y
CONFIG_ATALK=m
# CONFIG_DECNET is not set
# CONFIG_BRIDGE is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_LLC is not set
# CONFIG_NET_DIVERT is not set
# CONFIG_ECONET is not set
# CONFIG_WAN_ROUTER is not set
# CONFIG_NET_FASTROUTE is not set
# CONFIG_NET_HW_FLOWCONTROL is not set

#
# QoS and/or fair queueing
#
# CONFIG_NET_SCHED is not set

#
# Telephony Support
#
# CONFIG_PHONE is not set
# CONFIG_PHONE_IXJ is not set

#
# ATA/IDE/MFM/RLL support
#
CONFIG_IDE=y

#
# IDE, ATA and ATAPI Block devices
#
CONFIG_BLK_DEV_IDE=y
# CONFIG_BLK_DEV_HD_IDE is not set
# CONFIG_BLK_DEV_HD is not set
CONFIG_BLK_DEV_IDEDISK=y
CONFIG_IDEDISK_MULTI_MODE=y
# CONFIG_BLK_DEV_IDEDISK_VENDOR is not set
# CONFIG_BLK_DEV_IDEDISK_FUJITSU is not set
# CONFIG_BLK_DEV_IDEDISK_IBM is not set
# CONFIG_BLK_DEV_IDEDISK_MAXTOR is not set
# CONFIG_BLK_DEV_IDEDISK_QUANTUM is not set
# CONFIG_BLK_DEV_IDEDISK_SEAGATE is not set
# CONFIG_BLK_DEV_IDEDISK_WD is not set
# CONFIG_BLK_DEV_COMMERIAL is not set
# CONFIG_BLK_DEV_TIVO is not set
# CONFIG_BLK_DEV_IDECS is not set
CONFIG_BLK_DEV_IDECD=m
CONFIG_BLK_DEV_IDETAPE=m
CONFIG_BLK_DEV_IDEFLOPPY=m
CONFIG_BLK_DEV_IDESCSI=m
# CONFIG_BLK_DEV_CMD640 is not set
# CONFIG_BLK_DEV_CMD640_ENHANCED is not set
# CONFIG_BLK_DEV_ISAPNP is not set
# CONFIG_BLK_DEV_RZ1000 is not set
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_IDEPCI_SHARE_IRQ=y
CONFIG_BLK_DEV_IDEDMA_PCI=y
# CONFIG_BLK_DEV_OFFBOARD is not set
CONFIG_IDEDMA_PCI_AUTO=y
CONFIG_BLK_DEV_IDEDMA=y
# CONFIG_IDEDMA_PCI_WIP is not set
# CONFIG_IDEDMA_NEW_DRIVE_LISTINGS is not set
# CONFIG_BLK_DEV_AEC62XX is not set
# CONFIG_AEC62XX_TUNING is not set
# CONFIG_BLK_DEV_ALI15X3 is not set
# CONFIG_WDC_ALI15X3 is not set
# CONFIG_BLK_DEV_AMD7409 is not set
# CONFIG_AMD7409_OVERRIDE is not set
# CONFIG_BLK_DEV_CMD64X is not set
# CONFIG_BLK_DEV_CY82C693 is not set
# CONFIG_BLK_DEV_CS5530 is not set
# CONFIG_BLK_DEV_HPT34X is not set
# CONFIG_HPT34X_AUTODMA is not set
# CONFIG_BLK_DEV_HPT366 is not set
CONFIG_BLK_DEV_PIIX=y
CONFIG_PIIX_TUNING=y
# CONFIG_BLK_DEV_NS87415 is not set
# CONFIG_BLK_DEV_OPTI621 is not set
# CONFIG_BLK_DEV_PDC202XX is not set
# CONFIG_PDC202XX_BURST is not set
# CONFIG_BLK_DEV_OSB4 is not set
# CONFIG_BLK_DEV_SIS5513 is not set
# CONFIG_BLK_DEV_SLC90E66 is not set
# CONFIG_BLK_DEV_TRM290 is not set
# CONFIG_BLK_DEV_VIA82CXXX is not set
# CONFIG_IDE_CHIPSETS is not set
CONFIG_IDEDMA_AUTO=y
CONFIG_IDEDMA_IVB=y
# CONFIG_DMA_NONPCI is not set
CONFIG_BLK_DEV_IDE_MODES=y

#
# SCSI support
#
CONFIG_SCSI=m
CONFIG_BLK_DEV_SD=m
CONFIG_SD_EXTRA_DEVS=40
CONFIG_CHR_DEV_ST=m
CONFIG_BLK_DEV_SR=m
# CONFIG_BLK_DEV_SR_VENDOR is not set
CONFIG_SR_EXTRA_DEVS=2
CONFIG_CHR_DEV_SG=m
CONFIG_SCSI_DEBUG_QUEUES=y
# CONFIG_SCSI_MULTI_LUN is not set
CONFIG_SCSI_CONSTANTS=y
CONFIG_SCSI_LOGGING=y

#
# SCSI low-level drivers
#
# CONFIG_BLK_DEV_3W_XXXX_RAID is not set
# CONFIG_SCSI_7000FASST is not set
# CONFIG_SCSI_ACARD is not set
CONFIG_SCSI_AHA152X=m
# CONFIG_SCSI_AHA1542 is not set
# CONFIG_SCSI_AHA1740 is not set
# CONFIG_SCSI_AIC7XXX is not set
# CONFIG_SCSI_ADVANSYS is not set
# CONFIG_SCSI_IN2000 is not set
# CONFIG_SCSI_AM53C974 is not set
# CONFIG_SCSI_MEGARAID is not set
# CONFIG_SCSI_BUSLOGIC is not set
# CONFIG_SCSI_CPQFCTS is not set
# CONFIG_SCSI_DMX3191D is not set
# CONFIG_SCSI_DTC3280 is not set
# CONFIG_SCSI_EATA is not set
# CONFIG_SCSI_EATA_DMA is not set
# CONFIG_SCSI_EATA_PIO is not set
# CONFIG_SCSI_FUTURE_DOMAIN is not set
# CONFIG_SCSI_GDTH is not set
# CONFIG_SCSI_GENERIC_NCR5380 is not set
# CONFIG_SCSI_IPS is not set
# CONFIG_SCSI_INITIO is not set
# CONFIG_SCSI_INIA100 is not set
# CONFIG_SCSI_PPA is not set
# CONFIG_SCSI_IMM is not set
# CONFIG_SCSI_NCR53C406A is not set
# CONFIG_SCSI_NCR53C7xx is not set
# CONFIG_SCSI_NCR53C8XX is not set
# CONFIG_SCSI_SYM53C8XX is not set
# CONFIG_SCSI_PAS16 is not set
# CONFIG_SCSI_PCI2000 is not set
# CONFIG_SCSI_PCI2220I is not set
# CONFIG_SCSI_PSI240I is not set
# CONFIG_SCSI_QLOGIC_FAS is not set
# CONFIG_SCSI_QLOGIC_ISP is not set
# CONFIG_SCSI_QLOGIC_FC is not set
# CONFIG_SCSI_QLOGIC_1280 is not set
# CONFIG_SCSI_SEAGATE is not set
# CONFIG_SCSI_SIM710 is not set
# CONFIG_SCSI_SYM53C416 is not set
# CONFIG_SCSI_DC390T is not set
# CONFIG_SCSI_T128 is not set
# CONFIG_SCSI_U14_34F is not set
# CONFIG_SCSI_ULTRASTOR is not set
# CONFIG_SCSI_DEBUG is not set

#
# IEEE 1394 (FireWire) support
#
# CONFIG_IEEE1394 is not set

#
# I2O device support
#
# CONFIG_I2O is not set
# CONFIG_I2O_PCI is not set
# CONFIG_I2O_BLOCK is not set
# CONFIG_I2O_LAN is not set
# CONFIG_I2O_SCSI is not set
# CONFIG_I2O_PROC is not set

#
# Network device support
#
CONFIG_NETDEVICES=y

#
# ARCnet devices
#
# CONFIG_ARCNET is not set

#
# Appletalk devices
#
# CONFIG_APPLETALK is not set
CONFIG_DUMMY=m
# CONFIG_BONDING is not set
# CONFIG_EQUALIZER is not set
# CONFIG_TUN is not set
# CONFIG_ETHERTAP is not set
# CONFIG_NET_SB1000 is not set

#
# Ethernet (10 or 100Mbit)
#
CONFIG_NET_ETHERNET=y
# CONFIG_NET_VENDOR_3COM is not set
# CONFIG_LANCE is not set
# CONFIG_NET_VENDOR_SMC is not set
# CONFIG_NET_VENDOR_RACAL is not set
# CONFIG_AT1700 is not set
# CONFIG_DEPCA is not set
# CONFIG_HP100 is not set
# CONFIG_NET_ISA is not set
CONFIG_NET_PCI=y
# CONFIG_PCNET32 is not set
# CONFIG_ADAPTEC_STARFIRE is not set
# CONFIG_AC3200 is not set
# CONFIG_APRICOT is not set
# CONFIG_CS89x0 is not set
# CONFIG_TULIP is not set
# CONFIG_DE4X5 is not set
# CONFIG_DGRS is not set
# CONFIG_DM9102 is not set
# CONFIG_EEPRO100 is not set
# CONFIG_EEPRO100_PM is not set
# CONFIG_LNE390 is not set
# CONFIG_NATSEMI is not set
CONFIG_NE2K_PCI=m
# CONFIG_NE3210 is not set
# CONFIG_ES3210 is not set
# CONFIG_8139TOO is not set
# CONFIG_RTL8129 is not set
# CONFIG_SIS900 is not set
# CONFIG_EPIC100 is not set
# CONFIG_SUNDANCE is not set
# CONFIG_TLAN is not set
# CONFIG_VIA_RHINE is not set
# CONFIG_WINBOND_840 is not set
# CONFIG_HAPPYMEAL is not set
# CONFIG_NET_POCKET is not set

#
# Ethernet (1000 Mbit)
#
# CONFIG_ACENIC is not set
# CONFIG_HAMACHI is not set
# CONFIG_YELLOWFIN is not set
# CONFIG_SK98LIN is not set
# CONFIG_FDDI is not set
# CONFIG_HIPPI is not set
CONFIG_PLIP=m
CONFIG_PPP=m
# CONFIG_PPP_MULTILINK is not set
CONFIG_PPP_ASYNC=m
CONFIG_PPP_SYNC_TTY=m
CONFIG_PPP_DEFLATE=m
CONFIG_PPP_BSDCOMP=m
CONFIG_PPPOE=m
CONFIG_SLIP=m
CONFIG_SLIP_COMPRESSED=y
# CONFIG_SLIP_SMART is not set
# CONFIG_SLIP_MODE_SLIP6 is not set

#
# Wireless LAN (non-hamradio)
#
# CONFIG_NET_RADIO is not set

#
# Token Ring devices
#
# CONFIG_TR is not set
# CONFIG_NET_FC is not set
# CONFIG_RCPCI is not set
# CONFIG_SHAPER is not set

#
# Wan interfaces
#
# CONFIG_WAN is not set

#
# Amateur Radio support
#
CONFIG_HAMRADIO=y
CONFIG_AX25=m
CONFIG_AX25_DAMA_SLAVE=y
CONFIG_NETROM=m
CONFIG_ROSE=m

#
# AX.25 network device drivers
#
CONFIG_MKISS=m
CONFIG_6PACK=m
CONFIG_BPQETHER=m
CONFIG_DMASCC=m
CONFIG_SCC=m
# CONFIG_SCC_DELAY is not set
# CONFIG_SCC_TRXECHO is not set
CONFIG_BAYCOM_SER_FDX=m
CONFIG_BAYCOM_SER_HDX=m
CONFIG_BAYCOM_PAR=m
CONFIG_BAYCOM_EPP=m
CONFIG_SOUNDMODEM=m
CONFIG_SOUNDMODEM_SBC=y
CONFIG_SOUNDMODEM_WSS=y
CONFIG_SOUNDMODEM_AFSK1200=y
CONFIG_SOUNDMODEM_AFSK2400_7=y
CONFIG_SOUNDMODEM_AFSK2400_8=y
CONFIG_SOUNDMODEM_AFSK2666=y
CONFIG_SOUNDMODEM_HAPN4800=y
CONFIG_SOUNDMODEM_PSK4800=y
CONFIG_SOUNDMODEM_FSK9600=y
CONFIG_YAM=m

#
# IrDA (infrared) support
#
# CONFIG_IRDA is not set

#
# ISDN subsystem
#
# CONFIG_ISDN is not set

#
# Old CD-ROM drivers (not SCSI, not IDE)
#
# CONFIG_CD_NO_IDESCSI is not set

#
# Input core support
#
CONFIG_INPUT=m
CONFIG_INPUT_KEYBDEV=m
CONFIG_INPUT_MOUSEDEV=m
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
CONFIG_INPUT_JOYDEV=m
CONFIG_INPUT_EVDEV=m

#
# Character devices
#
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_SERIAL=m
# CONFIG_SERIAL_EXTENDED is not set
# CONFIG_SERIAL_NONSTANDARD is not set
CONFIG_UNIX98_PTYS=y
CONFIG_UNIX98_PTY_COUNT=256
CONFIG_PRINTER=m
CONFIG_LP_CONSOLE=y
CONFIG_PPDEV=m

#
# I2C support
#
CONFIG_I2C=m
CONFIG_I2C_ALGOBIT=m
# CONFIG_I2C_PHILIPSPAR is not set
# CONFIG_I2C_ELV is not set
# CONFIG_I2C_VELLEMAN is not set
# CONFIG_I2C_ALGOPCF is not set
CONFIG_I2C_CHARDEV=m

#
# Mice
#
# CONFIG_BUSMOUSE is not set
CONFIG_MOUSE=m
CONFIG_PSMOUSE=y
# CONFIG_82C710_MOUSE is not set
# CONFIG_PC110_PAD is not set

#
# Joysticks
#
# CONFIG_JOYSTICK is not set
# CONFIG_QIC02_TAPE is not set

#
# Watchdog Cards
#
CONFIG_WATCHDOG=y
# CONFIG_WATCHDOG_NOWAYOUT is not set
CONFIG_SOFT_WATCHDOG=m
CONFIG_WDT=m
CONFIG_WDTPCI=m
CONFIG_WDT_501=y
CONFIG_WDT_501_FAN=y
CONFIG_PCWATCHDOG=m
CONFIG_ACQUIRE_WDT=m
CONFIG_60XX_WDT=m
CONFIG_MIXCOMWD=m
CONFIG_I810_TCO=m
# CONFIG_INTEL_RNG is not set
CONFIG_NVRAM=m
CONFIG_RTC=m
CONFIG_DTLK=m
# CONFIG_R3964 is not set
# CONFIG_APPLICOM is not set

#
# Ftape, the floppy tape device driver
#
CONFIG_FTAPE=m
CONFIG_ZFTAPE=m
CONFIG_ZFT_DFLT_BLK_SZ=10240
CONFIG_ZFT_COMPRESSOR=m
CONFIG_FT_NR_BUFFERS=3
CONFIG_FT_PROC_FS=y
CONFIG_FT_NORMAL_DEBUG=y
# CONFIG_FT_FULL_DEBUG is not set
# CONFIG_FT_NO_TRACE is not set
# CONFIG_FT_NO_TRACE_AT_ALL is not set
CONFIG_FT_STD_FDC=y
# CONFIG_FT_MACH2 is not set
# CONFIG_FT_PROBE_FC10 is not set
# CONFIG_FT_ALT_FDC is not set
CONFIG_FT_FDC_THR=8
CONFIG_FT_FDC_MAX_RATE=2000
CONFIG_FT_ALPHA_CLOCK=0
CONFIG_AGP=m
CONFIG_AGP_INTEL=y
CONFIG_AGP_I810=y
CONFIG_AGP_VIA=y
CONFIG_AGP_AMD=y
CONFIG_AGP_SIS=y
CONFIG_AGP_ALI=y
CONFIG_DRM=y
CONFIG_DRM_TDFX=m
CONFIG_DRM_GAMMA=m
CONFIG_DRM_R128=m
CONFIG_DRM_I810=m
CONFIG_DRM_MGA=m

#
# Multimedia devices
#
CONFIG_VIDEO_DEV=m

#
# Video For Linux
#
CONFIG_VIDEO_PROC_FS=y
# CONFIG_I2C_PARPORT is not set
CONFIG_VIDEO_BT848=m
# CONFIG_VIDEO_PMS is not set
# CONFIG_VIDEO_BWQCAM is not set
# CONFIG_VIDEO_CQCAM is not set
# CONFIG_VIDEO_CPIA is not set
# CONFIG_VIDEO_SAA5249 is not set
# CONFIG_TUNER_3036 is not set
# CONFIG_VIDEO_STRADIS is not set
# CONFIG_VIDEO_ZORAN is not set
# CONFIG_VIDEO_BUZ is not set
# CONFIG_VIDEO_ZR36120 is not set

#
# Radio Adapters
#
# CONFIG_RADIO_CADET is not set
# CONFIG_RADIO_RTRACK is not set
# CONFIG_RADIO_RTRACK2 is not set
# CONFIG_RADIO_AZTECH is not set
# CONFIG_RADIO_GEMTEK is not set
# CONFIG_RADIO_MAESTRO is not set
# CONFIG_RADIO_MIROPCM20 is not set
# CONFIG_RADIO_SF16FMI is not set
# CONFIG_RADIO_TERRATEC is not set
# CONFIG_RADIO_TRUST is not set
# CONFIG_RADIO_TYPHOON is not set
# CONFIG_RADIO_ZOLTRIX is not set

#
# File systems
#
CONFIG_QUOTA=y
# CONFIG_AUTOFS_FS is not set
CONFIG_AUTOFS4_FS=m
CONFIG_ADFS_FS=m
# CONFIG_ADFS_FS_RW is not set
# CONFIG_AFFS_FS is not set
# CONFIG_HFS_FS is not set
# CONFIG_BFS_FS is not set
CONFIG_FAT_FS=m
CONFIG_MSDOS_FS=m
CONFIG_UMSDOS_FS=m
CONFIG_VFAT_FS=m
# CONFIG_EFS_FS is not set
# CONFIG_JFFS_FS is not set
# CONFIG_CRAMFS is not set
# CONFIG_RAMFS is not set
CONFIG_ISO9660_FS=m
CONFIG_JOLIET=y
CONFIG_MINIX_FS=m
# CONFIG_NTFS_FS is not set
# CONFIG_NTFS_RW is not set
# CONFIG_HPFS_FS is not set
CONFIG_PROC_FS=y
CONFIG_DEVFS_FS=y
# CONFIG_DEVFS_MOUNT is not set
# CONFIG_DEVFS_DEBUG is not set
CONFIG_DEVPTS_FS=y
# CONFIG_QNX4FS_FS is not set
# CONFIG_QNX4FS_RW is not set
# CONFIG_ROMFS_FS is not set
CONFIG_EXT2_FS=y
# CONFIG_SYSV_FS is not set
# CONFIG_SYSV_FS_WRITE is not set
CONFIG_UDF_FS=m
# CONFIG_UDF_RW is not set
# CONFIG_UFS_FS is not set
# CONFIG_UFS_FS_WRITE is not set

#
# Network File Systems
#
CONFIG_CODA_FS=m
CONFIG_NFS_FS=m
# CONFIG_NFS_V3 is not set
# CONFIG_ROOT_NFS is not set
CONFIG_NFSD=m
CONFIG_NFSD_V3=y
CONFIG_SUNRPC=m
CONFIG_LOCKD=m
CONFIG_LOCKD_V4=y
CONFIG_SMB_FS=m
# CONFIG_SMB_NLS_DEFAULT is not set
CONFIG_NCP_FS=m
CONFIG_NCPFS_PACKET_SIGNING=y
# CONFIG_NCPFS_IOCTL_LOCKING is not set
CONFIG_NCPFS_STRONG=y
CONFIG_NCPFS_NFS_NS=y
CONFIG_NCPFS_OS2_NS=y
# CONFIG_NCPFS_SMALLDOS is not set
CONFIG_NCPFS_NLS=y
CONFIG_NCPFS_EXTRAS=y

#
# Partition Types
#
CONFIG_PARTITION_ADVANCED=y
# CONFIG_ACORN_PARTITION is not set
# CONFIG_OSF_PARTITION is not set
# CONFIG_AMIGA_PARTITION is not set
# CONFIG_ATARI_PARTITION is not set
# CONFIG_MAC_PARTITION is not set
CONFIG_MSDOS_PARTITION=y
# CONFIG_BSD_DISKLABEL is not set
# CONFIG_SOLARIS_X86_PARTITION is not set
# CONFIG_UNIXWARE_DISKLABEL is not set
# CONFIG_SGI_PARTITION is not set
# CONFIG_ULTRIX_PARTITION is not set
# CONFIG_SUN_PARTITION is not set
CONFIG_NLS=y

#
# Native Language Support
#
CONFIG_NLS_DEFAULT="iso8859-1"
CONFIG_NLS_CODEPAGE_437=m
CONFIG_NLS_CODEPAGE_737=m
CONFIG_NLS_CODEPAGE_775=m
CONFIG_NLS_CODEPAGE_850=m
CONFIG_NLS_CODEPAGE_852=m
CONFIG_NLS_CODEPAGE_855=m
CONFIG_NLS_CODEPAGE_857=m
CONFIG_NLS_CODEPAGE_860=m
CONFIG_NLS_CODEPAGE_861=m
CONFIG_NLS_CODEPAGE_862=m
CONFIG_NLS_CODEPAGE_863=m
CONFIG_NLS_CODEPAGE_864=m
CONFIG_NLS_CODEPAGE_865=m
CONFIG_NLS_CODEPAGE_866=m
CONFIG_NLS_CODEPAGE_869=m
CONFIG_NLS_CODEPAGE_874=m
CONFIG_NLS_CODEPAGE_932=m
CONFIG_NLS_CODEPAGE_936=m
CONFIG_NLS_CODEPAGE_949=m
CONFIG_NLS_CODEPAGE_950=m
CONFIG_NLS_ISO8859_1=m
CONFIG_NLS_ISO8859_2=m
CONFIG_NLS_ISO8859_3=m
CONFIG_NLS_ISO8859_4=m
CONFIG_NLS_ISO8859_5=m
CONFIG_NLS_ISO8859_6=m
CONFIG_NLS_ISO8859_7=m
CONFIG_NLS_ISO8859_8=m
CONFIG_NLS_ISO8859_9=m
CONFIG_NLS_ISO8859_14=m
CONFIG_NLS_ISO8859_15=m
CONFIG_NLS_KOI8_R=m
CONFIG_NLS_UTF8=m

#
# Console drivers
#
CONFIG_VGA_CONSOLE=y
CONFIG_VIDEO_SELECT=y
# CONFIG_MDA_CONSOLE is not set

#
# Frame-buffer support
#
CONFIG_FB=y
CONFIG_DUMMY_CONSOLE=y
# CONFIG_FB_RIVA is not set
# CONFIG_FB_CLGEN is not set
# CONFIG_FB_PM2 is not set
# CONFIG_FB_CYBER2000 is not set
# CONFIG_FB_VESA is not set
# CONFIG_FB_VGA16 is not set
# CONFIG_FB_HGA is not set
CONFIG_VIDEO_SELECT=y
CONFIG_FB_MATROX=y
# CONFIG_FB_MATROX_MILLENIUM is not set
# CONFIG_FB_MATROX_MYSTIQUE is not set
CONFIG_FB_MATROX_G100=y
CONFIG_FB_MATROX_I2C=m
CONFIG_FB_MATROX_MAVEN=m
# CONFIG_FB_MATROX_G450 is not set
# CONFIG_FB_MATROX_MULTIHEAD is not set
# CONFIG_FB_ATY is not set
# CONFIG_FB_ATY128 is not set
# CONFIG_FB_3DFX is not set
# CONFIG_FB_SIS is not set
# CONFIG_FB_VIRTUAL is not set
# CONFIG_FBCON_ADVANCED is not set
CONFIG_FBCON_CFB8=y
CONFIG_FBCON_CFB16=y
CONFIG_FBCON_CFB24=y
CONFIG_FBCON_CFB32=y
# CONFIG_FBCON_FONTWIDTH8_ONLY is not set
# CONFIG_FBCON_FONTS is not set
CONFIG_FONT_8x8=y
CONFIG_FONT_8x16=y

#
# Sound
#
CONFIG_SOUND=m
# CONFIG_SOUND_CMPCI is not set
# CONFIG_SOUND_EMU10K1 is not set
# CONFIG_SOUND_FUSION is not set
# CONFIG_SOUND_CS4281 is not set
# CONFIG_SOUND_ES1370 is not set
# CONFIG_SOUND_ES1371 is not set
# CONFIG_SOUND_ESSSOLO1 is not set
# CONFIG_SOUND_MAESTRO is not set
# CONFIG_SOUND_SONICVIBES is not set
# CONFIG_SOUND_TRIDENT is not set
# CONFIG_SOUND_MSNDCLAS is not set
# CONFIG_SOUND_MSNDPIN is not set
# CONFIG_SOUND_VIA82CXXX is not set
CONFIG_SOUND_OSS=m
# CONFIG_SOUND_TRACEINIT is not set
CONFIG_SOUND_DMAP=y
# CONFIG_SOUND_AD1816 is not set
# CONFIG_SOUND_SGALAXY is not set
# CONFIG_SOUND_ADLIB is not set
# CONFIG_SOUND_ACI_MIXER is not set
# CONFIG_SOUND_CS4232 is not set
# CONFIG_SOUND_SSCAPE is not set
# CONFIG_SOUND_GUS is not set
# CONFIG_SOUND_ICH is not set
# CONFIG_SOUND_VMIDI is not set
# CONFIG_SOUND_TRIX is not set
# CONFIG_SOUND_MSS is not set
# CONFIG_SOUND_MPU401 is not set
# CONFIG_SOUND_NM256 is not set
# CONFIG_SOUND_MAD16 is not set
# CONFIG_SOUND_PAS is not set
# CONFIG_PAS_JOYSTICK is not set
# CONFIG_SOUND_PSS is not set
CONFIG_SOUND_SB=m
# CONFIG_SOUND_AWE32_SYNTH is not set
# CONFIG_SOUND_WAVEFRONT is not set
# CONFIG_SOUND_MAUI is not set
CONFIG_SOUND_YM3812=m
# CONFIG_SOUND_OPL3SA1 is not set
# CONFIG_SOUND_OPL3SA2 is not set
# CONFIG_SOUND_YMPCI is not set
# CONFIG_SOUND_YMFPCI is not set
# CONFIG_SOUND_UART6850 is not set
# CONFIG_SOUND_AEDSP16 is not set
CONFIG_SOUND_TVMIXER=m

#
# USB support
#
# CONFIG_USB is not set

#
# Kernel hacking
#
CONFIG_MAGIC_SYSRQ=y


ver_linux output (from runnig kernel, for sure):

Versions installed: (if some fields are empty or look unusual then possibly
you have very old versions)
Linux dg1kfa.ampr.org 2.4.0-test12 #1 Sat Dec 16 02:39:07 CET 2000 i686
unknown
Kernel modules 2.3.21
Gnu C 2.97
Gnu Make 3.78.1
Binutils 2.10.0.26
Linux C Library > libc.2.2
Dynamic linker ldd (GNU libc) 2.2
Linux C++ Library 3.0.0
Procps 2.0.6
Mount 2.10f
Net-tools 2.05
Console-tools 0.2.3
Sh-utils 2.0
Modules Loaded snd-mixer-oss snd-mixer snd soundcore mga autofs4 rtc
floppy serial isa-pnp agpgart sr_mod cdrom ide-scsi scsi_mod ide-floppy

---snip---

--
->>>----------------------- Andreas Franck --------<<<-
---<<<---- [email protected] --->>>---
->>>---- Keep smiling! ----------------------------<<<-


2000-12-23 18:04:07

by Mike Galbraith

[permalink] [raw]
Subject: Re: Fatal Oops on boot with 2.4.0testX and recent GCC snapshots

On Sat, 23 Dec 2000, Andreas Franck wrote:

> Hello,
>
> I hope I am not doing something particularly stupid here, but as Linus
> encouraged curious people to try compiling the kernel with the
> latest gcc snapshots, I have tried - as several weeks before, but again
> in vain.
>
> Since I have tried, the same following error on early boot (just after
> "Starting kswapd v1.8" appears on the screen) has bitten me, when I
> compiled the kernel with a recent gcc snapshot. This was for at least
> 2.4.0-test11 with gcc snapshots from 2 months ago till yesterday.

Hi,

I had the same, with the last few snapshots I tried, but 20001218 seems
to work ok.
dmesg|head -1
Linux version 2.4.0-test13ikd (root@el-kaboom) (gcc version gcc-2.97 20001218 (experimental)) #18 Sat Dec 23 17:43:29 CET 2000

-Mike

2000-12-23 19:41:52

by Andreas Franck

[permalink] [raw]
Subject: Re: Fatal Oops on boot with 2.4.0testX and recent GCC snapshots

Hi Mike, hello linux-kernel audience,

> I had the same, with the last few snapshots I tried, but 20001218 seems
> to work ok.
> dmesg|head -1
> Linux version 2.4.0-test13ikd (root@el-kaboom) (gcc version gcc-2.97
> 20001218 (experimental)) #18 Sat Dec 23 17:43:29 CET 2000

Hmm, would have been nice, but it crashes here with 20001222, nevertheless.
For which CPU do you have your kernel configured? It might be a CPU specific
issue, I'll try to compile for Pentium I and 486, now, and report my results.

It would also be nice to know if this is a gcc issue or a kernel issue - if I
knew which precise file was responsible for the crash, I could compare the
assembly output for stable and snapshot GCC. My suspect is kernel/sched.c,
but this might be wrong, as the story begins on the launch of kupdate in
fs/buffer.c.

But now I have almost no clue what really goes wrong.

Geetings and a nice christmas to everybody!
Andreas

--
->>>----------------------- Andreas Franck --------<<<-
---<<<---- [email protected] --->>>---
->>>---- Keep smiling! ----------------------------<<<-

2000-12-24 00:48:38

by Andreas Franck

[permalink] [raw]
Subject: Re: Fatal Oops on boot with 2.4.0testX and recent GCC snapshots

The story continues, citing myself:

> Hmm, would have been nice, but it crashes here with 20001222, nevertheless.
> For which CPU do you have your kernel configured? It might be a CPU
> specific issue, I'll try to compile for Pentium I and 486, now, and report
> my results.

It does not seem CPU specific, breaks for both 486 and Pentium with the same
error.

> It would also be nice to know if this is a gcc issue or a kernel issue - if
> I knew which precise file was responsible for the crash, I could compare
> the assembly output for stable and snapshot GCC. My suspect is
> kernel/sched.c, but this might be wrong, as the story begins on the launch
> of kupdate in fs/buffer.c.

And this is where everything seems to go wrong: When I compile buffer.c with
2.95.2, and link everything together, the kernel magically boots without any
complaints; later on something starts crashing badly, but this might be other
issues that can be investigated later on.

> But now I have almost no clue what really goes wrong
... and now I have a bit more, and the suspection that something broke the
way in which the kernel_thread function (arch/i386/kernel/process.c) wants to
start the kernel threads, here bdflush and kupdate. I don't understand all
issues completely, but something seems to have changed.

Attached are the relevant (?) portions of the assembly output for buffer.c:
kupdate, bdflush and bdflush_init, compiled with 2.95.2 and 2.97,
respectively. Perhaps someone could look over it?

Thanks and happy hacking,
Andreas

--
->>>----------------------- Andreas Franck --------<<<-
---<<<---- [email protected] --->>>---
->>>---- Keep smiling! ----------------------------<<<-


Attachments:
buffer-2.95.2.S (4.05 kB)
buffer-2.97.S (4.17 kB)
Download all attachments

2000-12-24 16:46:54

by Mike Galbraith

[permalink] [raw]
Subject: Re: Fatal Oops on boot with 2.4.0testX and recent GCC snapshots

On Sat, 23 Dec 2000, Andreas Franck wrote:

> Hi Mike, hello linux-kernel audience,
>
> > I had the same, with the last few snapshots I tried, but 20001218 seems
> > to work ok.
> > dmesg|head -1
> > Linux version 2.4.0-test13ikd (root@el-kaboom) (gcc version gcc-2.97
> > 20001218 (experimental)) #18 Sat Dec 23 17:43:29 CET 2000
>
> Hmm, would have been nice, but it crashes here with 20001222, nevertheless.
> For which CPU do you have your kernel configured? It might be a CPU specific
> issue, I'll try to compile for Pentium I and 486, now, and report my results.

Yes, hmm indeed. Try these two things.

1. make DECLARE_MUTEX_LOCKED(sem) in bdflush_init() static.
2. compile with frame pointers. (normal case for IKD)

My IKD tree works with either option, but not with neither. I haven't
figured out why yet.

-Mike

2000-12-24 23:15:55

by Andreas Franck

[permalink] [raw]
Subject: Re: Fatal Oops on boot with 2.4.0testX and recent GCC snapshots

Hello Mike, hello linux-kernel hackers,

Mike Galbraith wrote:

> Yes, hmm indeed. Try these two things.
>
> 1. make DECLARE_MUTEX_LOCKED(sem) in bdflush_init() static.
> 2. compile with frame pointers. (normal case for IKD)
>
> My IKD tree works with either option, but not with neither. I haven't
> figured out why yet.

1 worked for me, too - with the same effect as compiling buffer.c with
2.95.2, thus meaning successful boot and heavy crashing later on.
I haven't tried to boot 2 yet, but this looks seriously fishy to me. It would
be nice if we could make a simpler testcase to reproduce it, as it's much
work to boot the kernel over and over again.

I have now printed out the buffer.c:bdflush_init assembly for all four cases,
2.95.2, 2.97 without patch, 2.97 with static DECLARE... and 2.97 with frame
pointer, and will try to figure out what's going wrong - it would still be
nice to know if its a gcc problem or if some kernel assumption about GCC
behaviour triggered this bug, which seems equally likely, as kernel_thread
and the mutex/semaphore stuff involve some nontrivial (at least for beginners
like me...) hand-made assembly code.

A nice evening and still merry christmas to the people westward of Europe :-)

Andreas

--
->>>----------------------- Andreas Franck --------<<<-
---<<<---- [email protected] --->>>---
->>>---- Keep smiling! ----------------------------<<<-

2000-12-25 05:42:41

by Mike Galbraith

[permalink] [raw]
Subject: Re: Fatal Oops on boot with 2.4.0testX and recent GCC snapshots

On Sun, 24 Dec 2000, Andreas Franck wrote:

> Hello Mike, hello linux-kernel hackers,
>
> Mike Galbraith wrote:
>
> > Yes, hmm indeed. Try these two things.
> >
> > 1. make DECLARE_MUTEX_LOCKED(sem) in bdflush_init() static.
> > 2. compile with frame pointers. (normal case for IKD)
> >
> > My IKD tree works with either option, but not with neither. I haven't
> > figured out why yet.
>
> 1 worked for me, too - with the same effect as compiling buffer.c with
> 2.95.2, thus meaning successful boot and heavy crashing later on.
> I haven't tried to boot 2 yet, but this looks seriously fishy to me. It would
> be nice if we could make a simpler testcase to reproduce it, as it's much
> work to boot the kernel over and over again.

I wouldn't (not going to here;) spend a lot of time on it. The compiler
has problems. It won't build glibc-2.2, and chokes horribly on ipchains.

int ipt_register_table(struct ipt_table *table)
{
int ret;
struct ipt_table_info *newinfo;
static struct ipt_table_info bootstrap
= { 0, 0, { 0 }, { 0 }, { } };
^
ip_tables.c:1361: Internal compiler error in array_size_for_constructor, at varasm.c:4456

-Mike

2000-12-25 16:39:29

by Andreas Franck

[permalink] [raw]
Subject: Re: Fatal Oops on boot with 2.4.0testX and recent GCC snapshots

Hello Mike, hello linux-kernel hackers,

Mike Galbraith wrote:
> I wouldn't (not going to here;) spend a lot of time on it. The compiler
> has problems. It won't build glibc-2.2, and chokes horribly on ipchains.

Maybe, but you were lucky getting an ICE, and not silently failing code :-)

After having spent several hours debugging now, I think it was
worth it (at least for my understanding of lower-level kernel issues and of
the (rather nice and almost readable) assembly code gcc generates). There
seems to be something going wrong in the down(sem) path after the
kernel_thread call.

I'm not sure if down() succeeds instantly when compiling the kernel with
2.95.2, but it seems to fail for 2.97; I figured out by spilling some
printk's around in bdflush_init, which made the bug magically disappear, due
to the looser timing. This also might happen for compiling with frame
pointers or with the static declaration variables, somehow.

Th bdflush_init function itself does not seem to be responsible, which
corresponds with the assembly, which is fine and should get the same results
for all compiled cases.

It seems that whyever, the cause for this failure is actually the down(sem)
call on a not yet up()'ed semaphore, and this is where it starts to get ugly.

down() then calls __down_failed, which ends up in __down(); __down does some
waitqueue handling, which I don't understand, and then calls __wake_up - up
to then, everything seems fine, in __wake_up it is where my search ended up
to now, but I think something is wrong in this context; however, the
complexity of this code exceeds my knowledge by magnitudes, so I can't
continue searching there without going mad :-)

It would be nice if someone else could look from there on, now I've narrowed
the case down to rather low-level functions.

Greetings,
Andreas

--
->>>----------------------- Andreas Franck --------<<<-
---<<<---- [email protected] --->>>---
->>>---- Keep smiling! ----------------------------<<<-

2000-12-25 16:39:29

by Andreas Franck

[permalink] [raw]
Subject: Re: Fatal Oops on boot with 2.4.0testX and recent GCC snapshots

Hello Mike, hello linux-kernel hackers,

Mike Galbraith wrote:
> I wouldn't (not going to here;) spend a lot of time on it. The compiler
> has problems. It won't build glibc-2.2, and chokes horribly on ipchains.

Maybe, but after having spent several hours debugging now, I think it was
worth it: I am almost sure this is not a gcc bug, but a nasty race condition
involving the semaphore handling bdflush_init.

I figured out by spilling some printk's around in bdflush_init, which made
the bug magically disappear, what wasn't what I intended - but which gave me
a clearer impression of what's going on.

It seems that whyever, the cause for this failure is actually the down(sem)
call on a not yet up()'ed semaphore, and this is where it starts to get ugly.


--
->>>----------------------- Andreas Franck --------<<<-
---<<<---- [email protected] --->>>---
->>>---- Keep smiling! ----------------------------<<<-

2000-12-25 18:16:52

by Mike Galbraith

[permalink] [raw]
Subject: Re: Fatal Oops on boot with 2.4.0testX and recent GCC snapshots

On Mon, 25 Dec 2000, Andreas Franck wrote:

> Hello Mike, hello linux-kernel hackers,
>
> Mike Galbraith wrote:
> > I wouldn't (not going to here;) spend a lot of time on it. The compiler
> > has problems. It won't build glibc-2.2, and chokes horribly on ipchains.
>
> Maybe, but after having spent several hours debugging now, I think it was
> worth it: I am almost sure this is not a gcc bug, but a nasty race condition
> involving the semaphore handling bdflush_init.
>
> I figured out by spilling some printk's around in bdflush_init, which made
> the bug magically disappear, what wasn't what I intended - but which gave me
> a clearer impression of what's going on.

Oh? Can you show me (offline) what you did exactly that made it go away?
(that's kinda scary.. _much_ prefer 'compiler has rough edges' option;)

-Mike

2000-12-25 18:33:16

by Mike Galbraith

[permalink] [raw]
Subject: Re: Fatal Oops on boot with 2.4.0testX and recent GCC snapshots

On Mon, 25 Dec 2000, Andreas Franck wrote:

> Hello Mike, hello linux-kernel hackers,
>
> Mike Galbraith wrote:
> > I wouldn't (not going to here;) spend a lot of time on it. The compiler
> > has problems. It won't build glibc-2.2, and chokes horribly on ipchains.
>
> Maybe, but you were lucky getting an ICE, and not silently failing code :-)

You bet.

> After having spent several hours debugging now, I think it was
> worth it (at least for my understanding of lower-level kernel issues and of
> the (rather nice and almost readable) assembly code gcc generates). There

Don't get me wrong, chasing things like this is never a waste of time.
In the case of gcc in particular. Our next 'stable' kernel compiler
is going to come from the gcc development tree just as the next 'stable'
kernel is coming out of the kernel development tree.

-Mike

2000-12-25 21:12:10

by Thorsten Kranzkowski

[permalink] [raw]
Subject: Re: Fatal Oops on boot with 2.4.0testX and recent GCC snapshots

On Mon, Dec 25, 2000 at 06:09:35AM +0100, Mike Galbraith wrote:
> I wouldn't (not going to here;) spend a lot of time on it. The compiler
> has problems. It won't build glibc-2.2, and chokes horribly on ipchains.
>
> int ipt_register_table(struct ipt_table *table)
> {
> int ret;
> struct ipt_table_info *newinfo;
> static struct ipt_table_info bootstrap
> = { 0, 0, { 0 }, { 0 }, { } };
> ^
> ip_tables.c:1361: Internal compiler error in array_size_for_constructor, at varasm.c:4456


Well, I 'fixed' this by changing above line to:
= { 0, 0, { 0 }, { 0 }, };
and repeating this change (deleting the braces) about 15 times in 2 or 3 other
files of iptables. (patch available on request)
Of course gcc shouldn't die but issue a useful message if/when syntax rules
may have changed.

Apart from that and a hand-edited arch/alpha/vmlinux.lds that got some
newlines wrong, the kernel compiled fine and is up for over a day now.
Though this is not intel but alpha (ev4 / AXPpci33).

Marvin:~$ uname -a
Linux Marvin 2.4.0-test13pre4-ac2 #13 Sun Dec 24 15:26:57 UTC 2000 alpha unknown
Marvin:~$ uptime
8:19pm up 1 day, 4:28, 4 users, load average: 0.00, 0.00, 0.00
Marvin:~$ gcc -v
Reading specs from /usr/lib/gcc-lib/alpha-unknown-linux-gnu/2.97/specs
Configured with: ../gcc-20001211/configure --enable-threads --enable-shared --prefix=/usr --enable-languages=c,c++
gcc version 2.97 20001211 (experimental)


I use iptables for masquerading my local ethernet and that works as expected
so far.

Thorsten.



--
| Thorsten Kranzkowski Internet: [email protected] |
| Mobile: ++49 170 1876134 Snail: Niemannsweg 30, 49201 Dissen, Germany |
| Ampr: dl8bcu@db0lj.#rpl.deu.eu, [email protected] [44.130.8.19] |

2000-12-26 07:34:59

by Paul Laufer

[permalink] [raw]
Subject: Re: Fatal Oops on boot with 2.4.0testX and recent GCC snapshots

On Mon, Dec 25, 2000 at 08:40:50PM +0000 or thereabouts, Thorsten Kranzkowski wrote:
> On Mon, Dec 25, 2000 at 06:09:35AM +0100, Mike Galbraith wrote:
> > I wouldn't (not going to here;) spend a lot of time on it. The compiler
> > has problems. It won't build glibc-2.2, and chokes horribly on ipchains.
> >
> > int ipt_register_table(struct ipt_table *table)
> > {
> > int ret;
> > struct ipt_table_info *newinfo;
> > static struct ipt_table_info bootstrap
> > = { 0, 0, { 0 }, { 0 }, { } };
> > ^
> > ip_tables.c:1361: Internal compiler error in array_size_for_constructor, at varasm.c:4456
>
>
> Well, I 'fixed' this by changing above line to:
> = { 0, 0, { 0 }, { 0 }, };
> and repeating this change (deleting the braces) about 15 times in 2 or 3 other
> files of iptables. (patch available on request)
> Of course gcc shouldn't die but issue a useful message if/when syntax rules
> may have changed.
>
> Apart from that and a hand-edited arch/alpha/vmlinux.lds that got some
> newlines wrong, the kernel compiled fine and is up for over a day now.
> Though this is not intel but alpha (ev4 / AXPpci33).
>
> Marvin:~$ uname -a
> Linux Marvin 2.4.0-test13pre4-ac2 #13 Sun Dec 24 15:26:57 UTC 2000 alpha unknown
> Marvin:~$ uptime
> 8:19pm up 1 day, 4:28, 4 users, load average: 0.00, 0.00, 0.00
> Marvin:~$ gcc -v
> Reading specs from /usr/lib/gcc-lib/alpha-unknown-linux-gnu/2.97/specs
> Configured with: ../gcc-20001211/configure --enable-threads --enable-shared --prefix=/usr --enable-languages=c,c++
> gcc version 2.97 20001211 (experimental)
>
>
> I use iptables for masquerading my local ethernet and that works as expected
> so far.
>
> Thorsten.

Its a problem with initializing a zero-length array. This is something
that gcc has never previously been documented to do, but it has worked
in the past (most of the time). Recently it has been decided (according
to traffic on gcc-bugs and gcc-patches lists) that gcc will handle
zero-length arrays as flexable-array-members per ISO C99 standard.
AFAIK, that means that if they are to be initialized, zero-length arrays
can only exist as the last element of a structure, and that the
structure must not be embeded within another structure.

The empty brackets that Thorsten removed were initializing the zero-length
array to empty, but gcc currently has this bit of code in varasm.c
(around line 4460):

/* ??? I'm fairly certain if there were no elements, we shouldn't have
created the constructor in the first place. */
if (max_index == NULL_TREE)
abort ();

This abort() resulted in the "Internal compiler error" that Mike noticed
earlier. Removing the empty brackets prevents gcc from trying to
initialize the zero length array and avoids this problem. However, this
can result in warning messages about missing initializers depending upon
the warning flags given to gcc, and seems like the wrong thing to do.

The best solution (IMHO) for this situation is to change gcc/varasm.c to
accept empty initializers, something like:

/* ??? I'm fairly certain if there were no elements, we shouldn't have
created the constructor in the first place. */
/* No, it can be useful to initialize the zero-length array with an
empty initializer. */
if (max_index == NULL_TREE)
return 0;

The rest of netfilter will still not compile because in several other C
files the initialized zero-length arrays are nested several structures
deep. If we can convince the gcc folks to drop some of the ISO C99
restrictions on the use of zero-length arrays then all will be back to
normal (as Ulrich Drepper pointed out, the ISO committee in their
infinite wisdom does not always come up with a standard that is the best
solution in the real world). But I am not sure if that is the best
solution. Perhaps it would be better to change the netfilter code. In
any event, the gcc documentation does not say anything about not being
able to initialize zero-length arrays to empty, so this is a bug and I'm
going to talk with the gcc folks.

-Paul Laufer

2000-12-28 00:53:05

by Andreas Franck

[permalink] [raw]
Subject: Re: Fatal Oops on boot with 2.4.0testX and recent GCC snapshots

Hello Mike, hello Linus,

Some minutes ago, I wrote:
> I think I have found the reason for our bugs. It seems GCC really
> miscompiles buffer.c:bdflush_init without frame pointers. I'll try harder
> now to understand what excactly is going on, but it seems it is smashing
> its local stack space by decrementing its stack pointer too early, then
> calling an assembler function (__down_failed). It might be that GCC is
> confused by this.

[...]

> Any comments on this? I'll now try to split up the stack space operation in
> two parts, the first after call kernel_thread: addl $12, %esp (as in the
> first call), and an additional addl $64, %esp just before leaving (before
> popl %ebx). And I'll report what happened, later - but I have a good
> feeling that I have caught the bug.

... and my good feeling was right. Changing the bogus assembly code made the
bug go away. I'll try to prepare a simpler testcase for the GCC maintainers
tomorrow. For short, this is what happens: GCC tries to free its stack frame
for the local variables far too early. It then calls __down_failed(), which
pushes some things on the stack - thereby corrupting the semaphore pointer!
So __down() works on a random memory location instead of the semaphore, which
is guaranteed to fail badly.

I've added linux-kernel as CC again, so everybody can now hear that this is
definitely a GCC bug, and not a kernel issue.

Greetings,
Andreas

--
->>>----------------------- Andreas Franck --------<<<-
---<<<---- [email protected] --->>>---
->>>---- Keep smiling! ----------------------------<<<-