2002-03-01 17:34:41

by Bryon Roche

[permalink] [raw]
Subject: OOPS: Multipath routing 2.4.17

I am running a mailserver on linux 2.4.17 with equal-cost multi-path
routing to 2 local routers, and I am able to OOPS the machine under
moderate load with the multipath route installed. Attached is a decoded
OOPS log as well as my .config.

These are my log messages immediately before the OOPS:

impossible 888
divide error: 0000

OOPS:
ksymoops 2.4.3 on i686 2.4.17. Options used
-V (default)
-k /proc/ksyms (default)
-l /proc/modules (default)
-o /lib/modules/2.4.17/ (default)
-m /boot/System.map-2.4.17 (default)

Warning (compare_maps): mismatch on symbol partition_name , ksyms_base says c0203d60, System.map says c01547f0. Ignoring ksyms_base entry
CPU: 0
EIP: 0010:[<c024c5ea>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010246
eax: 1a0412d2 ebx: e930fe50 ecx: f75be780 edx: 00000000
esi: e930fe44 edi: 00000000 ebp: e930fe44 esp: e930fe0c
ds: 0018 es: 0018 ss: 0018
Process smtp (pid: 24272, stackpage=e930f000)
Stack: e930fe50 e930fe44 00000000 e930feb8 c02232e8 e930fe50 e930fe44 00000000
e930feb8 c038dc1c e930feb4 00000000 00000001 00000000 00010100 f75be780
c02e8640 5696edc2 00000000 00000001 00000000 00000000 37360000 c02237ac
Call Trace: [<c02232e8>] [<c02237ac>] [<c023b6ac>] [<c024883a>] [<c021111b>]
[<C02100D5>] [<c013613c>] [<c0210d70>] [<c0211aa0>] [<c010713b>]
Code: f7 71 4c 8b 41 48 89 d3 31 f6 8d 51 50 39 c6 7d 20 89 c7 8d

>>EIP; c024c5ea <fib_select_multipath+5a/a0> <=====
Trace; c02232e8 <ip_route_output_slow+318/670>
Trace; c02237ac <ip_route_output_key+16c/180>
Trace; c023b6ac <tcp_v4_connect+cc/370>
Trace; c024883a <inet_stream_connect+13a/2a0>
Trace; c021111a <sys_connect+5a/80>
Trace; c02100d4 <sock_map_fd+124/1c0>
Trace; c013613c <fput+cc/f0>
Trace; c0210d70 <sys_socket+30/60>
Trace; c0211aa0 <sys_socketcall+90/200>
Trace; c010713a <system_call+32/38>
Code; c024c5ea <fib_select_multipath+5a/a0>
00000000 <_EIP>:
Code; c024c5ea <fib_select_multipath+5a/a0> <=====
0: f7 71 4c divl 0x4c(%ecx) <=====
Code; c024c5ec <fib_select_multipath+5c/a0>
3: 8b 41 48 mov 0x48(%ecx),%eax
Code; c024c5f0 <fib_select_multipath+60/a0>
6: 89 d3 mov %edx,%ebx
Code; c024c5f2 <fib_select_multipath+62/a0>
8: 31 f6 xor %esi,%esi
Code; c024c5f4 <fib_select_multipath+64/a0>
a: 8d 51 50 lea 0x50(%ecx),%edx
Code; c024c5f6 <fib_select_multipath+66/a0>
d: 39 c6 cmp %eax,%esi
Code; c024c5f8 <fib_select_multipath+68/a0>
f: 7d 20 jge 31 <_EIP+0x31> c024c61a <fib_select_multipath+8a/a0>
Code; c024c5fa <fib_select_multipath+6a/a0>
11: 89 c7 mov %eax,%edi
Code; c024c5fc <fib_select_multipath+6c/a0>
13: 8d 00 lea (%eax),%eax

CPU: 0
EIP: 0010:[<c024c5ea>] Not tainted
EFLAGS: 00010246
eax: 1a21c700 ebx: 00000001 ecx: f75be780 edx: 00000000
esi: 00000002 edi: 00000000 ebp: dc833d60 esp: dc833d28
ds: 0018 es: 0018 ss: 0018
Process named (pid: 28713, stackpage=dc833000)
Stack: dc833d6c dc833d60 00000000 dc833dd8 c02232e8 dc833d6c dc833d60 00000000
dc833dd8 dc833df0 dc833dd4 00000000 00000001 00000000 00010100 f75be780
c02e8640 0e00eed8 00000000 00000001 00000000 00000000 00000000 c02237ac
Call Trace: [<c02232e8>] [<c02237ac>] [<c02426cf>] [<c0248c79>] [<c0210389>]
[<c0211773>] [<c0106d29>] [<c0107003>] [<c013e493>] [<c0211bec>] [<c010713b>]
Code: f7 71 4c 8b 41 48 89 d3 31 f6 8d 51 50 39 c6 7d 20 89 c7 8d

>>EIP; c024c5ea <fib_select_multipath+5a/a0> <=====
Trace; c02232e8 <ip_route_output_slow+318/670>
Trace; c02237ac <ip_route_output_key+16c/180>
Trace; c02426ce <udp_sendmsg+29e/410>
Trace; c0248c78 <inet_sendmsg+38/40>
Trace; c0210388 <sock_sendmsg+68/90>
Trace; c0211772 <sys_sendmsg+192/1f0>
Trace; c0106d28 <handle_signal+78/100>
Trace; c0107002 <do_signal+252/2a0>
Trace; c013e492 <pipe_write+212/270>
Trace; c0211bec <sys_socketcall+1dc/200>
Trace; c010713a <system_call+32/38>
Code; c024c5ea <fib_select_multipath+5a/a0>
00000000 <_EIP>:
Code; c024c5ea <fib_select_multipath+5a/a0> <=====
0: f7 71 4c divl 0x4c(%ecx) <=====
Code; c024c5ec <fib_select_multipath+5c/a0>
3: 8b 41 48 mov 0x48(%ecx),%eax
Code; c024c5f0 <fib_select_multipath+60/a0>
6: 89 d3 mov %edx,%ebx
Code; c024c5f2 <fib_select_multipath+62/a0>
8: 31 f6 xor %esi,%esi
Code; c024c5f4 <fib_select_multipath+64/a0>
a: 8d 51 50 lea 0x50(%ecx),%edx
Code; c024c5f6 <fib_select_multipath+66/a0>
d: 39 c6 cmp %eax,%esi
Code; c024c5f8 <fib_select_multipath+68/a0>
f: 7d 20 jge 31 <_EIP+0x31> c024c61a <fib_select_multipath+8a/a0>
Code; c024c5fa <fib_select_multipath+6a/a0>
11: 89 c7 mov %eax,%edi
Code; c024c5fc <fib_select_multipath+6c/a0>
13: 8d 00 lea (%eax),%eax


2 warnings issued. Results may not be reliable.

.config:
#
# Automatically generated make config: don't edit
#
CONFIG_X86=y
CONFIG_ISA=y
# CONFIG_SBUS is not set
CONFIG_UID16=y

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y

#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODVERSIONS=y
CONFIG_KMOD=y

#
# Processor type and features
#
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
CONFIG_MK7=y
# CONFIG_MCRUSOE is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MCYRIXIII is not set
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
# CONFIG_RWSEM_GENERIC_SPINLOCK is not set
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_X86_TSC=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_USE_3DNOW=y
CONFIG_X86_PGE=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
# CONFIG_TOSHIBA is not set
# CONFIG_I8K is not set
CONFIG_MICROCODE=y
CONFIG_X86_MSR=y
CONFIG_X86_CPUID=y
# CONFIG_NOHIGHMEM is not set
CONFIG_HIGHMEM4G=y
# CONFIG_HIGHMEM64G is not set
CONFIG_HIGHMEM=y
# CONFIG_MATH_EMULATION is not set
CONFIG_MTRR=y
CONFIG_SMP=y
# CONFIG_MULTIQUAD is not set
CONFIG_HAVE_DEC_LOCK=y

#
# General setup
#
CONFIG_NET=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_PCI=y
# CONFIG_PCI_GOBIOS is not set
# CONFIG_PCI_GODIRECT is not set
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_NAMES=y
# CONFIG_EISA is not set
# CONFIG_MCA is not set
# CONFIG_HOTPLUG is not set
# CONFIG_PCMCIA is not set
# CONFIG_HOTPLUG_PCI is not set
CONFIG_SYSVIPC=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_SYSCTL=y
CONFIG_KCORE_ELF=y
# CONFIG_KCORE_AOUT is not set
# CONFIG_BINFMT_AOUT is not set
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_MISC=m
# CONFIG_PM is not set
# CONFIG_ACPI is not set
# CONFIG_APM is not set

#
# Memory Technology Devices (MTD)
#
# CONFIG_MTD is not set

#
# Parallel port support
#
# CONFIG_PARPORT is not set

#
# Plug and Play configuration
#
CONFIG_PNP=y
# CONFIG_ISAPNP is not set

#
# Block devices
#
CONFIG_BLK_DEV_FD=y
# CONFIG_BLK_DEV_XD is not set
# CONFIG_PARIDE is not set
# CONFIG_BLK_CPQ_DA is not set
# CONFIG_BLK_CPQ_CISS_DA is not set
# CONFIG_BLK_DEV_DAC960 is not set
CONFIG_BLK_DEV_LOOP=m
CONFIG_BLK_DEV_NBD=m
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_SIZE=8192
CONFIG_BLK_DEV_INITRD=y

#
# Multi-device support (RAID and LVM)
#
CONFIG_MD=y
CONFIG_BLK_DEV_MD=y
# CONFIG_MD_LINEAR is not set
# CONFIG_MD_RAID0 is not set
# CONFIG_MD_RAID1 is not set
CONFIG_MD_RAID5=y
CONFIG_MD_MULTIPATH=y
CONFIG_BLK_DEV_LVM=y

#
# Networking options
#
CONFIG_PACKET=y
CONFIG_PACKET_MMAP=y
# CONFIG_NETLINK_DEV is not set
CONFIG_NETFILTER=y
# CONFIG_NETFILTER_DEBUG is not set
CONFIG_FILTER=y
CONFIG_UNIX=y
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_ADVANCED_ROUTER=y
CONFIG_IP_MULTIPLE_TABLES=y
CONFIG_IP_ROUTE_FWMARK=y
CONFIG_IP_ROUTE_NAT=y
CONFIG_IP_ROUTE_MULTIPATH=y
CONFIG_IP_ROUTE_TOS=y
# CONFIG_IP_ROUTE_VERBOSE is not set
# CONFIG_IP_ROUTE_LARGE_TABLES is not set
# CONFIG_IP_PNP is not set
CONFIG_NET_IPIP=m
CONFIG_NET_IPGRE=m
CONFIG_NET_IPGRE_BROADCAST=y
# CONFIG_IP_MROUTE is not set
# CONFIG_ARPD is not set
CONFIG_INET_ECN=y
CONFIG_SYN_COOKIES=y

#
# IP: Netfilter Configuration
#
CONFIG_IP_NF_CONNTRACK=m
CONFIG_IP_NF_FTP=m
# CONFIG_IP_NF_IRC is not set
CONFIG_IP_NF_QUEUE=m
CONFIG_IP_NF_IPTABLES=m
CONFIG_IP_NF_MATCH_LIMIT=m
CONFIG_IP_NF_MATCH_MAC=m
CONFIG_IP_NF_MATCH_MARK=m
CONFIG_IP_NF_MATCH_MULTIPORT=m
CONFIG_IP_NF_MATCH_TOS=m
CONFIG_IP_NF_MATCH_LENGTH=m
CONFIG_IP_NF_MATCH_TTL=m
CONFIG_IP_NF_MATCH_TCPMSS=m
CONFIG_IP_NF_MATCH_STATE=m
CONFIG_IP_NF_MATCH_UNCLEAN=m
CONFIG_IP_NF_MATCH_OWNER=m
CONFIG_IP_NF_FILTER=m
CONFIG_IP_NF_TARGET_REJECT=m
CONFIG_IP_NF_TARGET_MIRROR=m
CONFIG_IP_NF_NAT=m
CONFIG_IP_NF_NAT_NEEDED=y
CONFIG_IP_NF_TARGET_MASQUERADE=m
CONFIG_IP_NF_TARGET_REDIRECT=m
CONFIG_IP_NF_NAT_SNMP_BASIC=m
CONFIG_IP_NF_NAT_FTP=m
CONFIG_IP_NF_MANGLE=m
CONFIG_IP_NF_TARGET_TOS=m
CONFIG_IP_NF_TARGET_MARK=m
CONFIG_IP_NF_TARGET_LOG=m
CONFIG_IP_NF_TARGET_TCPMSS=m
# CONFIG_IP_NF_COMPAT_IPCHAINS is not set
# CONFIG_IP_NF_COMPAT_IPFWADM is not set
# CONFIG_IPV6 is not set
# CONFIG_KHTTPD is not set
# CONFIG_ATM is not set
# CONFIG_VLAN_8021Q is not set

#
#
#
# CONFIG_IPX is not set
# CONFIG_ATALK is not set
# CONFIG_DECNET is not set
# CONFIG_BRIDGE is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_LLC is not set
# CONFIG_NET_DIVERT is not set
# CONFIG_ECONET is not set
# CONFIG_WAN_ROUTER is not set
# CONFIG_NET_FASTROUTE is not set
# CONFIG_NET_HW_FLOWCONTROL is not set

#
# QoS and/or fair queueing
#
# CONFIG_NET_SCHED is not set

#
# Telephony Support
#
# CONFIG_PHONE is not set
# CONFIG_PHONE_IXJ is not set
# CONFIG_PHONE_IXJ_PCMCIA is not set

#
# ATA/IDE/MFM/RLL support
#
# CONFIG_IDE is not set
# CONFIG_BLK_DEV_IDE_MODES is not set
# CONFIG_BLK_DEV_HD is not set

#
# SCSI support
#
CONFIG_SCSI=y

#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=y
CONFIG_SD_EXTRA_DEVS=40
CONFIG_CHR_DEV_ST=y
# CONFIG_CHR_DEV_OSST is not set
CONFIG_BLK_DEV_SR=y
CONFIG_BLK_DEV_SR_VENDOR=y
CONFIG_SR_EXTRA_DEVS=2
CONFIG_CHR_DEV_SG=y

#
# Some SCSI devices (e.g. CD jukebox) support multiple LUNs
#
CONFIG_SCSI_DEBUG_QUEUES=y
CONFIG_SCSI_MULTI_LUN=y
CONFIG_SCSI_CONSTANTS=y
CONFIG_SCSI_LOGGING=y

#
# SCSI low-level drivers
#
# CONFIG_BLK_DEV_3W_XXXX_RAID is not set
# CONFIG_SCSI_7000FASST is not set
# CONFIG_SCSI_ACARD is not set
# CONFIG_SCSI_AHA152X is not set
# CONFIG_SCSI_AHA1542 is not set
# CONFIG_SCSI_AHA1740 is not set
# CONFIG_SCSI_AACRAID is not set
# CONFIG_SCSI_AIC7XXX is not set
# CONFIG_SCSI_AIC7XXX_OLD is not set
# CONFIG_SCSI_DPT_I2O is not set
# CONFIG_SCSI_ADVANSYS is not set
# CONFIG_SCSI_IN2000 is not set
# CONFIG_SCSI_AM53C974 is not set
# CONFIG_SCSI_MEGARAID is not set
# CONFIG_SCSI_BUSLOGIC is not set
# CONFIG_SCSI_CPQFCTS is not set
# CONFIG_SCSI_DMX3191D is not set
# CONFIG_SCSI_DTC3280 is not set
# CONFIG_SCSI_EATA is not set
# CONFIG_SCSI_EATA_DMA is not set
# CONFIG_SCSI_EATA_PIO is not set
# CONFIG_SCSI_FUTURE_DOMAIN is not set
# CONFIG_SCSI_GDTH is not set
# CONFIG_SCSI_GENERIC_NCR5380 is not set
# CONFIG_SCSI_IPS is not set
# CONFIG_SCSI_INITIO is not set
# CONFIG_SCSI_INIA100 is not set
# CONFIG_SCSI_NCR53C406A is not set
# CONFIG_SCSI_NCR53C7xx is not set
CONFIG_SCSI_SYM53C8XX_2=y
CONFIG_SCSI_SYM53C8XX_DMA_ADDRESSING_MODE=2
CONFIG_SCSI_SYM53C8XX_DEFAULT_TAGS=96
CONFIG_SCSI_SYM53C8XX_MAX_TAGS=256
# CONFIG_SCSI_SYM53C8XX_IOMAPPED is not set
# CONFIG_SCSI_PAS16 is not set
# CONFIG_SCSI_PCI2000 is not set
# CONFIG_SCSI_PCI2220I is not set
# CONFIG_SCSI_PSI240I is not set
# CONFIG_SCSI_QLOGIC_FAS is not set
# CONFIG_SCSI_QLOGIC_ISP is not set
# CONFIG_SCSI_QLOGIC_FC is not set
# CONFIG_SCSI_QLOGIC_1280 is not set
# CONFIG_SCSI_SEAGATE is not set
# CONFIG_SCSI_SIM710 is not set
# CONFIG_SCSI_SYM53C416 is not set
# CONFIG_SCSI_DC390T is not set
# CONFIG_SCSI_T128 is not set
# CONFIG_SCSI_U14_34F is not set
# CONFIG_SCSI_ULTRASTOR is not set
# CONFIG_SCSI_DEBUG is not set

#
# Fusion MPT device support
#
# CONFIG_FUSION is not set
# CONFIG_FUSION_BOOT is not set
# CONFIG_FUSION_ISENSE is not set
# CONFIG_FUSION_CTL is not set
# CONFIG_FUSION_LAN is not set

#
# IEEE 1394 (FireWire) support (EXPERIMENTAL)
#
# CONFIG_IEEE1394 is not set

#
# I2O device support
#
# CONFIG_I2O is not set
# CONFIG_I2O_PCI is not set
# CONFIG_I2O_BLOCK is not set
# CONFIG_I2O_LAN is not set
# CONFIG_I2O_SCSI is not set
# CONFIG_I2O_PROC is not set

#
# Network device support
#
CONFIG_NETDEVICES=y

#
# ARCnet devices
#
# CONFIG_ARCNET is not set
CONFIG_DUMMY=m
# CONFIG_BONDING is not set
# CONFIG_EQUALIZER is not set
CONFIG_TUN=m
# CONFIG_ETHERTAP is not set

#
# Ethernet (10 or 100Mbit)
#
CONFIG_NET_ETHERNET=y
# CONFIG_SUNLANCE is not set
# CONFIG_HAPPYMEAL is not set
# CONFIG_SUNBMAC is not set
# CONFIG_SUNQE is not set
# CONFIG_SUNLANCE is not set
# CONFIG_SUNGEM is not set
# CONFIG_NET_VENDOR_3COM is not set
# CONFIG_LANCE is not set
# CONFIG_NET_VENDOR_SMC is not set
# CONFIG_NET_VENDOR_RACAL is not set
# CONFIG_AT1700 is not set
# CONFIG_DEPCA is not set
# CONFIG_HP100 is not set
CONFIG_NET_ISA=y
# CONFIG_E2100 is not set
# CONFIG_EWRK3 is not set
# CONFIG_EEXPRESS is not set
# CONFIG_EEXPRESS_PRO is not set
# CONFIG_HPLAN_PLUS is not set
# CONFIG_HPLAN is not set
# CONFIG_LP486E is not set
# CONFIG_ETH16I is not set
# CONFIG_NE2000 is not set
CONFIG_NET_PCI=y
# CONFIG_PCNET32 is not set
# CONFIG_ADAPTEC_STARFIRE is not set
# CONFIG_AC3200 is not set
# CONFIG_APRICOT is not set
# CONFIG_CS89x0 is not set
CONFIG_TULIP=m
CONFIG_TULIP_MWI=y
CONFIG_TULIP_MMIO=y
# CONFIG_DE4X5 is not set
# CONFIG_DGRS is not set
# CONFIG_DM9102 is not set
# CONFIG_EEPRO100 is not set
# CONFIG_LNE390 is not set
# CONFIG_FEALNX is not set
# CONFIG_NATSEMI is not set
# CONFIG_NE2K_PCI is not set
# CONFIG_NE3210 is not set
# CONFIG_ES3210 is not set
# CONFIG_8139CP is not set
# CONFIG_8139TOO is not set
# CONFIG_8139TOO_PIO is not set
# CONFIG_8139TOO_TUNE_TWISTER is not set
# CONFIG_8139TOO_8129 is not set
# CONFIG_SIS900 is not set
# CONFIG_EPIC100 is not set
# CONFIG_SUNDANCE is not set
# CONFIG_TLAN is not set
# CONFIG_VIA_RHINE is not set
# CONFIG_VIA_RHINE_MMIO is not set
# CONFIG_WINBOND_840 is not set
# CONFIG_NET_POCKET is not set

#
# Ethernet (1000 Mbit)
#
# CONFIG_ACENIC is not set
# CONFIG_DL2K is not set
# CONFIG_MYRI_SBUS is not set
# CONFIG_NS83820 is not set
# CONFIG_HAMACHI is not set
# CONFIG_YELLOWFIN is not set
# CONFIG_SK98LIN is not set
# CONFIG_FDDI is not set
# CONFIG_HIPPI is not set
# CONFIG_PLIP is not set
# CONFIG_PPP is not set
# CONFIG_SLIP is not set

#
# Wireless LAN (non-hamradio)
#
# CONFIG_NET_RADIO is not set

#
# Token Ring devices
#
# CONFIG_TR is not set
# CONFIG_NET_FC is not set
# CONFIG_RCPCI is not set
# CONFIG_SHAPER is not set

#
# Wan interfaces
#
# CONFIG_WAN is not set

#
# Amateur Radio support
#
# CONFIG_HAMRADIO is not set

#
# IrDA (infrared) support
#
# CONFIG_IRDA is not set

#
# ISDN subsystem
#
# CONFIG_ISDN is not set

#
# Old CD-ROM drivers (not SCSI, not IDE)
#
# CONFIG_CD_NO_IDESCSI is not set

#
# Input core support
#
# CONFIG_INPUT is not set
# CONFIG_INPUT_KEYBDEV is not set
# CONFIG_INPUT_MOUSEDEV is not set
# CONFIG_INPUT_JOYDEV is not set
# CONFIG_INPUT_EVDEV is not set

#
# Character devices
#
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
# CONFIG_SERIAL is not set
# CONFIG_SERIAL_EXTENDED is not set
# CONFIG_SERIAL_NONSTANDARD is not set
CONFIG_UNIX98_PTYS=y
CONFIG_UNIX98_PTY_COUNT=2048

#
# I2C support
#
# CONFIG_I2C is not set

#
# Mice
#
# CONFIG_BUSMOUSE is not set
CONFIG_MOUSE=y
CONFIG_PSMOUSE=y
# CONFIG_82C710_MOUSE is not set
# CONFIG_PC110_PAD is not set

#
# Joysticks
#
# CONFIG_INPUT_GAMEPORT is not set

#
# Input core support is needed for gameports
#

#
# Input core support is needed for joysticks
#
# CONFIG_QIC02_TAPE is not set

#
# Watchdog Cards
#
# CONFIG_WATCHDOG is not set
# CONFIG_INTEL_RNG is not set
CONFIG_NVRAM=y
CONFIG_RTC=y
# CONFIG_DTLK is not set
# CONFIG_R3964 is not set
# CONFIG_APPLICOM is not set
# CONFIG_SONYPI is not set

#
# Ftape, the floppy tape device driver
#
# CONFIG_FTAPE is not set
# CONFIG_AGP is not set
# CONFIG_DRM is not set
# CONFIG_MWAVE is not set

#
# Multimedia devices
#
# CONFIG_VIDEO_DEV is not set

#
# File systems
#
# CONFIG_QUOTA is not set
# CONFIG_AUTOFS_FS is not set
CONFIG_AUTOFS4_FS=y
CONFIG_REISERFS_FS=y
# CONFIG_REISERFS_CHECK is not set
# CONFIG_REISERFS_PROC_INFO is not set
# CONFIG_ADFS_FS is not set
# CONFIG_ADFS_FS_RW is not set
# CONFIG_AFFS_FS is not set
# CONFIG_HFS_FS is not set
# CONFIG_BFS_FS is not set
# CONFIG_EXT3_FS is not set
# CONFIG_JBD is not set
# CONFIG_JBD_DEBUG is not set
CONFIG_FAT_FS=m
CONFIG_MSDOS_FS=m
# CONFIG_UMSDOS_FS is not set
CONFIG_VFAT_FS=m
# CONFIG_EFS_FS is not set
# CONFIG_JFFS_FS is not set
# CONFIG_JFFS2_FS is not set
# CONFIG_CRAMFS is not set
# CONFIG_TMPFS is not set
CONFIG_RAMFS=y
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
CONFIG_ZISOFS=y
# CONFIG_MINIX_FS is not set
# CONFIG_VXFS_FS is not set
# CONFIG_NTFS_FS is not set
# CONFIG_NTFS_RW is not set
# CONFIG_HPFS_FS is not set
CONFIG_PROC_FS=y
CONFIG_DEVFS_FS=y
CONFIG_DEVFS_MOUNT=y
# CONFIG_DEVFS_DEBUG is not set
CONFIG_DEVPTS_FS=y
# CONFIG_QNX4FS_FS is not set
# CONFIG_QNX4FS_RW is not set
CONFIG_ROMFS_FS=y
CONFIG_EXT2_FS=y
# CONFIG_SYSV_FS is not set
CONFIG_UDF_FS=m
# CONFIG_UDF_RW is not set
# CONFIG_UFS_FS is not set
# CONFIG_UFS_FS_WRITE is not set

#
# Network File Systems
#
# CONFIG_CODA_FS is not set
# CONFIG_INTERMEZZO_FS is not set
CONFIG_NFS_FS=y
CONFIG_NFS_V3=y
# CONFIG_ROOT_NFS is not set
CONFIG_NFSD=y
CONFIG_NFSD_V3=y
CONFIG_SUNRPC=y
CONFIG_LOCKD=y
CONFIG_LOCKD_V4=y
# CONFIG_SMB_FS is not set
# CONFIG_NCP_FS is not set
# CONFIG_NCPFS_PACKET_SIGNING is not set
# CONFIG_NCPFS_IOCTL_LOCKING is not set
# CONFIG_NCPFS_STRONG is not set
# CONFIG_NCPFS_NFS_NS is not set
# CONFIG_NCPFS_OS2_NS is not set
# CONFIG_NCPFS_SMALLDOS is not set
# CONFIG_NCPFS_NLS is not set
# CONFIG_NCPFS_EXTRAS is not set
CONFIG_ZISOFS_FS=y
CONFIG_ZLIB_FS_INFLATE=y

#
# Partition Types
#
# CONFIG_PARTITION_ADVANCED is not set
CONFIG_MSDOS_PARTITION=y
# CONFIG_SMB_NLS is not set
CONFIG_NLS=y

#
# Native Language Support
#
CONFIG_NLS_DEFAULT="iso8859-1"
# CONFIG_NLS_CODEPAGE_437 is not set
# CONFIG_NLS_CODEPAGE_737 is not set
# CONFIG_NLS_CODEPAGE_775 is not set
# CONFIG_NLS_CODEPAGE_850 is not set
# CONFIG_NLS_CODEPAGE_852 is not set
# CONFIG_NLS_CODEPAGE_855 is not set
# CONFIG_NLS_CODEPAGE_857 is not set
# CONFIG_NLS_CODEPAGE_860 is not set
# CONFIG_NLS_CODEPAGE_861 is not set
# CONFIG_NLS_CODEPAGE_862 is not set
# CONFIG_NLS_CODEPAGE_863 is not set
# CONFIG_NLS_CODEPAGE_864 is not set
# CONFIG_NLS_CODEPAGE_865 is not set
# CONFIG_NLS_CODEPAGE_866 is not set
# CONFIG_NLS_CODEPAGE_869 is not set
# CONFIG_NLS_CODEPAGE_936 is not set
# CONFIG_NLS_CODEPAGE_950 is not set
# CONFIG_NLS_CODEPAGE_932 is not set
# CONFIG_NLS_CODEPAGE_949 is not set
# CONFIG_NLS_CODEPAGE_874 is not set
# CONFIG_NLS_ISO8859_8 is not set
# CONFIG_NLS_CODEPAGE_1251 is not set
# CONFIG_NLS_ISO8859_1 is not set
# CONFIG_NLS_ISO8859_2 is not set
# CONFIG_NLS_ISO8859_3 is not set
# CONFIG_NLS_ISO8859_4 is not set
# CONFIG_NLS_ISO8859_5 is not set
# CONFIG_NLS_ISO8859_6 is not set
# CONFIG_NLS_ISO8859_7 is not set
# CONFIG_NLS_ISO8859_9 is not set
# CONFIG_NLS_ISO8859_13 is not set
# CONFIG_NLS_ISO8859_14 is not set
# CONFIG_NLS_ISO8859_15 is not set
# CONFIG_NLS_KOI8_R is not set
# CONFIG_NLS_KOI8_U is not set
# CONFIG_NLS_UTF8 is not set

#
# Console drivers
#
CONFIG_VGA_CONSOLE=y
# CONFIG_VIDEO_SELECT is not set
# CONFIG_MDA_CONSOLE is not set

#
# Frame-buffer support
#
# CONFIG_FB is not set

#
# Sound
#
# CONFIG_SOUND is not set

#
# USB support
#
# CONFIG_USB is not set

#
# USB Controllers
#
# CONFIG_USB_UHCI is not set
# CONFIG_USB_UHCI_ALT is not set
# CONFIG_USB_OHCI is not set

#
# USB Device Class drivers
#
# CONFIG_USB_AUDIO is not set
# CONFIG_USB_BLUETOOTH is not set
# CONFIG_USB_STORAGE is not set
# CONFIG_USB_STORAGE_DEBUG is not set
# CONFIG_USB_STORAGE_DATAFAB is not set
# CONFIG_USB_STORAGE_FREECOM is not set
# CONFIG_USB_STORAGE_ISD200 is not set
# CONFIG_USB_STORAGE_DPCM is not set
# CONFIG_USB_STORAGE_HP8200e is not set
# CONFIG_USB_STORAGE_SDDR09 is not set
# CONFIG_USB_STORAGE_JUMPSHOT is not set
# CONFIG_USB_ACM is not set
# CONFIG_USB_PRINTER is not set

#
# USB Human Interface Devices (HID)
#

#
# Input core support is needed for USB HID
#

#
# USB Imaging devices
#
# CONFIG_USB_DC2XX is not set
# CONFIG_USB_MDC800 is not set
# CONFIG_USB_SCANNER is not set
# CONFIG_USB_MICROTEK is not set
# CONFIG_USB_HPUSBSCSI is not set

#
# USB Multimedia devices
#

#
# Video4Linux support is needed for USB Multimedia device support
#

#
# USB Network adaptors
#
# CONFIG_USB_PEGASUS is not set
# CONFIG_USB_KAWETH is not set
# CONFIG_USB_CATC is not set
# CONFIG_USB_CDCETHER is not set
# CONFIG_USB_USBNET is not set

#
# USB port drivers
#
# CONFIG_USB_USS720 is not set

#
# USB Serial Converter support
#
# CONFIG_USB_SERIAL is not set
# CONFIG_USB_SERIAL_GENERIC is not set
# CONFIG_USB_SERIAL_BELKIN is not set
# CONFIG_USB_SERIAL_WHITEHEAT is not set
# CONFIG_USB_SERIAL_DIGI_ACCELEPORT is not set
# CONFIG_USB_SERIAL_EMPEG is not set
# CONFIG_USB_SERIAL_FTDI_SIO is not set
# CONFIG_USB_SERIAL_VISOR is not set
# CONFIG_USB_SERIAL_IR is not set
# CONFIG_USB_SERIAL_EDGEPORT is not set
# CONFIG_USB_SERIAL_KEYSPAN_PDA is not set
# CONFIG_USB_SERIAL_KEYSPAN is not set
# CONFIG_USB_SERIAL_KEYSPAN_USA28 is not set
# CONFIG_USB_SERIAL_KEYSPAN_USA28X is not set
# CONFIG_USB_SERIAL_KEYSPAN_USA28XA is not set
# CONFIG_USB_SERIAL_KEYSPAN_USA28XB is not set
# CONFIG_USB_SERIAL_KEYSPAN_USA19 is not set
# CONFIG_USB_SERIAL_KEYSPAN_USA18X is not set
# CONFIG_USB_SERIAL_KEYSPAN_USA19W is not set
# CONFIG_USB_SERIAL_KEYSPAN_USA49W is not set
# CONFIG_USB_SERIAL_MCT_U232 is not set
# CONFIG_USB_SERIAL_PL2303 is not set
# CONFIG_USB_SERIAL_CYBERJACK is not set
# CONFIG_USB_SERIAL_XIRCOM is not set
# CONFIG_USB_SERIAL_OMNINET is not set

#
# USB Miscellaneous drivers
#
# CONFIG_USB_RIO500 is not set

#
# Bluetooth support
#
# CONFIG_BLUEZ is not set

#
# Kernel hacking
#
CONFIG_DEBUG_KERNEL=y
# CONFIG_DEBUG_HIGHMEM is not set
# CONFIG_DEBUG_SLAB is not set
# CONFIG_DEBUG_IOVIRT is not set
CONFIG_MAGIC_SYSRQ=y
# CONFIG_DEBUG_SPINLOCK is not set
CONFIG_DEBUG_BUGVERBOSE=y
--
Sex is physics, Love is chemistry, but it takes Engineering to be kinky.
**
FUD Technician
Bryon Roche, Kain <[email protected]>
<[email protected]>


2002-03-01 18:12:54

by Andi Kleen

[permalink] [raw]
Subject: Re: OOPS: Multipath routing 2.4.17

Kain <[email protected]> writes:

> I am running a mailserver on linux 2.4.17 with equal-cost multi-path
> routing to 2 local routers, and I am able to OOPS the machine under
> moderate load with the multipath route installed. Attached is a decoded
> OOPS log as well as my .config.
>
> These are my log messages immediately before the OOPS:
>
> impossible 888
> divide error: 0000

They should be after, not before the oops.

What compiler are you using?

-Andi

2002-03-01 19:05:33

by Bryon Roche

[permalink] [raw]
Subject: Re: OOPS: Multipath routing 2.4.17

On Fri, Mar 01, 2002 at 07:12:12PM +0100, Andi Kleen wrote:
> Kain <[email protected]> writes:
>
> > I am running a mailserver on linux 2.4.17 with equal-cost multi-path
> > routing to 2 local routers, and I am able to OOPS the machine under
> > moderate load with the multipath route installed. Attached is a decoded
> > OOPS log as well as my .config.
> >
> > These are my log messages immediately before the OOPS:
> >
> > impossible 888
> > divide error: 0000
>
> They should be after, not before the oops.
>
> What compiler are you using?

I am compiling with debian sid gcc:
Reading specs from /usr/lib/gcc-lib/i386-linux/2.95.4/specs
gcc version 2.95.4 20011006 (Debian prerelease)

--
Assassins do it from behind.
**
Professional
Bryon Roche, Kain <[email protected]>
<[email protected]>


Attachments:
(No filename) (806.00 B)
(No filename) (240.00 B)
Download all attachments

2002-03-01 22:28:13

by Julian Anastasov

[permalink] [raw]
Subject: Re: OOPS: Multipath routing 2.4.17


Hello,

Kain wrote:

> impossible 888
> divide error: 0000

> > > EIP; c024c5ea <fib_select_multipath+5a/a0> <=====
> Trace; c02232e8 <ip_route_output_slow+318/670>

There is no write locking in fib_select_multipath,
combined with high rate of route resolutions and ... boom,
fi->fib_power is 0:

w = jiffies % fi->fib_power;

What about a different algorithm to apply weighted
round robin (idea mostly from LVS), something like
this code (entirely not tested) where fi->fib_power is not used
and where fib_sync_up and fib_sync_down don't need to play
with nh_power on nh_flags change:

void fib_select_multipath(const struct rt_key *key, struct fib_result *res)
{
struct fib_info *fi = res->fi;
int w = -1, sel = 0;

write_lock(&fib_info_lock);

repeat:

change_nexthops(fi) {
if (nh->nh_power > w && !(nh->nh_flags&RTNH_F_DEAD)) {
w = nh->nh_power;
sel = nhsel;
}
} endfor_nexthops(fi);
if (w > 0) {
fi->fib_nh[sel].nh_power--;
write_unlock(&fib_info_lock);
res->nh_sel = sel;
return;
}

if (!w) {
change_nexthops(fi) {
if (!(nh->nh_flags&RTNH_F_DEAD)) {
nh->nh_power = nh->nh_weight;
}
} endfor_nexthops(fi);
w = -1;
goto repeat;
}

write_unlock(&fib_info_lock);

#if 1
printk(KERN_CRIT "impossible 888\n");
#endif
return;
}

Regards

--
Julian Anastasov <[email protected]>

2002-03-01 22:44:13

by Andi Kleen

[permalink] [raw]
Subject: Re: OOPS: Multipath routing 2.4.17

Julian Anastasov <[email protected]> writes:

> Hello,
>
> Kain wrote:
>
> > impossible 888
> > divide error: 0000
>
> > > > EIP; c024c5ea <fib_select_multipath+5a/a0> <=====
> > Trace; c02232e8 <ip_route_output_slow+318/670>
>
> There is no write locking in fib_select_multipath,
> combined with high rate of route resolutions and ... boom,
> fi->fib_power is 0:

In theory yes, but the

#if 1
if (power <= 0) {
printk(KERN_CRIT "impossible 777\n");
return;
}
#endif

should stop it; making it just not work, but not crash.
If he still gets a division by zero then something else is fishy.

-Andi

2002-03-01 23:01:54

by Julian Anastasov

[permalink] [raw]
Subject: Re: OOPS: Multipath routing 2.4.17


Hello,

On 1 Mar 2002, Andi Kleen wrote:

> #if 1
> if (power <= 0) {
> printk(KERN_CRIT "impossible 777\n");
> return;
> }
> #endif
>
> should stop it; making it just not work, but not crash.
> If he still gets a division by zero then something else is fishy.

How oops is reached:

2 CPUs enter fib_select_multipath while fib_power is 1.
Both see 1 at 'if (fi->fib_power <= 0) {', so no 777, CPU1 changes
fib_power from 1 to 0 before CPU2 reaches 'w = jiffies % fi->fib_power;'

How 888 is printed:

both CPUs see 1 in 'w = jiffies % fi->fib_power;' but the first
changes nh_power and fib_power from 1 to 0. CPU2 sees 0 everywhere
and prints 888. I assume nobody plays with DEAD.

If I understand correctly the locking (please correct me),
we can have many threads at the same time:

- many in ip_route_* calling fib_select_multipath

- one in rtnetlink playing with nh_*

> -Andi

Regards

--
Julian Anastasov <[email protected]>

2002-03-01 23:01:14

by Andi Kleen

[permalink] [raw]
Subject: Re: OOPS: Multipath routing 2.4.17


I wrote:

>In theory yes, but the
>
>#if 1
> if (power <= 0) {
> printk(KERN_CRIT "impossible 777\n");
> return;
> }
>#endif
>
>should stop it; making it just not work, but not crash.
>If he still gets a division by zero then something else is fishy.

Ignore the objection. The division is using fi->fib_power not power,
so it is definitely racy and possible. Sorry for the brain fart.

Short term fix would be just to add a spinlock like this (untested).
I think using a new algorithm would be too risky at least for 2.4.

--- linux-work/net/ipv4/fib_semantics.c-FIBLOCK Tue Jan 15 11:05:17 2002
+++ linux-work/net/ipv4/fib_semantics.c Fri Mar 1 23:58:45 2002
@@ -35,6 +35,7 @@
#include <linux/skbuff.h>
#include <linux/netlink.h>
#include <linux/init.h>
+#include <linux/spinlock.h>

#include <net/ip.h>
#include <net/protocol.h>
@@ -45,6 +46,8 @@

#define FSprintk(a...)

+static spinlock_t fib_nh_lock = SPIN_LOCK_UNLOCKED;
+
static struct fib_info *fib_info_list;
static rwlock_t fib_info_lock = RW_LOCK_UNLOCKED;
int fib_info_cnt;
@@ -859,6 +862,8 @@
if (force)
scope = -1;

+ spin_lock_bh(&fib_nh_lock);
+
for_fib_info() {
if (local && fi->fib_prefsrc == local) {
fi->fib_flags |= RTNH_F_DEAD;
@@ -885,6 +890,7 @@
}
}
} endfor_fib_info();
+ spin_unlock_bh(&fib_nh_lock);
return ret;
}

@@ -902,6 +908,7 @@
if (!(dev->flags&IFF_UP))
return 0;

+ spin_lock_bh(&fib_nh_lock);
for_fib_info() {
int alive = 0;

@@ -924,6 +931,7 @@
ret++;
}
} endfor_fib_info();
+ spin_unlock_bh(&fib_nh_lock);
return ret;
}

@@ -937,6 +945,7 @@
struct fib_info *fi = res->fi;
int w;

+ spin_lock_bh(&fib_nh_lock);
if (fi->fib_power <= 0) {
int power = 0;
change_nexthops(fi) {
@@ -949,6 +958,7 @@
#if 1
if (power <= 0) {
printk(KERN_CRIT "impossible 777\n");
+ spin_unlock_bh(&fib_nh_lock);
return;
}
#endif
@@ -967,10 +977,13 @@
nh->nh_power--;
fi->fib_power--;
res->nh_sel = nhsel;
+ spin_unlock_bh(&fib_nh_lock);
return;
}
}
} endfor_nexthops(fi);
+
+ spin_unlock_bh(&fib_nh_lock);

#if 1
printk(KERN_CRIT "impossible 888\n");



-Andi

2002-03-01 23:05:54

by Andi Kleen

[permalink] [raw]
Subject: Re: OOPS: Multipath routing 2.4.17

On Sat, Mar 02, 2002 at 01:01:25AM +0000, Julian Anastasov wrote:
>
> How oops is reached:

[... see my other crossing mail...]
> If I understand correctly the locking (please correct me),
> we can have many threads at the same time:
>
> - many in ip_route_* calling fib_select_multipath
>
> - one in rtnetlink playing with nh_*

Yes, that's correct.

-Andi

2002-03-01 23:21:09

by Julian Anastasov

[permalink] [raw]
Subject: Re: OOPS: Multipath routing 2.4.17


Hello,

On 2 Mar 2002, Andi Kleen wrote:

> Short term fix would be just to add a spinlock like this (untested).

Yes, I don't see more places. I'm only not sure
whether it should be fib_info_lock instead of fib_nh_lock, may be no.

> I think using a new algorithm would be too risky at least for 2.4.

Yes, it seems I have to test it first.

Regards

--
Julian Anastasov <[email protected]>

2002-03-02 00:25:35

by Bryon Roche

[permalink] [raw]
Subject: Re: OOPS: Multipath routing 2.4.17

On Sat, Mar 02, 2002 at 12:27:40AM +0000, Julian Anastasov wrote:
> What about a different algorithm to apply weighted
> round robin (idea mostly from LVS), something like
> this code (entirely not tested) where fi->fib_power is not used
> and where fib_sync_up and fib_sync_down don't need to play
> with nh_power on nh_flags change:
>
> [function cut]

It would take me a few days I think to understand the net code enough to
code anything so I don't know if I can be of help there, but I'm willing
to put together some testcases for this, or the simple
slap-some-write-locks-in solution, and see if I can break it again.
--
"Don't dwell on reality; it will only keep you from greatness."
-- Randall McBride, Jr.
**
Reality Engineer
Bryon Roche, Kain <[email protected]>
<[email protected]>


Attachments:
(No filename) (806.00 B)
(No filename) (240.00 B)
Download all attachments

2002-03-02 13:00:50

by Alexey Kuznetsov

[permalink] [raw]
Subject: Re: OOPS: Multipath routing 2.4.17

Hello!

> w = jiffies % fi->fib_power;

power = fi->fib_power;
barrier();
if (power) ...

Such thing are made in this way.

> write_lock(&fib_info_lock);

DO NOT MAKE THIS! fib_info_lock must not be acquired in this context,
it will lockup. Just add a new lock, which is protected wrt softirqs.

Alexey

2002-03-02 13:29:08

by Julian Anastasov

[permalink] [raw]
Subject: Re: OOPS: Multipath routing 2.4.17


Hello,

On Sat, 2 Mar 2002 [email protected] wrote:

> > w = jiffies % fi->fib_power;
>
> power = fi->fib_power;
> barrier();
> if (power) ...
>
> Such thing are made in this way.

I hope you are sure about this solution for fib_select_multipath
because I'm not. IMO, the solution from Andi looks more correct
for the current scheduler.

What about the new scheduler (for 2.5?), of course, after
replacing the wrong write_lock() with spin_lock_bh(&fib_nh_powers) ?
This lock will be used only in fib_select_multipath because
fib_sync_{up,down} will not play with nh_power. It will protect
only nh_power and I hope the DEAD flag change will not make big
problems for fib_select_multipath.

> Alexey

Regards

--
Julian Anastasov <[email protected]>

2002-03-02 14:37:13

by Alexey Kuznetsov

[permalink] [raw]
Subject: Re: OOPS: Multipath routing 2.4.17

Hello!

> What about the new scheduler (for 2.5?), of course, after
> replacing the wrong write_lock() with spin_lock_bh(&fib_nh_powers) ?

I do not see any reasons not to do this.
I remember your approach had some inacceptable issues, but 2.5 is exactly
the place to resolve them. :-)

But actually I would like to see a fix for 2.4 for beginning.
The failure with orphaned DEADs was hard bug yet. Minute...
I remember I did some work to make a minimalistic fix...

Aha! That's it. Please, look at this _carefully_. It is going
to be submitted to 2.4 and mistakes are not allowed here.
Look especially at the differences of your approach both about
medium_id and DEAD fault.

Alexey


diff -ur ../vger3-020202/linux/Documentation/networking/ip-sysctl.txt linux/Documentation/networking/ip-sysctl.txt
--- ../vger3-020202/linux/Documentation/networking/ip-sysctl.txt Sat Dec 29 22:29:46 2001
+++ linux/Documentation/networking/ip-sysctl.txt Sat Feb 2 22:58:40 2002
@@ -182,10 +188,7 @@
still did not receive an acknowledgement from connecting client.
Default value is 1024 for systems with more than 128Mb of memory,
and 128 for low memory machines. If server suffers of overload,
- try to increase this number. Warning! If you make it greater
- than 1024, it would be better to change TCP_SYNQ_HSIZE in
- include/net/tcp.h to keep TCP_SYNQ_HSIZE*16<=tcp_max_syn_backlog
- and to recompile kernel.
+ try to increase this number.

tcp_window_scaling - BOOLEAN
Enable window scaling as defined in RFC1323.
@@ -357,6 +360,17 @@
mc_forwarding - BOOLEAN
Do multicast routing. The kernel needs to be compiled with CONFIG_MROUTE
and a multicast routing daemon is required.
+
+medium_id - INTEGER
+ Integer value used to differentiate the devices by the medium they
+ are attached to. Two devices can have different id values when
+ the broadcast packets are received only on one of them.
+ The default value 0 means that the device is the only interface
+ to its medium, value of -1 means that medium is not known.
+
+ Currently, it is used to change the proxy_arp behavior:
+ the proxy_arp feature is enabled for packets forwarded between
+ two devices attached to different media.

proxy_arp - BOOLEAN
Do proxy arp.
diff -ur ../vger3-020202/linux/include/linux/inetdevice.h linux/include/linux/inetdevice.h
--- ../vger3-020202/linux/include/linux/inetdevice.h Sat Jul 28 23:03:33 2001
+++ linux/include/linux/inetdevice.h Sat Feb 2 22:58:40 2002
@@ -18,6 +18,7 @@
int mc_forwarding;
int tag;
int arp_filter;
+ int medium_id;
void *sysctl;
};

@@ -48,6 +49,7 @@
#define IN_DEV_TX_REDIRECTS(in_dev) (ipv4_devconf.send_redirects || (in_dev)->cnf.send_redirects)
#define IN_DEV_SEC_REDIRECTS(in_dev) (ipv4_devconf.secure_redirects || (in_dev)->cnf.secure_redirects)
#define IN_DEV_IDTAG(in_dev) ((in_dev)->cnf.tag)
+#define IN_DEV_MEDIUM_ID(in_dev) ((in_dev)->cnf.medium_id)

#define IN_DEV_RX_REDIRECTS(in_dev) \
((IN_DEV_FORWARD(in_dev) && \
diff -ur ../vger3-020202/linux/include/linux/sysctl.h linux/include/linux/sysctl.h
--- ../vger3-020202/linux/include/linux/sysctl.h Mon Dec 3 20:24:00 2001
+++ linux/include/linux/sysctl.h Sat Feb 2 22:58:40 2002
@@ -334,7 +336,8 @@
NET_IPV4_CONF_BOOTP_RELAY=10,
NET_IPV4_CONF_LOG_MARTIANS=11,
NET_IPV4_CONF_TAG=12,
- NET_IPV4_CONF_ARPFILTER=13
+ NET_IPV4_CONF_ARPFILTER=13,
+ NET_IPV4_CONF_MEDIUM_ID=14,
};

/* /proc/sys/net/ipv6 */
diff -ur ../vger3-020202/linux/net/ipv4/arp.c linux/net/ipv4/arp.c
--- ../vger3-020202/linux/net/ipv4/arp.c Sat Oct 13 20:56:37 2001
+++ linux/net/ipv4/arp.c Sat Feb 2 22:58:40 2002
@@ -450,6 +450,32 @@
}

/*
+ * Check if we can use proxy ARP for this path
+ */
+
+static inline int arp_fwd_proxy(struct in_device *in_dev, struct rtable *rt)
+{
+ struct in_device *out_dev;
+ int imi, omi = -1;
+
+ if (!IN_DEV_PROXY_ARP(in_dev))
+ return 0;
+
+ if ((imi = IN_DEV_MEDIUM_ID(in_dev)) == 0)
+ return 1;
+ if (imi == -1)
+ return 0;
+
+ /* place to check for proxy_arp for routes */
+
+ if ((out_dev = in_dev_get(rt->u.dst.dev)) != NULL) {
+ omi = IN_DEV_MEDIUM_ID(out_dev);
+ in_dev_put(out_dev);
+ }
+ return (omi != imi && omi != -1);
+}
+
+/*
* Interface to link layer: send routine and receive handler.
*/

@@ -768,7 +794,7 @@
} else if (IN_DEV_FORWARD(in_dev)) {
if ((rt->rt_flags&RTCF_DNAT) ||
(addr_type == RTN_UNICAST && rt->u.dst.dev != dev &&
- (IN_DEV_PROXY_ARP(in_dev) || pneigh_lookup(&arp_tbl, &tip, dev, 0)))) {
+ (arp_fwd_proxy(in_dev, rt) || pneigh_lookup(&arp_tbl, &tip, dev, 0)))) {
n = neigh_event_ns(&arp_tbl, sha, &sip, dev);
if (n)
neigh_release(n);
diff -ur ../vger3-020202/linux/net/ipv4/devinet.c linux/net/ipv4/devinet.c
--- ../vger3-020202/linux/net/ipv4/devinet.c Thu Nov 1 23:35:22 2001
+++ linux/net/ipv4/devinet.c Sat Feb 2 22:58:40 2002
@@ -1032,7 +1032,7 @@
static struct devinet_sysctl_table
{
struct ctl_table_header *sysctl_header;
- ctl_table devinet_vars[14];
+ ctl_table devinet_vars[15];
ctl_table devinet_dev[2];
ctl_table devinet_conf_dir[2];
ctl_table devinet_proto_dir[2];
@@ -1065,6 +1065,9 @@
&proc_dointvec},
{NET_IPV4_CONF_PROXY_ARP, "proxy_arp",
&ipv4_devconf.proxy_arp, sizeof(int), 0644, NULL,
+ &proc_dointvec},
+ {NET_IPV4_CONF_MEDIUM_ID, "medium_id",
+ &ipv4_devconf.medium_id, sizeof(int), 0644, NULL,
&proc_dointvec},
{NET_IPV4_CONF_BOOTP_RELAY, "bootp_relay",
&ipv4_devconf.bootp_relay, sizeof(int), 0644, NULL,
diff -ur ../vger3-020202/linux/net/ipv4/fib_frontend.c linux/net/ipv4/fib_frontend.c
--- ../vger3-020202/linux/net/ipv4/fib_frontend.c Thu Nov 1 23:35:22 2001
+++ linux/net/ipv4/fib_frontend.c Sat Feb 2 22:58:40 2002
@@ -579,6 +579,9 @@
switch (event) {
case NETDEV_UP:
fib_add_ifaddr(ifa);
+#ifdef CONFIG_IP_ROUTE_MULTIPATH
+ fib_sync_up(ifa->ifa_dev->dev);
+#endif
rt_cache_flush(-1);
break;
case NETDEV_DOWN:
diff -ur ../vger3-020202/linux/net/ipv4/fib_semantics.c linux/net/ipv4/fib_semantics.c
--- ../vger3-020202/linux/net/ipv4/fib_semantics.c Mon Jan 14 20:14:43 2002
+++ linux/net/ipv4/fib_semantics.c Sat Feb 2 22:58:40 2002
@@ -871,6 +871,10 @@
#ifdef CONFIG_IP_ROUTE_MULTIPATH
fi->fib_power -= nh->nh_power;
nh->nh_power = 0;
+ if (force && nh->nh_dev) {
+ dev_put(nh->nh_dev);
+ nh->nh_dev = NULL;
+ }
#endif
dead++;
}
@@ -905,6 +909,10 @@
if (!(nh->nh_flags&RTNH_F_DEAD)) {
alive++;
continue;
+ }
+ if (nh->nh_dev == NULL && nh->nh_oif == dev->ifindex) {
+ dev_hold(dev);
+ nh->nh_dev = dev;
}
if (nh->nh_dev == NULL || !(nh->nh_dev->flags&IFF_UP))
continue;

2002-03-02 16:11:16

by Julian Anastasov

[permalink] [raw]
Subject: Re: OOPS: Multipath routing 2.4.17


Hello,

On Sat, 2 Mar 2002 [email protected] wrote:

> > What about the new scheduler (for 2.5?), of course, after
> > replacing the wrong write_lock() with spin_lock_bh(&fib_nh_powers) ?
>
> I do not see any reasons not to do this.
> I remember your approach had some inacceptable issues, but 2.5 is exactly
> the place to resolve them. :-)
>
> But actually I would like to see a fix for 2.4 for beginning.

OK, I'll try it soon, I assume you ack about the
new fib_select_multipath scheduler discussed in this thread.

> The failure with orphaned DEADs was hard bug yet. Minute...
> I remember I did some work to make a minimalistic fix...

Yep, I'm wondering, nobody complains about this problem :)
May be it is still not too late to fix it :)

> Aha! That's it. Please, look at this _carefully_. It is going
> to be submitted to 2.4 and mistakes are not allowed here.
> Look especially at the differences of your approach both about
> medium_id and DEAD fault.

I see, very good, you restore the nh_dev on "enable IP"
which is detroyed on "disable IP".

About medium_id, I didn't tested your variant but I see what you mean:
proxy_arp must be enabled for the receiver, distinguish 0/-1.
Sounds good, looks good, only doc changes:

- change "media" to "medium"

- docs in Documentation/filesystems/proc.txt or the net part
is going out from this file?

About the FIB changes, check fib_sync_down because I see a
bad scenario, i.e. change:

if (force && nh->nh_dev) {

to

if (force && nh->nh_dev == dev && nh->nh_flags&RTNH_F_DEAD) {

and move it before endfor_nexthops because we can miss the
following sequence of events:

- enable IP
- link up
...
- link down => nh is DEAD after force=0 but with valid nh_dev!=NULL

and we come here with the new change:

- disable IP/unreg with force=1 => we miss the above event and
don't clear nh_dev

In short, nh can be already DEAD with nh_dev!=NULL and we miss it when
force is 1.

So, the new code must be something like this:

}
+ if (force && nh->nh_dev == dev &&
+ nh->nh_flags&RTNH_F_DEAD) {
+ dev_put(nh->nh_dev);
+ nh->nh_dev = NULL;
+ }
} endfor_nexthops(fi)

There is a second variant: we to keep all DEAD nhs to be with
nh_dev==NULL but I'm not sure about it. At least, you already added
a mechanism the nexthops to bind again to outdev and this can work,
may be. But I prefer the first solution after fixing.

> Alexey

> + two devices attached to different media.

medium

Regards

--
Julian Anastasov <[email protected]>