2000-12-04 03:02:21

by Linus Torvalds

[permalink] [raw]
Subject: test12-pre4


Synching up with Alan and various other stuff. The most important one
being the fix to the inode dirty block list.

Linus

----

- pre4:
- Andries Brouwer: final isofs pieces.
- Kai Germaschewski: ISDN
- play CD audio correctly, don't stop after 12 minutes.
- Anton Altaparmakov: disable NTFS mmap for now, as it doesn't work.
- Stephen Tweedie: fix inode dirty block handling
- Bill Hartner: reschedule_idle - prefer right cpu
- Johannes Erdfelt: USB updates
- Alan Cox: synchronize
- Richard Henderson: alpha updates and optimizations
- Geert Uytterhoeven: fbdev could be fooled into crashing fix
- Trond Myklebust: NFS filehandles in inode rather than dentry

- pre3:
- me: more PageDirty / swapcache handling
- Neil Brown: raid and md init fixes
- David Brownell: pci hotplug sanitization.
- Kanoj Sarcar: mips64 update
- Kai Germaschewski: ISDN sync
- Andreas Bombe: ieee1394 cleanups and fixes
- Johannes Erdfelt: USB update
- David Miller: Sparc and net update
- Trond Myklebust: RPC layer SMP fixes
- Thomas Sailer: mixed sound driver fixes
- Tigran Aivazian: use atomic_dec_and_lock() for free_uid()

- pre2:
- Peter Anvin: more P4 configuration parsing
- Stephen Tweedie: O_SYNC patches. Make O_SYNC/fsync/fdatasync
do the right thing.
- Keith Owens: make mdule loading use the right struct module size
- Boszormenyi Zoltan: get MTRR's right for the >32-bit case
- Alan Cox: various random documentation etc
- Dario Ballabio: EATA and u14-34f update
- Ivan Kokshaysky: unbreak alpha ruffian
- Richard Henderson: PCI bridge initialization on alpha
- Zach Brown: correct locking in Maestro driver
- Geert Uytterhoeven: more m68k updates
- Andrey Savochkin: eepro100 update
- Dag Brattli: irda update
- Johannes Erdfelt: USB update

- pre1: (for ISDN synchronization _ONLY_! Not complete!)
- Byron Stanoszek: correct decimal precision for CPU MHz in
/proc/cpuinfo
- Ollie Lho: SiS pirq routing.
- Andries Brouwer: isofs cleanups
- Matt Kraai: /proc read() on directories should return EISDIR, not EINVAL
- me: be stricter about what we accept as a PCI bridge setup.
- me: always set PCI interrupts to be level-triggered when we enable them.
- me: updated PageDirty and swap cache handling
- Peter Anvin: update A20 code to work without keyboard controller
- Kai Germaschewski: ISDN updates
- Russell King: ARM updates
- Geert Uytterhoeven: m68k updates


2000-12-04 04:28:29

by Mohammad A. Haque

[permalink] [raw]
Subject: Re: test12-pre4

--- linux/drivers/net/dummy.c.orig Sun Dec 3 21:59:18 2000
+++ linux/drivers/net/dummy.c Sun Dec 3 22:52:13 2000
@@ -53,6 +53,8 @@

static int __init dummy_init(struct net_device *dev)
{
+ SET_MODULE_OWNER(dev);
+
/* Initialize the device structure. */
dev->hard_start_xmit = dummy_xmit;

@@ -100,7 +102,6 @@
int err;

dev_dummy.init = dummy_init;
- SET_MODULE_OWNER(&dev_dummy);

/* Find a name for this unit */
err=dev_alloc_name(&dev_dummy,"dummy%d");


Attachments:
dummy-t12p4.diff (475.00 B)

2000-12-04 06:19:46

by Tom Holroyd

[permalink] [raw]
Subject: Re: test12-pre4

gcc -D__KERNEL__ -I/usr/src/linux/include -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -fno-strict-aliasing -pipe -mno-fp-regs -ffixed-8 -mcpu=ev6 -Wa,-mev6 -DEXPORT_SYMTAB -c pci.c
pci.c: In function `pci_read_bases':
pci.c:576: `tmp' undeclared (first use in this function)
pci.c:576: (Each undeclared identifier is reported only once
pci.c:576: for each function it appears in.)

--- drivers/pci/#pci.c Mon Dec 4 14:30:40 2000
+++ drivers/pci/pci.c Mon Dec 4 14:44:29 2000
@@ -540,7 +540,7 @@
static void pci_read_bases(struct pci_dev *dev, unsigned int howmany, int rom)
{
unsigned int pos, reg, next;
- u32 l, sz;
+ u32 l, sz, tmp;
struct resource *res;

for(pos=0; pos<howmany; pos = next) {

2000-12-04 06:32:29

by Alexander Viro

[permalink] [raw]
Subject: [PATCH] inode dirty blocks Re: test12-pre4



On Sun, 3 Dec 2000, Linus Torvalds wrote:

>
> Synching up with Alan and various other stuff. The most important one
> being the fix to the inode dirty block list.

It doesn't solve the problem. If you unlink a file with dirty metadata
you have a nice chance to hit the BUG() in inode.c:83. I hope that patch
below closes all remaining holes. See analysis in previous posting
(basically, bforget() is not enough when we free the block; bh should
be removed from the inode's list regardless of the ->b_count).
Cheers,
Al

diff -urN rc12-pre4/fs/buffer.c rc12-pre4-dirty_blocks/fs/buffer.c
--- rc12-pre4/fs/buffer.c Mon Dec 4 01:01:43 2000
+++ rc12-pre4-dirty_blocks/fs/buffer.c Mon Dec 4 01:11:42 2000
@@ -1164,6 +1164,31 @@
}

/*
+ * Call it when you are going to free the block. The difference between
+ * that and bforget() is that we remove the thing from inode queue
+ * unconditionally.
+ */
+void bforget_inode(struct buffer_head * buf)
+{
+ /* grab the lru lock here to block bdflush. */
+ spin_lock(&lru_list_lock);
+ write_lock(&hash_table_lock);
+ remove_inode_queue(buf);
+ if (!atomic_dec_and_test(&buf->b_count) || buffer_locked(buf))
+ goto in_use;
+ __hash_unlink(buf);
+ write_unlock(&hash_table_lock);
+ __remove_from_lru_list(buf, buf->b_list);
+ spin_unlock(&lru_list_lock);
+ put_last_free(buf);
+ return;
+
+ in_use:
+ write_unlock(&hash_table_lock);
+ spin_unlock(&lru_list_lock);
+}
+
+/*
* bread() reads a specified block and returns the buffer that contains
* it. It returns NULL if the block was unreadable.
*/
@@ -1460,6 +1485,9 @@
clear_bit(BH_Mapped, &bh->b_state);
clear_bit(BH_Req, &bh->b_state);
clear_bit(BH_New, &bh->b_state);
+ spin_lock(&lru_list_lock);
+ remove_inode_queue(bh);
+ spin_unlock(&lru_list_lock);
}
}

diff -urN rc12-pre4/fs/ext2/inode.c rc12-pre4-dirty_blocks/fs/ext2/inode.c
--- rc12-pre4/fs/ext2/inode.c Mon Dec 4 01:01:43 2000
+++ rc12-pre4-dirty_blocks/fs/ext2/inode.c Mon Dec 4 01:13:10 2000
@@ -416,7 +416,7 @@

/* Allocation failed, free what we already allocated */
for (i = 1; i < n; i++)
- bforget(branch[i].bh);
+ bforget_inode(branch[i].bh);
for (i = 0; i < n; i++)
ext2_free_blocks(inode, le32_to_cpu(branch[i].key), 1);
return err;
@@ -484,7 +484,7 @@

changed:
for (i = 1; i < num; i++)
- bforget(where[i].bh);
+ bforget_inode(where[i].bh);
for (i = 0; i < num; i++)
ext2_free_blocks(inode, le32_to_cpu(where[i].key), 1);
return -EAGAIN;
@@ -854,7 +854,7 @@
(u32*)bh->b_data,
(u32*)bh->b_data + addr_per_block,
depth);
- bforget(bh);
+ bforget_inode(bh);
/* Writer: ->i_blocks */
inode->i_blocks -= inode->i_sb->s_blocksize / 512;
/* Writer: end */
diff -urN rc12-pre4/include/linux/fs.h rc12-pre4-dirty_blocks/include/linux/fs.h
--- rc12-pre4/include/linux/fs.h Mon Dec 4 01:01:47 2000
+++ rc12-pre4-dirty_blocks/include/linux/fs.h Mon Dec 4 01:12:03 2000
@@ -1201,6 +1201,7 @@
__brelse(buf);
}
extern void __bforget(struct buffer_head *);
+extern void bforget_inode(struct buffer_head *);
static inline void bforget(struct buffer_head *buf)
{
if (buf)
diff -urN rc12-pre4/kernel/ksyms.c rc12-pre4-dirty_blocks/kernel/ksyms.c
--- rc12-pre4/kernel/ksyms.c Mon Dec 4 01:01:49 2000
+++ rc12-pre4-dirty_blocks/kernel/ksyms.c Mon Dec 4 01:12:19 2000
@@ -188,6 +188,7 @@
EXPORT_SYMBOL(breada);
EXPORT_SYMBOL(__brelse);
EXPORT_SYMBOL(__bforget);
+EXPORT_SYMBOL(bforget_inode);
EXPORT_SYMBOL(ll_rw_block);
EXPORT_SYMBOL(__wait_on_buffer);
EXPORT_SYMBOL(___wait_on_page);

2000-12-04 08:34:05

by Jeff Garzik

[permalink] [raw]
Subject: Re: test12-pre4



On Sun, 3 Dec 2000, Mohammad A. Haque wrote:

> Was borking on dummy.c. This seemed to fix it. Verification please?
>
> gcc -D__KERNEL__ -I/usr/src/linux-2.4.0-test11/include -Wall
> -Wstrict-prototypes -O6 -fomit-frame-pointer -fno-strict-aliasing -pipe
> -mpreferred-stack-boundary=2 -march=i686 -DMODULE -DMODVERSIONS -include
> /usr/src/linux-2.4.0-test11/include/linux/modversions.h -c -o dummy.o
> dummy.c
> dummy.c: In function `dummy_init_module':
> dummy.c:103: invalid type argument of `->'
> make[2]: *** [dummy.o] Error 1
>

No, module.h needs fixing. I guess I didn't send that one to Alan...

2000-12-04 12:52:04

by Alan

[permalink] [raw]
Subject: Re: test12-pre4

> Was borking on dummy.c. This seemed to fix it. Verification please?
>
> gcc -D__KERNEL__ -I/usr/src/linux-2.4.0-test11/include -Wall
> -Wstrict-prototypes -O6 -fomit-frame-pointer -fno-strict-aliasing -pipe
> -mpreferred-stack-boundary=2 -march=i686 -DMODULE -DMODVERSIONS -include
> /usr/src/linux-2.4.0-test11/include/linux/modversions.h -c -o dummy.o
> dummy.c
> dummy.c: In function `dummy_init_module':
> dummy.c:103: invalid type argument of `->'
> make[2]: *** [dummy.o] Error 1

Can you send me your .config and I'll double check this. "It built for me
before I sent it to Linus, honest"

2000-12-04 12:55:44

by Alan

[permalink] [raw]
Subject: Re: test12-pre4

> > dummy.c: In function `dummy_init_module':
> > dummy.c:103: invalid type argument of `->'
> > make[2]: *** [dummy.o] Error 1
>
> No, module.h needs fixing. I guess I didn't send that one to Alan...

You did send it, you just didnt tell me the dummy patch depended on it
and that I needed to send both together 8)


2000-12-04 12:58:34

by Nikhil Goel

[permalink] [raw]
Subject: Re: test12-pre4

#
# Automatically generated make config: don't edit
#
CONFIG_X86=y
CONFIG_ISA=y
# CONFIG_SBUS is not set
CONFIG_UID16=y

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y

#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODVERSIONS=y
CONFIG_KMOD=y

#
# Processor type and features
#
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
CONFIG_M686FXSR=y
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_CMPXCHG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_L1_CACHE_SHIFT=5
CONFIG_X86_TSC=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_PGE=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_FXSR=y
CONFIG_X86_XMM=y
# CONFIG_TOSHIBA is not set
CONFIG_MICROCODE=m
# CONFIG_X86_MSR is not set
# CONFIG_X86_CPUID is not set
CONFIG_NOHIGHMEM=y
# CONFIG_HIGHMEM4G is not set
# CONFIG_HIGHMEM64G is not set
CONFIG_MTRR=y
# CONFIG_SMP is not set
# CONFIG_X86_UP_IOAPIC is not set

#
# General setup
#
CONFIG_NET=y
# CONFIG_VISWS is not set
CONFIG_PCI=y
# CONFIG_PCI_GOBIOS is not set
# CONFIG_PCI_GODIRECT is not set
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_NAMES=y
# CONFIG_EISA is not set
# CONFIG_MCA is not set
CONFIG_HOTPLUG=y

#
# PCMCIA/CardBus support
#
# CONFIG_PCMCIA is not set
CONFIG_SYSVIPC=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_SYSCTL=y
CONFIG_KCORE_ELF=y
# CONFIG_KCORE_AOUT is not set
CONFIG_BINFMT_AOUT=y
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_MISC=y
# CONFIG_PM is not set
# CONFIG_ACPI_INTERPRETER is not set
# CONFIG_ACPI_S1_SLEEP is not set
# CONFIG_APM_IGNORE_USER_SUSPEND is not set
# CONFIG_APM_DO_ENABLE is not set
# CONFIG_APM_CPU_IDLE is not set
# CONFIG_APM_DISPLAY_BLANK is not set
# CONFIG_APM_RTC_IS_GMT is not set
# CONFIG_APM_ALLOW_INTS is not set
# CONFIG_APM_REAL_MODE_POWER_OFF is not set

#
# Memory Technology Devices (MTD)
#
# CONFIG_MTD is not set

#
# Parallel port support
#
CONFIG_PARPORT=m
CONFIG_PARPORT_PC=m
# CONFIG_PARPORT_PC_FIFO is not set
# CONFIG_PARPORT_PC_SUPERIO is not set
# CONFIG_PARPORT_AMIGA is not set
# CONFIG_PARPORT_MFC3 is not set
# CONFIG_PARPORT_ATARI is not set
# CONFIG_PARPORT_SUNBPP is not set
# CONFIG_PARPORT_OTHER is not set
# CONFIG_PARPORT_1284 is not set

#
# Plug and Play configuration
#
CONFIG_PNP=y
# CONFIG_ISAPNP is not set

#
# Block devices
#
# CONFIG_BLK_DEV_FD is not set
# CONFIG_BLK_DEV_XD is not set
# CONFIG_PARIDE is not set
# CONFIG_BLK_CPQ_DA is not set
# CONFIG_BLK_CPQ_CISS_DA is not set
# CONFIG_BLK_DEV_DAC960 is not set
# CONFIG_BLK_DEV_LOOP is not set
CONFIG_BLK_DEV_NBD=m
# CONFIG_BLK_DEV_RAM is not set

#
# Multi-device support (RAID and LVM)
#
# CONFIG_MD is not set

#
# Networking options
#
CONFIG_PACKET=y
CONFIG_PACKET_MMAP=y
CONFIG_NETLINK=y
# CONFIG_RTNETLINK is not set
CONFIG_NETLINK_DEV=m
CONFIG_NETFILTER=y
CONFIG_NETFILTER_DEBUG=y
CONFIG_FILTER=y
CONFIG_UNIX=y
CONFIG_INET=y
# CONFIG_IP_MULTICAST is not set
# CONFIG_IP_ADVANCED_ROUTER is not set
# CONFIG_IP_PNP is not set
# CONFIG_NET_IPIP is not set
# CONFIG_NET_IPGRE is not set
# CONFIG_INET_ECN is not set
# CONFIG_SYN_COOKIES is not set

#
# IP: Netfilter Configuration
#
CONFIG_IP_NF_CONNTRACK=y
CONFIG_IP_NF_FTP=y
CONFIG_IP_NF_QUEUE=y
CONFIG_IP_NF_IPTABLES=y
CONFIG_IP_NF_MATCH_LIMIT=m
CONFIG_IP_NF_MATCH_MAC=m
CONFIG_IP_NF_MATCH_MARK=m
CONFIG_IP_NF_MATCH_MULTIPORT=m
CONFIG_IP_NF_MATCH_TOS=m
CONFIG_IP_NF_MATCH_STATE=m
CONFIG_IP_NF_MATCH_UNCLEAN=m
CONFIG_IP_NF_MATCH_OWNER=m
CONFIG_IP_NF_FILTER=m
CONFIG_IP_NF_TARGET_REJECT=m
CONFIG_IP_NF_TARGET_MIRROR=m
CONFIG_IP_NF_NAT=y
CONFIG_IP_NF_NAT_NEEDED=y
CONFIG_IP_NF_TARGET_MASQUERADE=m
CONFIG_IP_NF_TARGET_REDIRECT=m
CONFIG_IP_NF_MANGLE=m
CONFIG_IP_NF_TARGET_TOS=m
CONFIG_IP_NF_TARGET_MARK=m
CONFIG_IP_NF_TARGET_LOG=m
# CONFIG_IPV6 is not set
CONFIG_KHTTPD=y
# CONFIG_ATM is not set

#
#
#
# CONFIG_IPX is not set
# CONFIG_ATALK is not set
# CONFIG_DECNET is not set
CONFIG_BRIDGE=m
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_LLC is not set
# CONFIG_NET_DIVERT is not set
# CONFIG_ECONET is not set
# CONFIG_WAN_ROUTER is not set
# CONFIG_NET_FASTROUTE is not set
# CONFIG_NET_HW_FLOWCONTROL is not set

#
# QoS and/or fair queueing
#
# CONFIG_NET_SCHED is not set

#
# Telephony Support
#
# CONFIG_PHONE is not set

#
# ATA/IDE/MFM/RLL support
#
CONFIG_IDE=y

#
# IDE, ATA and ATAPI Block devices
#
CONFIG_BLK_DEV_IDE=y

#
# Please see Documentation/ide.txt for help/info on IDE drives
#
# CONFIG_BLK_DEV_HD_IDE is not set
# CONFIG_BLK_DEV_HD is not set
CONFIG_BLK_DEV_IDEDISK=y
# CONFIG_IDEDISK_MULTI_MODE is not set
# CONFIG_BLK_DEV_IDEDISK_VENDOR is not set
# CONFIG_BLK_DEV_COMMERIAL is not set
CONFIG_BLK_DEV_IDECD=y
# CONFIG_BLK_DEV_IDETAPE is not set
# CONFIG_BLK_DEV_IDEFLOPPY is not set

#
# IDE chipset support/bugfixes
#
CONFIG_BLK_DEV_CMD640=y
# CONFIG_BLK_DEV_CMD640_ENHANCED is not set
CONFIG_BLK_DEV_RZ1000=y
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_IDEPCI_SHARE_IRQ=y
CONFIG_BLK_DEV_IDEDMA_PCI=y
# CONFIG_BLK_DEV_OFFBOARD is not set
# CONFIG_IDEDMA_PCI_AUTO is not set
CONFIG_BLK_DEV_IDEDMA=y
# CONFIG_IDEDMA_PCI_WIP is not set
# CONFIG_BLK_DEV_AEC62XX is not set
# CONFIG_BLK_DEV_ALI15X3 is not set
# CONFIG_BLK_DEV_AMD7409 is not set
# CONFIG_BLK_DEV_CMD64X is not set
# CONFIG_BLK_DEV_CY82C693 is not set
# CONFIG_BLK_DEV_CS5530 is not set
# CONFIG_BLK_DEV_HPT34X is not set
# CONFIG_BLK_DEV_HPT366 is not set
# CONFIG_BLK_DEV_PIIX is not set
# CONFIG_BLK_DEV_NS87415 is not set
# CONFIG_BLK_DEV_OPTI621 is not set
# CONFIG_BLK_DEV_PDC202XX is not set
# CONFIG_BLK_DEV_SIS5513 is not set
# CONFIG_BLK_DEV_SLC90E66 is not set
# CONFIG_BLK_DEV_TRM290 is not set
# CONFIG_BLK_DEV_VIA82CXXX is not set
# CONFIG_IDE_CHIPSETS is not set
# CONFIG_IDEDMA_AUTO is not set
# CONFIG_IDEDMA_IVB is not set
# CONFIG_DMA_NONPCI is not set
CONFIG_BLK_DEV_IDE_MODES=y

#
# SCSI support
#
# CONFIG_SCSI is not set

#
# IEEE 1394 (FireWire) support
#
# CONFIG_IEEE1394 is not set

#
# I2O device support
#
# CONFIG_I2O is not set

#
# Network device support
#
CONFIG_NETDEVICES=y

#
# ARCnet devices
#
# CONFIG_ARCNET is not set
CONFIG_DUMMY=m
# CONFIG_BONDING is not set
# CONFIG_EQUALIZER is not set
# CONFIG_TUN is not set
CONFIG_ETHERTAP=y
# CONFIG_NET_SB1000 is not set

#
# Ethernet (10 or 100Mbit)
#
CONFIG_NET_ETHERNET=y
CONFIG_NET_VENDOR_3COM=y
# CONFIG_EL1 is not set
# CONFIG_EL2 is not set
# CONFIG_ELPLUS is not set
# CONFIG_EL16 is not set
# CONFIG_EL3 is not set
# CONFIG_3C515 is not set
CONFIG_VORTEX=y
# CONFIG_LANCE is not set
# CONFIG_NET_VENDOR_SMC is not set
# CONFIG_NET_VENDOR_RACAL is not set
# CONFIG_AT1700 is not set
# CONFIG_DEPCA is not set
# CONFIG_HP100 is not set
# CONFIG_NET_ISA is not set
# CONFIG_NET_PCI is not set
# CONFIG_NET_POCKET is not set

#
# Ethernet (1000 Mbit)
#
# CONFIG_ACENIC is not set
# CONFIG_HAMACHI is not set
# CONFIG_YELLOWFIN is not set
# CONFIG_SK98LIN is not set
# CONFIG_FDDI is not set
# CONFIG_HIPPI is not set
# CONFIG_PLIP is not set
CONFIG_PPP=y
# CONFIG_PPP_MULTILINK is not set
CONFIG_PPP_ASYNC=y
CONFIG_PPP_SYNC_TTY=y
# CONFIG_PPP_DEFLATE is not set
# CONFIG_PPP_BSDCOMP is not set
# CONFIG_PPPOE is not set
# CONFIG_SLIP is not set

#
# Wireless LAN (non-hamradio)
#
# CONFIG_NET_RADIO is not set

#
# Token Ring devices
#
# CONFIG_TR is not set
# CONFIG_NET_FC is not set
# CONFIG_RCPCI is not set
# CONFIG_SHAPER is not set

#
# Wan interfaces
#
# CONFIG_WAN is not set

#
# Amateur Radio support
#
# CONFIG_HAMRADIO is not set

#
# IrDA (infrared) support
#
# CONFIG_IRDA is not set

#
# ISDN subsystem
#
# CONFIG_ISDN is not set

#
# Old CD-ROM drivers (not SCSI, not IDE)
#
# CONFIG_CD_NO_IDESCSI is not set

#
# Input core support
#
# CONFIG_INPUT is not set

#
# Character devices
#
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_SERIAL=y
# CONFIG_SERIAL_CONSOLE is not set
# CONFIG_SERIAL_EXTENDED is not set
# CONFIG_SERIAL_NONSTANDARD is not set
CONFIG_UNIX98_PTYS=y
CONFIG_UNIX98_PTY_COUNT=256
# CONFIG_PRINTER is not set
# CONFIG_PPDEV is not set

#
# I2C support
#
# CONFIG_I2C is not set

#
# Mice
#
# CONFIG_BUSMOUSE is not set
CONFIG_MOUSE=y
CONFIG_PSMOUSE=y
# CONFIG_82C710_MOUSE is not set
# CONFIG_PC110_PAD is not set

#
# Joysticks
#

#
# Game port support
#

#
# Gameport joysticks
#

#
# Serial port support
#

#
# Serial port joysticks
#

#
# Parallel port joysticks
#
# CONFIG_QIC02_TAPE is not set

#
# Watchdog Cards
#
# CONFIG_WATCHDOG is not set
CONFIG_INTEL_RNG=y
# CONFIG_NVRAM is not set
# CONFIG_RTC is not set
# CONFIG_DTLK is not set
# CONFIG_R3964 is not set
# CONFIG_APPLICOM is not set

#
# Ftape, the floppy tape device driver
#
# CONFIG_FTAPE is not set
CONFIG_AGP=y
# CONFIG_AGP_INTEL is not set
CONFIG_AGP_I810=y
# CONFIG_AGP_VIA is not set
# CONFIG_AGP_AMD is not set
# CONFIG_AGP_SIS is not set
# CONFIG_AGP_ALI is not set
CONFIG_DRM=y
CONFIG_DRM_TDFX=y
# CONFIG_DRM_GAMMA is not set
# CONFIG_DRM_R128 is not set
CONFIG_DRM_I810=y
# CONFIG_DRM_MGA is not set

#
# Multimedia devices
#
# CONFIG_VIDEO_DEV is not set

#
# File systems
#
# CONFIG_QUOTA is not set
# CONFIG_AUTOFS_FS is not set
CONFIG_AUTOFS4_FS=y
# CONFIG_ADFS_FS is not set
# CONFIG_AFFS_FS is not set
# CONFIG_HFS_FS is not set
# CONFIG_BFS_FS is not set
CONFIG_FAT_FS=y
CONFIG_MSDOS_FS=m
# CONFIG_UMSDOS_FS is not set
CONFIG_VFAT_FS=m
# CONFIG_EFS_FS is not set
CONFIG_JFFS_FS_VERBOSE=0
# CONFIG_CRAMFS is not set
# CONFIG_RAMFS is not set
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
# CONFIG_MINIX_FS is not set
CONFIG_NTFS_FS=m
# CONFIG_NTFS_RW is not set
# CONFIG_HPFS_FS is not set
CONFIG_PROC_FS=y
# CONFIG_DEVFS_FS is not set
CONFIG_DEVPTS_FS=y
# CONFIG_QNX4FS_FS is not set
# CONFIG_ROMFS_FS is not set
CONFIG_EXT2_FS=y
# CONFIG_SYSV_FS is not set
# CONFIG_UDF_FS is not set
# CONFIG_UFS_FS is not set

#
# Network File Systems
#
# CONFIG_CODA_FS is not set
CONFIG_NFS_FS=y
# CONFIG_NFS_V3 is not set
CONFIG_NFSD=y
CONFIG_NFSD_V3=y
CONFIG_SUNRPC=y
CONFIG_LOCKD=y
CONFIG_LOCKD_V4=y
CONFIG_SMB_FS=y
# CONFIG_SMB_NLS_DEFAULT is not set
# CONFIG_NCP_FS is not set

#
# Partition Types
#
# CONFIG_PARTITION_ADVANCED is not set
CONFIG_MSDOS_PARTITION=y
CONFIG_NLS=y

#
# Native Language Support
#
CONFIG_NLS_DEFAULT="iso8859-1"
# CONFIG_NLS_CODEPAGE_437 is not set
# CONFIG_NLS_CODEPAGE_737 is not set
# CONFIG_NLS_CODEPAGE_775 is not set
# CONFIG_NLS_CODEPAGE_850 is not set
# CONFIG_NLS_CODEPAGE_852 is not set
# CONFIG_NLS_CODEPAGE_855 is not set
# CONFIG_NLS_CODEPAGE_857 is not set
# CONFIG_NLS_CODEPAGE_860 is not set
# CONFIG_NLS_CODEPAGE_861 is not set
# CONFIG_NLS_CODEPAGE_862 is not set
# CONFIG_NLS_CODEPAGE_863 is not set
# CONFIG_NLS_CODEPAGE_864 is not set
# CONFIG_NLS_CODEPAGE_865 is not set
# CONFIG_NLS_CODEPAGE_866 is not set
# CONFIG_NLS_CODEPAGE_869 is not set
# CONFIG_NLS_CODEPAGE_874 is not set
# CONFIG_NLS_CODEPAGE_932 is not set
# CONFIG_NLS_CODEPAGE_936 is not set
# CONFIG_NLS_CODEPAGE_949 is not set
# CONFIG_NLS_CODEPAGE_950 is not set
# CONFIG_NLS_ISO8859_1 is not set
# CONFIG_NLS_ISO8859_2 is not set
# CONFIG_NLS_ISO8859_3 is not set
# CONFIG_NLS_ISO8859_4 is not set
# CONFIG_NLS_ISO8859_5 is not set
# CONFIG_NLS_ISO8859_6 is not set
# CONFIG_NLS_ISO8859_7 is not set
# CONFIG_NLS_ISO8859_8 is not set
# CONFIG_NLS_ISO8859_9 is not set
# CONFIG_NLS_ISO8859_14 is not set
# CONFIG_NLS_ISO8859_15 is not set
# CONFIG_NLS_KOI8_R is not set
# CONFIG_NLS_UTF8 is not set

#
# Console drivers
#
CONFIG_VGA_CONSOLE=y
# CONFIG_VIDEO_SELECT is not set
# CONFIG_MDA_CONSOLE is not set

#
# Frame-buffer support
#
# CONFIG_FB is not set

#
# Sound
#
CONFIG_SOUND=y
# CONFIG_SOUND_CMPCI is not set
# CONFIG_SOUND_EMU10K1 is not set
# CONFIG_SOUND_FUSION is not set
# CONFIG_SOUND_CS4281 is not set
# CONFIG_SOUND_ES1370 is not set
CONFIG_SOUND_ES1371=y
# CONFIG_SOUND_ESSSOLO1 is not set
# CONFIG_SOUND_MAESTRO is not set
# CONFIG_SOUND_SONICVIBES is not set
# CONFIG_SOUND_TRIDENT is not set
# CONFIG_SOUND_MSNDCLAS is not set
# CONFIG_SOUND_MSNDPIN is not set
# CONFIG_SOUND_VIA82CXXX is not set
CONFIG_SOUND_OSS=y
CONFIG_SOUND_TRACEINIT=y
CONFIG_SOUND_DMAP=y
CONFIG_SOUND_AD1816=m
# CONFIG_SOUND_SGALAXY is not set
# CONFIG_SOUND_ADLIB is not set
# CONFIG_SOUND_ACI_MIXER is not set
# CONFIG_SOUND_CS4232 is not set
# CONFIG_SOUND_SSCAPE is not set
# CONFIG_SOUND_GUS is not set
CONFIG_SOUND_ICH=y
# CONFIG_SOUND_VMIDI is not set
# CONFIG_SOUND_TRIX is not set
# CONFIG_SOUND_MSS is not set
# CONFIG_SOUND_MPU401 is not set
# CONFIG_SOUND_NM256 is not set
# CONFIG_SOUND_MAD16 is not set
# CONFIG_SOUND_PAS is not set
# CONFIG_SOUND_PSS is not set
# CONFIG_SOUND_SB is not set
# CONFIG_SOUND_AWE32_SYNTH is not set
# CONFIG_SOUND_WAVEFRONT is not set
# CONFIG_SOUND_MAUI is not set
# CONFIG_SOUND_YM3812 is not set
# CONFIG_SOUND_OPL3SA1 is not set
# CONFIG_SOUND_OPL3SA2 is not set
# CONFIG_SOUND_YMPCI is not set
# CONFIG_SOUND_UART6850 is not set
# CONFIG_SOUND_AEDSP16 is not set

#
# USB support
#
CONFIG_USB=y
CONFIG_USB_DEBUG=y

#
# Miscellaneous USB options
#
CONFIG_USB_DEVICEFS=y
CONFIG_USB_BANDWIDTH=y

#
# USB Controllers
#
CONFIG_USB_UHCI_ALT=y
# CONFIG_USB_OHCI is not set

#
# USB Device Class drivers
#
CONFIG_USB_AUDIO=m
# CONFIG_USB_BLUETOOTH is not set
# CONFIG_USB_STORAGE_DEBUG is not set
# CONFIG_USB_STORAGE_FREECOM is not set
CONFIG_USB_ACM=m
CONFIG_USB_PRINTER=m

#
# USB Human Interface Devices (HID)
#

#
# Input core support is needed for USB HID
#

#
# USB Imaging devices
#
# CONFIG_USB_DC2XX is not set
# CONFIG_USB_MDC800 is not set
CONFIG_USB_SCANNER=m

#
# USB Multimedia devices
#
# CONFIG_USB_DABUSB is not set

#
# USB Network adaptors
#
# CONFIG_USB_PLUSB is not set
CONFIG_USB_PEGASUS=y
# CONFIG_USB_NET1080 is not set

#
# USB port drivers
#
# CONFIG_USB_USS720 is not set

#
# USB Serial Converter support
#
CONFIG_USB_SERIAL=m
# CONFIG_USB_SERIAL_DEBUG is not set
# CONFIG_USB_SERIAL_GENERIC is not set
# CONFIG_USB_SERIAL_BELKIN is not set
# CONFIG_USB_SERIAL_WHITEHEAT is not set
# CONFIG_USB_SERIAL_DIGI_ACCELEPORT is not set
# CONFIG_USB_SERIAL_EMPEG is not set
# CONFIG_USB_SERIAL_FTDI_SIO is not set
# CONFIG_USB_SERIAL_VISOR is not set
# CONFIG_USB_SERIAL_KEYSPAN_PDA is not set
# CONFIG_USB_SERIAL_KEYSPAN is not set
# CONFIG_USB_SERIAL_OMNINET is not set

#
# USB misc drivers
#
# CONFIG_USB_RIO500 is not set

#
# Kernel hacking
#
# CONFIG_MAGIC_SYSRQ is not set


Attachments:
.config (14.13 kB)

2000-12-04 13:52:31

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] inode dirty blocks Re: test12-pre4

Alexander Viro wrote:
>
> On Sun, 3 Dec 2000, Linus Torvalds wrote:
>
> >
> > Synching up with Alan and various other stuff. The most important one
> > being the fix to the inode dirty block list.
>
> It doesn't solve the problem. If you unlink a file with dirty metadata
> you have a nice chance to hit the BUG() in inode.c:83. I hope that patch
> below closes all remaining holes. See analysis in previous posting
> (basically, bforget() is not enough when we free the block; bh should
> be removed from the inode's list regardless of the ->b_count).

It's still happening.

The good news is that I have a machine which does it within 15-30
minutes. The only interesting difference is that is has 64 megs
of RAM (all the others are 256) and its IDE system is a lot slower.

I removed the UnlockPage() at line 623 of vmscan.c because that's
causing assertion failures at swap.c:271.

I changed destroy_inode() thusly:

static void destroy_inode(struct inode *inode)
{
if (!list_empty(&inode->i_dirty_buffers)) {
struct task_struct *tsk;

printk("&inode->i_dirty_buffers=0x%p\n", &inode->i_dirty_buffers);
printk("next=0x%p\n", inode->i_dirty_buffers.next);
printk("prev=0x%p\n", inode->i_dirty_buffers.prev);

read_lock(&tasklist_lock);
for_each_task(tsk) {
printk("[%s]\n", tsk->comm);
show_stack(tsk->thread.esp);
printk("\n\n");
}
read_unlock(&tasklist_lock);

BUG();
}
kmem_cache_free(inode_cachep, (inode));
}

So we get a full task dump. Otherwise this is vanilla test12-pre4 plus
your bforget_inode() patch. SMP kernel on UP hardware, so this is a
good snapshot of system state.

First the BUG trace:

&inode->i_dirty_buffers=0xc1cc6c98
next=0xc03e9a78
prev=0xc03e9a78

Trace; c021b8c5 <tvecs+5a3d/1a358>
Trace; c021b9c2 <tvecs+5b3a/1a358>
Trace; c0146a86 <iput+18e/194>
Trace; c01451a6 <d_delete+66/ac>
Trace; c013df5d <vfs_unlink+18d/1c0>
Trace; c013e035 <sys_unlink+a5/118>
Trace; c0108fdf <system_call+33/38>

That's the same as before. Now some other interesting tasks.
The fourth dbench may be interesting? And look at what
kswapd is doing.


[dbench]
Trace; c012bf9b <wakeup_kswapd+b3/d0>
Trace; c012cc9e <__alloc_pages+246/2f8>
Trace; c012671c <generic_file_write+270/454>
Trace; c0144589 <dput+19/174>
Trace; c013092a <sys_write+8e/c4>

[dbench]
Trace; c012bf9b <wakeup_kswapd+b3/d0>
Trace; c012cc9e <__alloc_pages+246/2f8>
Trace; c012671c <generic_file_write+270/454>
Trace; c0144589 <dput+19/174>
Trace; c013092a <sys_write+8e/c4>

[dbench]
Trace; c01c1557 <vgacon_cursor+1df/1e8>
Trace; c018a34e <set_cursor+6e/80>
Trace; c0118851 <printk+1a1/1b0>
Trace; c0118851 <printk+1a1/1b0>
Trace; c4800000 <_end+44e2e4c/10503eac>
Trace; c0109324 <show_stack+d4/e8>
Trace; c020c2c3 <stext_lock+759f/843c>
Trace; c020c2c3 <stext_lock+759f/843c>
Trace; c0145875 <destroy_inode+75/c4>
Trace; c021b9b9 <tvecs+5b31/1a358>
Trace; c0146a86 <iput+18e/194>
Trace; c01451a6 <d_delete+66/ac>
Trace; c013df5d <vfs_unlink+18d/1c0>
Trace; c010002b <startup_32+2b/cb>

(This is `current')

[dbench]
Trace; c01317fd <__wait_on_buffer+4d/e0>
Trace; c0132cd9 <bread+45/70>
Trace; c0156d6c <ext2_read_inode+104/3ec>
Trace; c01465a8 <get_new_inode+cc/178>
Trace; c014685d <iget4+f5/100>


[kupdate]
Trace; c01156ea <schedule_timeout+7a/9c>
Trace; c0115614 <process_timeout+0/5c>
Trace; c0135304 <kupdate+a4/110>
Trace; c01074c3 <kernel_thread+23/30>

[bdflush]
Trace; c0135255 <bdflush+135/140>
Trace; c01074c3 <kernel_thread+23/30>

[kreclaimd]
Trace; c0116209 <interruptible_sleep_on+4d/80>
Trace; c012c03f <kreclaimd+5b/dc>
Trace; c0105000 <empty_bad_page+0/1000>
Trace; c01074c3 <kernel_thread+23/30>

[kswapd]
Trace; c01317fd <__wait_on_buffer+4d/e0>
Trace; c0132cd9 <bread+45/70>
Trace; c0157180 <ext2_update_inode+12c/408>
Trace; c015748e <ext2_write_inode+32/6c>
Trace; c0145d59 <sync_all_inodes+11d/168>
Trace; c0146221 <prune_icache+31/124>
Trace; c0146335 <shrink_icache_memory+21/30>
Trace; c012bd3b <do_try_to_free_pages+5b/88>
Trace; c0217b71 <tvecs+1ce9/1a358>
Trace; c012bdf6 <kswapd+8e/180>
Trace; c0105000 <empty_bad_page+0/1000>
Trace; c01074c3 <kernel_thread+23/30>

2000-12-04 14:20:09

by Alexander Viro

[permalink] [raw]
Subject: Re: [PATCH] inode dirty blocks

OK, guys, I think I've got it:

static int ext2_update_inode(struct inode * inode, int do_sync)
{
...
mark_buffer_dirty_inode(bh, inode);
...
}

Yes, that's right. bh of piece of inode table is put on inode's list.
Fix: in ext2/inode.c 1211s/mark_buffer_dirty_inode/mark_buffer_dirty/

HTH,
Al

2000-12-04 18:49:41

by Stephen C. Tweedie

[permalink] [raw]
Subject: Re: [PATCH] inode dirty blocks Re: test12-pre4

On Mon, Dec 04, 2000 at 01:01:36AM -0500, Alexander Viro wrote:
>
> It doesn't solve the problem. If you unlink a file with dirty metadata
> you have a nice chance to hit the BUG() in inode.c:83. I hope that patch
> below closes all remaining holes. See analysis in previous posting
> (basically, bforget() is not enough when we free the block; bh should
> be removed from the inode's list regardless of the ->b_count).

Agreed. However, is there any reason to have this as a separate
function? bforget() should _always_ remove the buffer from any inode
queue. You can make that operation conditional on (bh->b_inode !=
NULL) if you want to avoid taking the lru lock unnecessarily.

--Stephen

2000-12-04 20:25:34

by Alexander Viro

[permalink] [raw]
Subject: Re: [PATCH] inode dirty blocks Re: test12-pre4



On Mon, 4 Dec 2000, Stephen C. Tweedie wrote:

> Agreed. However, is there any reason to have this as a separate
> function? bforget() should _always_ remove the buffer from any inode
> queue. You can make that operation conditional on (bh->b_inode !=
> NULL) if you want to avoid taking the lru lock unnecessarily.

I doubt it. bforget() is called, for example, when we deal with the
changed branch in ext2_get_block() (the thing had been partially read,
but then we've noticed that it had been changed under us). And I don't
think that brelse() would be a good thing there...

2000-12-05 02:14:50

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] inode dirty blocks

Alexander Viro wrote:
>
> OK, guys, I think I've got it:

Yes, you have.

Two machines, four hours, zero failures.

This is with

- test12-pre4
- aviro bforget patch
- UnlockPage() removed from vmscan.c:623
- and


--- linux-2.4.0-test12-pre4/fs/ext2/inode.c Mon Dec 4 21:07:12 2000
+++ linux-akpm/fs/ext2/inode.c Tue Dec 5 08:46:38 2000
@@ -1208,7 +1208,7 @@
raw_inode->i_block[0] = cpu_to_le32(kdev_t_to_nr(inode->i_rdev));
else for (block = 0; block < EXT2_N_BLOCKS; block++)
raw_inode->i_block[block] = inode->u.ext2_i.i_data[block];
- mark_buffer_dirty_inode(bh, inode);
+ mark_buffer_dirty(bh);
if (do_sync) {
ll_rw_block (WRITE, 1, &bh);
wait_on_buffer (bh);

2000-12-05 03:12:52

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] inode dirty blocks



On Tue, 5 Dec 2000, Andrew Morton wrote:
>
> - test12-pre4
> - aviro bforget patch

Is the bforget patch really needed?

If clear_inode() gets rid of dirty buffers, I don't see how new dirty
buffers can magically appear. I may have missed part of the discussion on
all this.

I think that the second patch from Al (the inode dirty meta-data) is the
_real_ fix, and I don't see why the bforget changes should matter.

Linus

2000-12-05 03:20:12

by Mohammad A. Haque

[permalink] [raw]
Subject: Re: [PATCH] inode dirty blocks

Cool. Anyone have have a unified patch against pre4 or should I start
digging through my mail? =)

Andrew Morton wrote:
> This is with
>
> - test12-pre4
> - aviro bforget patch
> - UnlockPage() removed from vmscan.c:623
> - and
>
> --- linux-2.4.0-test12-pre4/fs/ext2/inode.c Mon Dec 4 21:07:12 2000
> +++ linux-akpm/fs/ext2/inode.c Tue Dec 5 08:46:38 2000
> @@ -1208,7 +1208,7 @@
> raw_inode->i_block[0] = cpu_to_le32(kdev_t_to_nr(inode->i_rdev));
> else for (block = 0; block < EXT2_N_BLOCKS; block++)
> raw_inode->i_block[block] = inode->u.ext2_i.i_data[block];
> - mark_buffer_dirty_inode(bh, inode);
> + mark_buffer_dirty(bh);
> if (do_sync) {
> ll_rw_block (WRITE, 1, &bh);
> wait_on_buffer (bh);

--

=====================================================================
Mohammad A. Haque http://www.haque.net/
[email protected]

"Alcohol and calculus don't mix. Project Lead
Don't drink and derive." --Unknown http://wm.themes.org/
[email protected]
=====================================================================

2000-12-05 04:01:51

by Alexander Viro

[permalink] [raw]
Subject: Re: [PATCH] inode dirty blocks



On Mon, 4 Dec 2000, Linus Torvalds wrote:

>
>
> On Tue, 5 Dec 2000, Andrew Morton wrote:
> >
> > - test12-pre4
> > - aviro bforget patch
>
> Is the bforget patch really needed?
>
> If clear_inode() gets rid of dirty buffers, I don't see how new dirty
> buffers can magically appear. I may have missed part of the discussion on
> all this.

Well, to start with you don't want random bh's floating around on the
inode's list. With the current code truncate()+fsync() can send a lot
of already freed stuff to disk. Even though we can survive that (making
clear_inode() to get rid of the list will save you from corruption)...
it doesn't look like a good idea.

BTW, in the current form clear_inode() doesn't get rid of the dirty
buffers. It misses the pages that became anonymous and it misses the
metadata that became freed. We can do that, but I'ld rather avoid
leaving these buffer_heads on the inode's list - stuff that got freed
has no business to be accessible from in-core inode.

> I think that the second patch from Al (the inode dirty meta-data) is the
> _real_ fix, and I don't see why the bforget changes should matter.

We can survive without them (modulo patch to clear_inode()), but...
BTW, there is another reason why we want to have separate function
for freeing the stuff: we may want to mark them clean. If they are
already under IO it will do nothing, but if they are merely dirty...

2000-12-05 04:23:58

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] inode dirty blocks



On Mon, 4 Dec 2000, Alexander Viro wrote:
>
> Well, to start with you don't want random bh's floating around on the
> inode's list. With the current code truncate()+fsync() can send a lot
> of already freed stuff to disk. Even though we can survive that (making
> clear_inode() to get rid of the list will save you from corruption)...
> it doesn't look like a good idea.

Now, I'll agree with that, certainly.

I just wanted to be clear on the purpose of the patches. The bforget() one
looks like "taking care of the details", but not like a bug-fix. Agreed?

(Which is not to say I won't apply it - I just want to make sure we have
the issues under control).

> BTW, in the current form clear_inode() doesn't get rid of the dirty
> buffers. It misses the pages that became anonymous and it misses the
> metadata that became freed. We can do that, but I'ld rather avoid
> leaving these buffer_heads on the inode's list - stuff that got freed
> has no business to be accessible from in-core inode.

Again, I agree with you, but it looks like that is a cleanup issue rather
than a bug.

> > I think that the second patch from Al (the inode dirty meta-data) is the
> > _real_ fix, and I don't see why the bforget changes should matter.
>
> We can survive without them (modulo patch to clear_inode()), but...

The "patch to clean-inode" is the one I already did from sct? Or are we
talking about another issue?

> BTW, there is another reason why we want to have separate function
> for freeing the stuff: we may want to mark them clean. If they are
> already under IO it will do nothing, but if they are merely dirty...

Yes. Make it so. In the meantime, does everybody agree that pre5 fixes the
bugs, even though it still has these discussion items left?

Linus

2000-12-05 04:46:44

by Peter Samuelson

[permalink] [raw]
Subject: Re: [PATCH] inode dirty blocks

[Mohammad A. Haque]
> Cool. Anyone have have a unified patch against pre4 or should I start
> digging through my mail? =)

test12pre5, I guess.

Peter

2000-12-05 04:50:34

by Alexander Viro

[permalink] [raw]
Subject: Re: [PATCH] inode dirty blocks



On Mon, 4 Dec 2000, Linus Torvalds wrote:

> I just wanted to be clear on the purpose of the patches. The bforget() one
> looks like "taking care of the details", but not like a bug-fix. Agreed?

Agreed - invalidate_inode_buffers() seems to be doing the right thing,
so previous objections do not apply.

> > We can survive without them (modulo patch to clear_inode()), but...
>
> The "patch to clean-inode" is the one I already did from sct? Or are we
> talking about another issue?

No, the same one. I missed the invalidate_inode_buffers() bit.

> > BTW, there is another reason why we want to have separate function
> > for freeing the stuff: we may want to mark them clean. If they are
> > already under IO it will do nothing, but if they are merely dirty...
>
> Yes. Make it so. In the meantime, does everybody agree that pre5 fixes the
> bugs, even though it still has these discussion items left?

With respect to dirty blocks - hopefully yes.
Cheers,
Al
PS: bforget patch (with mark_buffer_clean() added) follows. And yes, it's
optimization and not a bug-fix.

diff -urN rc12-pre5/fs/buffer.c rc12-pre5-dirty_blocks/fs/buffer.c
--- rc12-pre5/fs/buffer.c Tue Dec 5 02:03:14 2000
+++ rc12-pre5-dirty_blocks/fs/buffer.c Tue Dec 5 02:40:16 2000
@@ -1164,6 +1164,32 @@
}

/*
+ * Call it when you are going to free the block. The difference between
+ * that and bforget() is that we remove the thing from inode queue
+ * unconditionally and mark it clean.
+ */
+void bforget_inode(struct buffer_head * buf)
+{
+ mark_buffer_clean(buf);
+ /* grab the lru lock here to block bdflush. */
+ spin_lock(&lru_list_lock);
+ write_lock(&hash_table_lock);
+ remove_inode_queue(buf);
+ if (!atomic_dec_and_test(&buf->b_count) || buffer_locked(buf))
+ goto in_use;
+ __hash_unlink(buf);
+ write_unlock(&hash_table_lock);
+ __remove_from_lru_list(buf, buf->b_list);
+ spin_unlock(&lru_list_lock);
+ put_last_free(buf);
+ return;
+
+ in_use:
+ write_unlock(&hash_table_lock);
+ spin_unlock(&lru_list_lock);
+}
+
+/*
* bread() reads a specified block and returns the buffer that contains
* it. It returns NULL if the block was unreadable.
*/
@@ -1460,6 +1486,9 @@
clear_bit(BH_Mapped, &bh->b_state);
clear_bit(BH_Req, &bh->b_state);
clear_bit(BH_New, &bh->b_state);
+ spin_lock(&lru_list_lock);
+ remove_inode_queue(bh);
+ spin_unlock(&lru_list_lock);
}
}

diff -urN rc12-pre5/fs/ext2/inode.c rc12-pre5-dirty_blocks/fs/ext2/inode.c
--- rc12-pre5/fs/ext2/inode.c Tue Dec 5 02:03:14 2000
+++ rc12-pre5-dirty_blocks/fs/ext2/inode.c Tue Dec 5 02:37:59 2000
@@ -416,7 +416,7 @@

/* Allocation failed, free what we already allocated */
for (i = 1; i < n; i++)
- bforget(branch[i].bh);
+ bforget_inode(branch[i].bh);
for (i = 0; i < n; i++)
ext2_free_blocks(inode, le32_to_cpu(branch[i].key), 1);
return err;
@@ -484,7 +484,7 @@

changed:
for (i = 1; i < num; i++)
- bforget(where[i].bh);
+ bforget_inode(where[i].bh);
for (i = 0; i < num; i++)
ext2_free_blocks(inode, le32_to_cpu(where[i].key), 1);
return -EAGAIN;
@@ -854,7 +854,7 @@
(u32*)bh->b_data,
(u32*)bh->b_data + addr_per_block,
depth);
- bforget(bh);
+ bforget_inode(bh);
/* Writer: ->i_blocks */
inode->i_blocks -= inode->i_sb->s_blocksize / 512;
/* Writer: end */
diff -urN rc12-pre5/include/linux/fs.h rc12-pre5-dirty_blocks/include/linux/fs.h
--- rc12-pre5/include/linux/fs.h Tue Dec 5 02:03:19 2000
+++ rc12-pre5-dirty_blocks/include/linux/fs.h Tue Dec 5 02:37:59 2000
@@ -1201,6 +1201,7 @@
__brelse(buf);
}
extern void __bforget(struct buffer_head *);
+extern void bforget_inode(struct buffer_head *);
static inline void bforget(struct buffer_head *buf)
{
if (buf)
diff -urN rc12-pre5/kernel/ksyms.c rc12-pre5-dirty_blocks/kernel/ksyms.c
--- rc12-pre5/kernel/ksyms.c Tue Dec 5 02:03:20 2000
+++ rc12-pre5-dirty_blocks/kernel/ksyms.c Tue Dec 5 02:38:00 2000
@@ -188,6 +188,7 @@
EXPORT_SYMBOL(breada);
EXPORT_SYMBOL(__brelse);
EXPORT_SYMBOL(__bforget);
+EXPORT_SYMBOL(bforget_inode);
EXPORT_SYMBOL(ll_rw_block);
EXPORT_SYMBOL(__wait_on_buffer);
EXPORT_SYMBOL(___wait_on_page);

2000-12-05 21:03:11

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] inode dirty blocks Re: test12-pre4

Alexander Viro wrote:
>
> On Sun, 3 Dec 2000, Linus Torvalds wrote:
>
> >
> > Synching up with Alan and various other stuff. The most important one
> > being the fix to the inode dirty block list.
>
> It doesn't solve the problem. If you unlink a file with dirty metadata
> you have a nice chance to hit the BUG() in inode.c:83. I hope that patch
> below closes all remaining holes. See analysis in previous posting
> (basically, bforget() is not enough when we free the block; bh should
> be removed from the inode's list regardless of the ->b_count).
> Cheers,
> Al
>
> diff -urN rc12-pre4/fs/buffer.c rc12-pre4-dirty_blocks/fs/buffer.c
> --- rc12-pre4/fs/buffer.c Mon Dec 4 01:01:43 2000
> +++ rc12-pre4-dirty_blocks/fs/buffer.c Mon Dec 4 01:11:42 2000

That bforget-inode patch ran fine on two machines for ten hours. One
was SMP. The other was running the ATA guy's latest set of patches
including taskfile support.

The proposed FS changes are solid.

The third machine died horribly twice - recursive pagefaults. Without
IDE patch. This could be anything, including hardware. Will investigate.