2000-10-31 20:42:46

by Linus Torvalds

[permalink] [raw]
Subject: Linux-2.4.0-test10


Ok, test10-final is out there now. This has no _known_ bugs that I
consider show-stoppers, for what it's worth.

And when I don't know of a bug, it doesn't exist. Let us rejoice. In
traditional kernel naming tradition, this kernel hereby gets anointed as
one of the "greased weasel" kernel series, one of the final steps in a
stable release.

We're still waiting for the Vatican to officially canonize this kernel,
but trust me, that's only a matter of time. It's a little known fact, but
the Pope likes penguins too.

Linus


2000-10-31 20:48:38

by Rik van Riel

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10

On Tue, 31 Oct 2000, Linus Torvalds wrote:

> Ok, test10-final is out there now. This has no _known_ bugs that
> I consider show-stoppers, for what it's worth.
>
> And when I don't know of a bug, it doesn't exist. Let us
> rejoice. In traditional kernel naming tradition, this kernel
> hereby gets anointed as one of the "greased weasel" kernel
> series, one of the final steps in a stable release.

Well, there's the thing with RAW IO being done into a
process' address space and the data arriving only after
the page gets unmapped from the process.

Then you have the RAW IO data in a swapcache page, but
the VM doesn't know how to swap it out and the page
becomes either unswappable or the data gets lost
(depending on at which stage the page is at that
moment).

But granted, this probably isn't a show-stopper for
most people and -since the fix has to support NFS
too, with its credentials stuff- a fix isn't even
underway yet...

> We're still waiting for the Vatican to officially canonize this
> kernel, but trust me, that's only a matter of time. It's a
> little known fact, but the Pope likes penguins too.

Lets just hope he doesn't need RAW IO ;)

cheers,

Rik
--
"What you're running that piece of shit Gnome?!?!"
-- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/ http://www.surriel.com/

2000-10-31 20:55:29

by Alan

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10

> Ok, test10-final is out there now. This has no _known_ bugs that I
> consider show-stoppers, for what it's worth.

The fact power management even handling is completely broken and crashes
on unfortunately timed module unloads doesnt count ?

More importantly has the bug when you can use the proc/self/mem trick with read
to crash machines as any user via svgalib stuff been fixed ?

Questions:
Has the O_SYNC stuff been fixed so that more than ext2 honours this
flag ?
What about the fact anyone can crash a box using ioctls on net
devices and waiting for an unload - was this fixed ?

Less Critical:
Does autofs4 work yet
Why haven't you merged irda changes people have been sending for months which mean irda in 2.4test doesnt work ?
Making ramfs work seems to not be merged

Ok so Im always on the more conservative side but the large collection of
'fixe exists isnt merged' and those 4 or 5 other issues to me count as at the
very least alarm bells.

But I have to admit it seems close to 2.4.0. Its stayed up a lot better than
I expected under load once I fixed the scsi one

Alan

2000-10-31 20:58:39

by Linus Torvalds

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10



On Tue, 31 Oct 2000, Rik van Riel wrote:
> On Tue, 31 Oct 2000, Linus Torvalds wrote:
> >
> > Ok, test10-final is out there now. This has no _known_ bugs that
> > I consider show-stoppers, for what it's worth.
> >
> > And when I don't know of a bug, it doesn't exist. Let us
> > rejoice. In traditional kernel naming tradition, this kernel
> > hereby gets anointed as one of the "greased weasel" kernel
> > series, one of the final steps in a stable release.
>
> Well, there's the thing with RAW IO being done into a
> process' address space and the data arriving only after
> the page gets unmapped from the process.

Yes. But that doesn't count like a "show-stopper" for me, simply because
it's one of those small details that are known, and never materialize
under normal load.

Yes, it will have to be fixed before anybody starts doing RAW IO in a
major way. And I bet it will be fixed. But it's not on my list of "I
cannot release a 2.4.0 before this is done" - even if I think it will
actually be fixed for the common case before that anyway.

(Note: I suspect that we may just have to accept the fact that due to NFS
etc issues, RAW IO into a shared mapping might not really supported at
all. I don't think any raw IO user uses it that way anyway, so I think the
big and worrisome case is actually only the swap-out case).

> > We're still waiting for the Vatican to officially canonize this
> > kernel, but trust me, that's only a matter of time. It's a
> > little known fact, but the Pope likes penguins too.
>
> Lets just hope he doesn't need RAW IO ;)

Naah, he mainly just does some browsing with netscape, and (don't tell a
soul) plays QuakeIII with the door locked.

Linus

2000-11-01 01:31:54

by Paul Jakma

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10

On Tue, 31 Oct 2000, Alan Cox wrote:

> Less Critical:
> Does autofs4 work yet

has been apparently working fine for me for a while on 2.4test and
2.2+patch. (while==not noticed any major problems in last couple of
months)

> Alan

regards,
--
Paul Jakma [email protected]
PGP5 key: http://www.clubi.ie/jakma/publickey.txt
-------------------------------------------
Fortune:
Save energy: Drive a smaller shell.

2000-11-01 02:21:49

by Tom Rini

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10

On Tue, Oct 31, 2000 at 12:41:55PM -0800, Linus Torvalds wrote:

> Ok, test10-final is out there now. This has no _known_ bugs that I
> consider show-stoppers, for what it's worth.

Sure, it's not a critical bug or anything but hey. One more time:
This is a very minor patch for fs/nls/Config.in, which Petr Vandrovec came up
with. The problem is that if CONFIG_INET is n, CONFIG_SMB_FS is never set
so fs/nls/Config.in assumes that the user wants to select some NLS options.
This fixes it and works on config/menuconfig/xconfig.

--
Tom Rini (TR1265)
http://gate.crashing.org/~trini/


Attachments:
(No filename) (589.00 B)
nls.patch (601.00 B)
Download all attachments

2000-11-01 03:53:38

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10

On Tue, Oct 31, 2000 at 08:55:13PM +0000, Alan Cox wrote:
> Does autofs4 work yet

Autofs4 was fixed in 2.4.0-test10-pre6 or so. Autofs4 for 2.2.x has
been working for some time, though I just updated the 2.2 patch so it
doesn't stomp on autofs (v3).

J


Attachments:
(No filename) (257.00 B)
(No filename) (240.00 B)
Download all attachments

2000-11-01 05:03:30

by M.H.VanLeeuwen

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10

FYI,

My list of 2.4.0-testX problems

Further details, .config, etc...available if needed

Martin

2.4.0-test10 and earlier problem list:

Problem | UP UP-APIC SMP
--------|------------------------------------------------
1 | OK OK HARDLOCK
2 | OK FAILS OK
3 | HARDLOCK HARDLOCK HARDLOCK
4 | BROKEN BROKEN BROKEN

Problem description:

1. kernel compiled w/o FB support. When attempting to switch
back to X from VC1-6 system locks hard for SMP. Nada thing
fixes this except hard reset... no Alt-SysRq-B, nothing
DRI not enabled. Video card has r128 chipset.

2. System is a NFS root machine, after a period of heavy ntwk
activity, eg. "make clean" in /usr/src/linux ETH0 no longer
works or sometimes just ntwk activity during system boot is
enough to cause the ETH activity to cease.
The only recourse is to Alt-SysRq-B the system.
NIC = NE2K ISA

3. Enabling PIIX4, kernel locks hard when printing the partition
tables for hdc. hdc has no partitions.
I think this problem is on Ted's problem list???

4. ISAPNP assigns an invalid/unusable IRQ to NE2K NIC card.
Previously reported to Linux & Ingo, they asked for an MPTABLE
dump, haven't heard back since providing said data.

2000-11-01 05:39:45

by Miles Lane

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10


Linus,

Were there no changes between test10-pre7 and test10?
I notice you didn't send out a Changelist.

The Changelists help me focus my testing.

Thanks,
Miles

2000-11-01 05:41:15

by adrian

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10



On Tue, 31 Oct 2000, Linus Torvalds wrote:

[snip]
> Naah, he mainly just does some browsing with netscape, and (don't tell a
> soul) plays QuakeIII with the door locked.
>
> Linus

Although he might find that 2.2.18pre18 gives better frame rates. :)

1024x768, Max detail, 32bit, 2.2.18pre18: 81fps

1024x768, Max detail, 32bit, 2.4.0-test10: 68fps


If he's really divine, he might notice a difference, unlike us mere
mortals.

Regards,
Adrian




2000-11-01 05:43:45

by Linus Torvalds

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10



On Tue, 31 Oct 2000, Miles Lane wrote:
>
> Were there no changes between test10-pre7 and test10?
> I notice you didn't send out a Changelist.
>
> The Changelists help me focus my testing.

Sorry. Here it is..

Linus
-----
- final:
- Jeff Garzik: ISA network driver cleanup, wrapper.h fixes, 8139too
update, etc
- Mike Coleman: fix TracerPid in /proc/<n>/status
- Thomas Molina: mark NAT packet drop message KERN_DEBUG
- Marcelo Tosatti: nbd should use GFP_BUFFER, not GFP_ATOMIC
- Steve Pratt: TLB flush order fix
- David Miller: network and sparc updates
- Alan Cox: various details (NULL ptr checks in SCSI etc)
- Daniel Roesen: pretty up microcode revision printouts
- Mike Coleman: fix ptrace ambiguity issues
- Paul Mackerras: make yenta work even in the absense of ISA irqs
- me: make USB Makefile do the right thing for export-objs.
- Randy Dunlap, USB: fix race conditions, usb enumeration etc.

- pre7:
- Niels Jensen: remove no-longer-needed workarounds for old gcc versions
- Ingo Molnar & Rik v Riel: VM inactive list maintenance correction
- Randy Dunlap, USB: printer.c, usb-storage, usb identification and
memory leak fixes
- David Miller: networking updates
- David Mosberger: add AT_CLKTCK to elf information. And make AT_PAGESZ work
for static binaries too.
- oops. pcmcia broke by mistake
- Me: truncate vs page access race fix.

- pre6:
- Jeremy Fitzhardinge: autofs4 expiry fix
- David Miller: sparc driver updates, networking updates
- Mathieu Chouquet-Stringer: buffer overflow in sg_proc_dressz_write
- Ingo Molnar: wakeup race fix (admittedly the window was basically
non-existent, but still..)
- Rasmus Andersen: notice that "this_slice" is no longer used for
scheduling - delete the code that calculates it.
- ALI pirq routing update. It's even uglier than we initially thought..
- Dimitrios Michailidis: fix ipip locking bugs
- Various: face it - gcc-2.7.2.3 miscompiles structure initializers.
- Paul Cassella: locking comments on dev_base
- Trond Myklebust: NFS locking atomicity. refresh inode properly.
- Andre Hedrick: Serverworks Chipset driver, IDE-tape fix
- Paul Gortmaker: kill unused code from 8390 support.
- Andrea Arcangeli: fix nfsv3d wrong truncates over 4G
- Maciej W. Rozycki: PIIX4 needs the same USB quirk handling as PIIX3.
- me: if we cannot figure out the PCI bridge windows, just "inherit"
the window from the parent. Better than not booting.
- Ching-Ling Lee: ALI 5451 Audio core support update

- pre5:
- Mikael Pettersson: more Pentium IV cleanup.
- David Miller: non-x86 platforms missed "pte_same()".
- Russell King: NFS invalidate_inode_pages() can do bad things!
- Randy Dunlap: usb-core.c is gone - module fix
- Ben LaHaise: swapcache fixups for the new atomic pte update code
- Oleg Drokin: fix nm256_audio memory region confusion
- Randy Dunlap: USB printer fixes
- David Miller: sparc updates
- David Miller: off-by-one error in /proc socket dumper
- David Miller: restore non-local bind() behaviour.
- David Miller: wakeups on socket shutdown()
- Jeff Garzik: DEPCA net drvr fixes and CodingStyle
- Jeff Garzik: netsemi net drvr fix
- Jeff Garzik & Andrea Arkangeli: keyboard cleanup
- Jeff Garzik: VIA audio update
- Andrea Arkangeli: mxcsr initialization cleanup and fix
- Gabriel Paubert: better twd_i387_to_fxsr() emulation
- Andries Brouwer: proper error return in ext2 mkdir()

- pre4:
- disable writing to /proc/xxx/mem. Sure, it works now, but it's still
a security risk.
- IDE driver update (Victroy66 SouthBridge support)
- i810 rng driver cleanup
- fix sbus Makefile
- named initializers in module..
- ppoe: remove explicit initializer - it's done with initcalls.
- x86 WP bit detection: do it cleanly with exception handling
- Arnaldo Carvalho de Melo: memory leaks in drivers/media/video
- Bartlomiej Zolnierkiewicz: video init functions get __init
- David Miller: get rid of net/protocols.c - they get to initialize themselves
- David Miller: get rid of dev_mc_lock - we hold dev->xmit_lock anyway.
- Geert Uytterhoeven: Zorro (Amiga) bus support update
- David Miller: work around gcc-2.7.2 bug
- Geert Uytterhoeven: mark struct consw's "const".
- Jeff Garzik: network driver cleanups, ns558 joystick driver oops fix
- Tigran Aivazian: clean up __alloc_pages(), kill_super() and
notify_change()
- Tigran Aivazian: move stuff from .data to .bss
- Jeff Garzik: divert.h typename cleanups
- James Simmons: mdacon using spinlocks
- Tigran Aivazian: fix BFS free block calculation
- David Miller: sparc32 works again
- Bernd Schmidt: fix undefined C code (set/use without a sequence point)
- Mikael Pettersson: nicer Pentium IV setup handling.
- Georg Acher: usb-uhci cpia oops fix
- Kanoj Sarcar: more node_data cleanups for [non]NUMA.
- Richard Henderson: alpha update to new vmalloc setup
- Ben LaHaise: atomic pte updates (don't lose dirty bit)
- David Brownell: ohci memory debugging (== use separate slabs for allocation)

- pre3:
- update email address of Joerg Reuter
- Andries Brouwer: spelling fixes, missing atari brelse(), breada() fix
- Geert Uytterhoeven: used named initializers for "struct console".
- Carsten Paeth: ISDN capifs - iput() only once.
- Petr Vandrovec: VFAT short name generation fix
- Jeff Garzik: i810_rng cleanup, and i815 chipset added.
- Bartlomiej Zolnierkiewicz: clean up some remaining old-style Makefiles
- Dave Jones: x86 setup fixes (recognize Pentium IV etc).
- x86: do the "fast A20" setup too in setup.S
- NIIBE Yutaka: update SuperH for the global page table (vmalloc) change.
- David Miller: sparc updates (vmalloc stuff still pending)
- David Miller: CodaFS warnings and 64-bit warnings in pci_size()
- David Miller: pcnet32 - correct NULL test
- David Miller: vmlist lock -> page_table_lock clarification
- Trond Myklebust: Ouch. rpcauth_lookup_credcache() memory corruption bug
- Matthew Wilcox: file locking cleanups
- David Woodhouse: USB audio spinlock fixes
- Torben Mathiasen: tlan driver cleanups
- Randy Dunlap: Yenta: CACHE_LINE_SIZE is in dwords, not bytes.
- Randy Dunlap: more USB updates
- Kanoj Sarcar: clean up the NUMA interfaces (pg_data instead of nodes)
- "save_fpu()" was broken. Need to clear pending errors: save_init_fpu().

- pre2:
- remember to change the kernel version ;)
- isapnp.txt bugfix
- ia64 update
- sparc update
- networking update (pppoe init, frame diverter, fix tcp_sendmsg,
fix udp_recvmsg).
- Compile for WinChip must _not_ use "-march=i686". It's a i586.
- Randy Dunlap: more USB updates
- clarify the Firewire AIC-5800 situation. It's not supported yet.
- PCI-space decode size fix. This is needed for some (broken?) hardware
- /proc/self/maps off-by-one error
- 3c501, 3c507, cs89x0 network drivers drop unnecessary check_region
- Asahi Kasei AK4540: new codec ID. Yamaha: new PCI ID's.
- ne2k-pci net driver documentation update
- Paul Gortmaker: delete paranoia check in rtc_exit
- scsi_merge: memset the right amount of memory.
- sun3fb: old __initfunc() not supported any more.
- synclink: remove unnecessary task state games
- xd.c: proper casting for 64-bit architectures
- vmalloc: page table update race condition.

- pre1:
- Roger Larsson: ">=" instead of ">" to make the VM not get stuck.
- Gideon Glass: brw_kiovec() failure case oops fix
- Rik van Riel: better memory balancing and OOM killer
- Ivan Kokshaysky: alpha compile fixes
- Vojtech Pavlik: forgotten ENOUGH macro in via82cxxx ide driver
- Arnaldo Carvalho de Melo: acpi resource leak fix
- Brian Gerst: use mov's instead of xchg in kernel trap entry
- Torben Mathiasen: tlan timer being added twice bug
- Andrzej Krzysztofowicz: config file fixes
- Jean Tourrilhes: Wavelan lockup on SMP fix
- Roman Zippel: initdata must be initialized (even if it is to zero:
gcc is strange)
- Jean Tourrilhes: hp100 driver lockup at startup on SMP
- Russell King: fix silly minixfs uninitialized error bug
- (various): fix uid hashing to use "uid_t" instead of "unsigned short"
- Jaroslav Kysela: isapnp timeout fix. NULL ptr dereference fix.
- Alain Knaff: fdformat should work again.
- Randy Dunlap: USB - fix bluetooth, acm, printer, serial to work
with urb->dev changes.
- Randy Dunlap: USB whiteheat serial driver firmware update.
- Randy Dunlap: USB hub memory corruption and pegasus driver update
- Andre Hedrick: IDE Makefile cleanup

2000-11-01 08:39:07

by Andi Kleen

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10

On Tue, Oct 31, 2000 at 08:55:13PM +0000, Alan Cox wrote:
> What about the fact anyone can crash a box using ioctls on net
> devices and waiting for an unload - was this fixed ?

The ioctls of network devices are generally unsafe on SMP, because
they run with kernel lock dropped now but are mostly not safe to do so.

-Andi

2000-11-01 10:19:18

by Tigran Aivazian

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10

On Tue, 31 Oct 2000, Linus Torvalds wrote:

>
> Ok, test10-final is out there now. This has no _known_ bugs that I
> consider show-stoppers, for what it's worth.

Linus,

But it contains an erroneous part in microcode.c:

- if (c->x86_vendor != X86_VENDOR_INTEL || c->x86 < 6){
+ if (c->x86_vendor != X86_VENDOR_INTEL || c->x86 != 6){
printk(KERN_ERR "microcode: CPU%d not an Intel P6\n",
cpu_num);


It was not in Daniel's cleanup patch which I saw but came from elsewhere.
Are there Intel CPUs with family>6 which do not support the same mechanism
for microcode update as family=6? The manuals suggest that test for ">" is
correct, i.e. that Intel will maintain compatibility with P6 wrt microcode
update.

Perhaps Richard can clarify this?

Regards,
Tigran

2000-11-01 12:54:22

by Mikael Pettersson

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10

On Wed, 1 Nov 2000, Tigran Aivazian wrote:

>But it contains an erroneous part in microcode.c:
>
>- if (c->x86_vendor != X86_VENDOR_INTEL || c->x86 < 6){
>+ if (c->x86_vendor != X86_VENDOR_INTEL || c->x86 != 6){
> printk(KERN_ERR "microcode: CPU%d not an Intel P6\n",
>cpu_num);
>
>
>It was not in Daniel's cleanup patch which I saw but came from elsewhere.
>Are there Intel CPUs with family>6 which do not support the same mechanism
>for microcode update as family=6? The manuals suggest that test for ">" is
>correct, i.e. that Intel will maintain compatibility with P6 wrt microcode
>update.

The patch came from me, as part of the patch kit to make the kernel
safe(r) for the Pentium IV processor. There were several places
which made unwarranted assumptions about ->x86 not exceeding 6, and
they would break since Pentium IV has family 15.

[Reading #243192 ...]
Concerning the microcode update driver, I cannot find anything in
the IA32-Vol3 manual to suggest it is not P6 specific.
The Pentium IV is not a P6, hence the stricter test.

Of course, this should be reviewed once the PentiumIV-updated
IA32-Vol3 manual has been released.

You say the manuals suggest testing for family >= 6.
Exactly which manual does that, and where?

(But I should have passed the patch to you Tigran before Linus.
Mea culpa, but it seemed such an obvious bug&fix ...)

/Mikael

2000-11-01 15:02:03

by Alan

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10

> for microcode update as family=6? The manuals suggest that test for ">" is
> correct, i.e. that Intel will maintain compatibility with P6 wrt microcode
> update.
>
> Perhaps Richard can clarify this?

Until we know what the preventium IV does on microcode behaviour it seems
wisest to test for == not >.

2000-11-01 15:20:19

by Tigran Aivazian

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10

On Wed, 1 Nov 2000, Alan Cox wrote:

> > for microcode update as family=6? The manuals suggest that test for ">" is
> > correct, i.e. that Intel will maintain compatibility with P6 wrt microcode
> > update.
> >
> > Perhaps Richard can clarify this?
>
> Until we know what the preventium IV does on microcode behaviour it seems
> wisest to test for == not >.

Ok, true, I agree with both you and Mikael Pettersson. Also, I couldn't
find where in the volumeIII I saw references to ">", perhaps it wasn't in
the docs but just in some sample Intel implementation (or maybe not).

Regards,
Tigran



2000-11-01 16:13:27

by CRADOCK, Christopher

[permalink] [raw]
Subject: RE: Linux-2.4.0-test10

I have a similar hardware list and I don't observe any of these problems on
2.4.0-test10x. Is it possibly a hardware conflict somewhere?

What I do see occasionally is if X was ever heavy on the memory usage (say
I've run GIMP for a couple of hours) then the text console's font set gets
trashed until the next reboot. Console driver failing to reset something?

Chris Cradock

> -----Original Message-----
> From: M.H.VanLeeuwen [SMTP:[email protected]]
> Sent: Wednesday, November 01, 2000 6:03 AM
> To: [email protected]
> Cc: [email protected]
> Subject: Re: Linux-2.4.0-test10
>
> FYI,
>
> My list of 2.4.0-testX problems
>
> Further details, .config, etc...available if needed
>
> Martin
>
> 2.4.0-test10 and earlier problem list:
>
> Problem | UP UP-APIC SMP
> --------|------------------------------------------------
> 1 | OK OK HARDLOCK
> 2 | OK FAILS OK
> 3 | HARDLOCK HARDLOCK HARDLOCK
> 4 | BROKEN BROKEN BROKEN
>
> Problem description:
>
> 1. kernel compiled w/o FB support. When attempting to switch
> back to X from VC1-6 system locks hard for SMP. Nada thing
> fixes this except hard reset... no Alt-SysRq-B, nothing
> DRI not enabled. Video card has r128 chipset.
>
> 2. System is a NFS root machine, after a period of heavy ntwk
> activity, eg. "make clean" in /usr/src/linux ETH0 no longer
> works or sometimes just ntwk activity during system boot is
> enough to cause the ETH activity to cease.
> The only recourse is to Alt-SysRq-B the system.
> NIC = NE2K ISA
>
> 3. Enabling PIIX4, kernel locks hard when printing the partition
> tables for hdc. hdc has no partitions.
> I think this problem is on Ted's problem list???
>
> 4. ISAPNP assigns an invalid/unusable IRQ to NE2K NIC card.
> Previously reported to Linux & Ingo, they asked for an MPTABLE
> dump, haven't heard back since providing said data.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> Please read the FAQ at http://www.tux.org/lkml/

2000-11-01 18:09:11

by Alexey Kuznetsov

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10

Hello!

> On Tue, Oct 31, 2000 at 08:55:13PM +0000, Alan Cox wrote:
> > What about the fact anyone can crash a box using ioctls on net
> > devices and waiting for an unload - was this fixed ?

What do you mean?

If I understood you correclty, this has been fixed in early 2.3
and never reappeared since that time.



> The ioctls of network devices are generally unsafe on SMP, because
> they run with kernel lock dropped now but are mostly not safe to do so.

Andi, pleeeease, stop FUDing.

If you see some bug, fix it. I do not see.

Alexey

2000-11-01 19:30:38

by David Ford

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10

"M.H.VanLeeuwen" wrote:

> 3. Enabling PIIX4, kernel locks hard when printing the partition
> tables for hdc. hdc has no partitions.
> I think this problem is on Ted's problem list???

Disable PIIXn tuning and recompile your kernel. How does it fare now?

-d

--
"The difference between 'involvement' and 'commitment' is like an
eggs-and-ham breakfast: the chicken was 'involved' - the pig was
'committed'."



Attachments:
david.vcf (239.00 B)
Card for David Ford

2000-11-01 23:09:15

by M.H.VanLeeuwen

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10

David Ford wrote:
>
> "M.H.VanLeeuwen" wrote:
>
> > 3. Enabling PIIX4, kernel locks hard when printing the partition
> > tables for hdc. hdc has no partitions.
> > I think this problem is on Ted's problem list???
>
> Disable PIIXn tuning and recompile your kernel. How does it fare now?
>

Yep, disabling (opposite of "enabling") does allow the kernel to boot just fine.
PIIXn tuning must be tickling something on the system so that the first time we
read from the disk, partition check block 0, the system freezes hard.

I do know that w/o PIIXn tuning the result of the first block read is all zero's
hence the "/dev/ide/host0/bus0/target1/lun0: unknown partition table" message.

Any idea how to go about debugging this kind of lockup?
Guess i'll scrounge up a couple of disks and see if it's controller or disk related.

Weren't you also experiencing this type of problem on a laptop?


Martin


lspci -v
00:07.0 ISA bridge: Intel Corporation 82371AB PIIX4 ISA (rev 02)
Flags: bus master, medium devsel, latency 0

00:07.1 IDE interface: Intel Corporation 82371AB PIIX4 IDE (rev 01) (prog-if 80 [Master])
Flags: bus master, medium devsel, latency 32
I/O ports at f000 [size=16]

00:07.2 USB Controller: Intel Corporation 82371AB PIIX4 USB (rev 01) (prog-if 00 [UHCI])
Flags: bus master, medium devsel, latency 32, IRQ 19
I/O ports at d000 [size=32]

00:07.3 Bridge: Intel Corporation 82371AB PIIX4 ACPI (rev 02)
Flags: medium devsel

2000-11-01 23:44:34

by M.H.VanLeeuwen

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10

"CRADOCK, Christopher" wrote:
>
> I have a similar hardware list and I don't observe any of these problems on
> 2.4.0-test10x. Is it possibly a hardware conflict somewhere?
>
> What I do see occasionally is if X was ever heavy on the memory usage (say
> I've run GIMP for a couple of hours) then the text console's font set gets
> trashed until the next reboot. Console driver failing to reset something?
>
> Chris Cradock
>

Hi Chris

Never had the trashed fonts before.

How about a better comparison of systems?
All I mentioned were r128, ne2k, PIIX4 and SMP, barely enough to claim similar
hardware thus these aren't real problems cause you don't see them.
I can send you gory details if your interested.

My reason for claiming these are problems, maybe not show stoppers, are:

This system is rock solid on 2.2.X.

problem 1, shouldn't fail on 2.4 if it works just fine on 2.2. Probably a locking
issue but I'm not sure. Any idea where to put some BKL's to see if the problem
will go away?

problem 2, happens randomly, so is it a hardware problem or a software issue? being
that the system works fine SMP and UP then my guess is a software interaction when
UP-APIC is enabled, a race condition??

problem 3, new feature in 2.4, one would expect, hey, I've got this hdwr in my system,
let me enable this option... wait a minute the system doesn't boot...

problem 4, ISAPNP in the kernel is new for 2.4, i was pointing out that it can be
improved to make it better able to select IRQ's that work so that the user can just
upgrade to 2.4 without having to tweak the BIOS and/or the code. I sent a patch to
Linus but he rejected it, yes I realize it was a weak attempt but it fixed my ISAPNP
problems, and no one has proposed a better solution. Shouldn't the
first release of 2.4.0 show that it's new capabilities are ready for prime time?


Thanks,
Martin


> 1. kernel compiled w/o FB support. When attempting to switch
> back to X from VC1-6 system locks hard for SMP. Nada thing
> fixes this except hard reset... no Alt-SysRq-B, nothing
> DRI not enabled. Video card has r128 chipset.
>
> 2. System is a NFS root machine, after a period of heavy ntwk
> activity, eg. "make clean" in /usr/src/linux ETH0 no longer
> works or sometimes just ntwk activity during system boot is
> enough to cause the ETH activity to cease.
> The only recourse is to Alt-SysRq-B the system.
> NIC = NE2K ISA
>
> 3. Enabling PIIX4, kernel locks hard when printing the partition
> tables for hdc. hdc has no partitions.
> I think this problem is on Ted's problem list???
>
> 4. ISAPNP assigns an invalid/unusable IRQ to NE2K NIC card.
> Previously reported to Linus & Ingo, they asked for an MPTABLE
> dump, haven't heard back since providing said data.

2000-11-02 00:04:46

by Jeff Garzik

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10

Andi Kleen wrote:
> On Tue, Oct 31, 2000 at 08:55:13PM +0000, Alan Cox wrote:
> > What about the fact anyone can crash a box using ioctls on net
> > devices and waiting for an unload - was this fixed ?

> The ioctls of network devices are generally unsafe on SMP, because
> they run with kernel lock dropped now but are mostly not safe to do so.

Wrong. The BLK is dropped in sock_ioctl, but struct netdevice::do_ioctl
is called with rtnl_lock held:

net/core/dev.c:
rtnl_lock();
ret = dev_ifsioc(&ifr, cmd);
rtnl_unlock();

Therefore for 2.4.x, our concern is whether a particular net driver
needs further SMP protection internally, or if rtnl_lock (a semaphore,
not a spinlock) is sufficient.

Jeff


--
Jeff Garzik | "Mind if I drive?" -Sam
Building 1024 | "Not if you don't mind me clawing at the
MandrakeSoft | dash and shrieking like a cheerleader."
| -Max

2000-11-02 00:07:56

by David Miller

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10

Date: Wed, 01 Nov 2000 19:03:44 -0500
From: Jeff Garzik <[email protected]>

Therefore for 2.4.x, our concern is whether a particular net driver
needs further SMP protection internally, or if rtnl_lock (a semaphore,
not a spinlock) is sufficient.

Thanks Jeff, this is precisely what Alexey and myself have been trying
to beat into Andi's head for months now :-)

Later,
David S. Miller
[email protected]

2000-11-02 02:58:33

by David Ford

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10

Yes..long standing bug, and I don't have sufficient time to get my feet wet in the IDE dept
and fix it.

-d

"M.H.VanLeeuwen" wrote:

> > Disable PIIXn tuning and recompile your kernel. How does it fare now?
>
> Yep, disabling (opposite of "enabling") does allow the kernel to boot just fine.
> PIIXn tuning must be tickling something on the system so that the first time we
> read from the disk, partition check block 0, the system freezes hard.

--
"The difference between 'involvement' and 'commitment' is like an
eggs-and-ham breakfast: the chicken was 'involved' - the pig was
'committed'."



Attachments:
david.vcf (239.00 B)
Card for David Ford

2000-11-02 07:15:52

by Vitezslav Samel

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10

Hi!

> My list of 2.4.0-testX problems
>
> Problem description:
>
> 1. kernel compiled w/o FB support. When attempting to switch
> back to X from VC1-6 system locks hard for SMP. Nada thing
> fixes this except hard reset... no Alt-SysRq-B, nothing
> DRI not enabled. Video card has r128 chipset.


Me Too (tm). No FB support, no DRI, lock occurs randomly during
switching back to X from VC. no keyboard, no net, no video (my
monitor switches off)

HW: Asus P2B-D (2xPIII/700)
ATI Rage Pro r128

.config or other info available



Vitezslav Samel

2000-11-02 17:21:08

by Stephen C. Tweedie

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10

Hi,

On Tue, Oct 31, 2000 at 08:55:13PM +0000, Alan Cox wrote:
>
> Questions:
> Has the O_SYNC stuff been fixed so that more than ext2 honours this
> flag ?

Not yet.

Linus, the last patch I sent you on this didn't make it in --- is it
worth my while resending, or do we need to rethink how to do this?

2.2 O_SYNC is actually broken too --- it doesn't sync all metadata (in
particular, it doesn't update the inode), but I'd rather fix that for
2.4 rather than change 2.2, as the main users of O_SYNC, databases,
are writing to preallocated files anyway.

The patch I sent fully implements O_SYNC (actually, it implements
O_DSYNC, which is allowed to skip the inode sync if the only attribute
which has changed is the timestamps) and fdatasync. It's easy for me
to make the DSYNC selectable via sysctl for full SU compliance, and I
know of other unixes that already do this --- you really don't want
existing database applications suddenly to start seeking to the inode
block for every O_SYNC write.

There are two parts to the implementation here --- the separation of
O_DIRTY into two bits, a "fully-dirty" bit and a "timestamp-dirty"
bit, and the use of a linked list of buffer_heads against each inode
to track all dirty data. It is possible to do without the latter, but
that requires either doing a full mapping tree walk after O_SYNC to
flush indirect blocks, or doing indirect writes synchronously as we
write out the data. f[data]sync can't do the sync-indirect-write
trick, so is still required to walk the whole indirect tree on fsync,
which can get expensive on large files.

Cheers,
Stephen

2000-11-02 17:37:34

by Christoph Rohland

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10

Hi Stephen,

On Thu, 2 Nov 2000, Stephen C. Tweedie wrote:
> The patch I sent fully implements O_SYNC (actually, it implements
> O_DSYNC, which is allowed to skip the inode sync if the only
> attribute which has changed is the timestamps) and fdatasync. It's
> easy for me to make the DSYNC selectable via sysctl for full SU
> compliance, and I know of other unixes that already do this --- you
> really don't want existing database applications suddenly to start
> seeking to the inode block for every O_SYNC write.

No, we definitely do not want to have that. We had big performance
problems at customer sites when another unix did change the behaviour
exactly that way between releases.

Greetings
Christoph

2000-11-02 17:52:37

by Ragnar Hojland Espinosa

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10

On Wed, Nov 01, 2000 at 06:44:25PM -0600, M.H.VanLeeuwen wrote:
> "CRADOCK, Christopher" wrote:
> > I have a similar hardware list and I don't observe any of these problems on
> > 2.4.0-test10x. Is it possibly a hardware conflict somewhere?
> >
> > What I do see occasionally is if X was ever heavy on the memory usage (say
> > I've run GIMP for a couple of hours) then the text console's font set gets
> > trashed until the next reboot. Console driver failing to reset something?

> Never had the trashed fonts before.

Well, here never did until today :) With test9, I had left the box idle
downloading stuff over ppp for like 6h under X. While wget was running,
switched to a vc and with each dot wget printed, the font map got screwed
up more and more.

Not a particular useful report, but I thought I'd mention it in case it
rings a bell somewhere .. UP instead of your SMP, VIA instead of PIIX.
--
____/| Ragnar H?jland Freedom - Linux - OpenGL Fingerprint 94C4B
\ o.O| 2F0D27DE025BE2302C
=(_)= "Thou shalt not follow the NULL pointer for 104B78C56 B72F0822
U chaos and madness await thee at its end." hkp://keys.pgp.com

Handle via comment channels only.

2000-11-02 18:34:04

by Ragnar Hojland Espinosa

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10

On Thu, Nov 02, 2000 at 06:57:06PM +0100, Ragnar Hojland Espinosa wrote:
> Well, here never did until today :) With test9, I had left the box idle

Just happened with test10, same circumstances .. font map got corrupted, and
noise on the screen. Switching back and forth from X to a vc fixed it, tho.

Sort of amusing that it (apparently) only happens with ppp + wget ..

--
____/| Ragnar H?jland Freedom - Linux - OpenGL Fingerprint 94C4B
\ o.O| 2F0D27DE025BE2302C
=(_)= "Thou shalt not follow the NULL pointer for 104B78C56 B72F0822
U chaos and madness await thee at its end." hkp://keys.pgp.com

Handle via comment channels only.

2000-11-02 18:46:48

by CRADOCK, Christopher

[permalink] [raw]
Subject: RE: Linux-2.4.0-test10

Martin, my setup is a Gigabyte DX (ie BX chip set with dual Pentium III) I
have an ATI rage 128 in it with two soundblasters and a cheapo ne2k clone.

The ISAPNP doesn't auto config completely for me so all I can say is the
plug-and-pray would still cause me some grief. Basically the ne2k and the
second sound blaster have conflicting requirements, so I manually tweaked
the ISAPNP selection until it worked. It appears the sound blasters don't
work on half the combination of interrupt and IO ports that you could chose
and that the PNP listing claims are available.

I've never had a problem with or without the FB in place (except initially
when aty128fb.c was flaky) Although it runs X about ten times slower than
the XFree V4 direct driver so I don't normally touch it.

The problems with the PIIX4 simply cause my HDA disk to hang up until the
reset pokes it back into action. As the root partition is on hda1 it hangs
the machine waiting for this to happen (takes about 15 minutes).

ETH0 - thinking about it to does occasionally hang up on me but I thought
that was me fiddling with the settings too much. I'll try that again.

As for kernel debug points I can't say I do my debugging that way.

Chris.

> -----Original Message-----
> From: M.H.VanLeeuwen [SMTP:[email protected]]
> Sent: Thursday, November 02, 2000 12:44 AM
> To: CRADOCK, Christopher
> Cc: [email protected]; [email protected]
> Subject: Re: Linux-2.4.0-test10
>
> "CRADOCK, Christopher" wrote:
> >
> > I have a similar hardware list and I don't observe any of these problems
> on
> > 2.4.0-test10x. Is it possibly a hardware conflict somewhere?
> >
> > What I do see occasionally is if X was ever heavy on the memory usage
> (say
> > I've run GIMP for a couple of hours) then the text console's font set
> gets
> > trashed until the next reboot. Console driver failing to reset
> something?
> >
> > Chris Cradock
> >
>
> Hi Chris
>
> Never had the trashed fonts before.
>
> How about a better comparison of systems?
> All I mentioned were r128, ne2k, PIIX4 and SMP, barely enough to claim
> similar
> hardware thus these aren't real problems cause you don't see them.
> I can send you gory details if your interested.
>
> My reason for claiming these are problems, maybe not show stoppers, are:
>
> This system is rock solid on 2.2.X.
>
> problem 1, shouldn't fail on 2.4 if it works just fine on 2.2. Probably a
> locking
> issue but I'm not sure. Any idea where to put some BKL's to see if the
> problem
> will go away?
>
> problem 2, happens randomly, so is it a hardware problem or a software
> issue? being
> that the system works fine SMP and UP then my guess is a software
> interaction when
> UP-APIC is enabled, a race condition??
>
> problem 3, new feature in 2.4, one would expect, hey, I've got this hdwr
> in my system,
> let me enable this option... wait a minute the system doesn't boot...
>
> problem 4, ISAPNP in the kernel is new for 2.4, i was pointing out that it
> can be
> improved to make it better able to select IRQ's that work so that the user
> can just
> upgrade to 2.4 without having to tweak the BIOS and/or the code. I sent a
> patch to
> Linus but he rejected it, yes I realize it was a weak attempt but it fixed
> my ISAPNP
> problems, and no one has proposed a better solution. Shouldn't the
> first release of 2.4.0 show that it's new capabilities are ready for prime
> time?
>
>
> Thanks,
> Martin
>
>
> > 1. kernel compiled w/o FB support. When attempting to switch
> > back to X from VC1-6 system locks hard for SMP. Nada thing
> > fixes this except hard reset... no Alt-SysRq-B, nothing
> > DRI not enabled. Video card has r128 chipset.
> >
> > 2. System is a NFS root machine, after a period of heavy ntwk
> > activity, eg. "make clean" in /usr/src/linux ETH0 no longer
> > works or sometimes just ntwk activity during system boot is
> > enough to cause the ETH activity to cease.
> > The only recourse is to Alt-SysRq-B the system.
> > NIC = NE2K ISA
> >
> > 3. Enabling PIIX4, kernel locks hard when printing the partition
> > tables for hdc. hdc has no partitions.
> > I think this problem is on Ted's problem list???
> >
> > 4. ISAPNP assigns an invalid/unusable IRQ to NE2K NIC card.
> > Previously reported to Linus & Ingo, they asked for an MPTABLE
> > dump, haven't heard back since providing said data.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> Please read the FAQ at http://www.tux.org/lkml/

2000-11-03 00:08:52

by James Simmons

[permalink] [raw]
Subject: RE: Linux-2.4.0-test10


> I have a similar hardware list and I don't observe any of these problems on
> 2.4.0-test10x. Is it possibly a hardware conflict somewhere?
>
> What I do see occasionally is if X was ever heavy on the memory usage (say
> I've run GIMP for a couple of hours) then the text console's font set gets
> trashed until the next reboot. Console driver failing to reset something?

No! The X server resets the VGA mode including resetting the fonts. See
xc/programs/Xserver/hw/xfree86/vgahw to see how XF4.0 switchs between X
and vgacon. It see under heavy pressure X fails to reset the video
hardware on it own :-(

2000-11-03 06:25:53

by kernel

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10

On Thu, Nov 02, 2000 at 07:38:56PM +0100, Ragnar Hojland Espinosa wrote:
> On Thu, Nov 02, 2000 at 06:57:06PM +0100, Ragnar Hojland Espinosa wrote:
> > Well, here never did until today :) With test9, I had left the box idle
>
> Just happened with test10, same circumstances .. font map got corrupted, and
> noise on the screen. Switching back and forth from X to a vc fixed it, tho.
>
> Sort of amusing that it (apparently) only happens with ppp + wget ..

You have a voodoo3 or voodoo5 with X4, and the DRI X4 module loaded.

Or am I wrong?

Zephaniah E. Hull.
>
> --
> ____/| Ragnar H?jland Freedom - Linux - OpenGL Fingerprint 94C4B
> \ o.O| 2F0D27DE025BE2302C
> =(_)= "Thou shalt not follow the NULL pointer for 104B78C56 B72F0822
> U chaos and madness await thee at its end." hkp://keys.pgp.com
>
> Handle via comment channels only.

2000-11-03 10:03:14

by Ben Ford

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10

I have this problem also. I am running vesafb and X4.01 w/ a voodoo3500.
Switching to a vc sometimes gives you black text (hiliting w/ mouse fixes it) and
alternating green and red pixels across the top of the screen.

-b


[email protected] wrote:

> On Thu, Nov 02, 2000 at 07:38:56PM +0100, Ragnar Hojland Espinosa wrote:
> > On Thu, Nov 02, 2000 at 06:57:06PM +0100, Ragnar Hojland Espinosa wrote:
> > > Well, here never did until today :) With test9, I had left the box idle
> >
> > Just happened with test10, same circumstances .. font map got corrupted, and
> > noise on the screen. Switching back and forth from X to a vc fixed it, tho.
> >
> > Sort of amusing that it (apparently) only happens with ppp + wget ..
>
> You have a voodoo3 or voodoo5 with X4, and the DRI X4 module loaded.
>
> Or am I wrong?
>
> Zephaniah E. Hull.
> >
> > --
> > ____/| Ragnar H?jland Freedom - Linux - OpenGL Fingerprint 94C4B
> > \ o.O| 2F0D27DE025BE2302C
> > =(_)= "Thou shalt not follow the NULL pointer for 104B78C56 B72F0822
> > U chaos and madness await thee at its end." hkp://keys.pgp.com
> >
> > Handle via comment channels only.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> Please read the FAQ at http://www.tux.org/lkml/

2000-11-03 20:59:53

by Pavel Machek

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10

Hi!

> The patch I sent fully implements O_SYNC (actually, it implements
> O_DSYNC, which is allowed to skip the inode sync if the only attribute
> which has changed is the timestamps) and fdatasync. It's easy for me
> to make the DSYNC selectable via sysctl for full SU compliance, and I
> know of other unixes that already do this --- you really don't want
> existing database applications suddenly to start seeking to the inode
> block for every O_SYNC write.

It looks to me like times updates are upper-bound by once per second,
no? So this should not be (big) issue.
Pavel
--
I'm [email protected]. "In my country we have almost anarchy and I don't care."
Panos Katsaloulis describing me w.r.t. patents at [email protected]

2000-11-04 20:49:50

by Marco d'Itri

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10

On Nov 02, "Stephen C. Tweedie" <[email protected]> wrote:

>2.2 O_SYNC is actually broken too --- it doesn't sync all metadata (in
>particular, it doesn't update the inode), but I'd rather fix that for
>2.4 rather than change 2.2, as the main users of O_SYNC, databases,
>are writing to preallocated files anyway.
What about fsync(2)? Will it update metadata too?

--
ciao,
Marco


2000-11-05 01:45:19

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10

Followup to: <[email protected]>
By author: "Marco d'Itri" <[email protected]>
In newsgroup: linux.dev.kernel
>
> On Nov 02, "Stephen C. Tweedie" <[email protected]> wrote:
>
> >2.2 O_SYNC is actually broken too --- it doesn't sync all metadata (in
> >particular, it doesn't update the inode), but I'd rather fix that for
> >2.4 rather than change 2.2, as the main users of O_SYNC, databases,
> >are writing to preallocated files anyway.
> What about fsync(2)? Will it update metadata too?
>

It better. fdatasync(), if implemented, is allowed to skip that
requirement.

-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt

2000-11-06 12:58:06

by Stephen C. Tweedie

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10

Hi,

On Sat, Nov 04, 2000 at 07:49:37PM +0100, Marco d'Itri wrote:
> On Nov 02, "Stephen C. Tweedie" <[email protected]> wrote:
>
> >2.2 O_SYNC is actually broken too --- it doesn't sync all metadata (in
> >particular, it doesn't update the inode), but I'd rather fix that for
> >2.4 rather than change 2.2, as the main users of O_SYNC, databases,
> >are writing to preallocated files anyway.
> What about fsync(2)? Will it update metadata too?

Yes --- fsync has never had that problem.

--Stephen

2000-11-07 11:14:11

by Ragnar Hojland Espinosa

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10

On Fri, Nov 03, 2000 at 01:25:10AM -0500, [email protected] wrote:
> > Just happened with test10, same circumstances .. font map got corrupted, and
> > noise on the screen. Switching back and forth from X to a vc fixed it, tho.
> >
> > Sort of amusing that it (apparently) only happens with ppp + wget ..
>
> You have a voodoo3 or voodoo5 with X4, and the DRI X4 module loaded.
>
> Or am I wrong?

v3.. bingo :)

--
____/| Ragnar H?jland Freedom - Linux - OpenGL Fingerprint 94C4B
\ o.O| 2F0D27DE025BE2302C
=(_)= "Thou shalt not follow the NULL pointer for 104B78C56 B72F0822
U chaos and madness await thee at its end." hkp://keys.pgp.com

Handle via comment channels only.

2000-11-07 18:39:38

by Zephaniah E. Hull

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10

On Tue, Nov 07, 2000 at 11:22:37AM +0100, Ragnar Hojland Espinosa wrote:
<snip>
> > You have a voodoo3 or voodoo5 with X4, and the DRI X4 module loaded.
> >
> > Or am I wrong?
>
> v3.. bingo :)

Comment out the 'Load "dri"' line from /etc/X11/XF86Config-4, I'm
working on debugging the problems.

Zephaniah E. Hull.
>
> --
> ____/| Ragnar H?jland Freedom - Linux - OpenGL Fingerprint 94C4B
> \ o.O| 2F0D27DE025BE2302C
> =(_)= "Thou shalt not follow the NULL pointer for 104B78C56 B72F0822
> U chaos and madness await thee at its end." hkp://keys.pgp.com
>
> Handle via comment channels only.
>

--
PGP EA5198D1-Zephaniah E. Hull <[email protected]>-GPG E65A7801
Keys available at http://whitestar.soark.net/~warp/public_keys.
CCs of replies from mailing lists are encouraged.

I am an "expert". Fear me, for I will wreak untold damage upon anything
I can get my grubby hands on.
-- Matt McLeod on ASR.


Attachments:
(No filename) (0.99 kB)
(No filename) (232.00 B)
Download all attachments

2000-11-09 16:48:38

by Stephen C. Tweedie

[permalink] [raw]
Subject: Re: Linux-2.4.0-test10

Hi,

On Sat, Nov 04, 2000 at 07:49:37PM +0100, Marco d'Itri wrote:
> On Nov 02, "Stephen C. Tweedie" <[email protected]> wrote:
>
> >2.2 O_SYNC is actually broken too --- it doesn't sync all metadata (in
> >particular, it doesn't update the inode), but I'd rather fix that for
> >2.4 rather than change 2.2, as the main users of O_SYNC, databases,
> >are writing to preallocated files anyway.
> What about fsync(2)? Will it update metadata too?

Always. fdatasync() is permitted to skip timestamp updates, but
fsync() is not.

Cheers,
Stephen

2000-12-09 07:18:23

by M.H.VanLeeuwen

[permalink] [raw]
Subject: another buffer.c:827 BUG, RAID1 reconstruction.

Hi,

I got this BUG report after test12-pre7 soft locked on my NFS server,
all nfsd's in D state and I had to reboot and system was rebuilding the
ide RAID1 arrays.

NFS client test12-pre7 was rebooted as well, root logged in, and ran ldconfig

NFS server BUG'd out

Hand copied OOPS hope too much isn't wrong.

Martin

heli:~$ ksymoops -K -L -O -m /boot/System.map-2.4.0-test12 < buffer.bug
ksymoops 2.3.5 on i586 2.2.17-RAID. Options used
-V (default)
-K (specified)
-L (specified)
-O (specified)
-m /boot/System.map-2.4.0-test12 (specified)

Kernel BUG at buffer.c 827
invalid operand: 0000
CPU: 0
EIP: 0010:[<c012ca53>]
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010082
eax: 0000001c ebx: c10837d0 ecx: 00000000 edx: 00000000
esi: c1be2e40 edi: 00000002 epb: c1be2e88 esp: c0275ea8
ds: 0018 es: 0018 ss: 0018
Stack: c02151a5 c021545a 0000033b 0001e2ba 00000046 c13c7be0 cabe2e40 c011c226c
c1b32e40 00000001 c13c7be0 c13c7bf8 00000001 00000001 c01c2392 c13c7be0
00000001 c13c7b88 c11d3780 00000002 c0174c39 c13c7bf8 00000001 c11d3780
Call Trace: [<c0215195>] [<c021545a>] [<c01c226e>] [<c01c2399>] [<c0174c39>] [<c019db3b>] [<c01a60b4>]
[<c019f367>] [<c01a6050>] [<c010a04f>] [<c010a1ae>] [<c01071f0>] [<ffffe000>] [<c0108f20>] [<c0107180>]
[<ffffe000>] [<c0107213>] [<c0107277>] [<c0105000>] [<c0100191>]
code: 0f 06 83 c4 0c 8d 73 28 8d 43 2c 39 43 2c 74 15 b9 01 00 00

>>EIP; c012ca53 <end_buffer_io_async+c7/f4> <=====
Trace; c0215195 <tvecs+318d/19d24>
Trace; c021545a <tvecs+3452/19d24>
Trace; c01c226e <raid1_end_bh_io+7e/110>
Trace; c01c2399 <raid1_end_request+99/a0>
Trace; c0174c39 <end_that_request_first+65/c4>
Trace; c019db3b <ide_end_request+27/74>
Trace; c01a60b4 <ide_dma_intr+64/9c>
Trace; c019f367 <ide_intr+fb/150>
Trace; c01a6050 <ide_dma_intr+0/9c>
Trace; c010a04f <handle_IRQ_event+2f/58>
Trace; c010a1ae <do_IRQ+6e/b0>
Trace; c01071f0 <default_idle+0/28>
Trace; ffffe000 <END_OF_CODE+3fd2af28/????>
Trace; c0108f20 <ret_from_intr+0/20>
Trace; c0107180 <init+a4/104>
Trace; ffffe000 <END_OF_CODE+3fd2af28/????>
Trace; c0107213 <default_idle+23/28>
Trace; c0107277 <cpu_idle+3f/54>
Trace; c0105000 <empty_bad_page+0/1000>
Trace; c0100191 <L6+0/2>
Code; c012ca53 <end_buffer_io_async+c7/f4>
00000000 <_EIP>:
Code; c012ca53 <end_buffer_io_async+c7/f4> <=====
0: 0f 06 clts <=====
Code; c012ca55 <end_buffer_io_async+c9/f4>
2: 83 c4 0c add $0xc,%esp
Code; c012ca58 <end_buffer_io_async+cc/f4>
5: 8d 73 28 lea 0x28(%ebx),%esi
Code; c012ca5b <end_buffer_io_async+cf/f4>
8: 8d 43 2c lea 0x2c(%ebx),%eax
Code; c012ca5e <end_buffer_io_async+d2/f4>
b: 39 43 2c cmp %eax,0x2c(%ebx)
Code; c012ca61 <end_buffer_io_async+d5/f4>
e: 74 15 je 25 <_EIP+0x25> c012ca78 <end_buffer_io_async+ec/f4>
Code; c012ca63 <end_buffer_io_async+d7/f4>
10: b9 01 00 00 00 mov $0x1,%ecx

Aiee, killing interrupt handler
heli:~$

2000-12-09 07:41:53

by M.H.VanLeeuwen

[permalink] [raw]
Subject: Cache problems on test12-pre?

Hi,

I've notices weird compile time failures etc on test12-pre7, especially
running more than 2 simultaneous processes...

but most noticeable is the time it takes to run ldconfig, after the
first time test11 takes less than 1 second, test12-pre7 takes ~40
seconds.

both were run immediately after reboot on a completely idle system
this system is almost exclusively NFS mounted file systems.

Is this a known problem?

Martin

here is test11 vs test12-pre7

shadow:~# time ldconfig shadow:~# time ldconfig

real 0m35.881s real 0m43.979s
user 0m0.120s user 0m0.090s
sys 0m0.890s sys 0m0.980s
shadow:~# time ldconfig shadow:~# time ldconfig

real 0m0.870s <<<<-------->>>> real 0m38.702s
user 0m0.040s user 0m0.120s
sys 0m0.230s sys 0m0.980s
shadow:~# time ldconfig shadow:~# time ldconfig

real 0m0.345s <<<<-------->>>> real 0m40.181s
user 0m0.040s user 0m0.130s
sys 0m0.070s sys 0m0.900s
shadow:~# time ldconfig shadow:~# time ldconfig

real 0m0.098s <<<<--------->>>> real 0m39.108s
user 0m0.050s user 0m0.110s
sys 0m0.050s sys 0m1.180s