Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id ; Sat, 3 Feb 2001 05:29:56 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id ; Sat, 3 Feb 2001 05:29:47 -0500 Received: from parcelfarce.linux.theplanet.co.uk ([195.92.249.252]:23305 "EHLO www.linux.org.uk") by vger.kernel.org with ESMTP id ; Sat, 3 Feb 2001 05:29:38 -0500 From: Russell King Message-Id: <200102031023.f13ANlw16296@flint.arm.linux.org.uk> Subject: Re: Version 2.4.1 has ext2 problems. To: root@chaos.analogic.com Date: Sat, 3 Feb 2001 10:23:47 +0000 (GMT) Cc: linux-kernel@vger.kernel.org (Linux kernel) In-Reply-To: from "Richard B. Johnson" at Feb 02, 2001 02:44:51 PM X-Location: london.england.earth.mulky-way.universe X-Mailer: ELM [version 2.5 PL3] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Richard B. Johnson writes: > Files generated by e2fsck in lost+found cannot be removed. > # rm * > rm: cannot remove `#1006': Value too large for defined data type Well, I can say that this isn't an isolated incident. I was hitting 2.4.1 hard last night on ARM, and ended up loosing my /usr and /var mountpoints and a few other files to this exact corruption. I resorted to using debugfs to remove these entries, and re-running e2fsck. Oh, the other interesting thing about it was that they had random modes (eg, 1066440) - e2fsck also complained about a large number of errors on the affected inodes (eg, various fields of the inode structure which should be zero, d_time stuff, etc). Sorry, don't have the e2fsck logs, and I'm reluctant to try to reproduce it. I've been wondering if the ARMv3 implementation of insw/outsw is broken (yes, its running in PIO only), hence I haven't reported it until now, but it seemed to check out last night. Maybe this problem and my random process SEGV problem are connected in some way. Basically, I was trying to track down a problem with processes getting SEGV'd when swap partitions was enabled. I ended up with init in a loop panicing about SEGVs. It turns out that the wrong page had been paged back in into the binary, and therefore glibc's __environ pointer was corrupted. Specifically, the page that was placed there was the immediately preceding page. I know that other people have been seeing weird effects on 2.4.1 with corrupted zero pages, but I don't think this is my problem. -- Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux http://www.arm.linux.org.uk/personal/aboutme.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/