2005-01-14 18:16:01

by David Greaves

[permalink] [raw]
Subject: Re: XFS: inode with st_mode == 0


Christoph Hellwig wrote:

>>> We have applied these two patches to 2.6.10-rc2, but this
>>>does not help. A few minutes ago I've got the "?----------" file
>>>again from my test script. This time it took >4 hours (it was
>>>about an hour or so without this patch).
>>>
>>>
I'm seeing this problem occasionally too.

I'm running 2.6.10

The patch you provided below can be made to apply but I get:
LD .tmp_vmlinux1
fs/built-in.o(.text+0xa0ed3): In function `linvfs_decode_fh':
: undefined reference to `find_exported_dentry'

I assume that since you say:

> Btw, any chance you could
>try XFS CVS (which is at 2.6.9) + the patch below instead of plain 2.6.9,
>there have been various other fixes in the last months.
>
>
That not all the changes in XFS CVS have made it to 2.6.10?

Is there a 2.6.10 patch that I could apply? Or do you have any other
suggestions.

David

>
>Index: fs/xfs/xfs_vfsops.c
>===================================================================
>RCS file: /cvs/linux-2.6-xfs/fs/xfs/xfs_vfsops.c,v
>retrieving revision 1.459
>diff -u -p -r1.459 xfs_vfsops.c
>--- fs/xfs/xfs_vfsops.c 15 Dec 2004 04:56:58 -0000 1.459
>+++ fs/xfs/xfs_vfsops.c 16 Dec 2004 20:47:22 -0000
>@@ -1581,7 +1581,7 @@ xfs_syncsub(
> }
>
> /*
>- * xfs_vget - called by DMAPI to get vnode from file handle
>+ * xfs_vget - called by DMAPI and NFSD to get vnode from file handle
> */
> STATIC int
> xfs_vget(
>@@ -1623,7 +1623,7 @@ xfs_vget(
> return XFS_ERROR(EIO);
> }
>
>- if (ip->i_d.di_mode == 0 || (igen && (ip->i_d.di_gen != igen))) {
>+ if (ip->i_d.di_mode == 0 || ip->i_d.di_gen != igen) {
> xfs_iput_new(ip, XFS_ILOCK_SHARED);
> *vpp = NULL;
> return XFS_ERROR(ENOENT);
>Index: fs/xfs/linux-2.6/xfs_super.c
>===================================================================
>RCS file: /cvs/linux-2.6-xfs/fs/xfs/linux-2.6/xfs_super.c,v
>retrieving revision 1.321
>diff -u -p -r1.321 xfs_super.c
>--- fs/xfs/linux-2.6/xfs_super.c 9 Dec 2004 02:41:20 -0000 1.321
>+++ fs/xfs/linux-2.6/xfs_super.c 16 Dec 2004 20:47:23 -0000
>@@ -731,6 +731,39 @@ linvfs_get_dentry(
> return result;
> }
>
>+STATIC struct dentry *
>+linvfs_decode_fh(
>+ struct super_block *sb,
>+ __u32 *fh,
>+ int fh_len,
>+ int fileid_type,
>+ int (*acceptable)(
>+ void *context,
>+ struct dentry *de),
>+ void *context)
>+{
>+ __u32 parent[2];
>+ parent[0] = parent[1] = 0;
>+
>+ if (fh_len < 2 || fileid_type > 2)
>+ return NULL;
>+
>+ if (fileid_type == 2 && fh_len > 2) {
>+ if (fh_len == 3) {
>+ printk(KERN_WARNING
>+ "XFS: detected filehandle without "
>+ "parent inode generation information.");
>+ return ERR_PTR(-ESTALE);
>+ }
>+
>+ parent[0] = fh[2];
>+ parent[1] = fh[3];
>+ }
>+
>+ return find_exported_dentry(sb, fh, parent, acceptable, context);
>+
>+}
>+
> STATIC int
> linvfs_show_options(
> struct seq_file *m,
>@@ -893,6 +926,7 @@ linvfs_get_sb(
>
>
> STATIC struct export_operations linvfs_export_ops = {
>+ .decode_fh = linvfs_decode_fh,
> .get_parent = linvfs_get_parent,
> .get_dentry = linvfs_get_dentry,
> };
>Index: include/linux/fs.h
>===================================================================
>RCS file: /cvs/linux-2.6-xfs/include/linux/fs.h,v
>retrieving revision 1.11
>diff -u -p -r1.11 fs.h
>--- include/linux/fs.h 1 Oct 2004 15:10:15 -0000 1.11
>+++ include/linux/fs.h 16 Dec 2004 20:47:25 -0000
>@@ -1115,6 +1115,10 @@ struct export_operations {
>
> };
>
>+extern struct dentry *
>+find_exported_dentry(struct super_block *sb, void *obj, void *parent,
>+ int (*acceptable)(void *context, struct dentry *de),
>+ void *context);
>
> struct file_system_type {
> const char *name;
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to [email protected]
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/
>
>
>


2005-01-14 18:23:25

by Jakob Oestergaard

[permalink] [raw]
Subject: Re: XFS: inode with st_mode == 0

On Fri, Jan 14, 2005 at 06:14:55PM +0000, David Greaves wrote:
>
...
> >try XFS CVS (which is at 2.6.9) + the patch below instead of plain 2.6.9,
> >there have been various other fixes in the last months.
> >
> >
> That not all the changes in XFS CVS have made it to 2.6.10?
>
> Is there a 2.6.10 patch that I could apply? Or do you have any other
> suggestions.

AFAIK the best you can do is to get the most recent XFS kernel from
SGI's CVS (this one is based on 2.6.10).

If you run that kernel, then most of the former problems will be gone;
*) I only have one undeletable directory on my system - so it seems that
this error is no longer common ;)
*) 2.6.10 apparently fixes the knfsd stale handle problem
*) I no longer see the weird directory/file/??? mode problems or random
ownership assignment

So apart from the general well known instability problems that will
occur when you actually start *using* the system, there should be no
major problems running XFS+SMP+NFS on SGI's 2.6.10 based kernel ;)

--

/ jakob

2005-01-15 02:12:23

by Nathan Scott

[permalink] [raw]
Subject: Re: XFS: inode with st_mode == 0

On Fri, Jan 14, 2005 at 07:23:09PM +0100, Jakob Oestergaard wrote:
> > Is there a 2.6.10 patch that I could apply? Or do you have any other
> > suggestions.
>
> AFAIK the best you can do is to get the most recent XFS kernel from
> SGI's CVS (this one is based on 2.6.10).

The -mm tree also has these fixes; we'll get them merged into
mainline soon.

> If you run that kernel, then most of the former problems will be gone;
> *) I only have one undeletable directory on my system - so it seems that
> this error is no longer common ;)

You may need to run xfs_repair to clean that up..? Or does
the problem persist after a repair?

cheers.

--
Nathan

2005-01-16 13:51:16

by Christoph Hellwig

[permalink] [raw]
Subject: Re: XFS: inode with st_mode == 0

On Fri, Jan 14, 2005 at 07:23:09PM +0100, Jakob Oestergaard wrote:
> So apart from the general well known instability problems that will
> occur when you actually start *using* the system, there should be no

What known instabilities?

2005-01-17 00:53:39

by Jakob Oestergaard

[permalink] [raw]
Subject: Re: XFS: inode with st_mode == 0

On Sat, Jan 15, 2005 at 01:09:08PM +1100, Nathan Scott wrote:
...
> > AFAIK the best you can do is to get the most recent XFS kernel from
> > SGI's CVS (this one is based on 2.6.10).
>
> The -mm tree also has these fixes; we'll get them merged into
> mainline soon.

Okeydokey - good

>
> > If you run that kernel, then most of the former problems will be gone;
> > *) I only have one undeletable directory on my system - so it seems that
> > this error is no longer common ;)
>
> You may need to run xfs_repair to clean that up..? Or does
> the problem persist after a repair?

I'm running Debian Woody - the xfs_check/xfs_repair there didn't seem to
find anything last I tried. I have not re-checked for this last problem
though.

I figured I might need to run the CVS version of xfs tools, and, well,
me being busy and all, I thought I'd just leave the 'delete_me'
directory hanging until some time I got more time on my hands ;)

--

/ jakob

2005-01-17 10:11:30

by Jakob Oestergaard

[permalink] [raw]
Subject: Re: XFS: inode with st_mode == 0

On Sun, Jan 16, 2005 at 01:51:12PM +0000, Christoph Hellwig wrote:
> On Fri, Jan 14, 2005 at 07:23:09PM +0100, Jakob Oestergaard wrote:
> > So apart from the general well known instability problems that will
> > occur when you actually start *using* the system, there should be no
>
> What known instabilities?

Where should I begin? ;)

Most of the following have already been posted to LKML - primarily by
Anders ([email protected]) - it seems that noone cares, but I'll repost a
summary that Anders sent me below:

-------
Scenario 1: Mailservers:
2.6.10 (~24-40 hours uptime):
Running ext3 on mailqueue:

<SNIP>
Unable to handle kernel NULL pointer dereference at virtual address 00000004
printing eip:
c018a095
*pde = 00000000
Oops: 0002 [#1]
SMP
Modules linked in: nfs e1000 iptable_nat ipt_connlimit rtc
CPU: 2
EIP: 0060:[<c018a095>] Not tainted
EFLAGS: 00010286 (2.6.8.1)
EIP is at journal_commit_transaction+0x535/0x10e5
eax: cac1e26c ebx: 00000000 ecx: f7cec400 edx: f7cec400
esi: f65f3000 edi: cac1e26c ebp: f65f3000 esp: f65f3dc0
ds: 007b es: 007b ss: 0068
Process kjournald (pid: 174, threadinfo=f65f3000 task=c2308b70)
Stack: f65f3e64 00000000 00000000 00000000 00000000 00000000 f7cec400 cda565fc
0000149a 00000004 f65f3e48 c01132d8 00000002 c202ad20 00000001 f65f3e5c
c202ad20 c202ad20 00000002 00000001 0000001e 01c1af60 f65f3e68 c0407dc0
Call Trace:
[<c01132d8>] scheduler_tick+0x468/0x470
[<c01127b5>] find_busiest_group+0x105/0x310
[<c011db8e>] del_timer_sync+0x7e/0xa0
[<c018cd4d>] kjournald+0xbd/0x230
[<c0114b10>] autoremove_wake_function+0x0/0x40
[<c0114b10>] autoremove_wake_function+0x0/0x40
[<c0103f16>] ret_from_fork+0x6/0x14
[<c018cc70>] commit_timeout+0x0/0x10
[<c018cc90>] kjournald+0x0/0x230
[<c01024bd>] kernel_thread_helper+0x5/0x18
Code: f0 ff 43 04 8b 03 83 e0 04 74 4c 8b 8c 24 b8 01 00 00 c6 81
<2>SoftDog: Initiating system reboot
</SNIP>

-------
Scenario 2: Mailservers:
Running XFS on mailqueue:

<SNIP>
Filesystem "sdb1": xfs_trans_delete_ail: attempting to delete a log item that
is not in the AIL
xfs_force_shutdown(sdb1,0x8) called from line 382 of file
fs/xfs/xfs_trans_ail.c. Return address = 0xc0216a56
@Linux version 2.6.9 ([email protected]) (gcc version 2.96 20000731 (Red
Hat Linux 7.3 2.96-113)) #1 SMP Tue Oct 19 16:04:55 CEST 2004
</SNIP>


=======
Resolution to the mailserver problem:
2.4.28 is perfectly stable on these machines.

-------
Scenario 3: Webservers:

2.6.10, 2.6.10-ac8 (~3-12 hours uptime):

<SNIP>
Unable to handle kernel paging request
<2>SoftDog: Initiating system reboot.
<SNIP>
(No more...) :(

=======
Resolution to the webserver problem:
2.4.28/2.4.29-rc2 are stable here

-------
Scenario 4: Storageservers:
2.6.8.1:
Oopses after ~5-10 hours whith SMP on. - Cannot find the actual Oopses
anymore and 2.6.8+ havent been tested as we cannot afford anymore downtime on
these servers.


=======
Resolution to the storage server problem:
2.6.8.1 UP is stable (but oopses regularly after memory allocation
failures)



Hardware on all servers: IBM x335 and x345.

Mentioned errors seen on a total of 17 servers.

--

/ jakob

2005-01-17 11:56:30

by Jan-Frode Myklebust

[permalink] [raw]
Subject: Re: XFS: inode with st_mode == 0

On Mon, Jan 17, 2005 at 11:07:46AM +0100, Jakob Oestergaard wrote:
>
> Where should I begin? ;)

Guess we've been struggeling with much of the same problems..

> -------
> Scenario 2: Mailservers:
> Running XFS on mailqueue:

The 2.6.10-1.737_FC3 + 's/posix_lock_file/posix_lock_file_wait/' on
fs/nfs/file.c seems stable on our mailserver running XFS on
mail queue and spool (mbox). 4 days of uptime!

>
> =======
> Resolution to the storage server problem:
> 2.6.8.1 UP is stable (but oopses regularly after memory allocation
> failures)

My XFS-fileserver ran 2.6.9-rc3 stable since october 25. Got lots of
"possible deadlock in kmem_alloc (mode:0xd0)" this weekend, so I
upgraded to plain 2.6.10. Seems OK so far.

>
> Hardware on all servers: IBM x335 and x345.

Mail servers: Dell 2650, IBM ServeRAID 6M, EXP400.
File servers: IBM x330, qla2300, infortrend eonstor.

All running Whitebox/centos RHEL clone.


-jf

2005-01-17 13:48:29

by Anders Saaby

[permalink] [raw]
Subject: Re: XFS: inode with st_mode == 0

Hi,

On Monday 17 January 2005 12:55, Jan-Frode Myklebust wrote:
>
> Guess we've been struggeling with much of the same problems..

Seems like it. :)

> > -------
> > Scenario 2: Mailservers:
> > Running XFS on mailqueue:
>
> The 2.6.10-1.737_FC3 + 's/posix_lock_file/posix_lock_file_wait/' on
> fs/nfs/file.c seems stable on our mailserver running XFS on
> mail queue and spool (mbox). 4 days of uptime!

Yes - We had those errors to:

"Kernel?panic?-?not?syncing:?Attempting?to?free?lock?with?active?block?list"

- on 2.6.10 on the webservers, which was fixed with that particular patch. But
this is a different error as our mailservers dont't act as NFS clients. All
use local XFS.

Sad thing is that the mailservers crashes every 10-20 hours on 2.6.x, but I'm
not able to reproduce it in a test environment, and at time of original post
to LKML noone was able to do anything about it without a reproduceable
testcase. :(

> > =======
> > Resolution to the storage server problem:
> > 2.6.8.1 UP is stable (but oopses regularly after memory allocation
> > failures)
>
> My XFS-fileserver ran 2.6.9-rc3 stable since october 25. Got lots of
> "possible deadlock in kmem_alloc (mode:0xd0)" this weekend, so I
> upgraded to plain 2.6.10. Seems OK so far.
>

OK, as far as i remember, we had the same messages in the kernel log when
running with SMP.

--
Med venlig hilsen - Best regards - Meilleures salutations

Anders Saaby
Systems Engineer
------------------------------------------------
Cohaesio A/S - Maglebjergvej 5D - DK-2800 Lyngby
Phone: +45 45 880 888 - Fax: +45 45 880 777
Mail: [email protected] - http://www.cohaesio.com
------------------------------------------------

2005-01-17 21:31:22

by Jeffrey Hundstad

[permalink] [raw]
Subject: journaled filesystems -- known instability; Was: XFS: inode with st_mode == 0

For more of this look up subjects:
Bad things happening to journaled filesystem machines
Oops in kjournald
and from author:
Anders Saaby

I also can't keep a recent 2.6 or 2.6*-ac* kernel up more than a few
hours on a machine under real load. Perhaps us folks with the problem
need to talk to the powers who be to come up with a strategy to make a
report they can use. My guess is we're not sending something that can
be used.

--
jeffrey hundstad


Jakob Oestergaard wrote:

>On Sun, Jan 16, 2005 at 01:51:12PM +0000, Christoph Hellwig wrote:
>
>
>>On Fri, Jan 14, 2005 at 07:23:09PM +0100, Jakob Oestergaard wrote:
>>
>>
>>>So apart from the general well known instability problems that will
>>>occur when you actually start *using* the system, there should be no
>>>
>>>
>>What known instabilities?
>>
>>
>
>Where should I begin? ;)
>
>Most of the following have already been posted to LKML - primarily by
>Anders ([email protected]) - it seems that noone cares, but I'll repost a
>summary that Anders sent me below:
>
>-------
>Scenario 1: Mailservers:
> 2.6.10 (~24-40 hours uptime):
> Running ext3 on mailqueue:
>
><SNIP>
>Unable to handle kernel NULL pointer dereference at virtual address 00000004
>printing eip:
>c018a095
>*pde = 00000000
>Oops: 0002 [#1]
>SMP
>Modules linked in: nfs e1000 iptable_nat ipt_connlimit rtc
>CPU: 2
>EIP: 0060:[<c018a095>] Not tainted
>EFLAGS: 00010286 (2.6.8.1)
>EIP is at journal_commit_transaction+0x535/0x10e5
>eax: cac1e26c ebx: 00000000 ecx: f7cec400 edx: f7cec400
>esi: f65f3000 edi: cac1e26c ebp: f65f3000 esp: f65f3dc0
>ds: 007b es: 007b ss: 0068
>Process kjournald (pid: 174, threadinfo=f65f3000 task=c2308b70)
>Stack: f65f3e64 00000000 00000000 00000000 00000000 00000000 f7cec400 cda565fc
> 0000149a 00000004 f65f3e48 c01132d8 00000002 c202ad20 00000001 f65f3e5c
> c202ad20 c202ad20 00000002 00000001 0000001e 01c1af60 f65f3e68 c0407dc0
>Call Trace:
> [<c01132d8>] scheduler_tick+0x468/0x470
> [<c01127b5>] find_busiest_group+0x105/0x310
> [<c011db8e>] del_timer_sync+0x7e/0xa0
> [<c018cd4d>] kjournald+0xbd/0x230
> [<c0114b10>] autoremove_wake_function+0x0/0x40
> [<c0114b10>] autoremove_wake_function+0x0/0x40
> [<c0103f16>] ret_from_fork+0x6/0x14
> [<c018cc70>] commit_timeout+0x0/0x10
> [<c018cc90>] kjournald+0x0/0x230
> [<c01024bd>] kernel_thread_helper+0x5/0x18
>Code: f0 ff 43 04 8b 03 83 e0 04 74 4c 8b 8c 24 b8 01 00 00 c6 81
> <2>SoftDog: Initiating system reboot
></SNIP>
>
>-------
>Scenario 2: Mailservers:
> Running XFS on mailqueue:
>
><SNIP>
>Filesystem "sdb1": xfs_trans_delete_ail: attempting to delete a log item that
>is not in the AIL
>xfs_force_shutdown(sdb1,0x8) called from line 382 of file
>fs/xfs/xfs_trans_ail.c. Return address = 0xc0216a56
>@Linux version 2.6.9 ([email protected]) (gcc version 2.96 20000731 (Red
>Hat Linux 7.3 2.96-113)) #1 SMP Tue Oct 19 16:04:55 CEST 2004
></SNIP>
>
>
>=======
>Resolution to the mailserver problem:
> 2.4.28 is perfectly stable on these machines.
>
>-------
>Scenario 3: Webservers:
>
> 2.6.10, 2.6.10-ac8 (~3-12 hours uptime):
>
> <SNIP>
> Unable to handle kernel paging request
> <2>SoftDog: Initiating system reboot.
> <SNIP>
> (No more...) :(
>
>=======
>Resolution to the webserver problem:
> 2.4.28/2.4.29-rc2 are stable here
>
>-------
>Scenario 4: Storageservers:
> 2.6.8.1:
> Oopses after ~5-10 hours whith SMP on. - Cannot find the actual Oopses
>anymore and 2.6.8+ havent been tested as we cannot afford anymore downtime on
>these servers.
>
>
>=======
>Resolution to the storage server problem:
> 2.6.8.1 UP is stable (but oopses regularly after memory allocation
> failures)
>
>
>
>Hardware on all servers: IBM x335 and x345.
>
>Mentioned errors seen on a total of 17 servers.
>
>
>

2005-01-17 22:43:30

by Alan

[permalink] [raw]
Subject: Re: journaled filesystems -- known instability; Was: XFS: inode with st_mode == 0

On Llu, 2005-01-17 at 21:31, Jeffrey Hundstad wrote:
> I also can't keep a recent 2.6 or 2.6*-ac* kernel up more than a few
> hours on a machine under real load. Perhaps us folks with the problem
> need to talk to the powers who be to come up with a strategy to make a
> report they can use. My guess is we're not sending something that can
> be used.

I need a way to reproduce it. Preferably on a hardware configuration
that is running 2.6.10-ac10 or later because of the bio and acpi fixes.
I'm not interested in any report including binary drivers and to be
honest the least complex configuration the better. I also care that the
hardware passes memtest86+ !

I also don't care about XFS although Christoph may well do.

Alan

2005-01-20 22:31:06

by Jeffrey Hundstad

[permalink] [raw]
Subject: Re: journaled filesystems -- known instability; Was: XFS: inode with st_mode == 0

Jeffrey Hundstad wrote:

> For more of this look up subjects:
> Bad things happening to journaled filesystem machines
> Oops in kjournald
> and from author:
> Anders Saaby
>
> I also can't keep a recent 2.6 or 2.6*-ac* kernel up more than a few
> hours on a machine under real load. Perhaps us folks with the
> problem need to talk to the powers who be to come up with a strategy
> to make a report they can use. My guess is we're not sending
> something that can be used.
>
I have found two server in my operation that seem to do quite well on
linux-2.6.7. So I believe the brokenness is after this point and before
linux-2.6.8.1.

...so far I'm not seeing problems after two days with
linux-2.6.10-ac10. I'm still crossing my fingers and knocking on wood.

--
jeffrey hundstad

2005-01-25 12:47:59

by Stephen C. Tweedie

[permalink] [raw]
Subject: Re: journaled filesystems -- known instability; Was: XFS: inode with st_mode == 0

Hi,

On Mon, 2005-01-17 at 21:31, Jeffrey Hundstad wrote:
> For more of this look up subjects:
> Bad things happening to journaled filesystem machines
> Oops in kjournald

That seems to have been due to the xattr problems recently fixed in
Linus's tree. The xattr race was allowing one process to delete an
unshared xattr block while another was trying to share it, and the
journaling code was getting upset when the second process then tried to
commit the now-deleted block.

--Stephen


2005-01-25 15:12:21

by Jeffrey Hundstad

[permalink] [raw]
Subject: Re: journaled filesystems -- known instability; Was: XFS: inode with st_mode == 0

Stephen C. Tweedie wrote:

>Hi,
>
>On Mon, 2005-01-17 at 21:31, Jeffrey Hundstad wrote:
>
>
>>For more of this look up subjects:
>> Bad things happening to journaled filesystem machines
>> Oops in kjournald
>>
>>
>
>That seems to have been due to the xattr problems recently fixed in
>Linus's tree. The xattr race was allowing one process to delete an
>unshared xattr block while another was trying to share it, and the
>journaling code was getting upset when the second process then tried to
>commit the now-deleted block.
>
>

Thanks for the update.

I wonder if there are several problems. Alan Cox claimed that there was
a fix in linux-2.6.10-ac10 that might alleviate the problem.

On linux-2.6.10-ac10 I've got one machine that's been up for 6 days now
that would never last more then 1 before. On the other hand I have one
machine that did die after two days.

Does linux-2.6.11-rc2 have both the linux-2.6.10-ac10 fix and the xattr
problem fixed? If so, I'll test there.

--
Jeffrey Hundstad

2005-01-25 15:39:01

by Stephen C. Tweedie

[permalink] [raw]
Subject: Re: journaled filesystems -- known instability; Was: XFS: inode with st_mode == 0

Hi,

On Tue, 2005-01-25 at 15:09, Jeffrey Hundstad wrote:

> >> Bad things happening to journaled filesystem machines
> >> Oops in kjournald

> I wonder if there are several problems. Alan Cox claimed that there was
> a fix in linux-2.6.10-ac10 that might alleviate the problem.

I'm not sure --- there are a couple of bio/bh-related fixes in that
patch, but nothing against jbd/ext3 itself.

> Does linux-2.6.11-rc2 have both the linux-2.6.10-ac10 fix and the xattr
> problem fixed?

Not sure about how much of -ac went in, but it has the xattr fix.

--Stephen

2005-01-28 20:24:25

by Jeffrey Hundstad

[permalink] [raw]
Subject: Re: journaled filesystems -- known instability; Was: XFS: inode with st_mode == 0

Stephen C. Tweedie wrote:

>Hi,
>
>On Tue, 2005-01-25 at 15:09, Jeffrey Hundstad wrote:
>
>
>
>>>> Bad things happening to journaled filesystem machines
>>>> Oops in kjournald
>>>>
>>>>
>
>
>
>>I wonder if there are several problems. Alan Cox claimed that there was
>>a fix in linux-2.6.10-ac10 that might alleviate the problem.
>>
>>
>
>I'm not sure --- there are a couple of bio/bh-related fixes in that
>patch, but nothing against jbd/ext3 itself.
>
>
>
>>Does linux-2.6.11-rc2 have both the linux-2.6.10-ac10 fix and the xattr
>>problem fixed?
>>
>>
>
>Not sure about how much of -ac went in, but it has the xattr fix.
>
>--Stephen
>
>
>

I've had my machine that would crash daily if not hourly stay up for 10
days now. This is with the linux-2.6.10-ac10 kernel. I was wondering
if anyone else is having similiar results.

2005-01-28 21:09:06

by Stephen C. Tweedie

[permalink] [raw]
Subject: Re: journaled filesystems -- known instability; Was: XFS: inode with st_mode == 0

Hi,

On Fri, 2005-01-28 at 20:15, Jeffrey E. Hundstad wrote:

> >>Does linux-2.6.11-rc2 have both the linux-2.6.10-ac10 fix and the xattr
> >>problem fixed?

> >Not sure about how much of -ac went in, but it has the xattr fix.

> I've had my machine that would crash daily if not hourly stay up for 10
> days now. This is with the linux-2.6.10-ac10 kernel.

Good to know. Are you using xattrs extensively (eg. for ACLs, SELinux
or Samba 4)?

--Stephen

2005-01-28 21:13:36

by Jeffrey Hundstad

[permalink] [raw]
Subject: Re: journaled filesystems -- known instability; Was: XFS: inode with st_mode == 0

Stephen C. Tweedie wrote:

>Hi,
>
>On Fri, 2005-01-28 at 20:15, Jeffrey E. Hundstad wrote:
>
>
>
>>>>Does linux-2.6.11-rc2 have both the linux-2.6.10-ac10 fix and the xattr
>>>>problem fixed?
>>>>
>>>>
>
>
>
>>>Not sure about how much of -ac went in, but it has the xattr fix.
>>>
>>>
>
>
>
>>I've had my machine that would crash daily if not hourly stay up for 10
>>days now. This is with the linux-2.6.10-ac10 kernel.
>>
>>
>
>Good to know. Are you using xattrs extensively (eg. for ACLs, SELinux
>or Samba 4)?
>
>--Stephen
>
>
>
On the machines that were having problems we really weren't using them
for anything. I think I may have been running into the BIO problem that
was fixed in 2.6.10-ac10.