LinuxLists.cc - 2.6.8.1-ck7

2004-09-10 04:03:10

Subject: 2.6.8.1-ck7

These are patches designed to improve system responsiveness with
specific emphasis on the desktop, but configurable to any workload.

http://ck.kolivas.org/patches/2.6/2.6.8.1/2.6.8.1-ck7/patch-2.6.8.1-ck7.bz2

web:
http://kernel.kolivas.org
all patches:
http://ck.kolivas.org/patches/
Split patches and a server specific patch available.

Added since ck5 (last announced release):
-change_reiser4_config.diff
Make default reiser 4 off and not configurable if 4k stacks is on

-lenient_uw.diff
Mapped watermark was too aggressive with freeing ram. Balance is just
about right with this being a watermark just above "high" used by kswapd

-s8.1_test1
-s8.1test1_test2
-s8.1_smtfix
Numerous small fixes for the staircase code to not fool the "burst"
mechanism. This helps simulated hardware (wine, mame, zsnes) a lot.

-write-barriers.patch
File system write barriers added for cfq2

-cfq_iosched_v2.patch
Updated Completely Fair Queueing I/O scheduler by Jens Axboe

-vesafb-tng-0.9-rc4-r2-2.6.8.1.patch
New vesa framebuffer for more modes at console and changing resolution
on the fly
-vesafb_change_config.diff
Make the default off for compatibility purposes.

-back_journal_clean_checkpoint_list-latency-fix.patch
A small latency hack by akpm was found to be questionable; back it out.
-2.6.8.1-ck7-version.diff

Changed:
-supermount-ng205.diff
Compile fixes for gcc3.4+

Full patch list:

from_2.6.8.1_to_staircase8.0
schedrange.diff
schedbatch2.4.diff
schediso2.5.diff
sched-adjust-p4gain
mapped_watermark.diff
defaultcfq.diff
config_hz.diff
1g_lowmem_i386.diff
akpm-latency-fix.patch
9000-SuSE-117-writeback-lat.patch
cddvd-cmdfilter-drop.patch
cool-spinlocks-i386.diff
bio_uncopy_user-mem-leak.patch
bio_uncopy_user2.diff
ioport-latency-fix-2.6.8.1.patch
supermount-ng205.diff
fbsplash-0.9-r5-2.6.8-rc3.patch
make-tree_lock-an-rwlock.patch
invalidate_inodes-speedup.patch
2.6.8.1-mm2-reiser4.diff
change_reiser4_config.diff
s8.0_s8.1
mapped_watermark_fix.diff
sc_mw.diff
1g_change_config.diff
lenient_uw.diff
s8.1_test1
s8.1test1_test2
s8.1_smtfix
write-barriers.patch
cfq_iosched_v2.patch
vesafb-tng-0.9-rc4-r2-2.6.8.1.patch
vesafb_change_config.diff
back_journal_clean_checkpoint_list-latency-fix.patch
2.6.8.1-ck7-version.diff

Cheers,
Con Kolivas

Attachments:

signature.asc (256.00 B)
OpenPGP digital signature

2004-09-13 01:23:39

by Joshua Schmidlkofer

[permalink] [raw]

Subject: Re: 2.6.8.1-ck7, Two Badnessess, one dump.

I upgraded from 2.6.8.1-ck5.

First off - this has been a landmark improvement for me. Running an
"emerge -a world" on my system has gone from a matter of minutes to a
matter of seconds.

The performance has been !outstanding!.
[Disclosure: Using NVIDIA Binary Drivers]

The first night of running I got this:
xfs_fsr: page allocation failure. order:4, mode:0x50
[<c0131174>] __alloc_pages+0x332/0x3f6
[<c0131257>] __get_free_pages+0x1f/0x3b
[<c013404f>] kmem_getpages+0x1d/0xb3
[<c0134b05>] cache_grow+0x96/0x122
[<c0134ccc>] cache_alloc_refill+0x13b/0x1d7
[<c01350d9>] __kmalloc+0x72/0x79
[<c0221144>] kmem_alloc+0x58/0xba
[<c01c546c>] xfs_alloc_log_agf+0x58/0x5c
[<c0221253>] kmem_realloc+0x2b/0x7c
[<c020058e>] xfs_iext_realloc+0xf6/0x13e
[<c01d6967>] xfs_bmap_insert_exlist+0x35/0x8c
[<c01d3f3a>] xfs_bmap_add_extent_hole_real+0x3cd/0x73f
[<c01df64f>] xfs_bmbt_get_state+0x2f/0x3b
[<c01d0eb6>] xfs_bmap_add_extent+0x35b/0x42d
[<c02212f4>] kmem_zone_alloc+0x50/0x96
[<c022136f>] kmem_zone_zalloc+0x35/0x62
[<c01e1070>] xfs_btree_init_cursor+0x2a/0x147
[<c01d888f>] xfs_bmapi+0x66a/0x141b
[<c0214fde>] xfs_trans_tail_ail+0x12/0x28
[<c0206c9e>] xlog_assign_tail_lsn+0x19/0x33
[<c0208cc2>] xlog_state_release_iclog+0x26/0xd1
[<c02065df>] xfs_log_reserve+0xbe/0xc7
[<c0216100>] xfs_trans_iget+0x9c/0x163
[<c021f8bf>] xfs_alloc_file_space+0x425/0x695
[<c0201adf>] xfs_ichgtime+0x10a/0x10c
[<c02208a4>] xfs_change_file_space+0x1e3/0x420
[<c02064e4>] xfs_log_release_iclog+0x21/0x5e
[<c021488b>] xfs_trans_commit+0x219/0x3a7
[<c0130daa>] buffered_rmqueue+0xc8/0x160
[<c0242100>] copy_from_user+0x42/0x6e
[<c0226e6e>] xfs_ioc_space+0x9c/0xb9
[<c0226932>] xfs_ioctl+0x389/0x829
[<c013ac28>] handle_mm_fault+0xd2/0x138
[<c0113230>] do_page_fault+0x13f/0x53d
[<c013c55d>] do_mmap_pgoff+0x591/0x6ac
[<c0225916>] linvfs_ioctl+0x3b/0x47
[<c01573ca>] file_ioctl+0x6a/0x171
[<c01575a0>] sys_ioctl+0xcf/0x203
[<c0105791>] sysenter_past_esp+0x52/0x71

I shutdown, did an xfs_repair, and the error has not returned.

Two days later I got this:

Badness in cfq_sort_rr_list at drivers/block/cfq-iosched.c:428
[<c02a82d6>] cfq_add_crq_rb+0x16f/0x17a
[<c02a8fe0>] cfq_enqueue+0x3b/0x6c
[<c02a9108>] cfq_insert_request+0xf7/0x12b
[<c029e4fd>] __elv_add_request+0x45/0x9e
[<c02a134f>] __make_request+0x293/0x4e5
[<c02a16aa>] generic_make_request+0x109/0x18a
[<c012fc97>] mempool_alloc+0x6f/0x11e
[<c0115690>] autoremove_wake_function+0x0/0x57
[<c02a1788>] submit_bio+0x5d/0xfb
[<c014b93b>] bio_add_page+0x34/0x38
[<c02241a6>] _pagebuf_ioapply+0x1b6/0x2a4
[<c0224319>] pagebuf_iorequest+0x85/0x153
[<c01fe1ec>] xfs_xlate_dinode_core+0x15e/0x81d
[<c0114847>] default_wake_function+0x0/0x12
[<c0114847>] default_wake_function+0x0/0x12
[<c022973f>] xfs_bdstrat_cb+0x42/0x48
[<c0223e66>] pagebuf_iostart+0x4e/0xa3
[<c0200fda>] xfs_iflush+0x1ad/0x472
[<c021eef1>] xfs_inode_flush+0x189/0x1fe
[<c022973f>] xfs_bdstrat_cb+0x42/0x48
[<c02153db>] xfs_trans_first_ail+0x16/0x27
[<c0229d89>] linvfs_write_inode+0x32/0x36
[<c01631dc>] write_inode+0x46/0x48
[<c016339c>] __sync_single_inode+0x1be/0x1d0
[<c01635fc>] generic_sync_sb_inodes+0x19d/0x2b1
[<c01637ec>] writeback_inodes+0xaa/0xac
[<c01320c1>] background_writeout+0x71/0xb1
[<c0132b03>] pdflush+0x0/0x2c
[<c0132a40>] __pdflush+0x9c/0x15f
[<c0132b2b>] pdflush+0x28/0x2c
[<c0132050>] background_writeout+0x0/0xb1
[<c0132b03>] pdflush+0x0/0x2c
[<c0126dd1>] kthread+0xa5/0xab
[<c0126d2c>] kthread+0x0/0xab
[<c0103c1d>] kernel_thread_helper+0x5/0xb
Badness in cfq_sort_rr_list at drivers/block/cfq-iosched.c:428
[<c02a82d6>] cfq_add_crq_rb+0x16f/0x17a
[<c02a8fe0>] cfq_enqueue+0x3b/0x6c
[<c02a9108>] cfq_insert_request+0xf7/0x12b
[<c029e4fd>] __elv_add_request+0x45/0x9e
[<c02a134f>] __make_request+0x293/0x4e5
[<c02a16aa>] generic_make_request+0x109/0x18a
[<c012fc97>] mempool_alloc+0x6f/0x11e
[<c0115690>] autoremove_wake_function+0x0/0x57
[<c02a1788>] submit_bio+0x5d/0xfb
[<c01145fe>] scheduler_tick+0x1f/0x268
[<c014b93b>] bio_add_page+0x34/0x38
[<c02241a6>] _pagebuf_ioapply+0x1b6/0x2a4
[<c0224319>] pagebuf_iorequest+0x85/0x153
[<c0114847>] default_wake_function+0x0/0x12
[<c0114847>] default_wake_function+0x0/0x12
[<c022973f>] xfs_bdstrat_cb+0x42/0x48
[<c02248c2>] pagebuf_daemon+0xdc/0x1ce
[<c02247e6>] pagebuf_daemon+0x0/0x1ce
[<c0103c1d>] kernel_thread_helper+0x5/0xb

==

Thread model: posix
gcc version 3.3.4 20040623 (Gentoo Linux 3.3.4-r1, ssp-3.3.2-2, pie-8.7.6)

binutils-2.14.90.0.8

If there is something I can do, please let me know. I have limited
time, but I will do my best to help.

2004-09-13 03:52:23

by Con Kolivas

[permalink] [raw]

Subject: Re: [ck] Re: 2.6.8.1-ck7, Two Badnessess, one dump.

Joshua Schmidlkofer wrote:
> I upgraded from 2.6.8.1-ck5.
>
> First off - this has been a landmark improvement for me. Running an
> "emerge -a world" on my system has gone from a matter of minutes to a
> matter of seconds.
>
> The performance has been !outstanding!. [Disclosure: Using NVIDIA
> Binary Drivers]

Great to hear. Thanks for feedback.

Not sure about the xfs one... perhaps it's related to the cfq one.

> Badness in cfq_sort_rr_list at drivers/block/cfq-iosched.c:428

Known issue. There is a fix posted already in my ckdev directory (as
posted by Jens Axboe). The stack dump, while annoying and causes a stall
for a couple of seconds I believe, is harmless. Please apply the cfq2
fix in my ckdev directory for this to go away.

http://ck.kolivas.org/patches/2.6/2.6.8.1/2.6.8.1-ckdev/

Cheers,
Con

Attachments:

signature.asc (256.00 B)
OpenPGP digital signature

2004-09-13 15:29:39

by Joshua Schmidlkofer

[permalink] [raw]

Subject: Re: [ck] Re: 2.6.8.1-ck7, Two Badnessess, one dump.

Con,

I did not mention before, I thought it was a fluke on my system. Now
its affecting two systems since applying ck7.

<snip>
hda: dma_intr: status=0x58 { DriveReady SeekComplete DataRequest }

ide: failed opcode was: unknown
hda: set_drive_speed_status: status=0x58 { DriveReady SeekComplete
DataRequest }ide: failed opcode was 100
hda: dma_intr: status=0x58 { DriveReady SeekComplete DataRequest }

ide: failed opcode was: unknown
hda: set_drive_speed_status: status=0x58 { DriveReady SeekComplete
DataRequest }ide: failed opcode was 100
hda: CHECK for good STATUS
<snip>

That is happening while applying the dma settings to the hard drive.

In both cases, the drive is a Western Digital 40GB hard drive. That is
the only solid commoniality. One is a P4 2.8, the other a P4 2.4.
Intel Chipset + Intel IDE in one, Intel Chipset + HighPoint chipset in
the other.

However, the code is exactly the same.

Thanks,
Joshua

Con Kolivas wrote:

> Joshua Schmidlkofer wrote:
>
>> I upgraded from 2.6.8.1-ck5.
>>
>> First off - this has been a landmark improvement for me. Running an
>> "emerge -a world" on my system has gone from a matter of minutes to a
>> matter of seconds.
>>
>> The performance has been !outstanding!. [Disclosure: Using NVIDIA
>> Binary Drivers]
>
>
> Great to hear. Thanks for feedback.
>
> Not sure about the xfs one... perhaps it's related to the cfq one.
>
>> Badness in cfq_sort_rr_list at drivers/block/cfq-iosched.c:428
>
>
> Known issue. There is a fix posted already in my ckdev directory (as
> posted by Jens Axboe). The stack dump, while annoying and causes a
> stall for a couple of seconds I believe, is harmless. Please apply the
> cfq2 fix in my ckdev directory for this to go away.
>
> http://ck.kolivas.org/patches/2.6/2.6.8.1/2.6.8.1-ckdev/
>
> Cheers,
> Con

2004-09-13 19:14:16

by Jens Axboe

[permalink] [raw]

Subject: Re: [ck] Re: 2.6.8.1-ck7, Two Badnessess, one dump.

On Mon, Sep 13 2004, Joshua Schmidlkofer wrote:
> Con,
>
>
> I did not mention before, I thought it was a fluke on my system. Now
> its affecting two systems since applying ck7.
>
>
> <snip>
> hda: dma_intr: status=0x58 { DriveReady SeekComplete DataRequest }
>
> ide: failed opcode was: unknown
> hda: set_drive_speed_status: status=0x58 { DriveReady SeekComplete
> DataRequest }ide: failed opcode was 100
> hda: dma_intr: status=0x58 { DriveReady SeekComplete DataRequest }
>
> ide: failed opcode was: unknown
> hda: set_drive_speed_status: status=0x58 { DriveReady SeekComplete
> DataRequest }ide: failed opcode was 100
> hda: CHECK for good STATUS
> <snip>
>
> That is happening while applying the dma settings to the hard drive.
>
> In both cases, the drive is a Western Digital 40GB hard drive. That is
> the only solid commoniality. One is a P4 2.8, the other a P4 2.4.
> Intel Chipset + Intel IDE in one, Intel Chipset + HighPoint chipset in
> the other.
>
> However, the code is exactly the same.

Is your drive idle while applying dma settings? Current 2.6 kernels
aren't even close to being safe to modify drive settings, since it makes
no effective attempts to serialize with ongoing commands. I have a
half-assed patch to fix that.

--
Jens Axboe

2004-09-14 17:10:26

by Joshua Schmidlkofer

[permalink] [raw]

Subject: Re: [ck] Re: 2.6.8.1-ck7, Two Badnessess, one dump.

> Is your drive idle while applying dma settings? Current 2.6 kernels
> aren't even close to being safe to modify drive settings, since it makes
> no effective attempts to serialize with ongoing commands. I have a
> half-assed patch to fix that.

1. Sorry about the HTML

2. How half-assed is the patch / what is it likely to break?

js

2004-09-14 17:02:48

by Joshua Schmidlkofer

[permalink] [raw]

Subject: Re: [ck] Re: 2.6.8.1-ck7, Two Badnessess, one dump.

>Is your drive idle while applying dma settings? Current 2.6 kernels
>aren't even close to being safe to modify drive settings, since it makes
>no effective attempts to serialize with ongoing commands. I have a
>half-assed patch to fix that.
>
>
>

No it isn't!! But I think that would be the problem. I just realized
that [after you wrote this] that I turned on 'parallel' execution in my
Gentoo init scripts for both of those systems.

Tonight, I am going to change hdparm so that it runs xfs_freeze on all
fs's just before tuning, and [of course un-freeze] to see if that cures it.

thanks,
Joshua

2004-09-15 07:23:21

by Jens Axboe

[permalink] [raw]

Subject: Re: [ck] Re: 2.6.8.1-ck7, Two Badnessess, one dump.

On Mon, Sep 13 2004, Joshua Schmidlkofer wrote:
>
> >Is your drive idle while applying dma settings? Current 2.6 kernels
> >aren't even close to being safe to modify drive settings, since it makes
> >no effective attempts to serialize with ongoing commands. I have a
> >half-assed patch to fix that.
> >
> >
> >
>
> No it isn't!! But I think that would be the problem. I just realized
> that [after you wrote this] that I turned on 'parallel' execution in my
> Gentoo init scripts for both of those systems.

That is for sure the problem.

> Tonight, I am going to change hdparm so that it runs xfs_freeze on all
> fs's just before tuning, and [of course un-freeze] to see if that cures it.

You can try this patch, hope it still applies. It's against the 2.6.5
SUSE kernel, but it did apply to BK around the beginning of august as
well.

The reason I say it's half-assed is because it's not the cleanest
approach and it doesn't cover some obscure hardware cases as well. But
it's definitely safe to play with, and it does cover your hardware. With
this applied, you can safely tune your drive settings with hdparm while
it's active doing reads/writes.

diff -urp /opt/kernel/linux-2.6.5/drivers/ide/ide.c linux-2.6.5/drivers/ide/ide.c
--- /opt/kernel/linux-2.6.5/drivers/ide/ide.c 2004-05-25 11:50:14.797583926 +0200
+++ linux-2.6.5/drivers/ide/ide.c 2004-05-25 11:52:40.367855970 +0200
@@ -1289,18 +1289,28 @@ static int set_io_32bit(ide_drive_t *dri
static int set_using_dma (ide_drive_t *drive, int arg)
{
#ifdef CONFIG_BLK_DEV_IDEDMA
+ int ret = -EPERM;
+
+ ide_pin_hwgroup(drive);
+
if (!drive->id || !(drive->id->capability & 1))
- return -EPERM;
+ goto out;
if (HWIF(drive)->ide_dma_check == NULL)
- return -EPERM;
+ goto out;
+ ret = -EIO;
if (arg) {
- if (HWIF(drive)->ide_dma_check(drive)) return -EIO;
- if (HWIF(drive)->ide_dma_on(drive)) return -EIO;
+ if (HWIF(drive)->ide_dma_check(drive))
+ goto out;
+ if (HWIF(drive)->ide_dma_on(drive))
+ goto out;
} else {
if (__ide_dma_off(drive))
- return -EIO;
+ goto out;
}
- return 0;
+ ret = 0;
+out:
+ ide_unpin_hwgroup(drive);
+ return ret;
#else
return -EPERM;
#endif
diff -urp /opt/kernel/linux-2.6.5/drivers/ide/ide-io.c linux-2.6.5/drivers/ide/ide-io.c
--- /opt/kernel/linux-2.6.5/drivers/ide/ide-io.c 2004-05-25 11:50:15.174543192 +0200
+++ linux-2.6.5/drivers/ide/ide-io.c 2004-05-25 11:53:34.606000019 +0200
@@ -881,6 +881,46 @@ void ide_stall_queue (ide_drive_t *drive
drive->sleep = timeout + jiffies;
}

+void ide_unpin_hwgroup(ide_drive_t *drive)
+{
+ ide_hwgroup_t *hwgroup = HWGROUP(drive);
+
+ if (hwgroup) {
+ spin_lock_irq(&ide_lock);
+ HWGROUP(drive)->busy = 0;
+ drive->blocked = 0;
+ do_ide_request(drive->queue);
+ spin_unlock_irq(&ide_lock);
+ }
+}
+
+void ide_pin_hwgroup(ide_drive_t *drive)
+{
+ ide_hwgroup_t *hwgroup = HWGROUP(drive);
+
+ /*
+ * should only happen very early, so not a problem
+ */
+ if (!hwgroup)
+ return;
+
+ spin_lock_irq(&ide_lock);
+ do {
+ if (!hwgroup->busy && !drive->blocked && !drive->doing_barrier)
+ break;
+ spin_unlock_irq(&ide_lock);
+ schedule_timeout(HZ/100);
+ spin_lock_irq(&ide_lock);
+ } while (hwgroup->busy || drive->blocked || drive->doing_barrier);
+
+ /*
+ * we've now secured exclusive access to this hwgroup
+ */
+ hwgroup->busy = 1;
+ drive->blocked = 1;
+ spin_unlock_irq(&ide_lock);
+}
+
EXPORT_SYMBOL(ide_stall_queue);

#define WAKEUP(drive) ((drive)->service_start + 2 * (drive)->service_time)
diff -urp /opt/kernel/linux-2.6.5/drivers/ide/ide-lib.c linux-2.6.5/drivers/ide/ide-lib.c
--- /opt/kernel/linux-2.6.5/drivers/ide/ide-lib.c 2004-05-25 11:50:15.204539951 +0200
+++ linux-2.6.5/drivers/ide/ide-lib.c 2004-05-25 11:52:40.433848845 +0200
@@ -436,13 +436,17 @@ EXPORT_SYMBOL(ide_toggle_bounce);

int ide_set_xfer_rate(ide_drive_t *drive, u8 rate)
{
+ int ret;
#ifndef CONFIG_BLK_DEV_IDEDMA
rate = min(rate, (u8) XFER_PIO_4);
#endif
- if(HWIF(drive)->speedproc)
- return HWIF(drive)->speedproc(drive, rate);
+ ide_pin_hwgroup(drive);
+ if (HWIF(drive)->speedproc)
+ ret = HWIF(drive)->speedproc(drive, rate);
else
- return -1;
+ ret = -1;
+ ide_unpin_hwgroup(drive);
+ return ret;
}

EXPORT_SYMBOL_GPL(ide_set_xfer_rate);
diff -urp /opt/kernel/linux-2.6.5/include/linux/ide.h linux-2.6.5/include/linux/ide.h
--- /opt/kernel/linux-2.6.5/include/linux/ide.h 2004-05-25 11:50:29.701973356 +0200
+++ linux-2.6.5/include/linux/ide.h 2004-05-25 11:52:40.457846254 +0200
@@ -1474,6 +1474,9 @@ extern irqreturn_t ide_intr(int irq, voi
extern void do_ide_request(request_queue_t *);
extern void ide_init_subdrivers(void);

+extern void ide_pin_hwgroup(ide_drive_t *);
+extern void ide_unpin_hwgroup(ide_drive_t *);
+
extern struct block_device_operations ide_fops[];
extern ide_proc_entry_t generic_subdriver_entries[];

--
Jens Axboe

2004-09-15 07:25:00

by Jens Axboe

[permalink] [raw]

Subject: Re: [ck] Re: 2.6.8.1-ck7, Two Badnessess, one dump.

On Mon, Sep 13 2004, Joshua Schmidlkofer wrote:
> >Is your drive idle while applying dma settings? Current 2.6 kernels
> >aren't even close to being safe to modify drive settings, since it makes
> >no effective attempts to serialize with ongoing commands. I have a
> >half-assed patch to fix that.
>
> 1. Sorry about the HTML

Didn't see any?

> 2. How half-assed is the patch / what is it likely to break?

It's only half-assed in concept, because it doesn't cover some obscure
corner cases which you aren't affected by. It wont break. See previous
mail.

--
Jens Axboe

2006-01-14 02:20:21

by Con Kolivas

[permalink] [raw]

Subject: Re: [ck] Re: 2.6.8.1-ck7, Two Badnessess, one dump.

On Tuesday 14 September 2004 01:21, Joshua Schmidlkofer wrote:
> Con,
>
>
> I did not mention before, I thought it was a fluke on my system. Now
> its affecting two systems since applying ck7.
>
>
> <snip>
> hda: dma_intr: status=0x58 { DriveReady SeekComplete DataRequest }
>
> ide: failed opcode was: unknown
> hda: set_drive_speed_status: status=0x58 { DriveReady SeekComplete
> DataRequest }ide: failed opcode was 100
> hda: dma_intr: status=0x58 { DriveReady SeekComplete DataRequest }
>
> ide: failed opcode was: unknown
> hda: set_drive_speed_status: status=0x58 { DriveReady SeekComplete
> DataRequest }ide: failed opcode was 100
> hda: CHECK for good STATUS
> <snip>
>
> That is happening while applying the dma settings to the hard drive.

It's not a very informative message but probably just means the kernel is
complaining about what that (evil) program hdparm is trying to do. Anyway
this kernel is _ancient_. Please move to a newer one if you can (yes I
understand the reasons stick to older kernels so don't bother explaining them
- that's why I said "if you can").

Cheers,
Con