LinuxLists.cc - [PATCH] Reserved root VM + OOM killer

2000-11-22 19:29:09

Subject: [PATCH] Reserved root VM + OOM killer

WHY?
Permanent memory need by user apps makes Linux uncontrollable in OOM
(out of memory) situation when OOM killer can't kill as fast as the
memory needed (and your superb 'free memory space' monitor/actor
developed in the last 5 years was also killed and init couldn't
restart it because of OOM). In the Unix world it's a common good
practice to reserve resources for root (see e.g. disk space, network
ports, file descriptors, processes, etc). Linux doesn't reserve
virtual memory for root so if OOM happens by user apps you get this
kind of messages as root when trying to make the system work again
properly or investigate what happend,

running a command from prompt:
Memory exhausted
Segmentation fault
fork failed: resource temporarily unavailable
trying to login from console:
Unable to load interpreter /lib/ld-linux.so.2
error while loading shared libraries: libc.so.6:
cannot map zero-fill pages: Cannot allocate memory
error while loading shared libraries: libtermcap.so.2: failed
to map segment from shared object: Cannot allocate memory
xrealloc: cannot reallocate 128 bytes (0 bytes allocated)
xmalloc: cannot allocate 562 bytes (0 bytes allocated)
trying to ssh via network:
Received disconnect: Command terminated on signal 11.

WHAT?
This patch tries to reserve virtual memory for root, balance memory
usage between root and user apps if memory is overcommited and has
Rik's OOM killer that is much more clever about what to kill when OOM
happens than what's inlcuded in standard 2.2 kernels.

HOW?
No performance loss, RAM is always fully utilized (except if no swap),
the tunable reserved memory is on swap (or in caches) until it's
needed by root. There are two scenarios. When user apps don't
overcommit memory they will see only
UVM = (real virtual memory) - (reserved virtual memory for root)
If memory is overcommited then user apps will also use the reserved
memory (otherwise there would be a performance loss as I guess) but
the kernel will try hard to push them back below UVM.

IN THE PATCH:
- reserved VM for root
- Rik's OOM killer from 2.4.0-test11 with "fixes":
- PID 1 never gets killed by OOM killer
- OOM killing takes place only in do_page_fault() [no two places in
the kernel for process killing]
- niced processes are not penalized
- IPC shared mem can only be kvazi-overcommited (i.e. request is
successful only if there is enough VM at the request time)

NOTES:
- it's for 2.2 (late) kernels [tested with 2.2.18pre21, applies to
2.2.18pre22 as well]
- Intel only [page fault handling is implemented differently
in different architectures, no common hooks but easy to fix]
- SMP not tested
- GUI environment not tested
- tests were done with constant brk, mmap, zfod, cow, IPC shm fork
bombs on mostly a 64-128 RAM MB + 80 MB swap box.
- using IPC shared mem still can "kill" the box (unused mem not
freed). Use Solar Designer kernel security patch or set
/proc/sys/kernel/shmall according to your VM
- it's not for common fork bombs (use e.g. fair scheduler,
Fork Bomb Defuser, etc against them). Use ulimit -u if you want
to test the patch and don't have enough CPU power
- the reserved virtual memory can be set runtime via
/proc/sys/vm/reserved The value is in pages (4096 bytes on x86)
- On SMP you should probably increase this value in the function
of you CPU's
- if you have GB's of VM you can experience malloc() scalability
problems, use glibc 2.2, limit your VM, raise the limits
via malloc environment variables, etc.

PROBLEMS:
- if killable task is in TASK_UNINTERRUPTIBLE constantly [e.g. becasue
of network fs (smb, nfs, etc) problems] then OOM killer won't
work ... at least this is what I suspect
- schedule() doesn't always immediately schedules the killable task
- probably others I'm not aware of

Standard disclaimer applies. It worked fine for me but maybe it will
eat your whole computer and pets :) It's not perfect but seems good
enough and I definitely found it much better then what is in 2.2
kernels. Of course your experience can be completely different. Please
let me know.

Szaka

diff -urw linux-2.2.18pre21/arch/i386/mm/Makefile linux/arch/i386/mm/Makefile
--- linux-2.2.18pre21/arch/i386/mm/Makefile Fri Nov 1 04:56:43 1996
+++ linux/arch/i386/mm/Makefile Tue Nov 21 03:03:15 2000
@@ -8,6 +8,6 @@
# Note 2! The CFLAGS definition is now in the main makefile...

O_TARGET := mm.o
-O_OBJS := init.o fault.o ioremap.o extable.o
+O_OBJS := init.o fault.o ioremap.o extable.o ../../../mm/oom_kill.o

include $(TOPDIR)/Rules.make
diff -urw linux-2.2.18pre21/arch/i386/mm/fault.c linux/arch/i386/mm/fault.c
--- linux-2.2.18pre21/arch/i386/mm/fault.c Wed May 3 20:16:31 2000
+++ linux/arch/i386/mm/fault.c Tue Nov 21 05:49:36 2000
@@ -23,6 +23,7 @@
#include <asm/hardirq.h>

extern void die(const char *,struct pt_regs *,long);
+extern int oom_kill(void);

/*
* Ugly, ugly, but the goto's result in better assembly..
@@ -291,25 +292,14 @@
up(&mm->mmap_sem);
if (error_code & 4)
{
- if (tsk->oom_kill_try++ > 10 ||
- !((regs->eflags >> 12) & 3))
- {
- printk("VM: killing process %s\n", tsk->comm);
+ if (oom_kill()) {
+ printk(KERN_ERR "Out of Memory: Killed process "
+ "%d (%s)\n", tsk->pid, tsk->comm);
do_exit(SIGKILL);
- }
- else
- {
- /*
- * The task is running with privilegies and so we
- * trust it and we give it a chance to die gracefully.
- */
- printk("VM: terminating process %s\n", tsk->comm);
- force_sig(SIGTERM, current);
- if (tsk->oom_kill_try > 1)
- {
+ } else {
+ /* we must kill a different process */
tsk->policy |= SCHED_YIELD;
schedule();
- }
return;
}
}
diff -urw linux-2.2.18pre21/include/linux/sysctl.h linux/include/linux/sysctl.h
--- linux-2.2.18pre21/include/linux/sysctl.h Thu Nov 9 08:20:19 2000
+++ linux/include/linux/sysctl.h Tue Nov 21 08:43:22 2000
@@ -122,7 +122,8 @@
VM_PAGECACHE=7, /* struct: Set cache memory thresholds */
VM_PAGERDAEMON=8, /* struct: Control kswapd behaviour */
VM_PGT_CACHE=9, /* struct: Set page table cache parameters */
- VM_PAGE_CLUSTER=10 /* int: set number of pages to swap together */
+ VM_PAGE_CLUSTER=10, /* int: set number of pages to swap together */
+ VM_ROOT_RESERVED=11 /* int: number of pages reserved for root */
};

diff -urw linux-2.2.18pre21/ipc/shm.c linux/ipc/shm.c
--- linux-2.2.18pre21/ipc/shm.c Wed Jun 7 17:26:44 2000
+++ linux/ipc/shm.c Mon Nov 13 03:50:51 2000
@@ -101,8 +101,8 @@
return -ENOMEM;
}

- shp->shm_pages = (ulong *) vmalloc (numpages*sizeof(ulong));
- if (!shp->shm_pages) {
+ if (!vm_enough_memory(numpages)
+ || !(shp->shm_pages = (ulong *) vmalloc(numpages*sizeof(ulong)))) {
shm_segs[id] = (struct shmid_kernel *) IPC_UNUSED;
wake_up (&shm_lock);
kfree(shp);
diff -urw linux-2.2.18pre21/kernel/sysctl.c linux/kernel/sysctl.c
--- linux-2.2.18pre21/kernel/sysctl.c Thu Nov 9 08:20:19 2000
+++ linux/kernel/sysctl.c Tue Nov 21 08:44:11 2000
@@ -32,6 +32,7 @@
#if defined(CONFIG_SYSCTL)

/* External variables not in a header file. */
+extern int vm_reserved;
extern int panic_timeout;
extern int console_loglevel, C_A_D;
extern int bdf_prm[], bdflush_min[], bdflush_max[];
@@ -249,6 +250,8 @@
&bdflush_min, &bdflush_max},
{VM_OVERCOMMIT_MEMORY, "overcommit_memory", &sysctl_overcommit_memory,
sizeof(sysctl_overcommit_memory), 0644, NULL, &proc_dointvec},
+ {VM_ROOT_RESERVED, "reserved",
+ &vm_reserved, sizeof(int), 0644, NULL, &proc_dointvec},
{VM_BUFFERMEM, "buffermem",
&buffer_mem, sizeof(buffer_mem_t), 0644, NULL, &proc_dointvec},
{VM_PAGECACHE, "pagecache",
diff -urw linux-2.2.18pre21/mm/mmap.c linux/mm/mmap.c
--- linux-2.2.18pre21/mm/mmap.c Thu Nov 9 08:20:19 2000
+++ linux/mm/mmap.c Tue Nov 21 08:45:14 2000
@@ -40,6 +40,7 @@
kmem_cache_t *vm_area_cachep;

int sysctl_overcommit_memory;
+int vm_reserved;

/* Check that a process has enough memory to allocate a
* new virtual mapping.
@@ -67,6 +68,8 @@
free += nr_free_pages;
free += nr_swap_pages;
free -= (page_cache.min_percent + buffer_mem.min_percent + 2)*num_physpages/100;
+ if (current->uid && free < vm_reserved)
+ return 0;
return free > pages;
}

@@ -872,6 +875,23 @@

void __init vma_init(void)
{
+ struct sysinfo i;
+
+ /*
+ * Setup default reserved VM pages for root. You can tune it
+ * via /proc/sys/vm/reserved. Default value is based on RAM size
+ * - no reserved pages if RAM is less than 8MB
+ * - 5MB should be enough on boxes w/ RAM > 100 MB
+ * - otherwise reserve 5%
+ */
+ si_meminfo(&i);
+ if (i.totalram < 8 * 1024 * 1024)
+ vm_reserved = 0;
+ else if (i.totalram > 100 * 1024 * 1024)
+ vm_reserved = 5 * 1024 * 1024 >> PAGE_SHIFT;
+ else
+ vm_reserved = (i.totalram >> PAGE_SHIFT) / 20;
+
vm_area_cachep = kmem_cache_create("vm_area_struct",
sizeof(struct vm_area_struct),
0, SLAB_HWCACHE_ALIGN,
diff -urw linux-2.2.18pre21/mm/oom_kill.c linux/mm/oom_kill.c
--- linux-2.2.18pre21/mm/oom_kill.c Fri Nov 17 00:07:57 2000
+++ linux/mm/oom_kill.c Wed Nov 22 17:59:04 2000
@@ -0,0 +1,186 @@
+/*
+ * linux/mm/oom_kill.c
+ *
+ * Copyright (C) 1998,2000 Rik van Riel
+ * Thanks go out to Claus Fischer for some serious inspiration and
+ * for goading me into coding this file...
+ *
+ * The routines in this file are used to kill a process when
+ * we're seriously out of memory. This gets called from kswapd()
+ * in linux/mm/vmscan.c when we really run out of memory.
+ *
+ * Since we won't call these routines often (on a well-configured
+ * machine) this file will double as a 'coding guide' and a signpost
+ * for newbie kernel hackers. It features several pointers to major
+ * kernel subsystems and hints as to where to find out what things do.
+ */
+
+#include <linux/mm.h>
+#include <linux/sched.h>
+#include <linux/swap.h>
+#include <linux/swapctl.h>
+#include <linux/timex.h>
+
+/* #define DEBUG */
+
+/**
+ * int_sqrt - oom_kill.c internal function, rough approximation to sqrt
+ * @x: integer of which to calculate the sqrt
+ *
+ * A rough approximation to the sqrt() function.
+ */
+static inline unsigned int int_sqrt(unsigned int x)
+{
+ unsigned int out = x;
+ while (x & ~(unsigned int)1) x >>=2, out >>=1;
+ if (x) out -= out >> 2;
+ return (out ? out : 1);
+}
+
+/**
+ * oom_badness - calculate a numeric value for how bad this task has been
+ * @p: task struct of which task we should calculate
+ *
+ * The formula used is relatively simple and documented inline in the
+ * function. The main rationale is that we want to select a good task
+ * to kill when we run out of memory.
+ *
+ * Good in this context means that:
+ * 1) we lose the minimum amount of work done
+ * 2) we recover a large amount of memory
+ * 3) we don't kill anything innocent of eating tons of memory
+ * 4) we want to kill the minimum amount of processes (one)
+ */
+
+static int badness(struct task_struct *p)
+{
+ int points, cpu_time, run_time;
+
+ if (!p->mm)
+ return 0;
+ /*
+ * The memory size of the process is the basis for the badness.
+ */
+ points = p->mm->total_vm;
+
+ /*
+ * CPU time is in seconds and run time is in minutes. There is no
+ * particular reason for this other than that it turned out to work
+ * very well in practice. This is not safe against jiffie wraps
+ * but we don't care _that_ much...
+ */
+ cpu_time = (p->times.tms_utime + p->times.tms_stime) >> (SHIFT_HZ + 3);
+ run_time = (jiffies - p->start_time) >> (SHIFT_HZ + 10);
+
+ points /= int_sqrt(cpu_time);
+ points /= int_sqrt(int_sqrt(run_time));
+
+ /*
+ * Superuser processes are usually more important, so we make it
+ * less likely that we kill those.
+ */
+ if (cap_t(p->cap_effective) & CAP_TO_MASK(CAP_SYS_ADMIN) ||
+ p->uid == 0 || p->euid == 0)
+ points /= 4;
+
+ /*
+ * We don't want to kill a process with direct hardware access.
+ * Not only could that mess up the hardware, but usually users
+ * tend to only have this flag set on applications they think
+ * of as important.
+ */
+ if (cap_t(p->cap_effective) & CAP_TO_MASK(CAP_SYS_RAWIO))
+ points /= 4;
+#ifdef DEBUG
+ printk(KERN_DEBUG "OOMkill: task %d (%s) got %d points\n",
+ p->pid, p->comm, points);
+#endif
+ return points;
+}
+
+/*
+ * Simple selection loop. We chose the process with the highest
+ * number of 'points'. We need the locks to make sure that the
+ * list of task structs doesn't change while we look the other way.
+ *
+ * (not docbooked, we don't want this one cluttering up the manual)
+ */
+static struct task_struct * select_bad_process(void)
+{
+ int maxpoints = 0;
+ struct task_struct *p = NULL;
+ struct task_struct *chosen = NULL;
+
+ read_lock(&tasklist_lock);
+ for_each_task(p) {
+ if (p->pid > 1) {
+ int points = badness(p);
+ if (points > maxpoints) {
+ chosen = p;
+ maxpoints = points;
+ }
+ }
+ }
+ read_unlock(&tasklist_lock);
+ return chosen;
+}
+
+/**
+ * oom_kill - kill the "best" process when we run out of memory
+ *
+ * If we run out of memory, we have the choice between either
+ * killing a random task (bad), letting the system crash (worse)
+ * OR try to be smart about which process to kill. Note that we
+ * don't have to be perfect here, we just have to be good.
+ *
+ * We must be careful though to never send SIGKILL a process with
+ * CAP_SYS_RAW_IO set, send SIGTERM instead (but it's unlikely that
+ * we select a process with CAP_SYS_RAW_IO set).
+ */
+int oom_kill(void)
+{
+
+ struct task_struct *p = select_bad_process();
+
+ /* Found nothing?!?! Either we hang forever, or we panic. */
+ if (p == NULL)
+ panic("Out of memory and no killable processes...\n");
+
+ /*
+ * We give our sacrificial lamb high priority and access to
+ * all the memory it needs. That way it should be able to
+ * exit() and clear out its resources quickly...
+ */
+ p->counter = 5 * HZ;
+ p->flags |= PF_MEMALLOC;
+
+ if ((p->oom_kill_try++ > 10)
+ || !(cap_t(p->cap_effective) & CAP_TO_MASK(CAP_SYS_RAWIO)))
+ {
+ if (p == current) {
+ return 1;
+ } else {
+ printk(KERN_ERR "Out of Memory: Killed process "
+ "%d (%s), saved process %d (%s)\n",
+ p->pid, p->comm, current->pid, current->comm);
+ force_sig(SIGKILL, p);
+ }
+ } else {
+ /* This process has hardware access, be more careful */
+ if (p == current) {
+ printk(KERN_ERR "Out of Memory: Terminating process "
+ "%d (%s)\n", p->pid, p->comm);
+ } else {
+ printk(KERN_ERR "Out of Memory: Terminating process "
+ "%d (%s), saved process %d (%s).\n",
+ p->pid, p->comm, current->pid, current->comm);
+ }
+ force_sig(SIGTERM, p);
+ }
+
+ if ((p != current) || (p->oom_kill_try > 1))
+ p->counter = 2 * DEF_PRIORITY;
+
+ return 0;
+}
+
diff -urw linux-2.2.18pre21/mm/vmscan.c linux/mm/vmscan.c
--- linux-2.2.18pre21/mm/vmscan.c Fri Nov 17 00:07:57 2000
+++ linux/mm/vmscan.c Wed Nov 22 18:10:40 2000
@@ -20,6 +20,8 @@

#include <asm/pgtable.h>

+extern int vm_enough_memory(long pages);
+
/*
* The swap-out functions return 1 if they successfully
* threw something out, and we got a free page. It returns
@@ -384,11 +386,16 @@
int swapcount;
int count = SWAP_CLUSTER_MAX;

+ /* Don't try for non-root if memory is less than reserved for root */
+ if (!vm_enough_memory(1))
+ return 0;
+
lock_kernel();

/* Always trim SLAB caches when memory gets low. */
kmem_cache_reap(gfp_mask);

+again:
priority = 6;
do {
while (shrink_mmap(priority, gfp_mask)) {
@@ -419,9 +426,10 @@
done:
unlock_kernel();

- if (!ret)
- printk("VM: do_try_to_free_pages failed for %s...\n",
- current->comm);
+ /* Don't flood this message */
+ if (0 && !ret) {
+ printk("VM: do_try_to_free_pages failed for %s...\n", current->comm);
+ }
/* Return success if we freed a page. */
return ret;
}

2000-11-22 21:10:48

by Rik van Riel

[permalink] [raw]

Subject: Re: [PATCH] Reserved root VM + OOM killer

On Wed, 22 Nov 2000, Szabolcs Szakacsits wrote:

> - OOM killing takes place only in do_page_fault() [no two places in
> the kernel for process killing]

... disable OOM killing for non-x86 architectures.
This doesn't seem like a smart move ;)

> diff -urw linux-2.2.18pre21/arch/i386/mm/Makefile linux/arch/i386/mm/Makefile
> --- linux-2.2.18pre21/arch/i386/mm/Makefile Fri Nov 1 04:56:43 1996
> +++ linux/arch/i386/mm/Makefile Tue Nov 21 03:03:15 2000
> @@ -8,6 +8,6 @@
> # Note 2! The CFLAGS definition is now in the main makefile...
>
> O_TARGET := mm.o
> -O_OBJS := init.o fault.o ioremap.o extable.o
> +O_OBJS := init.o fault.o ioremap.o extable.o ../../../mm/oom_kill.o
>
> include $(TOPDIR)/Rules.make

Rik
--
Hollywood goes for world dumbination,
Trailer at 11.

http://www.conectiva.com/ http://www.surriel.com/

2000-11-22 21:24:22

by Szabolcs Szakacsits

[permalink] [raw]

Subject: Re: [PATCH] Reserved root VM + OOM killer

On Wed, 22 Nov 2000, Rik van Riel wrote:

> On Wed, 22 Nov 2000, Szabolcs Szakacsits wrote:
>
> > - OOM killing takes place only in do_page_fault() [no two places in
> > the kernel for process killing]
>
> ... disable OOM killing for non-x86 architectures.
> This doesn't seem like a smart move ;)
>
> > diff -urw linux-2.2.18pre21/arch/i386/mm/Makefile linux/arch/i386/mm/Makefile
> > --- linux-2.2.18pre21/arch/i386/mm/Makefile Fri Nov 1 04:56:43 1996
^^^^^^^^^
As I wrote, the OOM killer changes are x86 only at present. Other
arch's still use the default OOM killing defined in arch/*/mm/fault.c.

Szaka

2000-11-23 23:19:47

by Pavel Machek

[permalink] [raw]

Subject: Re: [PATCH] Reserved root VM + OOM killer

Hi!

> HOW?
> No performance loss, RAM is always fully utilized (except if no swap),

Handheld machines never have any swap, and alwys have little RAM [trust me,
velo1 I'm writing this on is so tuned that 100KB les and machine is useless].
Unless reservation can be turned off, it is not acceptable. Okay, it can
be tuned. Ok, then.

[What about making default reserved space 10% of *swap* size?]

--
Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt,
details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html.

2000-11-24 12:43:02

by Szabolcs Szakacsits

[permalink] [raw]

Subject: Re: [PATCH] Reserved root VM + OOM killer

On Thu, 23 Nov 2000, Pavel Machek wrote:

> > HOW?
> > No performance loss, RAM is always fully utilized (except if no swap),
>
> Handheld machines never have any swap, and alwys have little RAM [trust me,
> velo1 I'm writing this on is so tuned that 100KB les and machine is useless].
> Unless reservation can be turned off, it is not acceptable. Okay, it can
> be tuned. Ok, then.
>
> [What about making default reserved space 10% of *swap* size?]

No. Many people uses no swap even if they have plenty of RAM. I wasn't
right when I wrote the "reserved" VM is on swap or in buffer/page
cache. I wanted to write the reserved VM is unused swap and/or it is
*used* as buffer/page cache until it's not needed by root. Left away
swap from the former sentence and you get no RAM is wasted at all ;)

Moreover the default value for boxes with less than 8MB is 0 pages (I
thought about "embedded" systems), it's 5 MB if the box has more then
100MB and 5% of the RAM but after considered it as part of the VM
between 8MB and 100MB. I found in my setup, at least 4 MB needed to be
useful if root wants to act sure. Of course this can be different in
other setups and application behaviours -- this is why it can be tuned
runtime. Using more "reserved" [this is really a stupid and not
accurate name] VM definitely helps :) BTW, apparently Solaris reserves
4 MB for root.

I also thought about making it a compile time option [for people using
Linux as embedded systmes] in that case you would have less than 25%
chance to save one page -- I would instead optimize the compiler ;)
.... but maybe embedded systems use non-overcomittable memory
handling, I didn't look how they handle OOM.

I'm afraid I was also wrong about performance, here is a typical case
how standard 2.2 kernel works if OOM happens: killing gpm, vmstat,
syslogd, tail, httpd, zsh, identd, httpd, klogd, httpd, httpd, httpd
[the main httpd, web is dead], bad_app. If there is more bad_app
[working on the same problem but e.g. they were feeded by wrong input,
etc], then you have the big chance you must hit the reset button. With
Rik's OOM killer, the "right" processes are killed but I found the
system trashes too long and because of the constant memory pressure
you still must hit the reset button. With my patch + fixes of Rik's
OOM killer, the "right" processes are killed fast [it's done only in
page fault, contrary to 2.4.0-test11 that has two OOM killer: one in
page fault and Rik's one ... pretty ugly] and you can do whatever you
want as root. It would be nice to see which one of the three cases
would finish a job first where multiply processes [not threads] work
on the same job saving the partial results and constantly producing
OOM.

Szaka

2000-11-26 07:05:07

by afei

[permalink] [raw]

Subject: difference between kernel and bios report on drive status

Hi there:

I am using PROMISE ultra100 with 2 UDMA66 disks. The PROMIS
bios reports that both disks are using UDMA 4. But during kernel boot On
Fri, 24 Nov 2000, Szabolcs Szakacsits wrote: up, the master disk is using
UDMA 2 only. The reason is that master disk's pci->dma_ultra value is only
0x40f. I hacked the kernel a bit, both hard disk using hardware configued
udma_four though. So I am wondering why not use udma_four as mode
indication? Anyone here knows or
I have to email Andre on this?

By the way, mode4 works much faster.(hdparm -tT):
disk1: udma33, mode2, disk buffer read 14MB/s
disk2: udma66, mode4, disk buffer read 24MB/s

both disks: buffer-cache read 88MB/s

The second disk is only 5400RPM, the first one is 7200RPM. Both comply > >
HOW? to UDMA66. How come the 2nd one is UDMA66, but the first one is > > >
No performance loss, RAM is always fully utilized (except if no swap),
UDMA33? And how come the 2nd is faster than the first one.

I hope this is not a kernel bug.

Fei

2000-11-26 07:33:26

by Andre Hedrick

[permalink] [raw]

Subject: Re: difference between kernel and bios report on drive status

It is a standards bug that your drives are mixed in the rev.
Adjust the IVB option so that it does not care about these bits just the
presence of either set.

Cheers,

On Sun, 26 Nov 2000 [email protected] wrote:

> Hi there:
>
> I am using PROMISE ultra100 with 2 UDMA66 disks. The PROMIS
> bios reports that both disks are using UDMA 4. But during kernel boot On
> Fri, 24 Nov 2000, Szabolcs Szakacsits wrote: up, the master disk is using
> UDMA 2 only. The reason is that master disk's pci->dma_ultra value is only
> 0x40f. I hacked the kernel a bit, both hard disk using hardware configued
> udma_four though. So I am wondering why not use udma_four as mode
> indication? Anyone here knows or
> I have to email Andre on this?
>
> By the way, mode4 works much faster.(hdparm -tT):
> disk1: udma33, mode2, disk buffer read 14MB/s
> disk2: udma66, mode4, disk buffer read 24MB/s
>
> both disks: buffer-cache read 88MB/s
>
> The second disk is only 5400RPM, the first one is 7200RPM. Both comply > >
> HOW? to UDMA66. How come the 2nd one is UDMA66, but the first one is > > >
> No performance loss, RAM is always fully utilized (except if no swap),
> UDMA33? And how come the 2nd is faster than the first one.
>
> I hope this is not a kernel bug.
>
> Fei
>

Andre Hedrick
CTO Timpanogas Research Group
EVP Linux Development, TRG
Linux ATA Development