2002-11-16 03:15:36

by Con Kolivas

[permalink] [raw]
Subject: [BENCHMARK] 2.5.47-mm3 with contest

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Here are some contest benchmarks for 2.5.47-mm3.

These include 2 experimental additions to contest. One is cacherun which is an
unloaded kernel compile immediately following a previous compile. The second
is dbench_load, where dbench (16*num_cpus) is run in a continuous loop. The
number of loads in dbench_loads at the moment is which run it was up to when
terminated (ie a run of 1 means it never got to finish).

Note the config of 2.5 kernels includes preempt.

Config of 2.5.47-mm3 includes shared 3rd level pagetables.

noload:
Kernel [runs] Time CPU% Loads LCPU% Ratio
2.4.18 [5] 71.7 93 0 0 1.00
2.4.19 [5] 69.0 97 0 0 0.97
2.4.20-rc1 [3] 72.2 93 0 0 1.01
2.5.47 [3] 73.5 93 0 0 1.03
2.5.47-mm1 [5] 73.6 93 0 0 1.03
2.5.47-mm3 [2] 73.7 93 0 0 1.03

cacherun:
Kernel [runs] Time CPU% Loads LCPU% Ratio
2.4.18 [2] 66.6 99 0 0 0.93
2.4.19 [2] 68.0 99 0 0 0.95
2.4.20-rc1 [3] 67.2 99 0 0 0.94
2.5.47 [3] 68.3 99 0 0 0.96
2.5.47-mm1 [5] 68.4 99 0 0 0.96
2.5.47-mm3 [2] 68.3 99 0 0 0.96

process_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
2.4.18 [3] 109.5 57 119 44 1.53
2.4.19 [3] 106.5 59 112 43 1.49
2.4.20-rc1 [3] 110.7 58 119 43 1.55
2.5.47 [3] 83.4 82 22 21 1.17
2.5.47-mm1 [5] 83.0 83 21 20 1.16
2.5.47-mm3 [2] 84.2 82 22 21 1.18

dbench_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
2.4.18 [1] 346.6 20 1 57 4.85
2.4.19 [1] 342.6 20 1 62 4.80
2.4.20-rc1 [1] 309.8 23 2 50 4.34
2.5.47 [2] 224.2 33 1 44 3.14
2.5.47-mm3 [2] 201.6 38 1 39 2.82

ctar_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
2.4.18 [3] 117.4 63 1 7 1.64
2.4.19 [2] 106.5 70 1 8 1.49
2.4.20-rc1 [3] 102.1 72 1 7 1.43
2.5.47 [3] 93.9 80 1 5 1.32
2.5.47-mm1 [5] 94.0 81 1 5 1.32
2.5.47-mm3 [2] 94.0 81 1 6 1.32

xtar_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
2.4.18 [3] 150.8 49 2 8 2.11
2.4.19 [1] 132.4 55 2 9 1.85
2.4.20-rc1 [3] 180.7 40 3 8 2.53
2.5.47 [3] 167.1 45 2 7 2.34
2.5.47-mm1 [5] 118.5 64 1 7 1.66
2.5.47-mm3 [2] 211.3 38 2 6 2.96

io_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
2.4.18 [3] 474.1 15 36 10 6.64
2.4.19 [3] 492.6 14 38 10 6.90
2.4.20-rc1 [2] 1142.2 6 90 10 16.00
2.5.47 [3] 165.9 46 9 9 2.32
2.5.47-mm1 [5] 126.3 61 5 8 1.77
2.5.47-mm3 [2] 117.1 65 4 8 1.64

read_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
2.4.18 [3] 102.3 70 6 3 1.43
2.4.19 [2] 134.1 54 14 5 1.88
2.4.20-rc1 [3] 173.2 43 20 5 2.43
2.5.47 [3] 103.4 74 6 4 1.45
2.5.47-mm1 [5] 100.6 76 7 4 1.41
2.5.47-mm3 [2] 218.5 34 10 2 3.06

list_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
2.4.18 [3] 90.2 76 1 17 1.26
2.4.19 [1] 89.8 77 1 20 1.26
2.4.20-rc1 [3] 88.8 77 0 12 1.24
2.5.47 [3] 100.2 71 1 20 1.40
2.5.47-mm1 [5] 102.4 69 1 19 1.43
2.5.47-mm3 [2] 101.2 71 1 21 1.42

mem_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
2.4.18 [3] 103.3 70 32 3 1.45
2.4.19 [3] 100.0 72 33 3 1.40
2.4.20-rc1 [3] 105.9 69 32 2 1.48
2.5.47 [3] 151.1 49 35 2 2.12
2.5.47-mm1 [5] 127.0 58 29 2 1.78
2.5.47-mm3 [2] 243.8 31 39 1 3.41

Note the significant discrepancy between mm1 and mm3. This reminds me of what
happened last time I enabled shared 3rd level pagetables - Andrew do you want
me to do a set of numbers with this disabled?

Con
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.0 (GNU/Linux)

iD8DBQE91bnlF6dfvkL3i1gRAlnMAKCODJZ26fA1zvTp0mcvtzydO7xk3ACfTh8A
Rxk3RlPecInc8ef7Ne8bWMg=
=+908
-----END PGP SIGNATURE-----


2002-11-16 03:23:47

by Andrew Morton

[permalink] [raw]
Subject: Re: [BENCHMARK] 2.5.47-mm3 with contest

Con Kolivas wrote:
>
> Note the significant discrepancy between mm1 and mm3. This reminds me of what
> happened last time I enabled shared 3rd level pagetables - Andrew do you want
> me to do a set of numbers with this disabled?

That certainly couldn't hurt. But your tests are, in general, tesging
the IO scheduler. And the IO scheduler has changed radically in each
of the recent -mm's.

So testing with rbtree-iosched reverted would really be the only way
to draw comparisons on how the rest of the code is behaving.

2002-11-16 04:17:12

by Nick Piggin

[permalink] [raw]
Subject: Re: [BENCHMARK] 2.5.47-mm3 with contest

--- linux-2.5/drivers/block/deadline-iosched.c.orig 2002-11-15 23:22:19.000000000 +1100
+++ linux-2.5/drivers/block/deadline-iosched.c 2002-11-16 13:01:01.000000000 +1100
@@ -407,20 +407,46 @@
if (rbnext)
dd->last_drq[data_dir] = rb_entry_drq(rbnext);

/*
* take it off the sort and fifo list, move
* to dispatch queue
*/
deadline_move_to_dispatch(dd, drq);
}

+static int expire_batch = 8;
+/*
+ * move a batch of entries to dispatch queue
+ */
+static inline void deadline_move_batch(struct deadline_data *dd, struct deadline_rq *drq)
+{
+ const int data_dir = rq_data_dir(drq->request);
+ struct rb_node *rbnext;
+ int i;
+
+ for (i = 0; i < expire_batch; i++) {
+ struct rb_node *rbnext;
+ rbnext = rb_next(&drq->rb_node);
+
+ deadline_move_to_dispatch(dd, drq);
+
+ if (!rbnext)
+ break;
+ drq = rb_entry_drq(rbnext);
+ }
+
+ dd->last_drq[data_dir] = NULL;
+ if (rbnext)
+ dd->last_drq[data_dir] = drq;
+}
+
/*
* returns 0 if there are no expired reads on the fifo, 1 otherwise
*/
#define list_entry_fifo(ptr) list_entry((ptr), struct deadline_rq, fifo)
static inline int deadline_check_fifo(struct deadline_data *dd, int ddir)
{
if (!list_empty(&dd->rw_fifo[ddir])) {
struct deadline_rq *drq = list_entry_fifo(dd->rw_fifo[ddir].next);

/*
@@ -439,56 +465,74 @@
struct deadline_rq *drq;

/*
* if we have expired entries on the fifo list, move some to dispatch
*/
if (deadline_check_fifo(dd, READ)) {
if (writes && (dd->starved++ >= dd->writes_starved))
goto dispatch_writes;

drq = list_entry_fifo(dd->rw_fifo[READ].next);
-dispatch_requests:
- deadline_move_request(dd, drq);
- return 1;
+
+ goto dispatch_batch;
}

if (!list_empty(&dd->rw_fifo[READ])) {
if (writes && (dd->starved++ >= dd->writes_starved))
goto dispatch_writes;

BUG_ON(RB_EMPTY(&dd->rb_list[READ]));

drq = dd->last_drq[READ];
if (!drq)
drq = list_entry_fifo(dd->rw_fifo[READ].next);

- goto dispatch_requests;
+ goto dispatch_request;
}

/*
* either there are no reads expired or on sort list, or the reads
* have starved writes for too long. dispatch some writes
*/
+
if (writes) {
dispatch_writes:
BUG_ON(RB_EMPTY(&dd->rb_list[WRITE]));

dd->starved = 0;
+
+ if (deadline_check_fifo(dd, WRITE)) {
+ drq = list_entry_fifo(dd->rw_fifo[WRITE].next);
+
+ goto dispatch_batch;
+ }

drq = dd->last_drq[WRITE];
- if (!drq || deadline_check_fifo(dd, WRITE))
+ if (!drq)
drq = list_entry_fifo(dd->rw_fifo[WRITE].next);

- goto dispatch_requests;
+ goto dispatch_request;
}

return 0;
+
+dispatch_request:
+ deadline_move_request(dd, drq);
+ return 1;
+
+dispatch_batch:
+ /* dispatch in batches to prevent a seek storm if the disk
+ * can't keep up with the queue size and all entries end up
+ * being expired and submitted fifo.
+ */
+ deadline_move_batch(dd, drq);
+ return 1;
}

static struct request *deadline_next_request(request_queue_t *q)
{
struct deadline_data *dd = q->elevator.elevator_data;
struct request *rq;

/*
* if there are still requests on the dispatch queue, grab the first one
*/


Attachments:
deadline.patch (3.08 kB)

2002-11-16 06:16:51

by Con Kolivas

[permalink] [raw]
Subject: Re: [BENCHMARK] 2.5.47-mm3 with contest

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


>Con Kolivas wrote:
>> Note the significant discrepancy between mm1 and mm3. This reminds me of
>> what happened last time I enabled shared 3rd level pagetables - Andrew do
>> you want me to do a set of numbers with this disabled?
>
>That certainly couldn't hurt. But your tests are, in general, tesging
>the IO scheduler. And the IO scheduler has changed radically in each
>of the recent -mm's.
>
>So testing with rbtree-iosched reverted would really be the only way
>to draw comparisons on how the rest of the code is behaving.

Ok. Tested with shared disabled (2.5.47-mm3ns) and are very similar. These
were the only significant differences:

read_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
2.5.47-mm3 [2] 218.5 34 10 2 3.06
2.5.47-mm3ns [2] 257.9 29 11 2 3.61

xtar_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
2.5.47-mm3 [2] 211.3 38 2 6 2.96
2.5.47-mm3ns [2] 152.8 49 2 7 2.14

Con.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.0 (GNU/Linux)

iD8DBQE91eRiF6dfvkL3i1gRAtclAJwLr2jxRHuhqkKUpwraJW3z8zawAACffH+c
4D5+HaXXSNwuyiGqULB02B4=
=dpn2
-----END PGP SIGNATURE-----

2002-11-16 11:06:44

by Con Kolivas

[permalink] [raw]
Subject: Re: [BENCHMARK] 2.5.47-mm3 with contest

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


>Andrew Morton wrote:
>>Con Kolivas wrote:
>>>Note the significant discrepancy between mm1 and mm3. This reminds me of
>>> what happened last time I enabled shared 3rd level pagetables - Andrew do
>>> you want me to do a set of numbers with this disabled?
>>
>>That certainly couldn't hurt. But your tests are, in general, tesging
>>the IO scheduler. And the IO scheduler has changed radically in each
>>of the recent -mm's.
>>
>>So testing with rbtree-iosched reverted would really be the only way
>>to draw comparisons on how the rest of the code is behaving.
>>-
>>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>the body of a message to [email protected]
>>More majordomo info at http://vger.kernel.org/majordomo-info.html
>>Please read the FAQ at http://www.tux.org/lkml/
>
>Andrew there is, in fact a problem with the io scheduler in mm3 as far
>as I can see. Jens is away 'till Monday so he hasn't confirmed this yet.
>Basically if the device can't get through the entire queue within the
>read|write_expire timeout, they will start being submitted in fifo order
>slowing down the device more (probably) and contributing to the problem.
>It may be causing the bad numbers in contest. Here is a patch which
>relieves the problem for loads I am testing (bench.c, tiobench).
>
>Con, it would be nice if you could try this, if you value your disk,
>maybe you could wait for Jens to get back!

I gave it a quick run (with expire batch set to 16 to emulate fifo batch=16)
and found only these different:

2.5.47-mm3ns is no shared 3rd level pagetables
2.5.47-mm3u is the same as 2.5.47-mm3ns but with your update

io_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
2.5.47-mm3ns [1] 121.8 60 5 7 1.71
2.5.47-mm3u [1] 161.1 47 9 9 2.26

read_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
2.5.47-mm3ns [2] 257.9 29 11 2 3.61
2.5.47-mm3u [1] 283.4 27 12 2 3.97

mem_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
2.5.47-mm3ns [1] 237.5 32 41 1 3.33
2.5.47-mm3u [1] 218.8 34 39 1 3.06

To me it does not appear to be the cause of the prolongation of kernel
compilation time under io load in 2.5.47-mm3

Con
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.0 (GNU/Linux)

iD8DBQE91ihYF6dfvkL3i1gRAqsnAKCAd0DkP1MDFe8DkNuTc/nl4XfYwQCgh4pR
FqhMIpEdOEFhQOnWx+wQNgE=
=shmO
-----END PGP SIGNATURE-----