The noop elevator is still too fat for db transaction processing workload.
Since the db application already merged all blocks before sending it down,
the I/O presented to the elevator are actually not merge-able anymore. Since
I/O are also random, we don't want to sort them either. However the noop
elevator is still doing a linear search on the entire list of requests in
the queue. A noop elevator after all isn't really noop.
We are proposing a true no-op elevator algorithm, no merge, no nothing. Just
do first in and first out list management for the I/O request. The best name
I can come up with is "FIFO". I also piggy backed the code onto noop-iosched.c.
I can easily pull those code into a separate file if people object. Though, I
hope Jens is OK with it.
--- linux-2.6.11/drivers/block/noop-iosched.c.orig 2005-03-28 16:37:30.000000000 -0800
+++ linux-2.6.11/drivers/block/noop-iosched.c 2005-03-28 16:43:57.000000000 -0800
@@ -74,6 +74,21 @@ static struct request *elevator_noop_nex
return NULL;
}
+static void elevator_fifo_add_request(request_queue_t *q, struct request *rq,
+ int where)
+{
+ list_add_tail(&rq->queuelist, &q->queue_head);
+}
+
+static struct elevator_type elevator_fifo = {
+ .ops = {
+ .elevator_next_req_fn = elevator_noop_next_request,
+ .elevator_add_req_fn = elevator_fifo_add_request,
+ },
+ .elevator_name = "fifo",
+ .elevator_owner = THIS_MODULE,
+};
+
static struct elevator_type elevator_noop = {
.ops = {
.elevator_merge_fn = elevator_noop_merge,
@@ -87,12 +102,14 @@ static struct elevator_type elevator_noo
static int __init noop_init(void)
{
- return elv_register(&elevator_noop);
+ return (elv_register(&elevator_noop) ||
+ elv_register(&elevator_fifo));
}
static void __exit noop_exit(void)
{
elv_unregister(&elevator_noop);
+ elv_unregister(&elevator_fifo);
}
module_init(noop_init);
On Mon, Mar 28 2005, Chen, Kenneth W wrote:
> The noop elevator is still too fat for db transaction processing
> workload. Since the db application already merged all blocks before
> sending it down, the I/O presented to the elevator are actually not
> merge-able anymore. Since I/O are also random, we don't want to sort
> them either. However the noop elevator is still doing a linear search
> on the entire list of requests in the queue. A noop elevator after
> all isn't really noop.
>
> We are proposing a true no-op elevator algorithm, no merge, no
> nothing. Just do first in and first out list management for the I/O
> request. The best name I can come up with is "FIFO". I also piggy
> backed the code onto noop-iosched.c. I can easily pull those code
> into a separate file if people object. Though, I hope Jens is OK with
> it.
It's not quite ok, because you don't honor the insertion point in
fifo_add_request. The only 'fat' part of the noop io scheduler is the
merge stuff, the original plan was to move that to a hash table lookup
instead like the other io schedulers do. So I would suggest just
changing noop to hash the request on the end point for back merges and
forget about front merges, since they are rare anyways. Hmm actually,
the last merge hint should catch most of the merges at almost zero cost.
How about this patch?
Signed-off-by: Jens Axboe <[email protected]>
===== drivers/block/noop-iosched.c 1.7 vs edited =====
--- 1.7/drivers/block/noop-iosched.c 2005-01-14 17:35:40 +01:00
+++ edited/drivers/block/noop-iosched.c 2005-03-29 10:05:21 +02:00
@@ -13,34 +13,13 @@
static int elevator_noop_merge(request_queue_t *q, struct request **req,
struct bio *bio)
{
- struct list_head *entry = &q->queue_head;
- struct request *__rq;
int ret;
- if ((ret = elv_try_last_merge(q, bio))) {
+ ret = elv_try_last_merge(q, bio);
+ if (ret != ELEVATOR_NO_MERGE)
*req = q->last_merge;
- return ret;
- }
- while ((entry = entry->prev) != &q->queue_head) {
- __rq = list_entry_rq(entry);
-
- if (__rq->flags & (REQ_SOFTBARRIER | REQ_HARDBARRIER))
- break;
- else if (__rq->flags & REQ_STARTED)
- break;
-
- if (!blk_fs_request(__rq))
- continue;
-
- if ((ret = elv_try_merge(__rq, bio))) {
- *req = __rq;
- q->last_merge = __rq;
- return ret;
- }
- }
-
- return ELEVATOR_NO_MERGE;
+ return ret;
}
static void elevator_noop_merge_requests(request_queue_t *q, struct request *req,
--
Jens Axboe
On Mon, Mar 28 2005, Chen, Kenneth W wrote:
> The noop elevator is still too fat for db transaction processing
> workload. Since the db application already merged all blocks before
> sending it down, the I/O presented to the elevator are actually not
> merge-able anymore. Since I/O are also random, we don't want to sort
> them either. However the noop elevator is still doing a linear search
> on the entire list of requests in the queue. A noop elevator after
> all isn't really noop.
>
> We are proposing a true no-op elevator algorithm, no merge, no
> nothing. Just do first in and first out list management for the I/O
> request. The best name I can come up with is "FIFO". I also piggy
> backed the code onto noop-iosched.c. I can easily pull those code
> into a separate file if people object. Though, I hope Jens is OK with
> it.
Jens Axboe wrote on Tuesday, March 29, 2005 12:06 AM
> It's not quite ok, because you don't honor the insertion point in
> fifo_add_request.
But it is FIFO! Honoring insertion point will break the promises this
elevator made to the user: first in first out.
OK, OK, I won't argue to the death of it :-). I will give this a try.
Thanks.
On Tue, Mar 29 2005, Chen, Kenneth W wrote:
> On Mon, Mar 28 2005, Chen, Kenneth W wrote:
> > The noop elevator is still too fat for db transaction processing
> > workload. Since the db application already merged all blocks before
> > sending it down, the I/O presented to the elevator are actually not
> > merge-able anymore. Since I/O are also random, we don't want to sort
> > them either. However the noop elevator is still doing a linear search
> > on the entire list of requests in the queue. A noop elevator after
> > all isn't really noop.
> >
> > We are proposing a true no-op elevator algorithm, no merge, no
> > nothing. Just do first in and first out list management for the I/O
> > request. The best name I can come up with is "FIFO". I also piggy
> > backed the code onto noop-iosched.c. I can easily pull those code
> > into a separate file if people object. Though, I hope Jens is OK with
> > it.
>
>
> Jens Axboe wrote on Tuesday, March 29, 2005 12:06 AM
> > It's not quite ok, because you don't honor the insertion point in
> > fifo_add_request.
>
> But it is FIFO! Honoring insertion point will break the promises this
> elevator made to the user: first in first out.
No such promise was ever made, noop just means it does 'basically
nothing'. It never meant FIFO in anyway, we cannot break the semantics
of block layer commands just for the hell of it.
--
Jens Axboe
Jens Axboe wrote on Tuesday, March 29, 2005 12:04 PM
> No such promise was ever made, noop just means it does 'basically
> nothing'. It never meant FIFO in anyway, we cannot break the semantics
> of block layer commands just for the hell of it.
Acknowledged and understood, will try your patch shortly.
Jens Axboe wrote:
> On Mon, Mar 28 2005, Chen, Kenneth W wrote:
>
>>The noop elevator is still too fat for db transaction processing
>>workload. Since the db application already merged all blocks before
>>sending it down, the I/O presented to the elevator are actually not
>>merge-able anymore. Since I/O are also random, we don't want to sort
>>them either. However the noop elevator is still doing a linear search
>>on the entire list of requests in the queue. A noop elevator after
>>all isn't really noop.
>>
>>We are proposing a true no-op elevator algorithm, no merge, no
>>nothing. Just do first in and first out list management for the I/O
>>request. The best name I can come up with is "FIFO". I also piggy
>>backed the code onto noop-iosched.c. I can easily pull those code
>>into a separate file if people object. Though, I hope Jens is OK with
>>it.
>
>
> It's not quite ok, because you don't honor the insertion point in
> fifo_add_request. The only 'fat' part of the noop io scheduler is the
> merge stuff, the original plan was to move that to a hash table lookup
> instead like the other io schedulers do. So I would suggest just
> changing noop to hash the request on the end point for back merges and
> forget about front merges, since they are rare anyways. Hmm actually,
> the last merge hint should catch most of the merges at almost zero cost.
Making the noop faster is clearly a good thing, but some database
software may depend on transaction order as provided by a true fifo
process. It would be nice to have both.
--
-bill davidsen ([email protected])
"The secret to procrastination is to put things off until the
last possible moment - but no longer" -me
On Tue, Mar 29 2005, Bill Davidsen wrote:
> Jens Axboe wrote:
> >On Mon, Mar 28 2005, Chen, Kenneth W wrote:
> >
> >>The noop elevator is still too fat for db transaction processing
> >>workload. Since the db application already merged all blocks before
> >>sending it down, the I/O presented to the elevator are actually not
> >>merge-able anymore. Since I/O are also random, we don't want to sort
> >>them either. However the noop elevator is still doing a linear search
> >>on the entire list of requests in the queue. A noop elevator after
> >>all isn't really noop.
> >>
> >>We are proposing a true no-op elevator algorithm, no merge, no
> >>nothing. Just do first in and first out list management for the I/O
> >>request. The best name I can come up with is "FIFO". I also piggy
> >>backed the code onto noop-iosched.c. I can easily pull those code
> >>into a separate file if people object. Though, I hope Jens is OK with
> >>it.
> >
> >
> >It's not quite ok, because you don't honor the insertion point in
> >fifo_add_request. The only 'fat' part of the noop io scheduler is the
> >merge stuff, the original plan was to move that to a hash table lookup
> >instead like the other io schedulers do. So I would suggest just
> >changing noop to hash the request on the end point for back merges and
> >forget about front merges, since they are rare anyways. Hmm actually,
> >the last merge hint should catch most of the merges at almost zero cost.
>
> Making the noop faster is clearly a good thing, but some database
> software may depend on transaction order as provided by a true fifo
> process. It would be nice to have both.
Just look at the code. It does FIFO for any request that _isn't_
specified as ELEVATOR_INSERT_FRONT - which means any fs request, or any
plain pc request. There is no specific reordering going on.
Drivers expect to be able to add a request back at the head, for eg
retrying it after a QUEUE_BUSY or similar condition.
--
Jens Axboe
On Tue, Mar 29 2005, Chen, Kenneth W wrote:
> Jens Axboe wrote on Tuesday, March 29, 2005 12:04 PM
> > No such promise was ever made, noop just means it does 'basically
> > nothing'. It never meant FIFO in anyway, we cannot break the semantics
> > of block layer commands just for the hell of it.
>
> Acknowledged and understood, will try your patch shortly.
Did you test it?
--
Jens Axboe
Jens Axboe wrote on Tuesday, April 05, 2005 7:54 AM
> On Tue, Mar 29 2005, Chen, Kenneth W wrote:
> > Jens Axboe wrote on Tuesday, March 29, 2005 12:04 PM
> > > No such promise was ever made, noop just means it does 'basically
> > > nothing'. It never meant FIFO in anyway, we cannot break the semantics
> > > of block layer commands just for the hell of it.
> >
> > Acknowledged and understood, will try your patch shortly.
>
> Did you test it?
Experiment is in the queue, should have a result in a day or two.
Chen, Kenneth W wrote on Tuesday, April 05, 2005 5:13 PM
> Jens Axboe wrote on Tuesday, April 05, 2005 7:54 AM
> > On Tue, Mar 29 2005, Chen, Kenneth W wrote:
> > > Jens Axboe wrote on Tuesday, March 29, 2005 12:04 PM
> > > > No such promise was ever made, noop just means it does 'basically
> > > > nothing'. It never meant FIFO in anyway, we cannot break the semantics
> > > > of block layer commands just for the hell of it.
> > >
> > > Acknowledged and understood, will try your patch shortly.
> >
> > Did you test it?
>
> Experiment is in the queue, should have a result in a day or two.
Jens, your patch works! We are seeing a little bit increase in indirect
branch calls with your patch where our patch tries to remove elevator_merge_fn()
completely. But the difference is all within noise range.
If there is no other issues (I don't see any), we would like to see this patch
merged upstream. Thanks.
- Ken
On Tue, Apr 12 2005, Chen, Kenneth W wrote:
> Chen, Kenneth W wrote on Tuesday, April 05, 2005 5:13 PM
> > Jens Axboe wrote on Tuesday, April 05, 2005 7:54 AM
> > > On Tue, Mar 29 2005, Chen, Kenneth W wrote:
> > > > Jens Axboe wrote on Tuesday, March 29, 2005 12:04 PM
> > > > > No such promise was ever made, noop just means it does 'basically
> > > > > nothing'. It never meant FIFO in anyway, we cannot break the semantics
> > > > > of block layer commands just for the hell of it.
> > > >
> > > > Acknowledged and understood, will try your patch shortly.
> > >
> > > Did you test it?
> >
> > Experiment is in the queue, should have a result in a day or two.
>
>
> Jens, your patch works! We are seeing a little bit increase in
Super.
> indirect branch calls with your patch where our patch tries to remove
> elevator_merge_fn() completely. But the difference is all within
> noise range.
Yeah that is expected. Thanks for testing!
> If there is no other issues (I don't see any), we would like to see
> this patch merged upstream. Thanks.
I will pass it along.
--
Jens Axboe