2022-04-19 22:22:33

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH v2] fs-writeback: writeback_sb_ino des:Recalculate 'wrote' according skipped pages

On 4/18/22 4:01 PM, Linus Torvalds wrote:
> On Mon, Apr 18, 2022 at 2:16 PM Jens Axboe <[email protected]> wrote:
>>
>> So as far as I can tell, we really have two options:
>>
>> 1) Don't preempt a task that has a plug active
>> 2) Flush for any schedule out, not just going to sleep
>>
>> 1 may not be feasible if we're queueing lots of IO, which then leaves 2.
>> Linus, do you remember what your original patch here was motivated by?
>> I'm assuming it was an effiency thing, but do we really have a lot of
>> cases of IO submissions being preempted a lot and hence making the plug
>> less efficient than it should be at merging IO? Seems unlikely, but I
>> could be wrong.
>
> No, it goes all the way back to 2011, my memory for those kinds of
> details doesn't go that far back.
>
> That said, it clearly is about preemption, and I wonder if we had an
> actual bug there.
>
> IOW, it might well not just in the "gather up more IO for bigger
> requests" thing, but about "the IO plug is per-thread and doesn't have
> locking because of that".
>
> So doing plug flushing from a preemptible kernel context might race
> with it all being set up.

Hmm yes. But doesn't preemption imply a full barrier? As long as we
assign the plug at the end, we should be fine. And just now looking that
up, there's even already a comment to that effect in blk_start_plug().
So barring any weirdness with that, maybe that's the solution.

Your comment did jog my memory a bit though, and I do in fact think it
was something related to that that made is change it. I'll dig through
some old emails and see if I can find it.

> Explicit io_schedule() etc obviously doesn't have that issue.

Right

--
Jens Axboe