Flush request isn't queueable. When it's running, other request
can't. We can optimize flush performance according to this knowledge
In my test, I got about 20% performance boost.
v2->v3: mainly adds more comments to explain the optimization and changes
the code position to enable the optimization in SATA.