2024-02-23 15:59:37

by Qais Yousef

[permalink] [raw]
Subject: [PATCH v2 2/2] block/blk-mq: Don't complete locally if capacities are different

The logic in blk_mq_complete_need_ipi() assumes SMP systems where all
CPUs have equal compute capacities and only LLC cache can make
a different on perceived performance. But this assumption falls apart on
HMP systems where LLC is shared, but the CPUs have different capacities.
Staying local then can have a big performance impact if the IO request
was done from a CPU with higher capacity but the interrupt is serviced
on a lower capacity CPU.

Use the new cpus_equal_capacity() function to check if we need to send
an IPI.

Without the patch I see the BLOCK softirq always running on little cores
(where the hardirq is serviced). With it I can see it running on all
cores.

This was noticed after the topology change [1] where now on a big.LITTLE
we truly get that the LLC is shared between all cores where as in the
past it was being misrepresented for historical reasons. The logic
exposed a missing dependency on capacities for such systems where there
can be a big performance difference between the CPUs.

This of course introduced a noticeable change in behavior depending on
how the topology is presented. Leading to regressions in some workloads
as the performance of the BLOCK softirq on littles can be noticeably
worse on some platforms.

Worth noting that we could have checked for capacities being greater
than or equal instead for equality. This will lead to favouring higher
performance always. But opted for equality instead to match the
performance of the requester without making an assumption that can lead
to power trade-offs which these systems tend to be sensitive about. If
the requester would like to run faster, it's better to rely on the
scheduler to give the IO requester via some facility to run on a faster
core; and then if the interrupt triggered on a CPU with different
capacity we'll make sure to match the performance the requester is
supposed to run at.

[1] https://lpc.events/event/16/contributions/1342/attachments/962/1883/LPC-2022-Android-MC-Phantom-Domains.pdf

Signed-off-by: Qais Yousef <[email protected]>
---
block/blk-mq.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 2dc01551e27c..ea69047e12f7 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1167,10 +1167,11 @@ static inline bool blk_mq_complete_need_ipi(struct request *rq)
if (force_irqthreads())
return false;

- /* same CPU or cache domain? Complete locally */
+ /* same CPU or cache domain and capacity? Complete locally */
if (cpu == rq->mq_ctx->cpu ||
(!test_bit(QUEUE_FLAG_SAME_FORCE, &rq->q->queue_flags) &&
- cpus_share_cache(cpu, rq->mq_ctx->cpu)))
+ cpus_share_cache(cpu, rq->mq_ctx->cpu) &&
+ cpus_equal_capacity(cpu, rq->mq_ctx->cpu)))
return false;

/* don't try to IPI to an offline CPU */
--
2.34.1



2024-02-23 16:20:27

by Bart Van Assche

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] block/blk-mq: Don't complete locally if capacities are different

On 2/23/24 07:57, Qais Yousef wrote:
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 2dc01551e27c..ea69047e12f7 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1167,10 +1167,11 @@ static inline bool blk_mq_complete_need_ipi(struct request *rq)
> if (force_irqthreads())
> return false;
>
> - /* same CPU or cache domain? Complete locally */
> + /* same CPU or cache domain and capacity? Complete locally */
> if (cpu == rq->mq_ctx->cpu ||
> (!test_bit(QUEUE_FLAG_SAME_FORCE, &rq->q->queue_flags) &&
> - cpus_share_cache(cpu, rq->mq_ctx->cpu)))
> + cpus_share_cache(cpu, rq->mq_ctx->cpu) &&
> + cpus_equal_capacity(cpu, rq->mq_ctx->cpu)))
> return false;
>
> /* don't try to IPI to an offline CPU */

I think it's worth mentioning that this change is intended for storage
controllers that only support a single completion interrupt. Anyway:

Reviewed-by: Bart Van Assche <[email protected]>

2024-02-24 23:06:46

by Qais Yousef

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] block/blk-mq: Don't complete locally if capacities are different

On 02/23/24 08:09, Bart Van Assche wrote:
> On 2/23/24 07:57, Qais Yousef wrote:
> > diff --git a/block/blk-mq.c b/block/blk-mq.c
> > index 2dc01551e27c..ea69047e12f7 100644
> > --- a/block/blk-mq.c
> > +++ b/block/blk-mq.c
> > @@ -1167,10 +1167,11 @@ static inline bool blk_mq_complete_need_ipi(struct request *rq)
> > if (force_irqthreads())
> > return false;
> > - /* same CPU or cache domain? Complete locally */
> > + /* same CPU or cache domain and capacity? Complete locally */
> > if (cpu == rq->mq_ctx->cpu ||
> > (!test_bit(QUEUE_FLAG_SAME_FORCE, &rq->q->queue_flags) &&
> > - cpus_share_cache(cpu, rq->mq_ctx->cpu)))
> > + cpus_share_cache(cpu, rq->mq_ctx->cpu) &&
> > + cpus_equal_capacity(cpu, rq->mq_ctx->cpu)))
> > return false;
> > /* don't try to IPI to an offline CPU */
>
> I think it's worth mentioning that this change is intended for storage
> controllers that only support a single completion interrupt. Anyway:

Sorry I didn't realize it is only applied for this scenario.

>
> Reviewed-by: Bart Van Assche <[email protected]>

Thanks for the reviews!