Due to relaxed ordering requirements on multiple architectures,
drivers are required to use wmb/rmb/mb combinations when they
need to guarantee observability between the memory and the HW.
The mpt3sas driver is already using wmb() for this purpose.
However, it issues a writel following wmb(). writel() function
on arm/arm64 arhictectures have an embedded wmb() call inside.
This results in unnecessary performance loss and code duplication.
The kernel has been updated to support relaxed read/write
API to be supported across all architectures now.
The right thing was to either call __raw_writel/__raw_readl or
write_relaxed/read_relaxed for multi-arch compatibility.
Signed-off-by: Sinan Kaya <[email protected]>
---
drivers/scsi/mpt3sas/mpt3sas_base.c | 21 +++++++++++----------
1 file changed, 11 insertions(+), 10 deletions(-)
diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c
index 5b7aec5..6e42036 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -1026,8 +1026,8 @@ static int mpt3sas_remove_dead_ioc_func(void *arg)
ioc->reply_free[ioc->reply_free_host_index] =
cpu_to_le32(reply);
wmb();
- writel(ioc->reply_free_host_index,
- &ioc->chip->ReplyFreeHostIndex);
+ writel_relaxed(ioc->reply_free_host_index,
+ &ioc->chip->ReplyFreeHostIndex);
}
}
@@ -1076,8 +1076,8 @@ static int mpt3sas_remove_dead_ioc_func(void *arg)
wmb();
if (ioc->is_warpdrive) {
- writel(reply_q->reply_post_host_index,
- ioc->reply_post_host_index[msix_index]);
+ writel_relaxed(reply_q->reply_post_host_index,
+ ioc->reply_post_host_index[msix_index]);
atomic_dec(&reply_q->busy);
return IRQ_HANDLED;
}
@@ -1098,13 +1098,14 @@ static int mpt3sas_remove_dead_ioc_func(void *arg)
* value in MSIxIndex field.
*/
if (ioc->combined_reply_queue)
- writel(reply_q->reply_post_host_index | ((msix_index & 7) <<
- MPI2_RPHI_MSIX_INDEX_SHIFT),
- ioc->replyPostRegisterIndex[msix_index/8]);
+ writel_relaxed(reply_q->reply_post_host_index |
+ ((msix_index & 7) <<
+ MPI2_RPHI_MSIX_INDEX_SHIFT),
+ ioc->replyPostRegisterIndex[msix_index/8]);
else
- writel(reply_q->reply_post_host_index | (msix_index <<
- MPI2_RPHI_MSIX_INDEX_SHIFT),
- &ioc->chip->ReplyPostHostIndex);
+ writel_relaxed(reply_q->reply_post_host_index |
+ (msix_index << MPI2_RPHI_MSIX_INDEX_SHIFT),
+ &ioc->chip->ReplyPostHostIndex);
atomic_dec(&reply_q->busy);
return IRQ_HANDLED;
}
--
1.9.1
On 4/7/2017 12:41 PM, Sinan Kaya wrote:
> The right thing was to either call __raw_writel/__raw_readl or
> write_relaxed/read_relaxed for multi-arch compatibility.
One can also argue to get rid of wmb(). I can go either way based
on the recommendation.
--
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
On Fri, 2017-04-07 at 12:41 -0400, Sinan Kaya wrote:
> Due to relaxed ordering requirements on multiple architectures,
> drivers are required to use wmb/rmb/mb combinations when they
> need to guarantee observability between the memory and the HW.
>
> The mpt3sas driver is already using wmb() for this purpose.
> However, it issues a writel following wmb(). writel() function
> on arm/arm64 arhictectures have an embedded wmb() call inside.
>
> This results in unnecessary performance loss and code duplication.
>
> The kernel has been updated to support relaxed read/write
> API to be supported across all architectures now.
>
> The right thing was to either call __raw_writel/__raw_readl or
> write_relaxed/read_relaxed for multi-arch compatibility.
writeX_relaxed and thus your patch is definitely wrong. The reason is
that we have two ordering domains: the CPU and the Bus. wmb forces
ordering in the CPU domain but not the bus domain. writeX originally
forced ordering in the bus domain but not the CPU domain, but since the
raw primitives I think it now orders in both and writeX_relaxed orders
in neither domain, so your patch would currently eliminate the bus
ordering.
James
On 4/7/2017 1:25 PM, James Bottomley wrote:
>> The right thing was to either call __raw_writel/__raw_readl or
>> write_relaxed/read_relaxed for multi-arch compatibility.
> writeX_relaxed and thus your patch is definitely wrong. The reason is
> that we have two ordering domains: the CPU and the Bus. wmb forces
> ordering in the CPU domain but not the bus domain. writeX originally
> forced ordering in the bus domain but not the CPU domain, but since the
> raw primitives I think it now orders in both and writeX_relaxed orders
> in neither domain, so your patch would currently eliminate the bus
> ordering.
Yeah, that's why I recommended to remove the wmb() with a follow up
instead of using the relaxed with a follow up.
writel already guarantees ordering for both cpu and bus. we don't need
additional wmb()
--
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.