The rtt-upiu packets precede any data-out upiu packets, thus
synchronizing the data input to the device: this mostly applies to write
operations, but there are other operations that requires rtt as well.
There are several rules binding this rtt - data-out dialog, specifically
There can be at most outstanding bMaxNumOfRTT such packets. This might
have an effect on write performance (sequential write in particular), as
each data-out upiu must wait for its rtt sibling.
UFSHCI expects bMaxNumOfRTT to be min(bDeviceRTTCap, NORTT). However,
as of today, there does not appear to be no-one who sets it: not the
host controller nor the driver. It wasn't an issue up to now:
bMaxNumOfRTT is set to 2 after manufacturing, and wasn't limiting the
write performance.
UFS4.0, and specifically gear 5 changes this, and requires the device to
be more attentive. This doesn't come free - the device has to allocate
more resources to that end, but the sequential write performance
improvement is significant. Early measurements shows 25% gain when
moving from rtt 2 to 9. Therefore, set bMaxNumOfRTT to be
min(bDeviceRTTCap, NORTT) as UFSHCI expects.
v5 -> v6:
Use blk_mq_<un>freeze_queue to drain the queues (Bart)
Replace the rtt_set() vop by a max_num_rtt constant (Cristoph/Bart)
v4 -> v5:
Quiesce the queues before writing bMaxNumOfRTT (Bart)
Make bDeviceRTTCap available in ufshcd_device_params_init() (Bart)
v3 -> v4:
Allow bMaxNumOfRTT to be configured via sysfs (Bart)
v2 -> v3:
Allow platform vendors to take precedence having their own rtt
negotiation mechanism (Peter)
v1 -> v2:
bMaxNumOfRTT is a Persistent attribute - do not override if it was
written (Bean)
Avri Altman (3):
scsi: ufs: Allow RTT negotiation
scsi: ufs: Maximum RTT supported by the host driver
scsi: ufs: sysfs: Make max_number_of_rtt read-write
Documentation/ABI/testing/sysfs-driver-ufs | 14 +++--
drivers/ufs/core/ufs-sysfs.c | 68 +++++++++++++++++++++-
drivers/ufs/core/ufshcd-priv.h | 24 ++++++++
drivers/ufs/core/ufshcd.c | 42 +++++++++++++
drivers/ufs/host/ufs-mediatek.c | 1 +
drivers/ufs/host/ufs-mediatek.h | 3 +
include/ufs/ufs.h | 2 +
include/ufs/ufshcd.h | 4 ++
include/ufs/ufshci.h | 1 +
9 files changed, 152 insertions(+), 7 deletions(-)
--
2.34.1
The rtt-upiu packets precede any data-out upiu packets, thus
synchronizing the data input to the device: this mostly applies to write
operations, but there are other operations that requires rtt as well.
There are several rules binding this rtt - data-out dialog, specifically
There can be at most outstanding bMaxNumOfRTT such packets. This might
have an effect on write performance (sequential write in particular), as
each data-out upiu must wait for its rtt sibling.
UFSHCI expects bMaxNumOfRTT to be min(bDeviceRTTCap, NORTT). However,
as of today, there does not appears to be no-one who sets it: not the
host controller nor the driver. It wasn't an issue up to now:
bMaxNumOfRTT is set to 2 after manufacturing, and wasn't limiting the
write performance.
UFS4.0, and specifically gear 5 changes this, and requires the device to
be more attentive. This doesn't come free - the device has to allocate
more resources to that end, but the sequential write performance
improvement is significant. Early measurements shows 25% gain when
moving from rtt 2 to 9. Therefore, set bMaxNumOfRTT to be
min(bDeviceRTTCap, NORTT) as UFSHCI expects.
Signed-off-by: Avri Altman <[email protected]>
Reviewed-by: Bean Huo <[email protected]>
---
drivers/ufs/core/ufshcd.c | 38 ++++++++++++++++++++++++++++++++++++++
include/ufs/ufs.h | 2 ++
include/ufs/ufshcd.h | 2 ++
include/ufs/ufshci.h | 1 +
4 files changed, 43 insertions(+)
diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index 0819ddafe7a6..7df8bcacbe7e 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -102,6 +102,9 @@
/* Default RTC update every 10 seconds */
#define UFS_RTC_UPDATE_INTERVAL_MS (10 * MSEC_PER_SEC)
+/* bMaxNumOfRTT is equal to two after device manufacturing */
+#define DEFAULT_MAX_NUM_RTT 2
+
/* UFSHC 4.0 compliant HC support this mode. */
static bool use_mcq_mode = true;
@@ -2405,6 +2408,8 @@ static inline int ufshcd_hba_capabilities(struct ufs_hba *hba)
((hba->capabilities & MASK_TASK_MANAGEMENT_REQUEST_SLOTS) >> 16) + 1;
hba->reserved_slot = hba->nutrs - 1;
+ hba->nortt = FIELD_GET(MASK_NUMBER_OUTSTANDING_RTT, hba->capabilities) + 1;
+
/* Read crypto capabilities */
err = ufshcd_hba_init_crypto_capabilities(hba);
if (err) {
@@ -8119,6 +8124,35 @@ static void ufshcd_ext_iid_probe(struct ufs_hba *hba, u8 *desc_buf)
dev_info->b_ext_iid_en = ext_iid_en;
}
+static void ufshcd_set_rtt(struct ufs_hba *hba)
+{
+ struct ufs_dev_info *dev_info = &hba->dev_info;
+ u32 rtt = 0;
+ u32 dev_rtt = 0;
+
+ /* RTT override makes sense only for UFS-4.0 and above */
+ if (dev_info->wspecversion < 0x400)
+ return;
+
+ if (ufshcd_query_attr_retry(hba, UPIU_QUERY_OPCODE_READ_ATTR,
+ QUERY_ATTR_IDN_MAX_NUM_OF_RTT, 0, 0, &dev_rtt)) {
+ dev_err(hba->dev, "failed reading bMaxNumOfRTT\n");
+ return;
+ }
+
+ /* do not override if it was already written */
+ if (dev_rtt != DEFAULT_MAX_NUM_RTT)
+ return;
+
+ rtt = min_t(int, dev_info->rtt_cap, hba->nortt);
+ if (rtt == dev_rtt)
+ return;
+
+ if (ufshcd_query_attr_retry(hba, UPIU_QUERY_OPCODE_WRITE_ATTR,
+ QUERY_ATTR_IDN_MAX_NUM_OF_RTT, 0, 0, &rtt))
+ dev_err(hba->dev, "failed writing bMaxNumOfRTT\n");
+}
+
void ufshcd_fixup_dev_quirks(struct ufs_hba *hba,
const struct ufs_dev_quirk *fixups)
{
@@ -8254,6 +8288,8 @@ static int ufs_get_device_desc(struct ufs_hba *hba)
desc_buf[DEVICE_DESC_PARAM_SPEC_VER + 1];
dev_info->bqueuedepth = desc_buf[DEVICE_DESC_PARAM_Q_DPTH];
+ dev_info->rtt_cap = desc_buf[DEVICE_DESC_PARAM_RTT_CAP];
+
model_index = desc_buf[DEVICE_DESC_PARAM_PRDCT_NAME];
err = ufshcd_read_string_desc(hba, model_index,
@@ -8506,6 +8542,8 @@ static int ufshcd_device_params_init(struct ufs_hba *hba)
goto out;
}
+ ufshcd_set_rtt(hba);
+
ufshcd_get_ref_clk_gating_wait(hba);
if (!ufshcd_query_flag_retry(hba, UPIU_QUERY_OPCODE_READ_FLAG,
diff --git a/include/ufs/ufs.h b/include/ufs/ufs.h
index b6003749bc83..853e95957c31 100644
--- a/include/ufs/ufs.h
+++ b/include/ufs/ufs.h
@@ -592,6 +592,8 @@ struct ufs_dev_info {
enum ufs_rtc_time rtc_type;
time64_t rtc_time_baseline;
u32 rtc_update_period;
+
+ u8 rtt_cap; /* bDeviceRTTCap */
};
/*
diff --git a/include/ufs/ufshcd.h b/include/ufs/ufshcd.h
index bad88bd91995..d74bd2d67b06 100644
--- a/include/ufs/ufshcd.h
+++ b/include/ufs/ufshcd.h
@@ -819,6 +819,7 @@ enum ufshcd_mcq_opr {
* @capabilities: UFS Controller Capabilities
* @mcq_capabilities: UFS Multi Circular Queue capabilities
* @nutrs: Transfer Request Queue depth supported by controller
+ * @nortt - Max outstanding RTTs supported by controller
* @nutmrs: Task Management Queue depth supported by controller
* @reserved_slot: Used to submit device commands. Protected by @dev_cmd.lock.
* @ufs_version: UFS Version to which controller complies
@@ -957,6 +958,7 @@ struct ufs_hba {
u32 capabilities;
int nutrs;
+ int nortt;
u32 mcq_capabilities;
int nutmrs;
u32 reserved_slot;
diff --git a/include/ufs/ufshci.h b/include/ufs/ufshci.h
index 385e1c6b8d60..c50f92bf2e1d 100644
--- a/include/ufs/ufshci.h
+++ b/include/ufs/ufshci.h
@@ -68,6 +68,7 @@ enum {
/* Controller capability masks */
enum {
MASK_TRANSFER_REQUESTS_SLOTS = 0x0000001F,
+ MASK_NUMBER_OUTSTANDING_RTT = 0x0000FF00,
MASK_TASK_MANAGEMENT_REQUEST_SLOTS = 0x00070000,
MASK_EHSLUTRD_SUPPORTED = 0x00400000,
MASK_AUTO_HIBERN8_SUPPORT = 0x00800000,
--
2.34.1
Allow platform vendors to take precedence having their own max rtt
support. This makes sense because the host controller's nortt
characteristic may vary among vendors.
while at it, set this value for Mediatek, as requested by Peter -
https://lore.kernel.org/all/[email protected]/
Signed-off-by: Avri Altman <[email protected]>
---
drivers/ufs/core/ufshcd.c | 6 +++++-
drivers/ufs/host/ufs-mediatek.c | 1 +
drivers/ufs/host/ufs-mediatek.h | 3 +++
include/ufs/ufshcd.h | 2 ++
4 files changed, 11 insertions(+), 1 deletion(-)
diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index 7df8bcacbe7e..b62023a6c306 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -8144,7 +8144,11 @@ static void ufshcd_set_rtt(struct ufs_hba *hba)
if (dev_rtt != DEFAULT_MAX_NUM_RTT)
return;
- rtt = min_t(int, dev_info->rtt_cap, hba->nortt);
+ if (hba->vops && hba->vops->max_num_rtt)
+ rtt = hba->vops->max_num_rtt;
+ else
+ rtt = min_t(int, dev_info->rtt_cap, hba->nortt);
+
if (rtt == dev_rtt)
return;
diff --git a/drivers/ufs/host/ufs-mediatek.c b/drivers/ufs/host/ufs-mediatek.c
index c4f997196c57..c7a0ab9b1f59 100644
--- a/drivers/ufs/host/ufs-mediatek.c
+++ b/drivers/ufs/host/ufs-mediatek.c
@@ -1785,6 +1785,7 @@ static int ufs_mtk_config_esi(struct ufs_hba *hba)
*/
static const struct ufs_hba_variant_ops ufs_hba_mtk_vops = {
.name = "mediatek.ufshci",
+ .max_num_rtt = MTK_MAX_NUM_RTT,
.init = ufs_mtk_init,
.get_ufs_hci_version = ufs_mtk_get_ufs_hci_version,
.setup_clocks = ufs_mtk_setup_clocks,
diff --git a/drivers/ufs/host/ufs-mediatek.h b/drivers/ufs/host/ufs-mediatek.h
index 3ff17e95afab..05d76a6bd772 100644
--- a/drivers/ufs/host/ufs-mediatek.h
+++ b/drivers/ufs/host/ufs-mediatek.h
@@ -189,4 +189,7 @@ struct ufs_mtk_host {
/* MTK delay of autosuspend: 500 ms */
#define MTK_RPM_AUTOSUSPEND_DELAY_MS 500
+/* MTK RTT support number */
+#define MTK_MAX_NUM_RTT 2
+
#endif /* !_UFS_MEDIATEK_H */
diff --git a/include/ufs/ufshcd.h b/include/ufs/ufshcd.h
index d74bd2d67b06..ef04ec8aad69 100644
--- a/include/ufs/ufshcd.h
+++ b/include/ufs/ufshcd.h
@@ -295,6 +295,7 @@ struct ufs_pwr_mode_info {
/**
* struct ufs_hba_variant_ops - variant specific callbacks
* @name: variant name
+ * @max_num_rtt: maximum RTT supported by the host
* @init: called when the driver is initialized
* @exit: called to cleanup everything done in init
* @get_ufs_hci_version: called to get UFS HCI version
@@ -332,6 +333,7 @@ struct ufs_pwr_mode_info {
*/
struct ufs_hba_variant_ops {
const char *name;
+ int max_num_rtt;
int (*init)(struct ufs_hba *);
void (*exit)(struct ufs_hba *);
u32 (*get_ufs_hci_version)(struct ufs_hba *);
--
2.34.1
Given the importance of the RTT parameter, we want to be able to
configure it via sysfs. This is because UFS users should be discouraged
from change UFS device parameters without the UFSHCI driver being aware
of these changes.
Signed-off-by: Avri Altman <[email protected]>
---
Documentation/ABI/testing/sysfs-driver-ufs | 14 +++--
drivers/ufs/core/ufs-sysfs.c | 68 +++++++++++++++++++++-
drivers/ufs/core/ufshcd-priv.h | 24 ++++++++
3 files changed, 99 insertions(+), 7 deletions(-)
diff --git a/Documentation/ABI/testing/sysfs-driver-ufs b/Documentation/ABI/testing/sysfs-driver-ufs
index 5bf7073b4f75..fe943ce76c60 100644
--- a/Documentation/ABI/testing/sysfs-driver-ufs
+++ b/Documentation/ABI/testing/sysfs-driver-ufs
@@ -920,14 +920,16 @@ Description: This file shows whether the configuration descriptor is locked.
What: /sys/bus/platform/drivers/ufshcd/*/attributes/max_number_of_rtt
What: /sys/bus/platform/devices/*.ufs/attributes/max_number_of_rtt
-Date: February 2018
-Contact: Stanislav Nijnikov <[email protected]>
+Date: May 2024
+Contact: Avri Altman <[email protected]>
Description: This file provides the maximum current number of
- outstanding RTTs in device that is allowed. The full
- information about the attribute could be found at
- UFS specifications 2.1.
+ outstanding RTTs in device that is allowed. bMaxNumOfRTT is a
+ read-write persistent attribute and is equal to two after device
+ manufacturing. It shall not be set to a value greater than
+ bDeviceRTTCap value, and it may be set only when the hw queues are
+ empty.
- The file is read only.
+ The file is read write.
What: /sys/bus/platform/drivers/ufshcd/*/attributes/exception_event_control
What: /sys/bus/platform/devices/*.ufs/attributes/exception_event_control
diff --git a/drivers/ufs/core/ufs-sysfs.c b/drivers/ufs/core/ufs-sysfs.c
index 3d049967f6bc..48ac708b8795 100644
--- a/drivers/ufs/core/ufs-sysfs.c
+++ b/drivers/ufs/core/ufs-sysfs.c
@@ -1340,6 +1340,73 @@ static const struct attribute_group ufs_sysfs_flags_group = {
.attrs = ufs_sysfs_device_flags,
};
+static ssize_t max_number_of_rtt_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct ufs_hba *hba = dev_get_drvdata(dev);
+ u32 rtt;
+ int ret;
+
+ down(&hba->host_sem);
+ if (!ufshcd_is_user_access_allowed(hba)) {
+ up(&hba->host_sem);
+ return -EBUSY;
+ }
+
+ ufshcd_rpm_get_sync(hba);
+ ret = ufshcd_query_attr(hba, UPIU_QUERY_OPCODE_READ_ATTR,
+ QUERY_ATTR_IDN_MAX_NUM_OF_RTT, 0, 0, &rtt);
+ ufshcd_rpm_put_sync(hba);
+
+ if (ret)
+ goto out;
+
+ ret = sysfs_emit(buf, "0x%08X\n", rtt);
+
+out:
+ up(&hba->host_sem);
+ return ret;
+}
+
+static ssize_t max_number_of_rtt_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct ufs_hba *hba = dev_get_drvdata(dev);
+ struct ufs_dev_info *dev_info = &hba->dev_info;
+ unsigned int rtt;
+ int ret;
+
+ if (kstrtouint(buf, 0, &rtt))
+ return -EINVAL;
+
+ if (rtt > dev_info->rtt_cap) {
+ dev_err(dev, "rtt can be at most bDeviceRTTCap\n");
+ return -EINVAL;
+ }
+
+ down(&hba->host_sem);
+ if (!ufshcd_is_user_access_allowed(hba)) {
+ ret = -EBUSY;
+ goto out;
+ }
+
+ ufshcd_rpm_get_sync(hba);
+ ufshcd_freez_hw_queues(hba);
+
+ ret = ufshcd_query_attr(hba, UPIU_QUERY_OPCODE_WRITE_ATTR,
+ QUERY_ATTR_IDN_MAX_NUM_OF_RTT, 0, 0, &rtt);
+
+ ufshcd_unfreez_hw_queues(hba);
+ ufshcd_rpm_put_sync(hba);
+
+out:
+ up(&hba->host_sem);
+ return ret < 0 ? ret : count;
+}
+
+static DEVICE_ATTR_RW(max_number_of_rtt);
+
static inline bool ufshcd_is_wb_attrs(enum attr_idn idn)
{
return idn >= QUERY_ATTR_IDN_WB_FLUSH_STATUS &&
@@ -1387,7 +1454,6 @@ UFS_ATTRIBUTE(max_data_in_size, _MAX_DATA_IN);
UFS_ATTRIBUTE(max_data_out_size, _MAX_DATA_OUT);
UFS_ATTRIBUTE(reference_clock_frequency, _REF_CLK_FREQ);
UFS_ATTRIBUTE(configuration_descriptor_lock, _CONF_DESC_LOCK);
-UFS_ATTRIBUTE(max_number_of_rtt, _MAX_NUM_OF_RTT);
UFS_ATTRIBUTE(exception_event_control, _EE_CONTROL);
UFS_ATTRIBUTE(exception_event_status, _EE_STATUS);
UFS_ATTRIBUTE(ffu_status, _FFU_STATUS);
diff --git a/drivers/ufs/core/ufshcd-priv.h b/drivers/ufs/core/ufshcd-priv.h
index f42d99ce5bf1..2cdbe6b48d96 100644
--- a/drivers/ufs/core/ufshcd-priv.h
+++ b/drivers/ufs/core/ufshcd-priv.h
@@ -32,6 +32,30 @@ static inline bool ufshcd_is_wb_buf_flush_allowed(struct ufs_hba *hba)
!(hba->quirks & UFSHCI_QUIRK_SKIP_MANUAL_WB_FLUSH_CTRL);
}
+static inline void ufshcd_freez_hw_queues(struct ufs_hba *hba)
+{
+ struct scsi_device *sdev;
+
+ shost_for_each_device(sdev, hba->host) {
+ if (sdev == hba->ufs_device_wlun)
+ continue;
+ blk_mq_freeze_queue(sdev->request_queue);
+ blk_mq_quiesce_queue(sdev->request_queue);
+ }
+}
+
+static inline void ufshcd_unfreez_hw_queues(struct ufs_hba *hba)
+{
+ struct scsi_device *sdev;
+
+ shost_for_each_device(sdev, hba->host) {
+ if (sdev == hba->ufs_device_wlun)
+ continue;
+ blk_mq_unquiesce_queue(sdev->request_queue);
+ blk_mq_unfreeze_queue(sdev->request_queue);
+ }
+}
+
#ifdef CONFIG_SCSI_UFS_HWMON
void ufs_hwmon_probe(struct ufs_hba *hba, u8 mask);
void ufs_hwmon_remove(struct ufs_hba *hba);
--
2.34.1
On Sun, 2024-05-26 at 11:16 +0300, Avri Altman wrote:
>
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
> Allow platform vendors to take precedence having their own max rtt
> support. This makes sense because the host controller's nortt
> characteristic may vary among vendors.
>
> while at it, set this value for Mediatek, as requested by Peter -
>
https://lore.kernel.org/all/[email protected]/
>
> Signed-off-by: Avri Altman <[email protected]>
> ---
> drivers/ufs/core/ufshcd.c | 6 +++++-
> drivers/ufs/host/ufs-mediatek.c | 1 +
> drivers/ufs/host/ufs-mediatek.h | 3 +++
> include/ufs/ufshcd.h | 2 ++
> 4 files changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
> index 7df8bcacbe7e..b62023a6c306 100644
> --- a/drivers/ufs/core/ufshcd.c
> +++ b/drivers/ufs/core/ufshcd.c
> @@ -8144,7 +8144,11 @@ static void ufshcd_set_rtt(struct ufs_hba
> *hba)
> if (dev_rtt != DEFAULT_MAX_NUM_RTT)
> return;
>
> - rtt = min_t(int, dev_info->rtt_cap, hba->nortt);
> + if (hba->vops && hba->vops->max_num_rtt)
> + rtt = hba->vops->max_num_rtt;
> + else
> + rtt = min_t(int, dev_info->rtt_cap, hba->nortt);
> +
> if (rtt == dev_rtt)
> return;
>
> diff --git a/drivers/ufs/host/ufs-mediatek.c b/drivers/ufs/host/ufs-
> mediatek.c
> index c4f997196c57..c7a0ab9b1f59 100644
> --- a/drivers/ufs/host/ufs-mediatek.c
> +++ b/drivers/ufs/host/ufs-mediatek.c
> @@ -1785,6 +1785,7 @@ static int ufs_mtk_config_esi(struct ufs_hba
> *hba)
> */
> static const struct ufs_hba_variant_ops ufs_hba_mtk_vops = {
> .name = "mediatek.ufshci",
> + .max_num_rtt = MTK_MAX_NUM_RTT,
> .init = ufs_mtk_init,
> .get_ufs_hci_version = ufs_mtk_get_ufs_hci_version,
> .setup_clocks = ufs_mtk_setup_clocks,
> diff --git a/drivers/ufs/host/ufs-mediatek.h b/drivers/ufs/host/ufs-
> mediatek.h
> index 3ff17e95afab..05d76a6bd772 100644
> --- a/drivers/ufs/host/ufs-mediatek.h
> +++ b/drivers/ufs/host/ufs-mediatek.h
> @@ -189,4 +189,7 @@ struct ufs_mtk_host {
> /* MTK delay of autosuspend: 500 ms */
> #define MTK_RPM_AUTOSUSPEND_DELAY_MS 500
>
> +/* MTK RTT support number */
> +#define MTK_MAX_NUM_RTT 2
> +
> #endif /* !_UFS_MEDIATEK_H */
> diff --git a/include/ufs/ufshcd.h b/include/ufs/ufshcd.h
> index d74bd2d67b06..ef04ec8aad69 100644
> --- a/include/ufs/ufshcd.h
> +++ b/include/ufs/ufshcd.h
> @@ -295,6 +295,7 @@ struct ufs_pwr_mode_info {
> /**
> * struct ufs_hba_variant_ops - variant specific callbacks
> * @name: variant name
> + * @max_num_rtt: maximum RTT supported by the host
> * @init: called when the driver is initialized
> * @exit: called to cleanup everything done in init
> * @get_ufs_hci_version: called to get UFS HCI version
> @@ -332,6 +333,7 @@ struct ufs_pwr_mode_info {
> */
> struct ufs_hba_variant_ops {
> const char *name;
> + int max_num_rtt;
> int (*init)(struct ufs_hba *);
> void (*exit)(struct ufs_hba *);
> u32 (*get_ufs_hci_version)(struct ufs_hba *);
> --
> 2.34.1
Reviewed-by: Peter Wang <[email protected]>
On 5/26/24 01:16, Avri Altman wrote:
> +static inline void ufshcd_freez_hw_queues(struct ufs_hba *hba)
> +{
> + struct scsi_device *sdev;
> +
> + shost_for_each_device(sdev, hba->host) {
> + if (sdev == hba->ufs_device_wlun)
> + continue;
> + blk_mq_freeze_queue(sdev->request_queue);
> + blk_mq_quiesce_queue(sdev->request_queue);
> + }
> +}
> +
> +static inline void ufshcd_unfreez_hw_queues(struct ufs_hba *hba)
> +{
> + struct scsi_device *sdev;
> +
> + shost_for_each_device(sdev, hba->host) {
> + if (sdev == hba->ufs_device_wlun)
> + continue;
> + blk_mq_unquiesce_queue(sdev->request_queue);
> + blk_mq_unfreeze_queue(sdev->request_queue);
> + }
> +}
Why have these functions been declared inline? blk_mq_freeze_queue()
may sleep and hence performance is not an argument to inline these
functions. Additionally, the WLUN should not be skipped when freezing
or unfreezing request queues. The blk_mq_quiesce_queue() and
blk_mq_unquiesce_queue() calls are not necessary in the above code.
Please remove these.
Thanks,
Bart.
On 5/26/24 01:16, Avri Altman wrote:
> The rtt-upiu packets precede any data-out upiu packets, thus
> synchronizing the data input to the device: this mostly applies to write
> operations, but there are other operations that requires rtt as well.
>
> There are several rules binding this rtt - data-out dialog, specifically
> There can be at most outstanding bMaxNumOfRTT such packets. This might
> have an effect on write performance (sequential write in particular), as
> each data-out upiu must wait for its rtt sibling.
>
> UFSHCI expects bMaxNumOfRTT to be min(bDeviceRTTCap, NORTT). However,
> as of today, there does not appears to be no-one who sets it: not the
> host controller nor the driver. It wasn't an issue up to now:
> bMaxNumOfRTT is set to 2 after manufacturing, and wasn't limiting the
> write performance.
>
> UFS4.0, and specifically gear 5 changes this, and requires the device to
> be more attentive. This doesn't come free - the device has to allocate
> more resources to that end, but the sequential write performance
> improvement is significant. Early measurements shows 25% gain when
> moving from rtt 2 to 9. Therefore, set bMaxNumOfRTT to be
> min(bDeviceRTTCap, NORTT) as UFSHCI expects.
Reviewed-by: Bart Van Assche <[email protected]>
On 5/26/24 01:16, Avri Altman wrote:
> - rtt = min_t(int, dev_info->rtt_cap, hba->nortt);
> + if (hba->vops && hba->vops->max_num_rtt)
> + rtt = hba->vops->max_num_rtt;
> + else
> + rtt = min_t(int, dev_info->rtt_cap, hba->nortt);
> +
Shouldn't what the controller supports be compared with what the device supports,
e.g. as follows?
min_t(int, dev_info->rtt_cap, hba->vops->max_num_rtt ? : hba->nortt);
> struct ufs_hba_variant_ops {
> const char *name;
> + int max_num_rtt;
Hmm ... why 'int' instead of an unsigned type? If the type would be changed
into 'u8' (the type of rtt_cap) then the above min_t() can be changed into
min().
Thanks,
Bart.
>
> On 5/26/24 01:16, Avri Altman wrote:
> > - rtt = min_t(int, dev_info->rtt_cap, hba->nortt);
> > + if (hba->vops && hba->vops->max_num_rtt)
> > + rtt = hba->vops->max_num_rtt;
> > + else
> > + rtt = min_t(int, dev_info->rtt_cap, hba->nortt);
> > +
>
> Shouldn't what the controller supports be compared with what the device
> supports, e.g. as follows?
>
> min_t(int, dev_info->rtt_cap, hba->vops->max_num_rtt ? : hba->nortt);
Yes, this is an option too.
The one that I proposed allows to entirely overrides the negotiation.
I think your suggestion is better.
Will change.
>
> > struct ufs_hba_variant_ops {
> > const char *name;
> > + int max_num_rtt;
>
> Hmm ... why 'int' instead of an unsigned type? If the type would be changed
> into 'u8' (the type of rtt_cap) then the above min_t() can be changed into
> min().
Nortt is 0 based, meaning it can be 256, which some of the platforms in the market do use, so u8 is not enough.
Thanks,
Avri
>
> Thanks,
>
> Bart.
> On 5/26/24 01:16, Avri Altman wrote:
> > +static inline void ufshcd_freez_hw_queues(struct ufs_hba *hba) {
> > + struct scsi_device *sdev;
> > +
> > + shost_for_each_device(sdev, hba->host) {
> > + if (sdev == hba->ufs_device_wlun)
> > + continue;
> > + blk_mq_freeze_queue(sdev->request_queue);
> > + blk_mq_quiesce_queue(sdev->request_queue);
> > + }
> > +}
> > +
> > +static inline void ufshcd_unfreez_hw_queues(struct ufs_hba *hba) {
> > + struct scsi_device *sdev;
> > +
> > + shost_for_each_device(sdev, hba->host) {
> > + if (sdev == hba->ufs_device_wlun)
> > + continue;
> > + blk_mq_unquiesce_queue(sdev->request_queue);
> > + blk_mq_unfreeze_queue(sdev->request_queue);
> > + }
> > +}
>
> Why have these functions been declared inline? blk_mq_freeze_queue() may
> sleep and hence performance is not an argument to inline these functions.
> Additionally, the WLUN should not be skipped when freezing or unfreezing
> request queues. The blk_mq_quiesce_queue() and
> blk_mq_unquiesce_queue() calls are not necessary in the above code.
> Please remove these.
OK.
Thanks,
Avri
>
> Thanks,
>
> Bart.
> On 5/26/24 01:16, Avri Altman wrote:
> > The rtt-upiu packets precede any data-out upiu packets, thus
> > synchronizing the data input to the device: this mostly applies to
> > write operations, but there are other operations that requires rtt as well.
> >
> > There are several rules binding this rtt - data-out dialog,
> > specifically There can be at most outstanding bMaxNumOfRTT such
> > packets. This might have an effect on write performance (sequential
> > write in particular), as each data-out upiu must wait for its rtt sibling.
> >
> > UFSHCI expects bMaxNumOfRTT to be min(bDeviceRTTCap, NORTT).
> However,
> > as of today, there does not appears to be no-one who sets it: not the
> > host controller nor the driver. It wasn't an issue up to now:
> > bMaxNumOfRTT is set to 2 after manufacturing, and wasn't limiting the
> > write performance.
> >
> > UFS4.0, and specifically gear 5 changes this, and requires the device
> > to be more attentive. This doesn't come free - the device has to
> > allocate more resources to that end, but the sequential write
> > performance improvement is significant. Early measurements shows 25%
> > gain when moving from rtt 2 to 9. Therefore, set bMaxNumOfRTT to be
> > min(bDeviceRTTCap, NORTT) as UFSHCI expects.
>
> Reviewed-by: Bart Van Assche <[email protected]>
Martin,
As the other 2 patches in this series require some more work, and our clients do wait for this change,
Would you consider picking this one first, whilst I'm finalizing the other two?
Thanks,
Avri