2023-12-13 12:44:56

by Maramaina Naresh

[permalink] [raw]
Subject: [PATCH V5 0/2] Add CPU latency QoS support for ufs driver

Add CPU latency QoS support for ufs driver. This improves random io
performance by 15% for ufs.

tiotest benchmark tool io performance results on sm8550 platform:

1. Without PM QoS support
Type (Speed in) | Average of 18 iterations
Random Read(IPOS) | 37101.3
Random Write(IPOS) | 41065.13

2. With PM QoS support
Type (Speed in) | Average of 18 iterations
Random Read(IPOS) | 42943.4
Random Write(IPOS) | 46784.9
(Improvement with PM QoS = ~15%).

This patch is based on below patch by Stanley Chu [1].
Moving the PM QoS code to ufshcd.c and making it generic.

[1] https://lore.kernel.org/r/[email protected]

Changes from v4:
- Addressed angelogioacchino's comments to update commit text
- Addressed angelogioacchino's comments to code alignment

Changes from v3:
- Removed UFSHCD_CAP_PM_QOS capability flag from patch#2

Changes from v2:
- Addressed bvanassche and mani comments
- Provided sysfs interface to enable/disable PM QoS feature

Changes from v1:
- Addressed bvanassche comments to have the code in core ufshcd
- Design is changed from per-device PM QoS to CPU latency QoS based support
- Reverted existing PM QoS feature from MEDIATEK UFS driver
- Added PM QoS capability for both QCOM and MEDIATEK SoCs

Maramaina Naresh (2):
ufs: core: Add CPU latency QoS support for ufs driver
ufs: ufs-mediatek: Migrate to UFSHCD generic CPU latency PM QoS
support

drivers/ufs/core/ufshcd.c | 125 ++++++++++++++++++++++++++++++++
drivers/ufs/host/ufs-mediatek.c | 17 -----
drivers/ufs/host/ufs-mediatek.h | 3 -
include/ufs/ufshcd.h | 6 ++
4 files changed, 131 insertions(+), 20 deletions(-)

--
2.17.1


2023-12-13 12:45:52

by Maramaina Naresh

[permalink] [raw]
Subject: [PATCH V5 1/2] ufs: core: Add CPU latency QoS support for ufs driver

Register ufs driver to CPU latency PM QoS framework to improve
ufs device random io performance.

PM QoS initialization will insert new QoS request into the CPU
latency QoS list with the maximum latency PM_QOS_DEFAULT_VALUE
value.

UFS driver will vote for performance mode on scale up and power
save mode for scale down.

If clock scaling feature is not enabled then voting will be based
on clock on or off condition.

Provided sysfs interface to enable/disable PM QoS feature.

tiotest benchmark tool io performance results on sm8550 platform:

1. Without PM QoS support
Type (Speed in) | Average of 18 iterations
Random Write(IPOS) | 41065.13
Random Read(IPOS) | 37101.3

2. With PM QoS support
Type (Speed in) | Average of 18 iterations
Random Write(IPOS) | 46784.9
Random Read(IPOS) | 42943.4
(Improvement with PM QoS = ~15%).

Reviewed-by: AngeloGioacchino Del Regno <[email protected]>
Co-developed-by: Nitin Rawat <[email protected]>
Signed-off-by: Nitin Rawat <[email protected]>
Signed-off-by: Naveen Kumar Goud Arepalli <[email protected]>
Signed-off-by: Maramaina Naresh <[email protected]>
---
drivers/ufs/core/ufshcd.c | 125 ++++++++++++++++++++++++++++++++++++++
include/ufs/ufshcd.h | 6 ++
2 files changed, 131 insertions(+)

diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index ae9936fc6ffb..a8ee6e02e83e 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -1001,6 +1001,19 @@ static bool ufshcd_is_unipro_pa_params_tuning_req(struct ufs_hba *hba)
return ufshcd_get_local_unipro_ver(hba) < UFS_UNIPRO_VER_1_6;
}

+/**
+ * ufshcd_pm_qos_update - update PM QoS request
+ * @hba: per adapter instance
+ * @on: If True, vote for perf PM QoS mode otherwise power save mode
+ */
+static void ufshcd_pm_qos_update(struct ufs_hba *hba, bool on)
+{
+ if (!hba->pm_qos_enabled)
+ return;
+
+ cpu_latency_qos_update_request(&hba->pm_qos_req, on ? 0 : PM_QOS_DEFAULT_VALUE);
+}
+
/**
* ufshcd_set_clk_freq - set UFS controller clock frequencies
* @hba: per adapter instance
@@ -1147,8 +1160,11 @@ static int ufshcd_scale_clks(struct ufs_hba *hba, unsigned long freq,
hba->devfreq->previous_freq);
else
ufshcd_set_clk_freq(hba, !scale_up);
+ goto out;
}

+ ufshcd_pm_qos_update(hba, scale_up);
+
out:
trace_ufshcd_profile_clk_scaling(dev_name(hba->dev),
(scale_up ? "up" : "down"),
@@ -8615,6 +8631,108 @@ static void ufshcd_set_timestamp_attr(struct ufs_hba *hba)
ufshcd_release(hba);
}

+/**
+ * ufshcd_pm_qos_init - initialize PM QoS request
+ * @hba: per adapter instance
+ */
+static void ufshcd_pm_qos_init(struct ufs_hba *hba)
+{
+
+ if (hba->pm_qos_enabled)
+ return;
+
+ cpu_latency_qos_add_request(&hba->pm_qos_req, PM_QOS_DEFAULT_VALUE);
+
+ if (cpu_latency_qos_request_active(&hba->pm_qos_req))
+ hba->pm_qos_enabled = true;
+}
+
+/**
+ * ufshcd_pm_qos_exit - remove request from PM QoS
+ * @hba: per adapter instance
+ */
+static void ufshcd_pm_qos_exit(struct ufs_hba *hba)
+{
+ if (!hba->pm_qos_enabled)
+ return;
+
+ cpu_latency_qos_remove_request(&hba->pm_qos_req);
+ hba->pm_qos_enabled = false;
+}
+
+/**
+ * ufshcd_pm_qos_enable_show - sysfs handler to show pm qos enable value
+ * @dev: device associated with the UFS controller
+ * @attr: sysfs attribute handle
+ * @buf: buffer for sysfs file
+ *
+ * Print 1 if PM QoS feature is enabled, 0 if disabled.
+ *
+ * Returns number of characters written to @buf.
+ */
+static ssize_t ufshcd_pm_qos_enable_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct ufs_hba *hba = dev_get_drvdata(dev);
+
+ return sysfs_emit(buf, "%d\n", hba->pm_qos_enabled);
+}
+
+/**
+ * ufshcd_pm_qos_enable_store - sysfs handler to store value
+ * @dev: device associated with the UFS controller
+ * @attr: sysfs attribute handle
+ * @buf: buffer for sysfs file
+ * @count: stores buffer characters count
+ *
+ * Input 0 to disable PM QoS and any non-zero positive value to enable.
+ * Default state: 1
+ *
+ * Return: number of characters written to @buf on success, < 0 upon failure.
+ */
+static ssize_t ufshcd_pm_qos_enable_store(struct device *dev,
+ struct device_attribute *attr, const char *buf, size_t count)
+{
+ struct ufs_hba *hba = dev_get_drvdata(dev);
+ u32 value;
+
+ if (kstrtou32(buf, 0, &value))
+ return -EINVAL;
+
+ value = !!value;
+ if (value)
+ ufshcd_pm_qos_init(hba);
+ else
+ ufshcd_pm_qos_exit(hba);
+
+ return count;
+}
+
+/**
+ * ufshcd_init_pm_qos_sysfs - initialize PM QoS sysfs entry
+ * @hba: per adapter instance
+ */
+static void ufshcd_init_pm_qos_sysfs(struct ufs_hba *hba)
+{
+ hba->pm_qos_enable_attr.show = ufshcd_pm_qos_enable_show;
+ hba->pm_qos_enable_attr.store = ufshcd_pm_qos_enable_store;
+ sysfs_attr_init(&hba->pm_qos_enable_attr.attr);
+ hba->pm_qos_enable_attr.attr.name = "pm_qos_enable";
+ hba->pm_qos_enable_attr.attr.mode = 0644;
+ if (device_create_file(hba->dev, &hba->pm_qos_enable_attr))
+ dev_err(hba->dev, "Failed to create sysfs for pm_qos_enable\n");
+}
+
+/**
+ * ufshcd_remove_pm_qos_sysfs - remove PM QoS sysfs entry
+ * @hba: per adapter instance
+ */
+static void ufshcd_remove_pm_qos_sysfs(struct ufs_hba *hba)
+{
+ if (hba->pm_qos_enable_attr.attr.name)
+ device_remove_file(hba->dev, &hba->pm_qos_enable_attr);
+}
+
/**
* ufshcd_add_lus - probe and add UFS logical units
* @hba: per-adapter instance
@@ -9204,6 +9322,8 @@ static int ufshcd_setup_clocks(struct ufs_hba *hba, bool on)
if (ret)
return ret;

+ if (!ufshcd_is_clkscaling_supported(hba))
+ ufshcd_pm_qos_update(hba, on);
out:
if (ret) {
list_for_each_entry(clki, head, list) {
@@ -9381,6 +9501,8 @@ static int ufshcd_hba_init(struct ufs_hba *hba)
static void ufshcd_hba_exit(struct ufs_hba *hba)
{
if (hba->is_powered) {
+ ufshcd_remove_pm_qos_sysfs(hba);
+ ufshcd_pm_qos_exit(hba);
ufshcd_exit_clk_scaling(hba);
ufshcd_exit_clk_gating(hba);
if (hba->eh_wq)
@@ -10030,6 +10152,7 @@ static int ufshcd_suspend(struct ufs_hba *hba)
ufshcd_vreg_set_lpm(hba);
/* Put the host controller in low power mode if possible */
ufshcd_hba_vreg_set_lpm(hba);
+ ufshcd_pm_qos_update(hba, false);
return ret;
}

@@ -10576,6 +10699,8 @@ int ufshcd_init(struct ufs_hba *hba, void __iomem *mmio_base, unsigned int irq)
ufs_sysfs_add_nodes(hba->dev);

device_enable_async_suspend(dev);
+ ufshcd_pm_qos_init(hba);
+ ufshcd_init_pm_qos_sysfs(hba);
return 0;

free_tmf_queue:
diff --git a/include/ufs/ufshcd.h b/include/ufs/ufshcd.h
index d862c8ddce03..fa7434a9073d 100644
--- a/include/ufs/ufshcd.h
+++ b/include/ufs/ufshcd.h
@@ -912,6 +912,9 @@ enum ufshcd_mcq_opr {
* @mcq_base: Multi circular queue registers base address
* @uhq: array of supported hardware queues
* @dev_cmd_queue: Queue for issuing device management commands
+ * @pm_qos_enable_attr: sysfs attribute to enable/disable pm qos
+ * @pm_qos_req: PM QoS request handle
+ * @pm_qos_enabled: flag to check if pm qos is enabled
*/
struct ufs_hba {
void __iomem *mmio_base;
@@ -1076,6 +1079,9 @@ struct ufs_hba {
struct ufs_hw_queue *uhq;
struct ufs_hw_queue *dev_cmd_queue;
struct ufshcd_mcq_opr_info_t mcq_opr[OPR_MAX];
+ struct device_attribute pm_qos_enable_attr;
+ struct pm_qos_request pm_qos_req;
+ bool pm_qos_enabled;
};

/**
--
2.17.1

2023-12-15 06:59:17

by Peter Wang (王信友)

[permalink] [raw]
Subject: Re: [PATCH V5 1/2] ufs: core: Add CPU latency QoS support for ufs driver

On Wed, 2023-12-13 at 18:13 +0530, Maramaina Naresh wrote:
>
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
> Register ufs driver to CPU latency PM QoS framework to improve
> ufs device random io performance.
>
> PM QoS initialization will insert new QoS request into the CPU
> latency QoS list with the maximum latency PM_QOS_DEFAULT_VALUE
> value.
>
> UFS driver will vote for performance mode on scale up and power
> save mode for scale down.
>
> If clock scaling feature is not enabled then voting will be based
> on clock on or off condition.
>
> Provided sysfs interface to enable/disable PM QoS feature.
>
> tiotest benchmark tool io performance results on sm8550 platform:
>
> 1. Without PM QoS support
> Type (Speed in) | Average of 18 iterations
> Random Write(IPOS) | 41065.13
> Random Read(IPOS) | 37101.3
>
> 2. With PM QoS support
> Type (Speed in) | Average of 18 iterations
> Random Write(IPOS) | 46784.9
> Random Read(IPOS) | 42943.4
> (Improvement with PM QoS = ~15%).
>

Reviewed-by: Peter Wang <[email protected]>

2023-12-15 09:07:03

by Avri Altman

[permalink] [raw]
Subject: RE: [PATCH V5 0/2] Add CPU latency QoS support for ufs driver

> Add CPU latency QoS support for ufs driver. This improves random io
> performance by 15% for ufs.
>
> tiotest benchmark tool io performance results on sm8550 platform:
Will it possible to provide test results for non-ufs4.0 platforms?
e.g. for SM8250, just to know if it would make sense to backport this to earlier releases.

Thanks,
Avri

2023-12-17 17:04:10

by Maramaina Naresh

[permalink] [raw]
Subject: Re: [PATCH V5 0/2] Add CPU latency QoS support for ufs driver

On 12/15/2023 2:35 PM, Avri Altman wrote:
>> Add CPU latency QoS support for ufs driver. This improves random io
>> performance by 15% for ufs.
>>
>> tiotest benchmark tool io performance results on sm8550 platform:
> Will it possible to provide test results for non-ufs4.0 platforms?
> e.g. for SM8250, just to know if it would make sense to backport this to earlier releases.
>

Hi Avri,

Performed tiotest benchmark tool io performance test on SM8450 platform
and see good improvement there as well.

> Thanks,
> Avri

Thanks,
Naresh.

2023-12-18 21:56:08

by Bart Van Assche

[permalink] [raw]
Subject: Re: [PATCH V5 1/2] ufs: core: Add CPU latency QoS support for ufs driver

On 12/13/23 04:43, Maramaina Naresh wrote:
> +static ssize_t ufshcd_pm_qos_enable_store(struct device *dev,
> + struct device_attribute *attr, const char *buf, size_t count)
> +{
> + struct ufs_hba *hba = dev_get_drvdata(dev);
> + u32 value;
> +
> + if (kstrtou32(buf, 0, &value))
> + return -EINVAL;
> +
> + value = !!value;
> + if (value)
> + ufshcd_pm_qos_init(hba);
> + else
> + ufshcd_pm_qos_exit(hba);
> +
> + return count;
> +}

Please use kstrtobool() instead of kstrtou32().

> +static void ufshcd_init_pm_qos_sysfs(struct ufs_hba *hba)
> +{
> + hba->pm_qos_enable_attr.show = ufshcd_pm_qos_enable_show;
> + hba->pm_qos_enable_attr.store = ufshcd_pm_qos_enable_store;
> + sysfs_attr_init(&hba->pm_qos_enable_attr.attr);
> + hba->pm_qos_enable_attr.attr.name = "pm_qos_enable";
> + hba->pm_qos_enable_attr.attr.mode = 0644;
> + if (device_create_file(hba->dev, &hba->pm_qos_enable_attr))
> + dev_err(hba->dev, "Failed to create sysfs for pm_qos_enable\n");
> +}

Calling device_create_file() and device_remove_file() is not acceptable because of
the race conditions these calls introduce for udev rules. Please add this attribute
into an existing group and update the is_visible callback function of that group.
See also ufs_sysfs_groups[].

Thanks,

Bart.