Changes since v1: [1]
* Fix the root poll interval support to avoid a infinite loop condition
when the polling is faster than the ARS completion.
* Move the introduction of scrub_flags earlier in the series and
introduce ARS_POLL to fix the above issue.
[1]: https://lists.01.org/pipermail/linux-nvdimm/2019-February/019964.html
---
Here is a small pile of updates to better coordinate the Linux ARS state
machine with platform-BIOS implementations. Specifically, take advantage
of opportunities to run short-ARS whenever the ARS interface is found to
be idle at init, always run short-ARS even if no_init_ars is specified,
allow root to reset the exponential backoff polling interval for ARS
completion, and protect the kernel against the consumption of stale ARS
results.
---
Dan Williams (6):
nfit/ars: Attempt a short-ARS whenever the ARS state is idle at boot
nfit/ars: Attempt short-ARS even in the no_init_ars case
nfit/ars: Remove ars_start_flags
nfit/ars: Introduce scrub_flags
nfit/ars: Allow root to busy-poll the ARS state machine
nfit/ars: Avoid stale ARS results
drivers/acpi/nfit/core.c | 70 ++++++++++++++++++++++++++++++++--------------
drivers/acpi/nfit/nfit.h | 11 +++++--
2 files changed, 57 insertions(+), 24 deletions(-)
If query-ARS reports that ARS has stopped and requires continuation
attempt to retrieve short-ARS results before continuing the long
operation.
Reported-by: Krzysztof Rusocki <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
---
drivers/acpi/nfit/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index e18ade5d74e9..3d681a92ff7f 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -3012,6 +3012,7 @@ static int ars_register(struct acpi_nfit_desc *acpi_desc,
switch (acpi_nfit_query_poison(acpi_desc)) {
case 0:
+ case -ENOSPC:
case -EAGAIN:
rc = ars_start(acpi_desc, nfit_spa, ARS_REQ_SHORT);
/* shouldn't happen, try again later */
@@ -3036,7 +3037,6 @@ static int ars_register(struct acpi_nfit_desc *acpi_desc,
break;
case -EBUSY:
case -ENOMEM:
- case -ENOSPC:
/*
* BIOS was using ARS, wait for it to complete (or
* resources to become available) and then perform our
The no_init_ars option is meant to prevent long-ARS, but short-ARS
should be allowed to grab any immediate results.
Reported-by: Erwin Tsaur <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
---
drivers/acpi/nfit/core.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index 3d681a92ff7f..934be96dc149 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -3004,11 +3004,12 @@ static int ars_register(struct acpi_nfit_desc *acpi_desc,
{
int rc;
- if (no_init_ars || test_bit(ARS_FAILED, &nfit_spa->ars_state))
+ if (test_bit(ARS_FAILED, &nfit_spa->ars_state))
return acpi_nfit_register_region(acpi_desc, nfit_spa);
set_bit(ARS_REQ_SHORT, &nfit_spa->ars_state);
- set_bit(ARS_REQ_LONG, &nfit_spa->ars_state);
+ if (!no_init_ars)
+ set_bit(ARS_REQ_LONG, &nfit_spa->ars_state);
switch (acpi_nfit_query_poison(acpi_desc)) {
case 0:
The ars_start_flags property of 'struct acpi_nfit_desc' is no longer
used since ARS_REQ_SHORT and ARS_REQ_LONG were added.
Cc: Toshi Kani <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
---
drivers/acpi/nfit/core.c | 10 +++++-----
drivers/acpi/nfit/nfit.h | 1 -
2 files changed, 5 insertions(+), 6 deletions(-)
diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index 934be96dc149..9a23ae74e82b 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -2651,11 +2651,11 @@ static int ars_continue(struct acpi_nfit_desc *acpi_desc)
struct nvdimm_bus_descriptor *nd_desc = &acpi_desc->nd_desc;
struct nd_cmd_ars_status *ars_status = acpi_desc->ars_status;
- memset(&ars_start, 0, sizeof(ars_start));
- ars_start.address = ars_status->restart_address;
- ars_start.length = ars_status->restart_length;
- ars_start.type = ars_status->type;
- ars_start.flags = acpi_desc->ars_start_flags;
+ ars_start = (struct nd_cmd_ars_start) {
+ .address = ars_status->restart_address,
+ .length = ars_status->restart_length,
+ .type = ars_status->type,
+ };
rc = nd_desc->ndctl(nd_desc, NULL, ND_CMD_ARS_START, &ars_start,
sizeof(ars_start), &cmd_rc);
if (rc < 0)
diff --git a/drivers/acpi/nfit/nfit.h b/drivers/acpi/nfit/nfit.h
index 33691aecfcee..871fb3de3b30 100644
--- a/drivers/acpi/nfit/nfit.h
+++ b/drivers/acpi/nfit/nfit.h
@@ -223,7 +223,6 @@ struct acpi_nfit_desc {
struct list_head idts;
struct nvdimm_bus *nvdimm_bus;
struct device *dev;
- u8 ars_start_flags;
struct nd_cmd_ars_status *ars_status;
struct nfit_spa *scrub_spa;
struct delayed_work dwork;
In preparation for introducing new flags to gate whether ARS results are
stale, or poll the completion state, convert the existing flags to an
unsigned long with enumerated values. This conversion allows the flags
to be atomically updated outside of ->init_mutex.
Signed-off-by: Dan Williams <[email protected]>
---
drivers/acpi/nfit/core.c | 30 +++++++++++++++++-------------
drivers/acpi/nfit/nfit.h | 8 ++++++--
2 files changed, 23 insertions(+), 15 deletions(-)
diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index 9a23ae74e82b..90312892093e 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -1317,19 +1317,23 @@ static ssize_t scrub_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
struct nvdimm_bus_descriptor *nd_desc;
+ struct acpi_nfit_desc *acpi_desc;
ssize_t rc = -ENXIO;
+ bool busy;
device_lock(dev);
nd_desc = dev_get_drvdata(dev);
- if (nd_desc) {
- struct acpi_nfit_desc *acpi_desc = to_acpi_desc(nd_desc);
-
- mutex_lock(&acpi_desc->init_mutex);
- rc = sprintf(buf, "%d%s", acpi_desc->scrub_count,
- acpi_desc->scrub_busy
- && !acpi_desc->cancel ? "+\n" : "\n");
- mutex_unlock(&acpi_desc->init_mutex);
+ if (!nd_desc) {
+ device_unlock(dev);
+ return rc;
}
+ acpi_desc = to_acpi_desc(nd_desc);
+
+ mutex_lock(&acpi_desc->init_mutex);
+ busy = test_bit(ARS_BUSY, &acpi_desc->scrub_flags)
+ && !test_bit(ARS_CANCEL, &acpi_desc->scrub_flags);
+ rc = sprintf(buf, "%d%s", acpi_desc->scrub_count, busy ? "+\n" : "\n");
+ mutex_unlock(&acpi_desc->init_mutex);
device_unlock(dev);
return rc;
}
@@ -3072,7 +3076,7 @@ static unsigned int __acpi_nfit_scrub(struct acpi_nfit_desc *acpi_desc,
lockdep_assert_held(&acpi_desc->init_mutex);
- if (acpi_desc->cancel)
+ if (test_bit(ARS_CANCEL, &acpi_desc->scrub_flags))
return 0;
if (query_rc == -EBUSY) {
@@ -3146,7 +3150,7 @@ static void __sched_ars(struct acpi_nfit_desc *acpi_desc, unsigned int tmo)
{
lockdep_assert_held(&acpi_desc->init_mutex);
- acpi_desc->scrub_busy = 1;
+ set_bit(ARS_BUSY, &acpi_desc->scrub_flags);
/* note this should only be set from within the workqueue */
if (tmo)
acpi_desc->scrub_tmo = tmo;
@@ -3162,7 +3166,7 @@ static void notify_ars_done(struct acpi_nfit_desc *acpi_desc)
{
lockdep_assert_held(&acpi_desc->init_mutex);
- acpi_desc->scrub_busy = 0;
+ clear_bit(ARS_BUSY, &acpi_desc->scrub_flags);
acpi_desc->scrub_count++;
if (acpi_desc->scrub_count_state)
sysfs_notify_dirent(acpi_desc->scrub_count_state);
@@ -3451,7 +3455,7 @@ int acpi_nfit_ars_rescan(struct acpi_nfit_desc *acpi_desc,
struct nfit_spa *nfit_spa;
mutex_lock(&acpi_desc->init_mutex);
- if (acpi_desc->cancel) {
+ if (test_bit(ARS_CANCEL, &acpi_desc->scrub_flags)) {
mutex_unlock(&acpi_desc->init_mutex);
return 0;
}
@@ -3530,7 +3534,7 @@ void acpi_nfit_shutdown(void *data)
mutex_unlock(&acpi_desc_lock);
mutex_lock(&acpi_desc->init_mutex);
- acpi_desc->cancel = 1;
+ set_bit(ARS_CANCEL, &acpi_desc->scrub_flags);
cancel_delayed_work_sync(&acpi_desc->dwork);
mutex_unlock(&acpi_desc->init_mutex);
diff --git a/drivers/acpi/nfit/nfit.h b/drivers/acpi/nfit/nfit.h
index 871fb3de3b30..897ce10192a0 100644
--- a/drivers/acpi/nfit/nfit.h
+++ b/drivers/acpi/nfit/nfit.h
@@ -210,6 +210,11 @@ struct nfit_mem {
int family;
};
+enum scrub_flags {
+ ARS_BUSY,
+ ARS_CANCEL,
+};
+
struct acpi_nfit_desc {
struct nvdimm_bus_descriptor nd_desc;
struct acpi_table_header acpi_header;
@@ -231,8 +236,7 @@ struct acpi_nfit_desc {
unsigned int max_ars;
unsigned int scrub_count;
unsigned int scrub_mode;
- unsigned int scrub_busy:1;
- unsigned int cancel:1;
+ unsigned long scrub_flags;
unsigned long dimm_cmd_force_en;
unsigned long bus_cmd_force_en;
unsigned long bus_nfit_cmd_force_en;
The ARS implementation implements exponential back-off on the poll
interval to prevent high-frequency access to the DIMM / platform
interface. Depending on when the ARS completes the poll interval may
exceed the completion event by minutes. Allow root to reset the timeout
each time it probes the status. A one-second timeout is still enforced,
but root can otherwise can control the poll interval.
Reported-by: Erwin Tsaur <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
---
drivers/acpi/nfit/core.c | 8 ++++++++
drivers/acpi/nfit/nfit.h | 1 +
2 files changed, 9 insertions(+)
diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index 90312892093e..629cf91649d2 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -1333,6 +1333,13 @@ static ssize_t scrub_show(struct device *dev,
busy = test_bit(ARS_BUSY, &acpi_desc->scrub_flags)
&& !test_bit(ARS_CANCEL, &acpi_desc->scrub_flags);
rc = sprintf(buf, "%d%s", acpi_desc->scrub_count, busy ? "+\n" : "\n");
+ /* Allow an admin to poll the busy state at a higher rate */
+ if (busy && capable(CAP_SYS_RAWIO) && !test_and_set_bit(ARS_POLL,
+ &acpi_desc->scrub_flags)) {
+ acpi_desc->scrub_tmo = 1;
+ mod_delayed_work(nfit_wq, &acpi_desc->dwork, HZ);
+ }
+
mutex_unlock(&acpi_desc->init_mutex);
device_unlock(dev);
return rc;
@@ -3187,6 +3194,7 @@ static void acpi_nfit_scrub(struct work_struct *work)
else
notify_ars_done(acpi_desc);
memset(acpi_desc->ars_status, 0, acpi_desc->max_ars);
+ clear_bit(ARS_POLL, &acpi_desc->scrub_flags);
mutex_unlock(&acpi_desc->init_mutex);
}
diff --git a/drivers/acpi/nfit/nfit.h b/drivers/acpi/nfit/nfit.h
index 897ce10192a0..d14bad687fb8 100644
--- a/drivers/acpi/nfit/nfit.h
+++ b/drivers/acpi/nfit/nfit.h
@@ -213,6 +213,7 @@ struct nfit_mem {
enum scrub_flags {
ARS_BUSY,
ARS_CANCEL,
+ ARS_POLL,
};
struct acpi_nfit_desc {
Gate ARS result consumption on whether the OS issued start-ARS since the
previous consumption. The BIOS may only clear its result buffers after a
successful start-ARS.
Reported-by: Krzysztof Rusocki <[email protected]>
Reported-by: Vishal Verma <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
---
drivers/acpi/nfit/core.c | 17 ++++++++++++++++-
drivers/acpi/nfit/nfit.h | 1 +
2 files changed, 17 insertions(+), 1 deletion(-)
diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index 629cf91649d2..5c9eb8d700d3 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -2652,7 +2652,10 @@ static int ars_start(struct acpi_nfit_desc *acpi_desc,
if (rc < 0)
return rc;
- return cmd_rc;
+ if (cmd_rc < 0)
+ return cmd_rc;
+ set_bit(ARS_VALID, &acpi_desc->scrub_flags);
+ return 0;
}
static int ars_continue(struct acpi_nfit_desc *acpi_desc)
@@ -2745,6 +2748,17 @@ static int ars_status_process_records(struct acpi_nfit_desc *acpi_desc)
*/
if (ars_status->out_length < 44)
return 0;
+
+ /*
+ * Ignore potentially stale results that are only refreshed
+ * after a start-ARS event.
+ */
+ if (!test_and_clear_bit(ARS_VALID, &acpi_desc->scrub_flags)) {
+ dev_dbg(acpi_desc->dev, "skip %d stale records\n",
+ ars_status->num_records);
+ return 0;
+ }
+
for (i = 0; i < ars_status->num_records; i++) {
/* only process full records */
if (ars_status->out_length
@@ -3229,6 +3243,7 @@ static int acpi_nfit_register_regions(struct acpi_nfit_desc *acpi_desc)
struct nfit_spa *nfit_spa;
int rc;
+ set_bit(ARS_VALID, &acpi_desc->scrub_flags);
list_for_each_entry(nfit_spa, &acpi_desc->spas, list) {
switch (nfit_spa_type(nfit_spa->spa)) {
case NFIT_SPA_VOLATILE:
diff --git a/drivers/acpi/nfit/nfit.h b/drivers/acpi/nfit/nfit.h
index d14bad687fb8..0cbe5009eb2c 100644
--- a/drivers/acpi/nfit/nfit.h
+++ b/drivers/acpi/nfit/nfit.h
@@ -213,6 +213,7 @@ struct nfit_mem {
enum scrub_flags {
ARS_BUSY,
ARS_CANCEL,
+ ARS_VALID,
ARS_POLL,
};
On Fri, 2019-02-15 at 11:43 -0800, Dan Williams wrote:
> Changes since v1: [1]
> * Fix the root poll interval support to avoid a infinite loop condition
> when the polling is faster than the ARS completion.
> * Move the introduction of scrub_flags earlier in the series and
> introduce ARS_POLL to fix the above issue.
>
> [1]: https://lists.01.org/pipermail/linux-nvdimm/2019-February/019964.html
>
> ---
>
> Here is a small pile of updates to better coordinate the Linux ARS state
> machine with platform-BIOS implementations. Specifically, take advantage
> of opportunities to run short-ARS whenever the ARS interface is found to
> be idle at init, always run short-ARS even if no_init_ars is specified,
> allow root to reset the exponential backoff polling interval for ARS
> completion, and protect the kernel against the consumption of stale ARS
> results.
>
> ---
>
> Dan Williams (6):
> nfit/ars: Attempt a short-ARS whenever the ARS state is idle at boot
> nfit/ars: Attempt short-ARS even in the no_init_ars case
> nfit/ars: Remove ars_start_flags
> nfit/ars: Introduce scrub_flags
> nfit/ars: Allow root to busy-poll the ARS state machine
> nfit/ars: Avoid stale ARS results
>
>
> drivers/acpi/nfit/core.c | 70 ++++++++++++++++++++++++++++++++--------------
> drivers/acpi/nfit/nfit.h | 11 +++++--
> 2 files changed, 57 insertions(+), 24 deletions(-)
The changes look good to me. For the series:
Reviewed-by: Toshi Kani <[email protected]>
Thanks,
-Toshi