2022-04-22 18:34:36

by Dan Williams

[permalink] [raw]
Subject: [PATCH v3 2/8] cxl/acpi: Add root device lockdep validation

The CXL "root" device, ACPI0017, is an attach point for coordinating
platform level CXL resources and is the parent device for a CXL port
topology tree. As such it has distinct locking rules relative to other
CXL subsystem objects, but because it is an ACPI device the lock class
is established well before it is given to the cxl_acpi driver.

However, the lockdep API does support changing the lock class "live" for
situations like this. Add a device_lock_set_class() helper that a driver
can use in ->probe() to set a custom lock class, and
device_lock_reset_class() to return to the default "no validate" class
before the custom lock class key goes out of scope after ->remove().

Note the helpers are all macros to support dead code elimination in the
CONFIG_PROVE_LOCKING=n case.

Suggested-by: Peter Zijlstra <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Waiman Long <[email protected]>
Cc: Boqun Feng <[email protected]>
Cc: Alison Schofield <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Ben Widawsky <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
---
drivers/cxl/acpi.c | 15 +++++++++++++++
include/linux/device.h | 25 +++++++++++++++++++++++++
2 files changed, 40 insertions(+)

diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index d15a6aec0331..e19cea27387e 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -275,6 +275,15 @@ static int add_root_nvdimm_bridge(struct device *match, void *data)
return 1;
}

+static struct lock_class_key cxl_root_key;
+
+static void cxl_acpi_lock_reset_class(void *_dev)
+{
+ struct device *dev = _dev;
+
+ device_lock_reset_class(dev);
+}
+
static int cxl_acpi_probe(struct platform_device *pdev)
{
int rc;
@@ -283,6 +292,12 @@ static int cxl_acpi_probe(struct platform_device *pdev)
struct acpi_device *adev = ACPI_COMPANION(host);
struct cxl_cfmws_context ctx;

+ device_lock_set_class(&pdev->dev, &cxl_root_key);
+ rc = devm_add_action_or_reset(&pdev->dev, cxl_acpi_lock_reset_class,
+ &pdev->dev);
+ if (rc)
+ return rc;
+
root_port = devm_cxl_add_port(host, host, CXL_RESOURCE_NONE, NULL);
if (IS_ERR(root_port))
return PTR_ERR(root_port);
diff --git a/include/linux/device.h b/include/linux/device.h
index 93459724dcde..82c9d307e7bd 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -850,6 +850,31 @@ static inline bool device_supports_offline(struct device *dev)
return dev->bus && dev->bus->offline && dev->bus->online;
}

+#define __device_lock_set_class(dev, name, key) \
+ lock_set_class(&(dev)->mutex.dep_map, name, key, 0, _THIS_IP_)
+
+/**
+ * device_lock_set_class - Specify a temporary lock class while a device
+ * is attached to a driver
+ * @dev: device to modify
+ * @key: lock class key data
+ *
+ * This must be called with the device_lock() already held, for example
+ * from driver ->probe().
+ */
+#define device_lock_set_class(dev, key) \
+ __device_lock_set_class(dev, #key, key)
+
+/**
+ * device_lock_reset_class - Return a device to the default lockdep novalidate state
+ * @dev: device to modify
+ *
+ * This must be called with the device_lock() already held, for example
+ * from driver ->remove().
+ */
+#define device_lock_reset_class(dev) \
+ device_lock_set_class(dev, &__lockdep_no_validate__)
+
void lock_device_hotplug(void);
void unlock_device_hotplug(void);
int lock_device_hotplug_sysfs(void);


2022-04-22 22:19:45

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v3 2/8] cxl/acpi: Add root device lockdep validation

On Thu, Apr 21, 2022 at 08:33:18AM -0700, Dan Williams wrote:
> The CXL "root" device, ACPI0017, is an attach point for coordinating
> platform level CXL resources and is the parent device for a CXL port
> topology tree. As such it has distinct locking rules relative to other
> CXL subsystem objects, but because it is an ACPI device the lock class
> is established well before it is given to the cxl_acpi driver.
>
> However, the lockdep API does support changing the lock class "live" for
> situations like this. Add a device_lock_set_class() helper that a driver
> can use in ->probe() to set a custom lock class, and
> device_lock_reset_class() to return to the default "no validate" class
> before the custom lock class key goes out of scope after ->remove().
>
> Note the helpers are all macros to support dead code elimination in the
> CONFIG_PROVE_LOCKING=n case.
>
> Suggested-by: Peter Zijlstra <[email protected]>
> Cc: Greg Kroah-Hartman <[email protected]>
> Cc: "Rafael J. Wysocki" <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Will Deacon <[email protected]>
> Cc: Waiman Long <[email protected]>
> Cc: Boqun Feng <[email protected]>
> Cc: Alison Schofield <[email protected]>
> Cc: Vishal Verma <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Ben Widawsky <[email protected]>
> Cc: Jonathan Cameron <[email protected]>
> Signed-off-by: Dan Williams <[email protected]>
> ---
> drivers/cxl/acpi.c | 15 +++++++++++++++
> include/linux/device.h | 25 +++++++++++++++++++++++++
> 2 files changed, 40 insertions(+)

Much simpler, great work.

Reviewed-by: Greg Kroah-Hartman <[email protected]>

2022-04-23 00:00:22

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v3 2/8] cxl/acpi: Add root device lockdep validation

On Thu, Apr 21, 2022 at 08:33:18AM -0700, Dan Williams wrote:
> The CXL "root" device, ACPI0017, is an attach point for coordinating
> platform level CXL resources and is the parent device for a CXL port
> topology tree. As such it has distinct locking rules relative to other
> CXL subsystem objects, but because it is an ACPI device the lock class
> is established well before it is given to the cxl_acpi driver.

This final sentence gave me pause because it implied that the device lock class
was set to something other than no validate. But I don't see that anywhere in
the acpi code. So given that it looks to me like ACPI is just using the
default no validate class...

Reviewed-by: Ira Weiny <[email protected]>

> However, the lockdep API does support changing the lock class "live" for
> situations like this. Add a device_lock_set_class() helper that a driver
> can use in ->probe() to set a custom lock class, and
> device_lock_reset_class() to return to the default "no validate" class
> before the custom lock class key goes out of scope after ->remove().
>
> Note the helpers are all macros to support dead code elimination in the
> CONFIG_PROVE_LOCKING=n case.
>
> Suggested-by: Peter Zijlstra <[email protected]>
> Cc: Greg Kroah-Hartman <[email protected]>
> Cc: "Rafael J. Wysocki" <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Will Deacon <[email protected]>
> Cc: Waiman Long <[email protected]>
> Cc: Boqun Feng <[email protected]>
> Cc: Alison Schofield <[email protected]>
> Cc: Vishal Verma <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Ben Widawsky <[email protected]>
> Cc: Jonathan Cameron <[email protected]>
> Signed-off-by: Dan Williams <[email protected]>
> ---
> drivers/cxl/acpi.c | 15 +++++++++++++++
> include/linux/device.h | 25 +++++++++++++++++++++++++
> 2 files changed, 40 insertions(+)
>
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index d15a6aec0331..e19cea27387e 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -275,6 +275,15 @@ static int add_root_nvdimm_bridge(struct device *match, void *data)
> return 1;
> }
>
> +static struct lock_class_key cxl_root_key;
> +
> +static void cxl_acpi_lock_reset_class(void *_dev)
> +{
> + struct device *dev = _dev;
> +
> + device_lock_reset_class(dev);
> +}
> +
> static int cxl_acpi_probe(struct platform_device *pdev)
> {
> int rc;
> @@ -283,6 +292,12 @@ static int cxl_acpi_probe(struct platform_device *pdev)
> struct acpi_device *adev = ACPI_COMPANION(host);
> struct cxl_cfmws_context ctx;
>
> + device_lock_set_class(&pdev->dev, &cxl_root_key);
> + rc = devm_add_action_or_reset(&pdev->dev, cxl_acpi_lock_reset_class,
> + &pdev->dev);
> + if (rc)
> + return rc;
> +
> root_port = devm_cxl_add_port(host, host, CXL_RESOURCE_NONE, NULL);
> if (IS_ERR(root_port))
> return PTR_ERR(root_port);
> diff --git a/include/linux/device.h b/include/linux/device.h
> index 93459724dcde..82c9d307e7bd 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -850,6 +850,31 @@ static inline bool device_supports_offline(struct device *dev)
> return dev->bus && dev->bus->offline && dev->bus->online;
> }
>
> +#define __device_lock_set_class(dev, name, key) \
> + lock_set_class(&(dev)->mutex.dep_map, name, key, 0, _THIS_IP_)
> +
> +/**
> + * device_lock_set_class - Specify a temporary lock class while a device
> + * is attached to a driver
> + * @dev: device to modify
> + * @key: lock class key data
> + *
> + * This must be called with the device_lock() already held, for example
> + * from driver ->probe().
> + */
> +#define device_lock_set_class(dev, key) \
> + __device_lock_set_class(dev, #key, key)
> +
> +/**
> + * device_lock_reset_class - Return a device to the default lockdep novalidate state
> + * @dev: device to modify
> + *
> + * This must be called with the device_lock() already held, for example
> + * from driver ->remove().
> + */
> +#define device_lock_reset_class(dev) \
> + device_lock_set_class(dev, &__lockdep_no_validate__)
> +
> void lock_device_hotplug(void);
> void unlock_device_hotplug(void);
> int lock_device_hotplug_sysfs(void);
>

2022-04-23 00:41:37

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH v3 2/8] cxl/acpi: Add root device lockdep validation

On Fri, Apr 22, 2022 at 4:58 PM Ira Weiny <[email protected]> wrote:
>
> On Thu, Apr 21, 2022 at 08:33:18AM -0700, Dan Williams wrote:
> > The CXL "root" device, ACPI0017, is an attach point for coordinating
> > platform level CXL resources and is the parent device for a CXL port
> > topology tree. As such it has distinct locking rules relative to other
> > CXL subsystem objects, but because it is an ACPI device the lock class
> > is established well before it is given to the cxl_acpi driver.
>
> This final sentence gave me pause because it implied that the device lock class
> was set to something other than no validate. But I don't see that anywhere in
> the acpi code. So given that it looks to me like ACPI is just using the
> default no validate class...

Oh, good observation. *If* ACPI had set a custom lock class then
cxl_acpi would need to be careful to restore that ACPI-specific class
and not reset it to "no validate" on exit, or skip setting its own
custom class. However, I think for generic buses like ACPI that feed
devices into other subsystems it likely has little reason to set its
own class. For safety, since device_lock_set_class() is general
purpose, I'll have it emit a debug message and fail if the class is
not "no validate" on entry.

Thanks Ira!

2022-04-24 00:02:49

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH v3 2/8] cxl/acpi: Add root device lockdep validation

On Fri, Apr 22, 2022 at 5:08 PM Dan Williams <[email protected]> wrote:
>
> On Fri, Apr 22, 2022 at 4:58 PM Ira Weiny <[email protected]> wrote:
> >
> > On Thu, Apr 21, 2022 at 08:33:18AM -0700, Dan Williams wrote:
> > > The CXL "root" device, ACPI0017, is an attach point for coordinating
> > > platform level CXL resources and is the parent device for a CXL port
> > > topology tree. As such it has distinct locking rules relative to other
> > > CXL subsystem objects, but because it is an ACPI device the lock class
> > > is established well before it is given to the cxl_acpi driver.
> >
> > This final sentence gave me pause because it implied that the device lock class
> > was set to something other than no validate. But I don't see that anywhere in
> > the acpi code. So given that it looks to me like ACPI is just using the
> > default no validate class...
>
> Oh, good observation. *If* ACPI had set a custom lock class then
> cxl_acpi would need to be careful to restore that ACPI-specific class
> and not reset it to "no validate" on exit, or skip setting its own
> custom class. However, I think for generic buses like ACPI that feed
> devices into other subsystems it likely has little reason to set its
> own class. For safety, since device_lock_set_class() is general
> purpose, I'll have it emit a debug message and fail if the class is
> not "no validate" on entry.
>

So this turned out way uglier than I expected:

drivers/cxl/acpi.c | 4 +++-
include/linux/device.h | 33 +++++++++++++++++++++++++--------
2 files changed, 28 insertions(+), 9 deletions(-)

...so I'm going to drop it and just add a comment about the
expectations. As Peter said there's already a multitude of ways to
cause false positive / negative results with lockdep so this is just
one more area where one needs to be careful and understand the lock
context they might be overriding.

2022-04-26 05:23:35

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH v3 2/8] cxl/acpi: Add root device lockdep validation

On Mon, Apr 25, 2022 at 3:33 AM Peter Zijlstra <[email protected]> wrote:
>
> On Sat, Apr 23, 2022 at 10:27:52AM -0700, Dan Williams wrote:
>
> > ...so I'm going to drop it and just add a comment about the
> > expectations. As Peter said there's already a multitude of ways to
> > cause false positive / negative results with lockdep so this is just
> > one more area where one needs to be careful and understand the lock
> > context they might be overriding.
>
> One safe-guard might be to check the class you're overriding is indeed
> __no_validate__, and WARN if not. Then the unconditional reset is
> conistent.
>
> Then, if/when, that WARN ever triggers you can revisit all this.

Ok, that does seem to need a dummy definition of lockdep_match_class()
in the CONFIG_LOCKDEP=n case, but that seems worth it to me for the
sanity check.

2022-04-26 06:42:28

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v3 2/8] cxl/acpi: Add root device lockdep validation

On Sat, Apr 23, 2022 at 10:27:52AM -0700, Dan Williams wrote:

> ...so I'm going to drop it and just add a comment about the
> expectations. As Peter said there's already a multitude of ways to
> cause false positive / negative results with lockdep so this is just
> one more area where one needs to be careful and understand the lock
> context they might be overriding.

One safe-guard might be to check the class you're overriding is indeed
__no_validate__, and WARN if not. Then the unconditional reset is
conistent.

Then, if/when, that WARN ever triggers you can revisit all this.

2022-04-26 19:08:44

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH v3 2/8] cxl/acpi: Add root device lockdep validation

On Mon, Apr 25, 2022 at 9:05 AM Dan Williams <[email protected]> wrote:
>
> On Mon, Apr 25, 2022 at 3:33 AM Peter Zijlstra <[email protected]> wrote:
> >
> > On Sat, Apr 23, 2022 at 10:27:52AM -0700, Dan Williams wrote:
> >
> > > ...so I'm going to drop it and just add a comment about the
> > > expectations. As Peter said there's already a multitude of ways to
> > > cause false positive / negative results with lockdep so this is just
> > > one more area where one needs to be careful and understand the lock
> > > context they might be overriding.
> >
> > One safe-guard might be to check the class you're overriding is indeed
> > __no_validate__, and WARN if not. Then the unconditional reset is
> > conistent.
> >
> > Then, if/when, that WARN ever triggers you can revisit all this.
>
> Ok, that does seem to need a dummy definition of lockdep_match_class()
> in the CONFIG_LOCKDEP=n case, but that seems worth it to me for the
> sanity check.

Thankfully the comment in lockdep.h to not define a
lockdep_match_class() for the CONFIG_LOCKDEP=n stopped me from going
that direction.