2021-04-01 09:03:07

by Dan Carpenter

[permalink] [raw]
Subject: [bug report] node: Add memory-side caching attributes

Hi Keith,

I've been trying to figure out ways Smatch can check for device managed
resources. Like adding rules that if we call dev_set_name(&foo->bar)
then it's device managaged and if there is a kfree(foo) without calling
device_put(&foo->bar) then that's a resource leak.

Of course one of the rules is that if you call device_register(dev) then
you can't kfree(dev), it has to released with device_put(dev) and that's
true even if the register fails. But this code here feels very
intentional so maybe there is an exception to the rule?

The patch acc02a109b04: "node: Add memory-side caching attributes"
from Mar 11, 2019, leads to the following static checker warning:

drivers/base/node.c:285 node_init_cache_dev()
error: kfree after device_register(): 'dev'

drivers/base/node.c
263 static void node_init_cache_dev(struct node *node)
264 {
265 struct device *dev;
266
267 dev = kzalloc(sizeof(*dev), GFP_KERNEL);
268 if (!dev)
269 return;
270
271 dev->parent = &node->dev;
272 dev->release = node_cache_release;
273 if (dev_set_name(dev, "memory_side_cache"))
274 goto free_dev;
275
276 if (device_register(dev))
^^^^^^^^^^^^^^^^^^^
277 goto free_name;
278
279 pm_runtime_no_callbacks(dev);
280 node->cache_dev = dev;
281 return;
282 free_name:
283 kfree_const(dev->kobj.name);
284 free_dev:
285 kfree(dev);
^^^^^^^^^^
286 }

regards,
dan carpenter


2021-04-01 17:47:53

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [bug report] node: Add memory-side caching attributes

On Thu, Apr 01, 2021 at 12:00:39PM +0300, Dan Carpenter wrote:
> Hi Keith,
>
> I've been trying to figure out ways Smatch can check for device managed
> resources. Like adding rules that if we call dev_set_name(&foo->bar)
> then it's device managaged and if there is a kfree(foo) without calling
> device_put(&foo->bar) then that's a resource leak.

It seems to be working from what I can see

Also I wasn't able to convince myself that any locking around
node->cache_attrs exist..

> Of course one of the rules is that if you call device_register(dev) then
> you can't kfree(dev), it has to released with device_put(dev) and that's
> true even if the register fails. But this code here feels very
> intentional so maybe there is an exception to the rule?

There is no exception. Open coding this:

> 282 free_name:
> 283 kfree_const(dev->kobj.name);

To avoid leaking memory from dev_set_name is a straight up layering
violation, WTF?

node_cacheinfo_release() is just kfree(), so there is no need.
Instead (please feel free to send this Dan):

diff --git a/drivers/base/node.c b/drivers/base/node.c
index f449dbb2c74666..89c28952863977 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -319,25 +319,24 @@ void node_add_cache(unsigned int nid, struct node_cache_attrs *cache_attrs)
return;

dev = &info->dev;
+ device_initialize(dev)
dev->parent = node->cache_dev;
dev->release = node_cacheinfo_release;
dev->groups = cache_groups;
if (dev_set_name(dev, "index%d", cache_attrs->level))
- goto free_cache;
+ goto put_device;

info->cache_attrs = *cache_attrs;
- if (device_register(dev)) {
+ if (device_add(dev)) {
dev_warn(&node->dev, "failed to add cache level:%d\n",
cache_attrs->level);
- goto free_name;
+ goto put_device
}
pm_runtime_no_callbacks(dev);
list_add_tail(&info->node, &node->cache_attrs);
return;
-free_name:
- kfree_const(dev->kobj.name);
-free_cache:
- kfree(info);
+put_device:
+ put_device(dev);
}

static void node_remove_caches(struct node *node)

2021-04-01 18:36:47

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [bug report] node: Add memory-side caching attributes

On Thu, Apr 01, 2021 at 05:06:52PM +0300, Dan Carpenter wrote:

> > diff --git a/drivers/base/node.c b/drivers/base/node.c
> > index f449dbb2c74666..89c28952863977 100644
> > +++ b/drivers/base/node.c
> > @@ -319,25 +319,24 @@ void node_add_cache(unsigned int nid, struct node_cache_attrs *cache_attrs)
> > return;
> >
> > dev = &info->dev;
> > + device_initialize(dev)
> > dev->parent = node->cache_dev;
> > dev->release = node_cacheinfo_release;
> > dev->groups = cache_groups;
> > if (dev_set_name(dev, "index%d", cache_attrs->level))
>
> Is calling dev_set_name() without doing a device_initialize() a bug? I
> could write a check for that.

IMHO, yes.

However, Greg may not agree as dev_set_name() with no error check
followed by device_register() is a very common pattern. If the user
omits the device_initialize() then dev_set_name() must be immediately
before only device_register() with no error unwind between them. It
must not error unwind dev_set_name() to kfree. (This is really
tricky)

Greg and I have argued on the merits of device_initialize() several
times before.

I argue the error control flow is simpler and easier to get right, he
argues the extra statement is unneeded complexity and people will get
the error unwind wrong.

Every time you find a bug like this is someone getting the complexity
around error handling and device_register() wrong, so my advices is to
stop using device_register (aka device_init_and_add) and make it
explicit so the goto unwind has logical pairing. put_device() pairs
with device_initialize().

The tricky bit is establishing the release function, as complex
release functions often have complex init sequences and you need the
setup done enough to go to release. When things become this complex I
advocate splitting alloc into a function:

/* Caller must use put_device(&foo->dev) */
struct foo *foo_allocate()
{
foo = kzalloc
allocate release freeing thing 1;
allocate release freeing thing 2;
allocate release freeing thing 3;
device_initialize();
return foo;

err:
free thing 3;
err:
free thing 2;
err:
free thing 1;
err:
kfree(foo)
return rc;
}

Thus there is a clear logical seperation between the world of 'unwind
to kfree' and the world of 'unwind to put_device'. dev_set_name() can
not be inside an alloc function.

Simple cases, like here, should just do device_initialize()
immediately after kzalloc() and never have a kfree() error unwind.

I saw Christoph had a similar opinion on 'init and add' patterns being
bad, maybe he has additional colour to share.

Jason

2021-04-01 18:52:41

by Dan Carpenter

[permalink] [raw]
Subject: Re: [bug report] node: Add memory-side caching attributes

On Thu, Apr 01, 2021 at 08:25:11AM -0300, Jason Gunthorpe wrote:
> On Thu, Apr 01, 2021 at 12:00:39PM +0300, Dan Carpenter wrote:
> > Hi Keith,
> >
> > I've been trying to figure out ways Smatch can check for device managed
> > resources. Like adding rules that if we call dev_set_name(&foo->bar)
> > then it's device managaged and if there is a kfree(foo) without calling
> > device_put(&foo->bar) then that's a resource leak.
>
> It seems to be working from what I can see

This check is actually more simple, and older. It just looks for

device_register(dev);
...
kfree(dev);

I've written your proposed check of:

device_register(&foo->dev);
...
kfree(foo); // warning missing device_put(&foo->dev);

But I just did that earler today and it will probably take a couple
iterations to work out the kinks. Plus I'm off for a small vacation so
it will be a week before I have the results from that.

>
> Also I wasn't able to convince myself that any locking around
> node->cache_attrs exist..
>
> > Of course one of the rules is that if you call device_register(dev) then
> > you can't kfree(dev), it has to released with device_put(dev) and that's
> > true even if the register fails. But this code here feels very
> > intentional so maybe there is an exception to the rule?
>
> There is no exception. Open coding this:
>
> > 282 free_name:
> > 283 kfree_const(dev->kobj.name);
>
> To avoid leaking memory from dev_set_name is a straight up layering
> violation, WTF?
>
> node_cacheinfo_release() is just kfree(), so there is no need.
> Instead (please feel free to send this Dan):

Sure, I can send this (tomorrow).

>
> diff --git a/drivers/base/node.c b/drivers/base/node.c
> index f449dbb2c74666..89c28952863977 100644
> --- a/drivers/base/node.c
> +++ b/drivers/base/node.c
> @@ -319,25 +319,24 @@ void node_add_cache(unsigned int nid, struct node_cache_attrs *cache_attrs)
> return;
>
> dev = &info->dev;
> + device_initialize(dev)
> dev->parent = node->cache_dev;
> dev->release = node_cacheinfo_release;
> dev->groups = cache_groups;
> if (dev_set_name(dev, "index%d", cache_attrs->level))

Is calling dev_set_name() without doing a device_initialize() a bug? I
could write a check for that.

regards,
dan carpenter

2021-04-01 21:28:45

by Keith Busch

[permalink] [raw]
Subject: Re: [bug report] node: Add memory-side caching attributes

On Thu, Apr 01, 2021 at 12:00:39PM +0300, Dan Carpenter wrote:
> Hi Keith,
>
> I've been trying to figure out ways Smatch can check for device managed
> resources. Like adding rules that if we call dev_set_name(&foo->bar)
> then it's device managaged and if there is a kfree(foo) without calling
> device_put(&foo->bar) then that's a resource leak.
>
> Of course one of the rules is that if you call device_register(dev) then
> you can't kfree(dev), it has to released with device_put(dev) and that's
> true even if the register fails. But this code here feels very
> intentional so maybe there is an exception to the rule?

Thanks for the notice. There was no intentional use of this, so I
think it was a mistake. The proposal in the follow on replies looks much
better.

> The patch acc02a109b04: "node: Add memory-side caching attributes"
> from Mar 11, 2019, leads to the following static checker warning:
>
> drivers/base/node.c:285 node_init_cache_dev()
> error: kfree after device_register(): 'dev'
>
> drivers/base/node.c
> 263 static void node_init_cache_dev(struct node *node)
> 264 {
> 265 struct device *dev;
> 266
> 267 dev = kzalloc(sizeof(*dev), GFP_KERNEL);
> 268 if (!dev)
> 269 return;
> 270
> 271 dev->parent = &node->dev;
> 272 dev->release = node_cache_release;
> 273 if (dev_set_name(dev, "memory_side_cache"))
> 274 goto free_dev;
> 275
> 276 if (device_register(dev))
> ^^^^^^^^^^^^^^^^^^^
> 277 goto free_name;
> 278
> 279 pm_runtime_no_callbacks(dev);
> 280 node->cache_dev = dev;
> 281 return;
> 282 free_name:
> 283 kfree_const(dev->kobj.name);
> 284 free_dev:
> 285 kfree(dev);
> ^^^^^^^^^^
> 286 }
>
> regards,
> dan carpenter