2017-08-08 12:11:07

by Phil Elwell

[permalink] [raw]
Subject: [PATCH] staging: bcm2835-audio: Fix memory corruption

I'm all for fixing memory leaks, but freeing a block while it is still
being used is a recipe for hard-to-debug kernel exceptions.

1) There is already a vchi method for freeing the instance, so use it.
2) Only call it on error, and then only before initted is false.

Signed-off-by: Phil Elwell <[email protected]>
Fixes: 0adbfd4694c2 ("staging: bcm2835-audio: fix memory leak in bcm2835_audio_open_connection()")
---
drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c
index 5f3d8f2..89f96f3 100644
--- a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c
+++ b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c
@@ -409,6 +409,7 @@ static int bcm2835_audio_open_connection(struct bcm2835_alsa_stream *alsa_stream
LOG_ERR("%s: failed to connect VCHI instance (ret=%d)\n",
__func__, ret);

+ vchi_disconnect(vchi_instance);
ret = -EIO;
goto err_free_mem;
}
@@ -431,7 +432,6 @@ static int bcm2835_audio_open_connection(struct bcm2835_alsa_stream *alsa_stream
LOG_DBG(" success !\n");
ret = 0;
err_free_mem:
- kfree(vchi_instance);

return ret;
}
--
1.9.1


2017-08-10 10:25:04

by Dan Carpenter

[permalink] [raw]
Subject: Re: [PATCH] staging: bcm2835-audio: Fix memory corruption

The original patch did not go through the normal review process...

On Tue, Aug 08, 2017 at 01:05:02PM +0100, Phil Elwell wrote:
> I'm all for fixing memory leaks, but freeing a block while it is still
> being used is a recipe for hard-to-debug kernel exceptions.
>

This bug completely breaks the driver doesn't it? It's not very subtle
so it should be easy to diagnose with git bisect.

> 1) There is already a vchi method for freeing the instance, so use it.
> 2) Only call it on error, and then only before initted is false.
>
> Signed-off-by: Phil Elwell <[email protected]>
> Fixes: 0adbfd4694c2 ("staging: bcm2835-audio: fix memory leak in bcm2835_audio_open_connection()")
> ---
> drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c
> index 5f3d8f2..89f96f3 100644
> --- a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c
> +++ b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c
> @@ -409,6 +409,7 @@ static int bcm2835_audio_open_connection(struct bcm2835_alsa_stream *alsa_stream
> LOG_ERR("%s: failed to connect VCHI instance (ret=%d)\n",
> __func__, ret);
>
> + vchi_disconnect(vchi_instance);

This is ugly because why are we calling disconnect() if connect() fails?
These functions should be symetric so disconnect only disconnects and
we call a different function to undo vchi_initialise().

> ret = -EIO;
> goto err_free_mem;

This label name is out of date. There is a later error path where
vc_vchi_audio_init() fails and we leak on that path. Also why is
vchi_instance a static variable? Arglebargleblargleblah...

regards,
dan carpenter

2017-08-10 10:52:45

by Phil Elwell

[permalink] [raw]
Subject: Re: [PATCH] staging: bcm2835-audio: Fix memory corruption

On 10/08/2017 11:21, Dan Carpenter wrote:
> The original patch did not go through the normal review process...
>
> On Tue, Aug 08, 2017 at 01:05:02PM +0100, Phil Elwell wrote:
>> I'm all for fixing memory leaks, but freeing a block while it is still
>> being used is a recipe for hard-to-debug kernel exceptions.
>>
>
> This bug completely breaks the driver doesn't it? It's not very subtle
> so it should be easy to diagnose with git bisect.

It's more subtle than that - the failure isn't consistent, sometimes crashing
and sometimes not depending on how and when memory is reused.

>> 1) There is already a vchi method for freeing the instance, so use it.
>> 2) Only call it on error, and then only before initted is false.
>>
>> Signed-off-by: Phil Elwell <[email protected]>
>> Fixes: 0adbfd4694c2 ("staging: bcm2835-audio: fix memory leak in bcm2835_audio_open_connection()")
>> ---
>> drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c
>> index 5f3d8f2..89f96f3 100644
>> --- a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c
>> +++ b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c
>> @@ -409,6 +409,7 @@ static int bcm2835_audio_open_connection(struct bcm2835_alsa_stream *alsa_stream
>> LOG_ERR("%s: failed to connect VCHI instance (ret=%d)\n",
>> __func__, ret);
>>
>> + vchi_disconnect(vchi_instance);
>
> This is ugly because why are we calling disconnect() if connect() fails?
> These functions should be symetric so disconnect only disconnects and
> we call a different function to undo vchi_initialise().

Agreed - I'm not going to change the API, but I can add a comment.

>> ret = -EIO;
>> goto err_free_mem;
>
> This label name is out of date. There is a later error path where
> vc_vchi_audio_init() fails and we leak on that path.

Also agreed. I'll rework it.

> Also why is
> vchi_instance a static variable? Arglebargleblargleblah...

You may well think that. I couldn't possibly comment.

Phil

2017-08-10 11:06:50

by Dan Carpenter

[permalink] [raw]
Subject: Re: [PATCH] staging: bcm2835-audio: Fix memory corruption

On Thu, Aug 10, 2017 at 11:52:42AM +0100, Phil Elwell wrote:
> > This label name is out of date. There is a later error path where
> > vc_vchi_audio_init() fails and we leak on that path.
>
> Also agreed. I'll rework it.
>

Actually I wasn't right. That error path should probably stay how it
is, because we're re-using the vchi_instance. We allocate it in the
first call then re-use it later.

This code is really subtle and ugly.

regards,
dan carpenter

2017-08-10 11:25:52

by Dan Carpenter

[permalink] [raw]
Subject: Re: [PATCH] staging: bcm2835-audio: Fix memory corruption

On Thu, Aug 10, 2017 at 11:52:42AM +0100, Phil Elwell wrote:
> On 10/08/2017 11:21, Dan Carpenter wrote:
> > The original patch did not go through the normal review process...
> >
> > On Tue, Aug 08, 2017 at 01:05:02PM +0100, Phil Elwell wrote:
> >> I'm all for fixing memory leaks, but freeing a block while it is still
> >> being used is a recipe for hard-to-debug kernel exceptions.
> >>
> >
> > This bug completely breaks the driver doesn't it? It's not very subtle
> > so it should be easy to diagnose with git bisect.
>
> It's more subtle than that - the failure isn't consistent, sometimes crashing
> and sometimes not depending on how and when memory is reused.
>
> >> 1) There is already a vchi method for freeing the instance, so use it.
> >> 2) Only call it on error, and then only before initted is false.
> >>
> >> Signed-off-by: Phil Elwell <[email protected]>
> >> Fixes: 0adbfd4694c2 ("staging: bcm2835-audio: fix memory leak in bcm2835_audio_open_connection()")
> >> ---
> >> drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c | 2 +-
> >> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c
> >> index 5f3d8f2..89f96f3 100644
> >> --- a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c
> >> +++ b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c
> >> @@ -409,6 +409,7 @@ static int bcm2835_audio_open_connection(struct bcm2835_alsa_stream *alsa_stream
> >> LOG_ERR("%s: failed to connect VCHI instance (ret=%d)\n",
> >> __func__, ret);
> >>
> >> + vchi_disconnect(vchi_instance);
> >
> > This is ugly because why are we calling disconnect() if connect() fails?
> > These functions should be symetric so disconnect only disconnects and
> > we call a different function to undo vchi_initialise().
>
> Agreed - I'm not going to change the API, but I can add a comment.
>

Nah... Please don't do that. Just create a function:

static void vchiq_free(VCHIQ_INSTANCE_T *vchi_instance)
{
kfree(vchi_instance);
}

Really vchi_initialise() is badly named and it's just a wrapper around
vchiq_initialise(). It should be deleted and vchiq_initialise() should
be renamed to allocate instead of initialize... But that's for a
separate patch.

And also we should move the kfree() out of disconnect() like I said
and instead call vchiq_free(). But again, that's for a separate patch.

Change the goto to "return -EIO." Leave the last error path as-is.
Eventually we will want to break this into two functions, one that
allocates the first vchi_instance and one that calls vc_vchi_audio_init().
But again, there are many patches needed to fix this code.

regards,
dan carpenter


2017-08-10 11:35:16

by Dan Carpenter

[permalink] [raw]
Subject: Re: [PATCH] staging: bcm2835-audio: Fix memory corruption

On Thu, Aug 10, 2017 at 02:24:51PM +0300, Dan Carpenter wrote:
> On Thu, Aug 10, 2017 at 11:52:42AM +0100, Phil Elwell wrote:
> > On 10/08/2017 11:21, Dan Carpenter wrote:
> > > The original patch did not go through the normal review process...
> > >
> > > On Tue, Aug 08, 2017 at 01:05:02PM +0100, Phil Elwell wrote:
> > >> I'm all for fixing memory leaks, but freeing a block while it is still
> > >> being used is a recipe for hard-to-debug kernel exceptions.
> > >>
> > >
> > > This bug completely breaks the driver doesn't it? It's not very subtle
> > > so it should be easy to diagnose with git bisect.
> >
> > It's more subtle than that - the failure isn't consistent, sometimes crashing
> > and sometimes not depending on how and when memory is reused.
> >
> > >> 1) There is already a vchi method for freeing the instance, so use it.
> > >> 2) Only call it on error, and then only before initted is false.
> > >>
> > >> Signed-off-by: Phil Elwell <[email protected]>
> > >> Fixes: 0adbfd4694c2 ("staging: bcm2835-audio: fix memory leak in bcm2835_audio_open_connection()")
> > >> ---
> > >> drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c | 2 +-
> > >> 1 file changed, 1 insertion(+), 1 deletion(-)
> > >>
> > >> diff --git a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c
> > >> index 5f3d8f2..89f96f3 100644
> > >> --- a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c
> > >> +++ b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c
> > >> @@ -409,6 +409,7 @@ static int bcm2835_audio_open_connection(struct bcm2835_alsa_stream *alsa_stream
> > >> LOG_ERR("%s: failed to connect VCHI instance (ret=%d)\n",
> > >> __func__, ret);
> > >>
> > >> + vchi_disconnect(vchi_instance);
> > >
> > > This is ugly because why are we calling disconnect() if connect() fails?
> > > These functions should be symetric so disconnect only disconnects and
> > > we call a different function to undo vchi_initialise().
> >
> > Agreed - I'm not going to change the API, but I can add a comment.
> >
>
> Nah... Please don't do that. Just create a function:
>
> static void vchiq_free(VCHIQ_INSTANCE_T *vchi_instance)
^^^^^^
Not static. My bad.

regards,
dan carpenter

2017-08-10 13:26:01

by Phil Elwell

[permalink] [raw]
Subject: Re: [PATCH] staging: bcm2835-audio: Fix memory corruption



On 10/08/2017 12:24, Dan Carpenter wrote:
> On Thu, Aug 10, 2017 at 11:52:42AM +0100, Phil Elwell wrote:
>> On 10/08/2017 11:21, Dan Carpenter wrote:
>>> The original patch did not go through the normal review process...
>>>
>>> On Tue, Aug 08, 2017 at 01:05:02PM +0100, Phil Elwell wrote:
>>>> I'm all for fixing memory leaks, but freeing a block while it is still
>>>> being used is a recipe for hard-to-debug kernel exceptions.
>>>>
>>>
>>> This bug completely breaks the driver doesn't it? It's not very subtle
>>> so it should be easy to diagnose with git bisect.
>>
>> It's more subtle than that - the failure isn't consistent, sometimes crashing
>> and sometimes not depending on how and when memory is reused.
>>
>>>> 1) There is already a vchi method for freeing the instance, so use it.
>>>> 2) Only call it on error, and then only before initted is false.
>>>>
>>>> Signed-off-by: Phil Elwell <[email protected]>
>>>> Fixes: 0adbfd4694c2 ("staging: bcm2835-audio: fix memory leak in bcm2835_audio_open_connection()")
>>>> ---
>>>> drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c | 2 +-
>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c
>>>> index 5f3d8f2..89f96f3 100644
>>>> --- a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c
>>>> +++ b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c
>>>> @@ -409,6 +409,7 @@ static int bcm2835_audio_open_connection(struct bcm2835_alsa_stream *alsa_stream
>>>> LOG_ERR("%s: failed to connect VCHI instance (ret=%d)\n",
>>>> __func__, ret);
>>>>
>>>> + vchi_disconnect(vchi_instance);
>>>
>>> This is ugly because why are we calling disconnect() if connect() fails?
>>> These functions should be symetric so disconnect only disconnects and
>>> we call a different function to undo vchi_initialise().
>>
>> Agreed - I'm not going to change the API, but I can add a comment.
>>
>
> Nah... Please don't do that. Just create a function:
>
> void vchiq_free(VCHIQ_INSTANCE_T *vchi_instance)
> {
> kfree(vchi_instance);
> }
>
> Really vchi_initialise() is badly named and it's just a wrapper around
> vchiq_initialise(). It should be deleted and vchiq_initialise() should
> be renamed to allocate instead of initialize... But that's for a
> separate patch.
>
> And also we should move the kfree() out of disconnect() like I said
> and instead call vchiq_free(). But again, that's for a separate patch.
>
> Change the goto to "return -EIO." Leave the last error path as-is.
> Eventually we will want to break this into two functions, one that
> allocates the first vchi_instance and one that calls vc_vchi_audio_init().
> But again, there are many patches needed to fix this code.

[ static removed ]

Shouldn't that be:

void vchi_uninitialise(VCHI_INSTANCE_T instance_handle)
{
vchiq_shutdown((VCHIQ_INSTANCE_T)instance_handle);
}
EXPORT_SYMBOL(vchi_uninitialise);

Yes, in this case the instance is just a block of memory with no users,
but if this is a general API call then you want it to work in all cases,
and calling vchiq_shutdown is the right way to free an instance. Calling
a VCHIQ method when all the other calls are to the VCHI API is grubby,
so make it a VCHI method instead.

If instead this is just another hack - a kfree with a comment in code form -
then lets keep it static within the audio driver (which is what I thought you
were initially suggesting).

Phil

2017-08-10 14:08:49

by Dan Carpenter

[permalink] [raw]
Subject: Re: [PATCH] staging: bcm2835-audio: Fix memory corruption

On Thu, Aug 10, 2017 at 02:25:57PM +0100, Phil Elwell wrote:
>
>
> On 10/08/2017 12:24, Dan Carpenter wrote:
> > On Thu, Aug 10, 2017 at 11:52:42AM +0100, Phil Elwell wrote:
> >> On 10/08/2017 11:21, Dan Carpenter wrote:
> >>> The original patch did not go through the normal review process...
> >>>
> >>> On Tue, Aug 08, 2017 at 01:05:02PM +0100, Phil Elwell wrote:
> >>>> I'm all for fixing memory leaks, but freeing a block while it is still
> >>>> being used is a recipe for hard-to-debug kernel exceptions.
> >>>>
> >>>
> >>> This bug completely breaks the driver doesn't it? It's not very subtle
> >>> so it should be easy to diagnose with git bisect.
> >>
> >> It's more subtle than that - the failure isn't consistent, sometimes crashing
> >> and sometimes not depending on how and when memory is reused.
> >>
> >>>> 1) There is already a vchi method for freeing the instance, so use it.
> >>>> 2) Only call it on error, and then only before initted is false.
> >>>>
> >>>> Signed-off-by: Phil Elwell <[email protected]>
> >>>> Fixes: 0adbfd4694c2 ("staging: bcm2835-audio: fix memory leak in bcm2835_audio_open_connection()")
> >>>> ---
> >>>> drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c | 2 +-
> >>>> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c
> >>>> index 5f3d8f2..89f96f3 100644
> >>>> --- a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c
> >>>> +++ b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c
> >>>> @@ -409,6 +409,7 @@ static int bcm2835_audio_open_connection(struct bcm2835_alsa_stream *alsa_stream
> >>>> LOG_ERR("%s: failed to connect VCHI instance (ret=%d)\n",
> >>>> __func__, ret);
> >>>>
> >>>> + vchi_disconnect(vchi_instance);
> >>>
> >>> This is ugly because why are we calling disconnect() if connect() fails?
> >>> These functions should be symetric so disconnect only disconnects and
> >>> we call a different function to undo vchi_initialise().
> >>
> >> Agreed - I'm not going to change the API, but I can add a comment.
> >>
> >
> > Nah... Please don't do that. Just create a function:
> >
> > void vchiq_free(VCHIQ_INSTANCE_T *vchi_instance)
> > {
> > kfree(vchi_instance);
> > }
> >
> > Really vchi_initialise() is badly named and it's just a wrapper around
> > vchiq_initialise(). It should be deleted and vchiq_initialise() should
> > be renamed to allocate instead of initialize... But that's for a
> > separate patch.
> >
> > And also we should move the kfree() out of disconnect() like I said
> > and instead call vchiq_free(). But again, that's for a separate patch.
> >
> > Change the goto to "return -EIO." Leave the last error path as-is.
> > Eventually we will want to break this into two functions, one that
> > allocates the first vchi_instance and one that calls vc_vchi_audio_init().
> > But again, there are many patches needed to fix this code.
>
> [ static removed ]
>
> Shouldn't that be:
>
> void vchi_uninitialise(VCHI_INSTANCE_T instance_handle)

I haven't understood the difference between vchi_ and vchiq_. It looks
like vhci_ is just a shim around vhciq_. We should get rid of the shim
layer eventually...

Naming it vhci_ would make it consistent with the rest of the shim layer
but we don't care about consistency much we just want to write the code
the way it will end up eventually. In other words, there is no point
creating a shim function just to be consistent.

initialize/uninitialize are bad names because really it's an alloc/free.

Anyway, I don't have strong feelings about the name.

> {
> vchiq_shutdown((VCHIQ_INSTANCE_T)instance_handle);
> }
> EXPORT_SYMBOL(vchi_uninitialise);
>
> Yes, in this case the instance is just a block of memory with no users,
> but if this is a general API call then you want it to work in all cases,
> and calling vchiq_shutdown is the right way to free an instance. Calling
> a VCHIQ method when all the other calls are to the VCHI API is grubby,
> so make it a VCHI method instead.

The vchiq_shutdown() does call kfree() but it's a layering violation.
Every allocation function should have a mirror free function. The
vchiq_shutdown() cleans up a bunch of stuff which was not allocated in
vchi_initialise() and it seems to all be a no-op so it's not buggy but
it's not the right way to design an API.

So: Rename vchiq_shutdown() to vchiq_disconnect(). Delete the kfree().
Create a new function vchiq_shutdown() that does:

void vchiq_shutdown()
{
vchiq_disconnect();
vchiq_free();
}

But again, that is work for future patches. Every alloc has a free,
every connect has a disconnect.

regards,
dan carpenter