2014-10-28 19:09:14

by Florian Fainelli

[permalink] [raw]
Subject: DMA allocations from CMA and fatal_signal_pending check

Hello,

While debugging why some dma_alloc_coherent() allocations where
returning NULL on our brcmstb platform, specifically with
drivers/net/ethernet/broadcom/bcmcsysport.c, I came across the
fatal_signal_pending() check in mm/page_alloc.c which is there.

This driver calls dma_alloc_coherent(, GFP_KERNEL) which ends up making
a coherent allocation from a CMA region on our platform. Since that
allocation is allowed to sleep, and because we are in bcm_syport_open(),
executed from process context, a pending signal makes
dma_alloc_coherent() return NULL.

There are two ways I could fix this:

- use a GFP_ATOMIC allocation, which would avoid this sensitivity to a
pending signal being fatal (we suffer from the same issue in
bcm_sysport_resume)

- move the DMA coherent allocation before bcm_sysport_open(), in the
driver's probe function, but if the network interface is never used, we
would be waisting precious DMA coherent memory for nothing (it is only 4
bytes times 32 but still

Now the general problem that I see with this fatal_signal_pending()
check is that any driver that calls dma_alloc_coherent() and which does
this in a process context (network drivers are frequently doing this in
their ndo_open callback) and also happens to get its allocation serviced
from CMA can now fail, instead of failing on really hard OOM conditions.
--
Florian


2014-10-31 08:26:56

by Joonsoo Kim

[permalink] [raw]
Subject: Re: DMA allocations from CMA and fatal_signal_pending check

On Tue, Oct 28, 2014 at 12:08:46PM -0700, Florian Fainelli wrote:
> Hello,
>
> While debugging why some dma_alloc_coherent() allocations where
> returning NULL on our brcmstb platform, specifically with
> drivers/net/ethernet/broadcom/bcmcsysport.c, I came across the
> fatal_signal_pending() check in mm/page_alloc.c which is there.
>
> This driver calls dma_alloc_coherent(, GFP_KERNEL) which ends up making
> a coherent allocation from a CMA region on our platform. Since that
> allocation is allowed to sleep, and because we are in bcm_syport_open(),
> executed from process context, a pending signal makes
> dma_alloc_coherent() return NULL.

Hello, Florian.

fatal_signal_pending means that there is SIGKILL on that process.
I guess that caller of dma_alloc_coherent() will die soon.
In this case, why CMA should be succeed?

>
> There are two ways I could fix this:
>
> - use a GFP_ATOMIC allocation, which would avoid this sensitivity to a
> pending signal being fatal (we suffer from the same issue in
> bcm_sysport_resume)
>
> - move the DMA coherent allocation before bcm_sysport_open(), in the
> driver's probe function, but if the network interface is never used, we
> would be waisting precious DMA coherent memory for nothing (it is only 4
> bytes times 32 but still

I guess that it is okay that bcm_sysport_open() return -EINTR?

Thanks.

2014-10-31 20:59:27

by Florian Fainelli

[permalink] [raw]
Subject: Re: DMA allocations from CMA and fatal_signal_pending check

Hi Joonsoo,

On 10/31/2014 01:28 AM, Joonsoo Kim wrote:
> On Tue, Oct 28, 2014 at 12:08:46PM -0700, Florian Fainelli wrote:
>> Hello,
>>
>> While debugging why some dma_alloc_coherent() allocations where
>> returning NULL on our brcmstb platform, specifically with
>> drivers/net/ethernet/broadcom/bcmcsysport.c, I came across the
>> fatal_signal_pending() check in mm/page_alloc.c which is there.
>>
>> This driver calls dma_alloc_coherent(, GFP_KERNEL) which ends up making
>> a coherent allocation from a CMA region on our platform. Since that
>> allocation is allowed to sleep, and because we are in bcm_syport_open(),
>> executed from process context, a pending signal makes
>> dma_alloc_coherent() return NULL.
>
> Hello, Florian.
>
> fatal_signal_pending means that there is SIGKILL on that process.
> I guess that caller of dma_alloc_coherent() will die soon.
> In this case, why CMA should be succeed?

I agree that the CMA allocation should not be allowed to succeed, but
the dma_alloc_coherent() allocation should succeed. If we look at the
sysport driver, there are kmalloc() calls to initialize private
structures, those will succeed (except under high memory pressure), so
by the same token, a driver expects DMA allocations to succeed (unless
we are under high memory pressure)

What are we trying to solve exactly with the fatal_signal_pending()
check here? Are we just optimizing for the case where a process has
allocated from a CMA region to allow this region to be returned to the
pool of free pages when it gets killed? Could there be another mechanism
used to reclaim those pages if we know the process is getting killed anyway?

>
>>
>> There are two ways I could fix this:
>>
>> - use a GFP_ATOMIC allocation, which would avoid this sensitivity to a
>> pending signal being fatal (we suffer from the same issue in
>> bcm_sysport_resume)
>>
>> - move the DMA coherent allocation before bcm_sysport_open(), in the
>> driver's probe function, but if the network interface is never used, we
>> would be waisting precious DMA coherent memory for nothing (it is only 4
>> bytes times 32 but still
>
> I guess that it is okay that bcm_sysport_open() return -EINTR?

Well, not really. This driver is not an isolated case, there are tons of
other networking drivers that do exactly the same thing, and we do
expect these dma_alloc_* calls to succeed.

I think we would want to ignore the fatal_signal_pending() check for
allocations coming through the dma_alloc_* API, although I agree this
could be a tough one when they are done from process context.

Updating all drivers to switch to GFP_ATOMIC allocations is not a good
idea, since that would exhaust the atomic DMA coherent pool for no good
reason.

FYI, we are hitting the same problem during suspend/resume, if you are
unlucky enough the suspending process get interrupted, you can get a lot
of crashes from drivers that do not expect their dma_alloc_coherent()
allocation to be sensible to signals.
--
Florian

2014-10-31 21:17:36

by Maxime Bizon

[permalink] [raw]
Subject: Re: DMA allocations from CMA and fatal_signal_pending check


On Fri, 2014-10-31 at 17:28 +0900, Joonsoo Kim wrote:

> I guess that it is okay that bcm_sysport_open() return -EINTR?

actually, since CMA alloc is hidden behind dma_alloc_coherent(), all you
get back is NULL and then return ENOMEM.

--
Maxime

2014-11-03 16:45:42

by Michal Nazarewicz

[permalink] [raw]
Subject: Re: DMA allocations from CMA and fatal_signal_pending check

On Fri, Oct 31 2014, Florian Fainelli wrote:
> I agree that the CMA allocation should not be allowed to succeed, but
> the dma_alloc_coherent() allocation should succeed. If we look at the
> sysport driver, there are kmalloc() calls to initialize private
> structures, those will succeed (except under high memory pressure), so
> by the same token, a driver expects DMA allocations to succeed (unless
> we are under high memory pressure)
>
> What are we trying to solve exactly with the fatal_signal_pending()
> check here? Are we just optimizing for the case where a process has
> allocated from a CMA region to allow this region to be returned to the
> pool of free pages when it gets killed? Could there be another mechanism
> used to reclaim those pages if we know the process is getting killed
> anyway?

We're guarding against situations where process may hang around
arbitrarily long time after receiving SIGKILL. If user does “kill -9
$pid” the usual expectation is that the $pid process will die within
seconds and anything longer is perceived by user as a bug.

What problem are *you* trying to solve? If user sent SIGKILL to
a process that imitated device initialisation, what is the point of
continuing initialising the device? Just recover and return -EINTR.

> Well, not really. This driver is not an isolated case, there are tons of
> other networking drivers that do exactly the same thing, and we do
> expect these dma_alloc_* calls to succeed.

Again, why do you expect them to succeed? The code must handle failures
correctly anyway so why do you wish to ignore fatal signal?

--
Best regards, _ _
.o. | Liege of Serenely Enlightened Majesty of o' \,=./ `o
..o | Computer Science, Michał “mina86” Nazarewicz (o o)
ooo +--<[email protected]>--<xmpp:[email protected]>--ooO--(_)--Ooo--

2014-11-03 18:52:08

by Florian Fainelli

[permalink] [raw]
Subject: Re: DMA allocations from CMA and fatal_signal_pending check

On 11/03/2014 08:45 AM, Michal Nazarewicz wrote:
> On Fri, Oct 31 2014, Florian Fainelli wrote:
>> I agree that the CMA allocation should not be allowed to succeed, but
>> the dma_alloc_coherent() allocation should succeed. If we look at the
>> sysport driver, there are kmalloc() calls to initialize private
>> structures, those will succeed (except under high memory pressure), so
>> by the same token, a driver expects DMA allocations to succeed (unless
>> we are under high memory pressure)
>>
>> What are we trying to solve exactly with the fatal_signal_pending()
>> check here? Are we just optimizing for the case where a process has
>> allocated from a CMA region to allow this region to be returned to the
>> pool of free pages when it gets killed? Could there be another mechanism
>> used to reclaim those pages if we know the process is getting killed
>> anyway?
>
> We're guarding against situations where process may hang around
> arbitrarily long time after receiving SIGKILL. If user does “kill -9
> $pid” the usual expectation is that the $pid process will die within
> seconds and anything longer is perceived by user as a bug.
>
> What problem are *you* trying to solve? If user sent SIGKILL to
> a process that imitated device initialisation, what is the point of
> continuing initialising the device? Just recover and return -EINTR.

I have two problems with the current approach:

- behavior of a dma_alloc_coherent() call is not consistent between a
CONFIG_CMA=y vs. CONFIG_CMA=n build, which is probably fine as long as
we document that properly

- there is currently no way for a caller of dma_alloc_coherent to tell
whether the allocation failed because it was interrupted by a signal, a
genuine OOM or something else, this is largely made worse by problem 1

>
>> Well, not really. This driver is not an isolated case, there are tons of
>> other networking drivers that do exactly the same thing, and we do
>> expect these dma_alloc_* calls to succeed.
>
> Again, why do you expect them to succeed? The code must handle failures
> correctly anyway so why do you wish to ignore fatal signal?

I guess expecting them to succeed is probably not good, but at we should
at least be able to report an accurate error code to the caller and down
to user-space.

Thanks
--
Florian