2008-02-07 00:22:02

by Thomas Gleixner

[permalink] [raw]
Subject: 2.6.24-git: kmap_atomic() WARN_ON()

current mainline triggers:

WARNING: at /home/tglx/work/kernel/x86/linux-2.6/arch/x86/mm/highmem_32.c:52 kmap_atomic_prot+0xe5/0x19b()
Modules linked in: ahci(+) sata_sil libata sd_mod scsi_mod raid1 ext3 jbd ehci_hcd ohci_hcd uhci_hcd
Pid: 0, comm: swapper Not tainted 2.6.24 #173
[<c0126b60>] warn_on_slowpath+0x41/0x51
[<c011c5eb>] ? __enqueue_entity+0x9c/0xa4
[<c011c717>] ? enqueue_entity+0x124/0x13b
[<c011cedb>] ? enqueue_task_fair+0x41/0x4c
[<c032ce88>] ? _spin_lock_irqsave+0x14/0x2e
[<c012e885>] ? lock_timer_base+0x1f/0x3e
[<c011ad6d>] kmap_atomic_prot+0xe5/0x19b
[<c011ae37>] kmap_atomic+0x14/0x16
[<f88ea218>] ata_scsi_rbuf_get+0x1e/0x2c [libata]
[<f88ea821>] atapi_qc_complete+0x23f/0x289 [libata]
[<f88e3d00>] __ata_qc_complete+0x8e/0x93 [libata]
[<f88e3fc7>] ata_qc_complete+0x115/0x128 [libata]
[<f88e8d47>] ata_qc_complete_multiple+0x86/0xa0 [libata]
[<f8841a5d>] ahci_interrupt+0x370/0x40d [ahci]
[<c01512c6>] handle_IRQ_event+0x21/0x48
[<c01521ca>] handle_edge_irq+0xc9/0x10a
[<c0152101>] ? handle_edge_irq+0x0/0x10a
[<c0107518>] do_IRQ+0x8b/0xb7
[<c01055db>] common_interrupt+0x23/0x28
[<c032007b>] ? init_chipset_cmd64x+0xb/0x93
[<c010316e>] ? mwait_idle_with_hints+0x39/0x3d
[<c0103172>] ? mwait_idle+0x0/0xf
[<c010317f>] mwait_idle+0xd/0xf
[<c0103761>] cpu_idle+0xb0/0xe4
[<c031b509>] rest_init+0x5d/0x5f

This is not a new problem. It was pointed out some time ago already,
but now the WARN_ON() finally made it into mainline :)

The fix is not obvious, as this code seems to be called from various
call sites.

Thanks,
tglx


2008-02-13 22:40:54

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: 2.6.24-git: kmap_atomic() WARN_ON()

Hi Thomas,

On Thursday, 7 of February 2008, Thomas Gleixner wrote:
> current mainline triggers:

Has the issue been fixed in the meantime?

> WARNING: at /home/tglx/work/kernel/x86/linux-2.6/arch/x86/mm/highmem_32.c:52 kmap_atomic_prot+0xe5/0x19b()
> Modules linked in: ahci(+) sata_sil libata sd_mod scsi_mod raid1 ext3 jbd ehci_hcd ohci_hcd uhci_hcd
> Pid: 0, comm: swapper Not tainted 2.6.24 #173
> [<c0126b60>] warn_on_slowpath+0x41/0x51
> [<c011c5eb>] ? __enqueue_entity+0x9c/0xa4
> [<c011c717>] ? enqueue_entity+0x124/0x13b
> [<c011cedb>] ? enqueue_task_fair+0x41/0x4c
> [<c032ce88>] ? _spin_lock_irqsave+0x14/0x2e
> [<c012e885>] ? lock_timer_base+0x1f/0x3e
> [<c011ad6d>] kmap_atomic_prot+0xe5/0x19b
> [<c011ae37>] kmap_atomic+0x14/0x16
> [<f88ea218>] ata_scsi_rbuf_get+0x1e/0x2c [libata]
> [<f88ea821>] atapi_qc_complete+0x23f/0x289 [libata]
> [<f88e3d00>] __ata_qc_complete+0x8e/0x93 [libata]
> [<f88e3fc7>] ata_qc_complete+0x115/0x128 [libata]
> [<f88e8d47>] ata_qc_complete_multiple+0x86/0xa0 [libata]
> [<f8841a5d>] ahci_interrupt+0x370/0x40d [ahci]
> [<c01512c6>] handle_IRQ_event+0x21/0x48
> [<c01521ca>] handle_edge_irq+0xc9/0x10a
> [<c0152101>] ? handle_edge_irq+0x0/0x10a
> [<c0107518>] do_IRQ+0x8b/0xb7
> [<c01055db>] common_interrupt+0x23/0x28
> [<c032007b>] ? init_chipset_cmd64x+0xb/0x93
> [<c010316e>] ? mwait_idle_with_hints+0x39/0x3d
> [<c0103172>] ? mwait_idle+0x0/0xf
> [<c010317f>] mwait_idle+0xd/0xf
> [<c0103761>] cpu_idle+0xb0/0xe4
> [<c031b509>] rest_init+0x5d/0x5f
>
> This is not a new problem. It was pointed out some time ago already,
> but now the WARN_ON() finally made it into mainline :)
>
> The fix is not obvious, as this code seems to be called from various
> call sites.
>
> Thanks,
> tglx

2008-02-14 01:14:36

by Thomas Gleixner

[permalink] [raw]
Subject: Re: 2.6.24-git: kmap_atomic() WARN_ON()

On Wed, 13 Feb 2008, Rafael J. Wysocki wrote:

> Hi Thomas,
>
> On Thursday, 7 of February 2008, Thomas Gleixner wrote:
> > current mainline triggers:
>
> Has the issue been fixed in the meantime?

Nope.

Just for reference: http://lkml.org/lkml/2007/1/14/38 looks
frighteningly similar.

Thanks,

tglx

2008-02-25 19:59:39

by Björn Steinbrink

[permalink] [raw]
Subject: Re: 2.6.24-git: kmap_atomic() WARN_ON()

On 2008.02.07 00:58:42 +0100, Thomas Gleixner wrote:
> current mainline triggers:
>
> WARNING: at /home/tglx/work/kernel/x86/linux-2.6/arch/x86/mm/highmem_32.c:52 kmap_atomic_prot+0xe5/0x19b()
> Modules linked in: ahci(+) sata_sil libata sd_mod scsi_mod raid1 ext3 jbd ehci_hcd ohci_hcd uhci_hcd
> Pid: 0, comm: swapper Not tainted 2.6.24 #173
> [<c0126b60>] warn_on_slowpath+0x41/0x51
> [<c011c5eb>] ? __enqueue_entity+0x9c/0xa4
> [<c011c717>] ? enqueue_entity+0x124/0x13b
> [<c011cedb>] ? enqueue_task_fair+0x41/0x4c
> [<c032ce88>] ? _spin_lock_irqsave+0x14/0x2e
> [<c012e885>] ? lock_timer_base+0x1f/0x3e
> [<c011ad6d>] kmap_atomic_prot+0xe5/0x19b
> [<c011ae37>] kmap_atomic+0x14/0x16
> [<f88ea218>] ata_scsi_rbuf_get+0x1e/0x2c [libata]
> [<f88ea821>] atapi_qc_complete+0x23f/0x289 [libata]
> [<f88e3d00>] __ata_qc_complete+0x8e/0x93 [libata]
> [<f88e3fc7>] ata_qc_complete+0x115/0x128 [libata]
> [<f88e8d47>] ata_qc_complete_multiple+0x86/0xa0 [libata]
> [<f8841a5d>] ahci_interrupt+0x370/0x40d [ahci]
> [<c01512c6>] handle_IRQ_event+0x21/0x48
> [<c01521ca>] handle_edge_irq+0xc9/0x10a
> [<c0152101>] ? handle_edge_irq+0x0/0x10a
> [<c0107518>] do_IRQ+0x8b/0xb7
> [<c01055db>] common_interrupt+0x23/0x28
> [<c032007b>] ? init_chipset_cmd64x+0xb/0x93
> [<c010316e>] ? mwait_idle_with_hints+0x39/0x3d
> [<c0103172>] ? mwait_idle+0x0/0xf
> [<c010317f>] mwait_idle+0xd/0xf
> [<c0103761>] cpu_idle+0xb0/0xe4
> [<c031b509>] rest_init+0x5d/0x5f
>
> This is not a new problem. It was pointed out some time ago already,
> but now the WARN_ON() finally made it into mainline :)
>
> The fix is not obvious, as this code seems to be called from various
> call sites.

Hm, do you have lockdep enabled? If not, does lockdep make this go away?
Because lockdep will set IRQF_DISABLED for all interrupt handlers, and
unless that flag is set, handle_IRQ_event will reenable interrupts while
the handler is running. And ahci_interrupt only uses a plain spin_lock,
so interrupts keep being enabled. The patch below should help with that.

Hmhm, maybe that also solves the deadlock you saw? Dunno...

I can't come up with an useful commit message right now, but I'll resend
in suitable form for submission if that thing actually works.

Bj?rn


diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
index 1db93b6..ae3dbc8 100644
--- a/drivers/ata/ahci.c
+++ b/drivers/ata/ahci.c
@@ -1739,6 +1739,7 @@ static irqreturn_t ahci_interrupt(int irq, void *dev_instance)
unsigned int i, handled = 0;
void __iomem *mmio;
u32 irq_stat, irq_ack = 0;
+ unsigned long flags;

VPRINTK("ENTER\n");

@@ -1751,7 +1752,7 @@ static irqreturn_t ahci_interrupt(int irq, void *dev_instance)
if (!irq_stat)
return IRQ_NONE;

- spin_lock(&host->lock);
+ spin_lock_irqsave(&host->lock, flags);

for (i = 0; i < host->n_ports; i++) {
struct ata_port *ap;
@@ -1778,7 +1779,7 @@ static irqreturn_t ahci_interrupt(int irq, void *dev_instance)
handled = 1;
}

- spin_unlock(&host->lock);
+ spin_unlock_irqrestore(&host->lock, flags);

VPRINTK("EXIT\n");

2008-02-25 20:08:51

by Jeff Garzik

[permalink] [raw]
Subject: Re: 2.6.24-git: kmap_atomic() WARN_ON()

Bj?rn Steinbrink wrote:
> On 2008.02.07 00:58:42 +0100, Thomas Gleixner wrote:
>> current mainline triggers:
>>
>> WARNING: at /home/tglx/work/kernel/x86/linux-2.6/arch/x86/mm/highmem_32.c:52 kmap_atomic_prot+0xe5/0x19b()
>> Modules linked in: ahci(+) sata_sil libata sd_mod scsi_mod raid1 ext3 jbd ehci_hcd ohci_hcd uhci_hcd
>> Pid: 0, comm: swapper Not tainted 2.6.24 #173
>> [<c0126b60>] warn_on_slowpath+0x41/0x51
>> [<c011c5eb>] ? __enqueue_entity+0x9c/0xa4
>> [<c011c717>] ? enqueue_entity+0x124/0x13b
>> [<c011cedb>] ? enqueue_task_fair+0x41/0x4c
>> [<c032ce88>] ? _spin_lock_irqsave+0x14/0x2e
>> [<c012e885>] ? lock_timer_base+0x1f/0x3e
>> [<c011ad6d>] kmap_atomic_prot+0xe5/0x19b
>> [<c011ae37>] kmap_atomic+0x14/0x16
>> [<f88ea218>] ata_scsi_rbuf_get+0x1e/0x2c [libata]
>> [<f88ea821>] atapi_qc_complete+0x23f/0x289 [libata]
>> [<f88e3d00>] __ata_qc_complete+0x8e/0x93 [libata]
>> [<f88e3fc7>] ata_qc_complete+0x115/0x128 [libata]
>> [<f88e8d47>] ata_qc_complete_multiple+0x86/0xa0 [libata]
>> [<f8841a5d>] ahci_interrupt+0x370/0x40d [ahci]
>> [<c01512c6>] handle_IRQ_event+0x21/0x48
>> [<c01521ca>] handle_edge_irq+0xc9/0x10a
>> [<c0152101>] ? handle_edge_irq+0x0/0x10a
>> [<c0107518>] do_IRQ+0x8b/0xb7
>> [<c01055db>] common_interrupt+0x23/0x28
>> [<c032007b>] ? init_chipset_cmd64x+0xb/0x93
>> [<c010316e>] ? mwait_idle_with_hints+0x39/0x3d
>> [<c0103172>] ? mwait_idle+0x0/0xf
>> [<c010317f>] mwait_idle+0xd/0xf
>> [<c0103761>] cpu_idle+0xb0/0xe4
>> [<c031b509>] rest_init+0x5d/0x5f
>>
>> This is not a new problem. It was pointed out some time ago already,
>> but now the WARN_ON() finally made it into mainline :)
>>
>> The fix is not obvious, as this code seems to be called from various
>> call sites.
>
> Hm, do you have lockdep enabled? If not, does lockdep make this go away?
> Because lockdep will set IRQF_DISABLED for all interrupt handlers, and
> unless that flag is set, handle_IRQ_event will reenable interrupts while
> the handler is running. And ahci_interrupt only uses a plain spin_lock,
> so interrupts keep being enabled. The patch below should help with that.
>
> Hmhm, maybe that also solves the deadlock you saw? Dunno...
>
> I can't come up with an useful commit message right now, but I'll resend
> in suitable form for submission if that thing actually works.
>
> Bj?rn
>
>
> diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
> index 1db93b6..ae3dbc8 100644
> --- a/drivers/ata/ahci.c
> +++ b/drivers/ata/ahci.c
> @@ -1739,6 +1739,7 @@ static irqreturn_t ahci_interrupt(int irq, void *dev_instance)
> unsigned int i, handled = 0;
> void __iomem *mmio;
> u32 irq_stat, irq_ack = 0;
> + unsigned long flags;
>
> VPRINTK("ENTER\n");
>
> @@ -1751,7 +1752,7 @@ static irqreturn_t ahci_interrupt(int irq, void *dev_instance)
> if (!irq_stat)
> return IRQ_NONE;
>
> - spin_lock(&host->lock);
> + spin_lock_irqsave(&host->lock, flags);
>
> for (i = 0; i < host->n_ports; i++) {
> struct ata_port *ap;
> @@ -1778,7 +1779,7 @@ static irqreturn_t ahci_interrupt(int irq, void *dev_instance)
> handled = 1;
> }
>
> - spin_unlock(&host->lock);
> + spin_unlock_irqrestore(&host->lock, flags);

If this truly fixes the problem, then lockdep is definitely the problem
source.

There are plenty of drivers that do the same thing that ahci does, in
terms of interrupt handler locking... and I will definitely push back
on efforts to convert otherwise-100%-safe spin_lock() into
spin_lock_irqsave() just to quiet lockdep.

Very interesting email, thanks...

Jeff


2008-02-25 20:35:53

by Björn Steinbrink

[permalink] [raw]
Subject: Re: 2.6.24-git: kmap_atomic() WARN_ON()

On 2008.02.25 15:08:35 -0500, Jeff Garzik wrote:
> Bj?rn Steinbrink wrote:
>> Hm, do you have lockdep enabled? If not, does lockdep make this go away?
>> Because lockdep will set IRQF_DISABLED for all interrupt handlers, and
>> unless that flag is set, handle_IRQ_event will reenable interrupts while
>> the handler is running. And ahci_interrupt only uses a plain spin_lock,
>> so interrupts keep being enabled. The patch below should help with that.
>>
>> Hmhm, maybe that also solves the deadlock you saw? Dunno...
>>
>> I can't come up with an useful commit message right now, but I'll resend
>> in suitable form for submission if that thing actually works.
>>
>> Bj?rn
>>
>>
>> diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
>> index 1db93b6..ae3dbc8 100644
>> --- a/drivers/ata/ahci.c
>> +++ b/drivers/ata/ahci.c
>> @@ -1739,6 +1739,7 @@ static irqreturn_t ahci_interrupt(int irq, void *dev_instance)
>> unsigned int i, handled = 0;
>> void __iomem *mmio;
>> u32 irq_stat, irq_ack = 0;
>> + unsigned long flags;
>> VPRINTK("ENTER\n");
>> @@ -1751,7 +1752,7 @@ static irqreturn_t ahci_interrupt(int irq, void
>> *dev_instance)
>> if (!irq_stat)
>> return IRQ_NONE;
>> - spin_lock(&host->lock);
>> + spin_lock_irqsave(&host->lock, flags);
>> for (i = 0; i < host->n_ports; i++) {
>> struct ata_port *ap;
>> @@ -1778,7 +1779,7 @@ static irqreturn_t ahci_interrupt(int irq, void *dev_instance)
>> handled = 1;
>> }
>> - spin_unlock(&host->lock);
>> + spin_unlock_irqrestore(&host->lock, flags);
>
> If this truly fixes the problem, then lockdep is definitely the problem
> source.

Hm, lockdep keeps the interrupts _disabled_. The code that reenables the
interrupts was already in the first revision of Linus' git tree. So
lockdep would actually just hide the problem, not cause it.

Bj?rn

2008-02-25 20:43:35

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.24-git: kmap_atomic() WARN_ON()

On Mon, 25 Feb 2008 15:08:35 -0500 Jeff Garzik <[email protected]> wrote:

> Bj__rn Steinbrink wrote:
> > On 2008.02.07 00:58:42 +0100, Thomas Gleixner wrote:
> >> current mainline triggers:
> >>
> >> WARNING: at /home/tglx/work/kernel/x86/linux-2.6/arch/x86/mm/highmem_32.c:52 kmap_atomic_prot+0xe5/0x19b()
> >> Modules linked in: ahci(+) sata_sil libata sd_mod scsi_mod raid1 ext3 jbd ehci_hcd ohci_hcd uhci_hcd
> >> Pid: 0, comm: swapper Not tainted 2.6.24 #173
> >> [<c0126b60>] warn_on_slowpath+0x41/0x51
> >> [<c011c5eb>] ? __enqueue_entity+0x9c/0xa4
> >> [<c011c717>] ? enqueue_entity+0x124/0x13b
> >> [<c011cedb>] ? enqueue_task_fair+0x41/0x4c
> >> [<c032ce88>] ? _spin_lock_irqsave+0x14/0x2e
> >> [<c012e885>] ? lock_timer_base+0x1f/0x3e
> >> [<c011ad6d>] kmap_atomic_prot+0xe5/0x19b
> >> [<c011ae37>] kmap_atomic+0x14/0x16
> >> [<f88ea218>] ata_scsi_rbuf_get+0x1e/0x2c [libata]
> >> [<f88ea821>] atapi_qc_complete+0x23f/0x289 [libata]
> >> [<f88e3d00>] __ata_qc_complete+0x8e/0x93 [libata]
> >> [<f88e3fc7>] ata_qc_complete+0x115/0x128 [libata]
> >> [<f88e8d47>] ata_qc_complete_multiple+0x86/0xa0 [libata]
> >> [<f8841a5d>] ahci_interrupt+0x370/0x40d [ahci]
> >> [<c01512c6>] handle_IRQ_event+0x21/0x48
> >> [<c01521ca>] handle_edge_irq+0xc9/0x10a
> >> [<c0152101>] ? handle_edge_irq+0x0/0x10a
> >> [<c0107518>] do_IRQ+0x8b/0xb7
> >> [<c01055db>] common_interrupt+0x23/0x28
> >> [<c032007b>] ? init_chipset_cmd64x+0xb/0x93
> >> [<c010316e>] ? mwait_idle_with_hints+0x39/0x3d
> >> [<c0103172>] ? mwait_idle+0x0/0xf
> >> [<c010317f>] mwait_idle+0xd/0xf
> >> [<c0103761>] cpu_idle+0xb0/0xe4
> >> [<c031b509>] rest_init+0x5d/0x5f
> >>
> >> This is not a new problem. It was pointed out some time ago already,
> >> but now the WARN_ON() finally made it into mainline :)
> >>
> >> The fix is not obvious, as this code seems to be called from various
> >> call sites.
> >
> > Hm, do you have lockdep enabled? If not, does lockdep make this go away?
> > Because lockdep will set IRQF_DISABLED for all interrupt handlers, and
> > unless that flag is set, handle_IRQ_event will reenable interrupts while
> > the handler is running. And ahci_interrupt only uses a plain spin_lock,
> > so interrupts keep being enabled. The patch below should help with that.
> >
> > Hmhm, maybe that also solves the deadlock you saw? Dunno...
> >
> > I can't come up with an useful commit message right now, but I'll resend
> > in suitable form for submission if that thing actually works.
> >
> > Bj__rn
> >
> >
> > diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
> > index 1db93b6..ae3dbc8 100644
> > --- a/drivers/ata/ahci.c
> > +++ b/drivers/ata/ahci.c
> > @@ -1739,6 +1739,7 @@ static irqreturn_t ahci_interrupt(int irq, void *dev_instance)
> > unsigned int i, handled = 0;
> > void __iomem *mmio;
> > u32 irq_stat, irq_ack = 0;
> > + unsigned long flags;
> >
> > VPRINTK("ENTER\n");
> >
> > @@ -1751,7 +1752,7 @@ static irqreturn_t ahci_interrupt(int irq, void *dev_instance)
> > if (!irq_stat)
> > return IRQ_NONE;
> >
> > - spin_lock(&host->lock);
> > + spin_lock_irqsave(&host->lock, flags);
> >
> > for (i = 0; i < host->n_ports; i++) {
> > struct ata_port *ap;
> > @@ -1778,7 +1779,7 @@ static irqreturn_t ahci_interrupt(int irq, void *dev_instance)
> > handled = 1;
> > }
> >
> > - spin_unlock(&host->lock);
> > + spin_unlock_irqrestore(&host->lock, flags);
>
> If this truly fixes the problem, then lockdep is definitely the problem
> source.
>
> There are plenty of drivers that do the same thing that ahci does, in
> terms of interrupt handler locking... and I will definitely push back
> on efforts to convert otherwise-100%-safe spin_lock() into
> spin_lock_irqsave() just to quiet lockdep.
>
> Very interesting email, thanks...
>

I suspect this is a bug in my old kmap_atomic debugging patch. It doesn't
know about the implicit irq-disablememnt which interrupt handlers enjoy. I
don't think...

2008-02-25 22:03:28

by Thomas Gleixner

[permalink] [raw]
Subject: Re: 2.6.24-git: kmap_atomic() WARN_ON()

On Mon, 25 Feb 2008, Andrew Morton wrote:
> On Mon, 25 Feb 2008 15:08:35 -0500 Jeff Garzik <[email protected]> wrote:
> > There are plenty of drivers that do the same thing that ahci does, in
> > terms of interrupt handler locking... and I will definitely push back
> > on efforts to convert otherwise-100%-safe spin_lock() into
> > spin_lock_irqsave() just to quiet lockdep.
> >
> > Very interesting email, thanks...
> >
>
> I suspect this is a bug in my old kmap_atomic debugging patch. It doesn't
> know about the implicit irq-disablememnt which interrupt handlers enjoy. I
> don't think...

I suspect here is confusion. The implicit irq-disablement of lockdep
is actually hiding the warning.

The code which emits the warning is:

if (type == KM_IRQ0 || type == KM_IRQ1 || type == KM_BOUNCE_READ ||
type == KM_BIO_SRC_IRQ || type == KM_BIO_DST_IRQ) {
if (!irqs_disabled()) {
WARN_ON(1);
warn_count--;
}

It checks for _NOT_ irqs_disabled. The calling code is
ata_scsi_rbuf_get() which calls with:

buf = kmap_atomic(sg_page(sg), KM_IRQ0) + sg->offset;

This happens with interrupts enabled. So the warning is according to
the well documented km_type enum and the equally well documented
highmem debug code correct.

Bjoern decoded it very well, just Jeff jumped to very interesting
conclusions.

Thanks,

tglx

2008-02-25 22:22:53

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.24-git: kmap_atomic() WARN_ON()

On Mon, 25 Feb 2008 23:01:59 +0100 (CET) Thomas Gleixner <[email protected]> wrote:

> On Mon, 25 Feb 2008, Andrew Morton wrote:
> > On Mon, 25 Feb 2008 15:08:35 -0500 Jeff Garzik <[email protected]> wrote:
> > > There are plenty of drivers that do the same thing that ahci does, in
> > > terms of interrupt handler locking... and I will definitely push back
> > > on efforts to convert otherwise-100%-safe spin_lock() into
> > > spin_lock_irqsave() just to quiet lockdep.
> > >
> > > Very interesting email, thanks...
> > >
> >
> > I suspect this is a bug in my old kmap_atomic debugging patch. It doesn't
> > know about the implicit irq-disablememnt which interrupt handlers enjoy. I
> > don't think...
>
> I suspect here is confusion. The implicit irq-disablement of lockdep
> is actually hiding the warning.
>
> The code which emits the warning is:
>
> if (type == KM_IRQ0 || type == KM_IRQ1 || type == KM_BOUNCE_READ ||
> type == KM_BIO_SRC_IRQ || type == KM_BIO_DST_IRQ) {
> if (!irqs_disabled()) {
> WARN_ON(1);
> warn_count--;
> }
>
> It checks for _NOT_ irqs_disabled. The calling code is
> ata_scsi_rbuf_get() which calls with:
>
> buf = kmap_atomic(sg_page(sg), KM_IRQ0) + sg->offset;
>
> This happens with interrupts enabled. So the warning is according to
> the well documented km_type enum and the equally well documented
> highmem debug code correct.
>
> Bjoern decoded it very well, just Jeff jumped to very interesting
> conclusions.
>

Ah, OK, yes. ata is wrong. It must disable interrupts here. Otherwise
this CPU could get interrupted by some other device whose handler also uses
KM_IRQ0, resulting in data corruption.

2008-02-25 23:19:30

by Jeff Garzik

[permalink] [raw]
Subject: Re: 2.6.24-git: kmap_atomic() WARN_ON()

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 0562b0a..7b1f1ee 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -1694,12 +1694,17 @@ void ata_scsi_rbuf_fill(struct ata_scsi_args *args,
u8 *rbuf;
unsigned int buflen, rc;
struct scsi_cmnd *cmd = args->cmd;
+ unsigned long flags;
+
+ local_irq_save(flags);

buflen = ata_scsi_rbuf_get(cmd, &rbuf);
memset(rbuf, 0, buflen);
rc = actor(args, rbuf, buflen);
ata_scsi_rbuf_put(cmd, rbuf);

+ local_irq_restore(flags);
+
if (rc == 0)
cmd->result = SAM_STAT_GOOD;
args->done(cmd);
@@ -2473,6 +2478,9 @@ static void atapi_qc_complete(struct ata_queued_cmd *qc)
if ((scsicmd[0] == INQUIRY) && ((scsicmd[1] & 0x03) == 0)) {
u8 *buf = NULL;
unsigned int buflen;
+ unsigned long flags;
+
+ local_irq_save(flags);

buflen = ata_scsi_rbuf_get(cmd, &buf);

@@ -2490,6 +2498,8 @@ static void atapi_qc_complete(struct ata_queued_cmd *qc)
}

ata_scsi_rbuf_put(cmd, buf);
+
+ local_irq_restore(flags);
}

cmd->result = SAM_STAT_GOOD;


Attachments:
patch (1.05 kB)

2008-02-26 08:40:00

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.24-git: kmap_atomic() WARN_ON()


* Jeff Garzik <[email protected]> wrote:

> + unsigned long flags;
> +
> + local_irq_save(flags);

hm, couldnt we attach the irq disabling to some spinlock, in a natural
way? Explicit flags fiddling is a PITA once we do things like threaded
irq handlers, -rt, etc.

Ingo

2008-02-26 08:55:27

by Thomas Gleixner

[permalink] [raw]
Subject: Re: 2.6.24-git: kmap_atomic() WARN_ON()

On Mon, 25 Feb 2008, Jeff Garzik wrote:

> Welcome to test this... (attached, not tested nor even compiled, really)

Works, but I agree with Ingo vs. the stand alone irq_en/disable.

Thanks,
tglx

2008-02-26 16:32:56

by Jeff Garzik

[permalink] [raw]
Subject: Re: 2.6.24-git: kmap_atomic() WARN_ON()

Ingo Molnar wrote:
> * Jeff Garzik <[email protected]> wrote:
>
>> + unsigned long flags;
>> +
>> + local_irq_save(flags);
>
> hm, couldnt we attach the irq disabling to some spinlock, in a natural
> way? Explicit flags fiddling is a PITA once we do things like threaded
> irq handlers, -rt, etc.

Attaching the irq disabling to some spinlock is what would be
artificial... See the ahci.c patch earlier in this thread. It is taken
without spin_lock_irqsave() in the interrupt handler, and there is no
reason to disable interrupts for the entirety of the interrupt handler
run -- only the part where we call kmap.

This is only being done to satisfy kmap_atomic's requirements, not libata's.

I could add a "kmap lock" but that just seems silly.

Jeff



2008-02-26 18:23:27

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.24-git: kmap_atomic() WARN_ON()

On Tue, 26 Feb 2008 11:32:24 -0500 Jeff Garzik <[email protected]> wrote:

> Ingo Molnar wrote:
> > * Jeff Garzik <[email protected]> wrote:
> >
> >> + unsigned long flags;
> >> +
> >> + local_irq_save(flags);
> >
> > hm, couldnt we attach the irq disabling to some spinlock, in a natural
> > way? Explicit flags fiddling is a PITA once we do things like threaded
> > irq handlers, -rt, etc.
>
> Attaching the irq disabling to some spinlock is what would be
> artificial... See the ahci.c patch earlier in this thread. It is taken
> without spin_lock_irqsave() in the interrupt handler, and there is no
> reason to disable interrupts for the entirety of the interrupt handler
> run -- only the part where we call kmap.
>
> This is only being done to satisfy kmap_atomic's requirements, not libata's.
>
> I could add a "kmap lock" but that just seems silly.
>

It's a bit sad to disable interupts across a memset (how big is it?) just
for the small proportion of cases which are accessing a highmem page.

What you could do is to add an `unsigned long *flags' arg to
ata_scsi_rbuf_get() and ata_scsi_rbuf_put(), and then, in
ata_scsi_rbuf_get() do

if (PageHighmem(page))
local_irq_disable(*flags);

2008-02-26 20:50:22

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.24-git: kmap_atomic() WARN_ON()


* Andrew Morton <[email protected]> wrote:

> > This is only being done to satisfy kmap_atomic's requirements, not
> > libata's.
> >
> > I could add a "kmap lock" but that just seems silly.
> >
>
> It's a bit sad to disable interupts across a memset (how big is it?)
> just for the small proportion of cases which are accessing a highmem
> page.
>
> What you could do is to add an `unsigned long *flags' arg to
> ata_scsi_rbuf_get() and ata_scsi_rbuf_put(), and then, in
> ata_scsi_rbuf_get() do
>
> if (PageHighmem(page))
> local_irq_disable(*flags);

it would be much nicer to attach the irq disabling to the object, not to
some arbitrary place in the code.

i.e. to introduce a kmap_atomic_irqsave(...,flags) and
kunmap_atomic_irqrestore() API variant. _That_ then could be mapped by
-rt to a non-irq disabling thing.

Ingo

2008-02-26 21:49:54

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.24-git: kmap_atomic() WARN_ON()

On Tue, 26 Feb 2008 21:49:43 +0100 Ingo Molnar <[email protected]> wrote:

>
> * Andrew Morton <[email protected]> wrote:
>
> > > This is only being done to satisfy kmap_atomic's requirements, not
> > > libata's.
> > >
> > > I could add a "kmap lock" but that just seems silly.
> > >
> >
> > It's a bit sad to disable interupts across a memset (how big is it?)
> > just for the small proportion of cases which are accessing a highmem
> > page.
> >
> > What you could do is to add an `unsigned long *flags' arg to
> > ata_scsi_rbuf_get() and ata_scsi_rbuf_put(), and then, in
> > ata_scsi_rbuf_get() do
> >
> > if (PageHighmem(page))
> > local_irq_disable(*flags);
>
> it would be much nicer to attach the irq disabling to the object, not to
> some arbitrary place in the code.
>
> i.e. to introduce a kmap_atomic_irqsave(...,flags) and
> kunmap_atomic_irqrestore() API variant. _That_ then could be mapped by
> -rt to a non-irq disabling thing.
>

Sure. But iirc we haven't had a need for this before. Which is a bit odd.

2008-02-26 23:00:38

by Jeff Garzik

[permalink] [raw]
Subject: Re: 2.6.24-git: kmap_atomic() WARN_ON()

Andrew Morton wrote:
> On Tue, 26 Feb 2008 21:49:43 +0100 Ingo Molnar <[email protected]> wrote:
>> i.e. to introduce a kmap_atomic_irqsave(...,flags) and
>> kunmap_atomic_irqrestore() API variant. _That_ then could be mapped by
>> -rt to a non-irq disabling thing.

> Sure. But iirc we haven't had a need for this before. Which is a bit odd.

We have such a need -- we just always paper over it with
spin_lock_irqsave() or local_irq_save() because those are "the rules."
grep around :)

See ata_pio_sector() in libata-core.c, where we do:

> if (PageHighMem(page)) {
> unsigned long flags;
>
> /* FIXME: use a bounce buffer */
> local_irq_save(flags);
> buf = kmap_atomic(page, KM_IRQ0);
>
> /* do the actual data transfer */
> ap->ops->data_xfer(qc->dev, buf + offset, qc->sect_size, do_write);
>
> kunmap_atomic(buf, KM_IRQ0);
> local_irq_restore(flags);
> } else {
> buf = page_address(page);
> ap->ops->data_xfer(qc->dev, buf + offset, qc->sect_size, do_write);
> }

This is a core non-DMA data transfer routine. Given the high cost of
disabling interrupts during the relatively-slow PIO data transfer, we
added extra logic to only disable interrupts for the highmem case.

The code in libata-scsi generally only deals with a single buffer, on
average less than 128 bytes in length. So the memcpy() isn't really
that expensive in the case being discussed -- unlike the
ata_pio_sector() case, where the cost is very, very high.

Aside: The "FIXME: use bounce buffer" comment above indicates the more
optimal PIO data xfer approach of

local_irq_save()
kmap_atomic()
memcpy into bounce buffer
kunmap_atomic()
local_irq_restore()

/* do slow PIO bitbanging data transfer */
ap->ops->data_xfer(...)

Regards,

Jeff

2008-02-26 23:51:26

by Nick Piggin

[permalink] [raw]
Subject: Re: 2.6.24-git: kmap_atomic() WARN_ON()

On Wednesday 27 February 2008 07:49, Ingo Molnar wrote:
> * Andrew Morton <[email protected]> wrote:
> > > This is only being done to satisfy kmap_atomic's requirements, not
> > > libata's.
> > >
> > > I could add a "kmap lock" but that just seems silly.
> >
> > It's a bit sad to disable interupts across a memset (how big is it?)
> > just for the small proportion of cases which are accessing a highmem
> > page.
> >
> > What you could do is to add an `unsigned long *flags' arg to
> > ata_scsi_rbuf_get() and ata_scsi_rbuf_put(), and then, in
> > ata_scsi_rbuf_get() do
> >
> > if (PageHighmem(page))
> > local_irq_disable(*flags);
>
> it would be much nicer to attach the irq disabling to the object, not to
> some arbitrary place in the code.
>
> i.e. to introduce a kmap_atomic_irqsave(...,flags) and
> kunmap_atomic_irqrestore() API variant. _That_ then could be mapped by
> -rt to a non-irq disabling thing.

Yeah that is the right way to fix it, but that name implies that
interrupts are disabled for non-highmem pages too, which we don't
want. Can you think of a better one?

kmap_atomic_irq_safe maybe?

2008-02-27 00:15:33

by Alan

[permalink] [raw]
Subject: Re: 2.6.24-git: kmap_atomic() WARN_ON()

> Aside: The "FIXME: use bounce buffer" comment above indicates the more
> optimal PIO data xfer approach of
>
> local_irq_save()
> kmap_atomic()
> memcpy into bounce buffer
> kunmap_atomic()
> local_irq_restore()
>
> /* do slow PIO bitbanging data transfer */
> ap->ops->data_xfer(...)

Definitely - older PATA controllers are unbuffered. A PIO_0 transfer is
running at ISA speed with IRQs off. Guaranteed to give Ingo's RT a blip.

Alan