2011-06-22 15:05:37

by Alexander Stein

[permalink] [raw]
Subject: [PATCH] pch_dma: Fix channel locking

Fix for the following INFO message

=================================
[ INFO: inconsistent lock state ]
2.6.39+ #89
---------------------------------
inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
rs232/822 [HC1[1]:SC0[0]:HE0:SE1] takes:
(&(&pd_chan->lock)->rlock){?.....}, at: [<c123b9a1>] pdc_desc_get+0x16/0xab
{HARDIRQ-ON-W} state was registered at:
[<c104fe28>] mark_irqflags+0xbd/0x11a
[<c1050386>] __lock_acquire+0x501/0x6bb
[<c1050945>] lock_acquire+0x63/0x7b
[<c131c51d>] _raw_spin_lock_bh+0x43/0x51
[<c123bee4>] pd_alloc_chan_resources+0x92/0x11e
[<c123ad62>] dma_chan_get+0x9b/0x107
[<c123b2d1>] __dma_request_channel+0x61/0xdc
[<c11ba24b>] pch_request_dma+0x61/0x19e
[<c11bb3b8>] pch_uart_startup+0x16a/0x1a2
[<c11b8446>] uart_startup+0x87/0x147
[<c11b9183>] uart_open+0x117/0x13e
[<c11a5c7d>] tty_open+0x23c/0x34c
[<c1097705>] chrdev_open+0x140/0x15f
[<c10930a6>] __dentry_open.clone.14+0x14a/0x22b
[<c1093dfb>] nameidata_to_filp+0x36/0x40
[<c109f28b>] do_last+0x513/0x635
[<c109f4af>] path_openat+0x9c/0x2aa
[<c109f6e4>] do_filp_open+0x27/0x69
[<c1093f02>] do_sys_open+0xfd/0x184
[<c1093fad>] sys_open+0x24/0x2a
[<c131d58c>] sysenter_do_call+0x12/0x32
irq event stamp: 2522
hardirqs last enabled at (2521): [<c131ca3b>] _raw_spin_unlock_irqrestore+0x36/0x52
hardirqs last disabled at (2522): [<c131db27>] common_interrupt+0x27/0x34
softirqs last enabled at (2354): [<c102fa11>] __do_softirq+0x10a/0x11a
softirqs last disabled at (2299): [<c10041a4>] do_softirq+0x57/0xa4

other info that might help us debug this:
2 locks held by rs232/822:
#0: (&tty->atomic_write_lock){+.+.+.}, at: [<c11a4b7a>] tty_write_lock+0x14/0x3c
#1: (&port_lock_key){-.....}, at: [<c11bad72>] pch_uart_interrupt+0x17/0x1e9

stack backtrace:
Pid: 822, comm: rs232 Not tainted 2.6.39+ #89
Call Trace:
[<c1319f90>] ? printk+0x19/0x1b
[<c104f893>] print_usage_bug+0x184/0x18f
[<c104e5b1>] ? print_irq_inversion_bug+0x10e/0x10e
[<c104f943>] mark_lock_irq+0xa5/0x1f6
[<c104fc9c>] mark_lock+0x208/0x2d7
[<c104fdc0>] mark_irqflags+0x55/0x11a
[<c1050386>] __lock_acquire+0x501/0x6bb
[<c10042ee>] ? dump_trace+0x92/0xb6
[<c1050945>] lock_acquire+0x63/0x7b
[<c123b9a1>] ? pdc_desc_get+0x16/0xab
[<c131c2d0>] _raw_spin_lock+0x3e/0x4c
[<c123b9a1>] ? pdc_desc_get+0x16/0xab
[<c123b9a1>] pdc_desc_get+0x16/0xab
[<c10504d8>] ? __lock_acquire+0x653/0x6bb
[<c123bb2c>] pd_prep_slave_sg+0x7c/0x1cb
[<c1006c3f>] ? nommu_map_sg+0x6e/0x81
[<c11bace6>] dma_handle_tx+0x2cf/0x344
[<c11bad72>] ? pch_uart_interrupt+0x17/0x1e9
[<c11baebb>] pch_uart_interrupt+0x160/0x1e9
[<c10642fb>] handle_irq_event_percpu+0x25/0x127
[<c1064429>] handle_irq_event+0x2c/0x43
[<c1065e0d>] ? handle_fasteoi_irq+0x84/0x84
[<c1065eb9>] handle_edge_irq+0xac/0xce
<IRQ> [<c1003ecb>] ? do_IRQ+0x38/0x9d
[<c131db2e>] ? common_interrupt+0x2e/0x34
[<c105007b>] ? __lock_acquire+0x1f6/0x6bb
[<c131ca3d>] ? _raw_spin_unlock_irqrestore+0x38/0x52
[<c11b798b>] ? uart_start+0x2d/0x32
[<c11b7998>] ? uart_flush_chars+0x8/0xa
[<c11a7962>] ? n_tty_write+0x12c/0x1c6
[<c1027a73>] ? try_to_wake_up+0x251/0x251
[<c11a4d0b>] ? tty_write+0x169/0x1dc
[<c11a7836>] ? n_tty_ioctl+0xb7/0xb7
[<c1094841>] ? vfs_write+0x91/0x10d
[<c11a4ba2>] ? tty_write_lock+0x3c/0x3c
[<c1094a69>] ? sys_write+0x3e/0x63
[<c131d58c>] ? sysenter_do_call+0x12/0x32

Signed-off-by: Alexander Stein <[email protected]>
---
I'm not exactly sure if the dmaengine functions are allowed to be called from
IRQ context. But I found amba-pl011 which also calls device_prep_slave_sg from
IRQ context.

drivers/dma/pch_dma.c | 16 ++++++++--------
1 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/dma/pch_dma.c b/drivers/dma/pch_dma.c
index ff5b38f..bfb0f19 100644
--- a/drivers/dma/pch_dma.c
+++ b/drivers/dma/pch_dma.c
@@ -495,11 +495,11 @@ static int pd_alloc_chan_resources(struct dma_chan *chan)
list_add_tail(&desc->desc_node, &tmp_list);
}

- spin_lock_bh(&pd_chan->lock);
+ spin_lock_irq(&pd_chan->lock);
list_splice(&tmp_list, &pd_chan->free_list);
pd_chan->descs_allocated = i;
pd_chan->completed_cookie = chan->cookie = 1;
- spin_unlock_bh(&pd_chan->lock);
+ spin_unlock_irq(&pd_chan->lock);

pdc_enable_irq(chan, 1);

@@ -517,10 +517,10 @@ static void pd_free_chan_resources(struct dma_chan *chan)
BUG_ON(!list_empty(&pd_chan->active_list));
BUG_ON(!list_empty(&pd_chan->queue));

- spin_lock_bh(&pd_chan->lock);
+ spin_lock_irq(&pd_chan->lock);
list_splice_init(&pd_chan->free_list, &tmp_list);
pd_chan->descs_allocated = 0;
- spin_unlock_bh(&pd_chan->lock);
+ spin_unlock_irq(&pd_chan->lock);

list_for_each_entry_safe(desc, _d, &tmp_list, desc_node)
pci_pool_free(pd->pool, desc, desc->txd.phys);
@@ -536,10 +536,10 @@ static enum dma_status pd_tx_status(struct dma_chan *chan, dma_cookie_t cookie,
dma_cookie_t last_completed;
int ret;

- spin_lock_bh(&pd_chan->lock);
+ spin_lock_irq(&pd_chan->lock);
last_completed = pd_chan->completed_cookie;
last_used = chan->cookie;
- spin_unlock_bh(&pd_chan->lock);
+ spin_unlock_irq(&pd_chan->lock);

ret = dma_async_is_complete(cookie, last_completed, last_used);

@@ -654,7 +654,7 @@ static int pd_device_control(struct dma_chan *chan, enum dma_ctrl_cmd cmd,
if (cmd != DMA_TERMINATE_ALL)
return -ENXIO;

- spin_lock_bh(&pd_chan->lock);
+ spin_lock_irq(&pd_chan->lock);

pdc_set_mode(&pd_chan->chan, DMA_CTL0_DISABLE);

@@ -664,7 +664,7 @@ static int pd_device_control(struct dma_chan *chan, enum dma_ctrl_cmd cmd,
list_for_each_entry_safe(desc, _d, &list, desc_node)
pdc_chain_complete(pd_chan, desc);

- spin_unlock_bh(&pd_chan->lock);
+ spin_unlock_irq(&pd_chan->lock);

return 0;
}
--
1.7.3.4


2011-06-23 00:11:58

by Tomoya MORINAGA

[permalink] [raw]
Subject: Re: [PATCH] pch_dma: Fix channel locking

(2011/06/23 0:05), Alexander Stein wrote:
> Fix for the following INFO message
>
> =================================
> [ INFO: inconsistent lock state ]
> 2.6.39+ #89
> ---------------------------------
> inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
> rs232/822 [HC1[1]:SC0[0]:HE0:SE1] takes:
> (&(&pd_chan->lock)->rlock){?.....}, at: [<c123b9a1>] pdc_desc_get+0x16/0xab
> {HARDIRQ-ON-W} state was registered at:
> [<c104fe28>] mark_irqflags+0xbd/0x11a
> [<c1050386>] __lock_acquire+0x501/0x6bb
> [<c1050945>] lock_acquire+0x63/0x7b
> [<c131c51d>] _raw_spin_lock_bh+0x43/0x51
> [<c123bee4>] pd_alloc_chan_resources+0x92/0x11e
> [<c123ad62>] dma_chan_get+0x9b/0x107
> [<c123b2d1>] __dma_request_channel+0x61/0xdc
> [<c11ba24b>] pch_request_dma+0x61/0x19e
> [<c11bb3b8>] pch_uart_startup+0x16a/0x1a2
> [<c11b8446>] uart_startup+0x87/0x147
> [<c11b9183>] uart_open+0x117/0x13e
> [<c11a5c7d>] tty_open+0x23c/0x34c
> [<c1097705>] chrdev_open+0x140/0x15f
> [<c10930a6>] __dentry_open.clone.14+0x14a/0x22b
> [<c1093dfb>] nameidata_to_filp+0x36/0x40
> [<c109f28b>] do_last+0x513/0x635
> [<c109f4af>] path_openat+0x9c/0x2aa
> [<c109f6e4>] do_filp_open+0x27/0x69
> [<c1093f02>] do_sys_open+0xfd/0x184
> [<c1093fad>] sys_open+0x24/0x2a
> [<c131d58c>] sysenter_do_call+0x12/0x32
> irq event stamp: 2522
> hardirqs last enabled at (2521): [<c131ca3b>] _raw_spin_unlock_irqrestore+0x36/0x52
> hardirqs last disabled at (2522): [<c131db27>] common_interrupt+0x27/0x34
> softirqs last enabled at (2354): [<c102fa11>] __do_softirq+0x10a/0x11a
> softirqs last disabled at (2299): [<c10041a4>] do_softirq+0x57/0xa4
>
> other info that might help us debug this:
> 2 locks held by rs232/822:
> #0: (&tty->atomic_write_lock){+.+.+.}, at: [<c11a4b7a>] tty_write_lock+0x14/0x3c
> #1: (&port_lock_key){-.....}, at: [<c11bad72>] pch_uart_interrupt+0x17/0x1e9
>
> stack backtrace:
> Pid: 822, comm: rs232 Not tainted 2.6.39+ #89
> Call Trace:
> [<c1319f90>] ? printk+0x19/0x1b
> [<c104f893>] print_usage_bug+0x184/0x18f
> [<c104e5b1>] ? print_irq_inversion_bug+0x10e/0x10e
> [<c104f943>] mark_lock_irq+0xa5/0x1f6
> [<c104fc9c>] mark_lock+0x208/0x2d7
> [<c104fdc0>] mark_irqflags+0x55/0x11a
> [<c1050386>] __lock_acquire+0x501/0x6bb
> [<c10042ee>] ? dump_trace+0x92/0xb6
> [<c1050945>] lock_acquire+0x63/0x7b
> [<c123b9a1>] ? pdc_desc_get+0x16/0xab
> [<c131c2d0>] _raw_spin_lock+0x3e/0x4c
> [<c123b9a1>] ? pdc_desc_get+0x16/0xab
> [<c123b9a1>] pdc_desc_get+0x16/0xab
> [<c10504d8>] ? __lock_acquire+0x653/0x6bb
> [<c123bb2c>] pd_prep_slave_sg+0x7c/0x1cb
> [<c1006c3f>] ? nommu_map_sg+0x6e/0x81
> [<c11bace6>] dma_handle_tx+0x2cf/0x344
> [<c11bad72>] ? pch_uart_interrupt+0x17/0x1e9
> [<c11baebb>] pch_uart_interrupt+0x160/0x1e9
> [<c10642fb>] handle_irq_event_percpu+0x25/0x127
> [<c1064429>] handle_irq_event+0x2c/0x43
> [<c1065e0d>] ? handle_fasteoi_irq+0x84/0x84
> [<c1065eb9>] handle_edge_irq+0xac/0xce
> <IRQ> [<c1003ecb>] ? do_IRQ+0x38/0x9d
> [<c131db2e>] ? common_interrupt+0x2e/0x34
> [<c105007b>] ? __lock_acquire+0x1f6/0x6bb
> [<c131ca3d>] ? _raw_spin_unlock_irqrestore+0x38/0x52
> [<c11b798b>] ? uart_start+0x2d/0x32
> [<c11b7998>] ? uart_flush_chars+0x8/0xa
> [<c11a7962>] ? n_tty_write+0x12c/0x1c6
> [<c1027a73>] ? try_to_wake_up+0x251/0x251
> [<c11a4d0b>] ? tty_write+0x169/0x1dc
> [<c11a7836>] ? n_tty_ioctl+0xb7/0xb7
> [<c1094841>] ? vfs_write+0x91/0x10d
> [<c11a4ba2>] ? tty_write_lock+0x3c/0x3c
> [<c1094a69>] ? sys_write+0x3e/0x63
> [<c131d58c>] ? sysenter_do_call+0x12/0x32
>

To: Yong Y
You wrote these spin_lock_bh(s), right ?
Could you tell us your opinion ?

To: Alexander
Thank you for your report.
How can I see the above kernel error log ?
Let me know your test condition.


--
tomoya
OKI SEMICONDUCTOR CO., LTD.

2011-06-23 07:42:54

by Alexander Stein

[permalink] [raw]
Subject: Re: [PATCH] pch_dma: Fix channel locking

On Thursday 23 June 2011 02:11:51 Tomoya MORINAGA wrote:
> (2011/06/23 0:05), Alexander Stein wrote:
> > Fix for the following INFO message
> >
> > =================================
> > [ INFO: inconsistent lock state ]
> > 2.6.39+ #89
> > ---------------------------------
> > inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
> >
> > rs232/822 [HC1[1]:SC0[0]:HE0:SE1] takes:
> > (&(&pd_chan->lock)->rlock){?.....}, at: [<c123b9a1>]
> > pdc_desc_get+0x16/0xab
> >
> > {HARDIRQ-ON-W} state was registered at:
> > [<c104fe28>] mark_irqflags+0xbd/0x11a
> > [<c1050386>] __lock_acquire+0x501/0x6bb
> > [<c1050945>] lock_acquire+0x63/0x7b
> > [<c131c51d>] _raw_spin_lock_bh+0x43/0x51
> > [<c123bee4>] pd_alloc_chan_resources+0x92/0x11e
> > [<c123ad62>] dma_chan_get+0x9b/0x107
> > [<c123b2d1>] __dma_request_channel+0x61/0xdc
> > [<c11ba24b>] pch_request_dma+0x61/0x19e
> > [<c11bb3b8>] pch_uart_startup+0x16a/0x1a2
> > [<c11b8446>] uart_startup+0x87/0x147
> > [<c11b9183>] uart_open+0x117/0x13e
> > [<c11a5c7d>] tty_open+0x23c/0x34c
> > [<c1097705>] chrdev_open+0x140/0x15f
> > [<c10930a6>] __dentry_open.clone.14+0x14a/0x22b
> > [<c1093dfb>] nameidata_to_filp+0x36/0x40
> > [<c109f28b>] do_last+0x513/0x635
> > [<c109f4af>] path_openat+0x9c/0x2aa
> > [<c109f6e4>] do_filp_open+0x27/0x69
> > [<c1093f02>] do_sys_open+0xfd/0x184
> > [<c1093fad>] sys_open+0x24/0x2a
> > [<c131d58c>] sysenter_do_call+0x12/0x32
> >
> > irq event stamp: 2522
> > hardirqs last enabled at (2521): [<c131ca3b>]
> > _raw_spin_unlock_irqrestore+0x36/0x52 hardirqs last disabled at (2522):
> > [<c131db27>] common_interrupt+0x27/0x34 softirqs last enabled at
> > (2354): [<c102fa11>] __do_softirq+0x10a/0x11a softirqs last disabled at
> > (2299): [<c10041a4>] do_softirq+0x57/0xa4
> >
> > other info that might help us debug this:
> >
> > 2 locks held by rs232/822:
> > #0: (&tty->atomic_write_lock){+.+.+.}, at: [<c11a4b7a>]
> > tty_write_lock+0x14/0x3c #1: (&port_lock_key){-.....}, at:
> > [<c11bad72>] pch_uart_interrupt+0x17/0x1e9
> >
> > stack backtrace:
> > Pid: 822, comm: rs232 Not tainted 2.6.39+ #89
> >
> > Call Trace:
> > [<c1319f90>] ? printk+0x19/0x1b
> > [<c104f893>] print_usage_bug+0x184/0x18f
> > [<c104e5b1>] ? print_irq_inversion_bug+0x10e/0x10e
> > [<c104f943>] mark_lock_irq+0xa5/0x1f6
> > [<c104fc9c>] mark_lock+0x208/0x2d7
> > [<c104fdc0>] mark_irqflags+0x55/0x11a
> > [<c1050386>] __lock_acquire+0x501/0x6bb
> > [<c10042ee>] ? dump_trace+0x92/0xb6
> > [<c1050945>] lock_acquire+0x63/0x7b
> > [<c123b9a1>] ? pdc_desc_get+0x16/0xab
> > [<c131c2d0>] _raw_spin_lock+0x3e/0x4c
> > [<c123b9a1>] ? pdc_desc_get+0x16/0xab
> > [<c123b9a1>] pdc_desc_get+0x16/0xab
> > [<c10504d8>] ? __lock_acquire+0x653/0x6bb
> > [<c123bb2c>] pd_prep_slave_sg+0x7c/0x1cb
> > [<c1006c3f>] ? nommu_map_sg+0x6e/0x81
> > [<c11bace6>] dma_handle_tx+0x2cf/0x344
> > [<c11bad72>] ? pch_uart_interrupt+0x17/0x1e9
> > [<c11baebb>] pch_uart_interrupt+0x160/0x1e9
> > [<c10642fb>] handle_irq_event_percpu+0x25/0x127
> > [<c1064429>] handle_irq_event+0x2c/0x43
> > [<c1065e0d>] ? handle_fasteoi_irq+0x84/0x84
> > [<c1065eb9>] handle_edge_irq+0xac/0xce
> > <IRQ> [<c1003ecb>] ? do_IRQ+0x38/0x9d
> > [<c131db2e>] ? common_interrupt+0x2e/0x34
> > [<c105007b>] ? __lock_acquire+0x1f6/0x6bb
> > [<c131ca3d>] ? _raw_spin_unlock_irqrestore+0x38/0x52
> > [<c11b798b>] ? uart_start+0x2d/0x32
> > [<c11b7998>] ? uart_flush_chars+0x8/0xa
> > [<c11a7962>] ? n_tty_write+0x12c/0x1c6
> > [<c1027a73>] ? try_to_wake_up+0x251/0x251
> > [<c11a4d0b>] ? tty_write+0x169/0x1dc
> > [<c11a7836>] ? n_tty_ioctl+0xb7/0xb7
> > [<c1094841>] ? vfs_write+0x91/0x10d
> > [<c11a4ba2>] ? tty_write_lock+0x3c/0x3c
> > [<c1094a69>] ? sys_write+0x3e/0x63
> > [<c131d58c>] ? sysenter_do_call+0x12/0x32

> To: Alexander
> Thank you for your report.
> How can I see the above kernel error log ?
> Let me know your test condition.

Enable DMA on ttyPCH0 (we use ttyPCH1 for serial login)
and write to the UART.
DMA is _not_ enabled by default, so I had to track back the code to find the
trigger for DMA enable.
After that, I wrote more than FIFO size, which is 256 for UART0), actually 270
chars on ttyPCH0.
See this minimalistic snippet:

#include <fcntl.h>
#include <unistd.h>
#include <sys/ioctl.h>
#include <linux/serial.h>
int main(int argc, char *argv[])
{
int fd;
char buf[270];
struct serial_struct serial;

fd = open ("/dev/ttyPCH0", O_RDWR);

/* Read current settings */
ioctl(fd, TIOCGSERIAL, &serial);
/* Write themagain to trigger DMA mode enable */
ioctl(fd, TIOCSSERIAL, &serial);

close(fd);
fd = open ("/dev/ttyPCH0", O_RDWR);
write (fd, buf, 270);

close (fd);
return 0;
}

The close and open is actually needed (I had a separate application for that
yesterday).

If all is done without closing again, I get the following bug:

BUG: unable to handle kernel NULL pointer dereference at (null)
dma_handle_tx+0x2bb/0x344
*pde = 00000000
PREEMPT SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:18.0/0000:03:00.0/class
Modules linked in:

Pid: 825, comm: raise_dma_fault Not tainted 2.6.39+ #93
EFLAGS: 00010046 CPU: 0
EIP is at dma_handle_tx+0x2bb/0x344
EAX: 00000000 EBX: f6b53e00 ECX: 00000001 EDX: f7350180
ESI: f7014800 EDI: f7350194 EBP: f700df6c ESP: f700df2c
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process raise_dma_fault (pid: 825, ti=f700c000 task=f6145780 task.ti=f60a4000)
Stack:
00000001 00000000 c11bad79 00000000 f7350180 00000001 00000001 00000000
000000b2 000000b2 00000100 00000001 00000000 f6b53e00 00000000 00000000
f700dfa0 c11baec2 00000000 00000001 00000018 f6145b0c 00000002 00000046
Call Trace:
? pch_uart_interrupt+0x17/0x1e9
pch_uart_interrupt+0x160/0x1e9
handle_irq_event_percpu+0x25/0x127
handle_irq_event+0x2c/0x43
? handle_fasteoi_irq+0x84/0x84
handle_edge_irq+0xac/0xce
<IRQ>
? do_IRQ+0x38/0x9d
? common_interrupt+0x2e/0x34
? lock_acquire+0x72/0x7b
? process_output+0x19/0x41
? mutex_lock_nested+0x45/0x291
? process_output+0x19/0x41
? trace_hardirqs_on+0xb/0xd
? process_output+0x19/0x41
? n_tty_write+0x10c/0x1c6
? try_to_wake_up+0x251/0x251
? tty_write+0x169/0x1dc
? n_tty_ioctl+0xb7/0xb7
? vfs_write+0x91/0x10d
? tty_write_lock+0x3c/0x3c
? sys_write+0x3e/0x63
? sysenter_do_call+0x12/0x32
Code: 89 57 0c 3b 45 dc 8b 55 e4 0f 45 55 e0 89 57 10 40 83 c7 14 03 4d e8 3b
45 ec 7c c3 8b 83 3c 01 00 00 8b 93 44 01 00 00 8b 4d ec <8b> 30 c7 44 24 04
03 00 00 00 c7 04 24 01 00 00 00 ff 56 54 85
dma_handle_tx+0x2bb/0x344 SS:ESP 0068:f700df2c
CR2: 0000000000000000
---[ end trace 0aeb50fd75b47c6f ]---
Kernel panic - not syncing: Fatal exception in interrupt
Pid: 825, comm: raise_dma_fault Tainted: G D 2.6.39+ #93
Call Trace:
? printk+0x19/0x1b
panic+0x62/0x151
oops_end+0x76/0x82
no_context+0x11c/0x125
__bad_area_nosemaphore+0x114/0x11d
? dump_trace+0x92/0xb6
? vmalloc_sync_all+0xeb/0xeb
bad_area_nosemaphore+0x13/0x15
do_page_fault+0x151/0x337
? __lock_acquire+0x653/0x6bb
? vmalloc_sync_all+0xeb/0xeb
error_code+0x5f/0x64
? vmalloc_sync_all+0xeb/0xeb
? dma_handle_tx+0x2bb/0x344
? pch_uart_interrupt+0x17/0x1e9
pch_uart_interrupt+0x160/0x1e9
handle_irq_event_percpu+0x25/0x127
handle_irq_event+0x2c/0x43
? handle_fasteoi_irq+0x84/0x84
handle_edge_irq+0xac/0xce
? do_IRQ+0x38/0x9d
? common_interrupt+0x2e/0x34
? lock_acquire+0x72/0x7b
? process_output+0x19/0x41
? mutex_lock_nested+0x45/0x291
? process_output+0x19/0x41
? trace_hardirqs_on+0xb/0xd
? process_output+0x19/0x41
? n_tty_write+0x10c/0x1c6
? try_to_wake_up+0x251/0x251
? tty_write+0x169/0x1dc
? n_tty_ioctl+0xb7/0xb7
? vfs_write+0x91/0x10d
? tty_write_lock+0x3c/0x3c
? sys_write+0x3e/0x63
? sysenter_do_call+0x12/0x32

Regards,
Alexander

2011-06-24 05:28:17

by Tomoya MORINAGA

[permalink] [raw]
Subject: Re: [PATCH] pch_dma: Fix channel locking

Hi Alexander Stein

I couldn't see the issue at all.

(2011/06/23 16:42), Alexander Stein wrote:
> The close and open is actually needed (I had a separate application for that
> yesterday).
Does the above mean that we must delete "close" processing from your
test program, right ?
(Modifying like this, I couldn't see the issue)
>
> If all is done without closing again, I get the following bug:
>
Does the above mean that executing your test program many times whose 2
"close" are deleted , did you see the issue ?

My steps like below
Boot Fedora14
Install DMA driver
Install pch_uart driver
Enable DMA (setserial /dev/ttyPCH0 ^low_latency)
Modifying your test program
Delete 2 close() function.
Compile your test program.
Execute the tp.
Execute the tp.
Execute the tp.
I couldn't see any error message.

--
tomoya
OKI SEMICONDUCTOR CO., LTD.

2011-06-27 14:23:01

by Alexander Stein

[permalink] [raw]
Subject: Re: [PATCH] pch_dma: Fix channel locking

Hello,

On Friday 24 June 2011 07:28:12 Tomoya MORINAGA wrote:
> Hi Alexander Stein
>
> I couldn't see the issue at all.
>
> (2011/06/23 16:42), Alexander Stein wrote:
> > The close and open is actually needed (I had a separate application for
> > that yesterday).
>
> Does the above mean that we must delete "close" processing from your
> test program, right ?
> (Modifying like this, I couldn't see the issue)

As you've written below, you used setserial to enable DMA mode. I used my own
application for that using 2 ioctls.

> > If all is done without closing again, I get the following bug:
> Does the above mean that executing your test program many times whose 2
> "close" are deleted , did you see the issue ?

The issue is only printed once. steps in short:
* Enable DMA
* Send 270 chars

> My steps like below
> Boot Fedora14
> Install DMA driver
> Install pch_uart driver
> Enable DMA (setserial /dev/ttyPCH0 ^low_latency)
> Modifying your test program
> Delete 2 close() function.
> Compile your test program.
> Execute the tp.
> Execute the tp.
> Execute the tp.
> I couldn't see any error message.

Ok. I have a 2.6.39 kernel with the following patches:
pch_phub: Don't panic if dmi_get_system_info returns NULL
drivers/tty/serial/pch_uart.c: don't oops if dmi_get_system_info returns NULL
pch_uart: Add console support

I need these to get a booting board with console support on pch_uart.
After boot (dma and pch_uart compiled into kernel), I do
* setserial /dev/ttyPCH0 ^low_latency
* ./raise_dma_fault
<- Here I get the message

For the reference the used code for raise_dma_fault:
#include <fcntl.h>
#include <unistd.h>
#include <sys/ioctl.h>
#include <linux/serial.h>
int main(int argc, char *argv[])
{
int fd;
char buf[270];

fd = open ("/dev/ttyPCH0", O_RDWR);
write (fd, buf, 270);

close (fd);
return 0;
}

Maybe you need to enable CONFIG_TRACE_IRQFLAGS and CONFIG_PROVE_LOCKING
additionally to see the messages.

Regards,
Alexander

2011-06-28 10:35:27

by Tomoya MORINAGA

[permalink] [raw]
Subject: Re: [PATCH] pch_dma: Fix channel locking

(2011/06/27 23:22), Alexander Stein wrote:
> Ok. I have a 2.6.39 kernel with the following patches:
> pch_phub: Don't panic if dmi_get_system_info returns NULL
> drivers/tty/serial/pch_uart.c: don't oops if dmi_get_system_info returns NULL
> pch_uart: Add console support
>
> I need these to get a booting board with console support on pch_uart.
> After boot (dma and pch_uart compiled into kernel), I do
> * setserial /dev/ttyPCH0 ^low_latency
> * ./raise_dma_fault
> <- Here I get the message

Thank you for your information.
I could see the kernel message.
However using your patch, it seems the message remains to be seen.

[ 142.401597] pch_uart 0000:02:0a.1: PCH UART : Use DMA Mode
[ 148.011410]
[ 148.011415] =================================
[ 148.011428] [ INFO: inconsistent lock state ]
[ 148.011439] 2.6.39.2+ #5
[ 148.011446] ---------------------------------
[ 148.011455] inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
[ 148.011467] gnome-terminal/1759 [HC1[1]:SC0[0]:HE0:SE1] takes:
[ 148.011477] (&(&pd_chan->lock)->rlock){?.....}, at: [<c0710cc5>]
pd_prep_slave_sg+0x92/0x26d
[ 148.011509] {HARDIRQ-ON-W} state was registered at:
[ 148.011516] [<c04618a1>] __lock_acquire+0x2d9/0xbcc
[ 148.011533] [<c0462594>] lock_acquire+0x94/0xb7
[ 148.011546] [<c07cec85>] _raw_spin_lock_bh+0x25/0x34
[ 148.011561] [<c07110d0>] pd_alloc_chan_resources+0xb2/0x165
[ 148.011577] [<c070f447>] dma_chan_get+0x63/0xbb
[ 148.011591] [<c070f7a0>] __dma_request_channel+0x168/0x246
[ 148.011606] [<c0655ff4>] pch_uart_startup+0x16b/0x2c4
[ 148.011622] [<c064f2df>] uart_startup+0x8b/0x158
[ 148.011636] [<c0650149>] uart_open+0x13f/0x168
[ 148.011649] [<c063b561>] tty_open+0x2c2/0x3cc
[ 148.011663] [<c04e8113>] chrdev_open+0xf4/0x118
[ 148.011678] [<c04e3a68>] __dentry_open+0x167/0x255
[ 148.011694] [<c04e477b>] nameidata_to_filp+0x4a/0x58
[ 148.011708] [<c04ee74b>] do_last+0x45d/0x573
[ 148.011722] [<c04ef75c>] path_openat+0x9b/0x2b5
[ 148.011735] [<c04ef99c>] do_filp_open+0x26/0x62
[ 148.011749] [<c04e47e2>] do_sys_open+0x59/0xd2
[ 148.011762] [<c04e487e>] sys_open+0x23/0x2b
[ 148.011775] [<c07d4bdf>] sysenter_do_call+0x12/0x38
[ 148.011791] irq event stamp: 485514
[ 148.011799] hardirqs last enabled at (485513): [<c07d4c59>]
sysexit_audit+0x11/0x50
[ 148.011815] hardirqs last disabled at (485514): [<c07d51ae>]
common_interrupt+0x2e/0x3c
[ 148.011832] softirqs last enabled at (485398): [<c043f585>]
__do_softirq+0x153/0x162
[ 148.011849] softirqs last disabled at (485375): [<c040430a>]
do_softirq+0x65/0xba
[ 148.011866]
[ 148.011869] other info that might help us debug this:
[ 148.011879] 1 lock held by gnome-terminal/1759:
[ 148.011886] #0: (&port_lock_key){-.-...}, at: [<c06555ab>]
pch_uart_interrupt+0x19/0x804
[ 148.011912]
[ 148.011915] stack backtrace:
[ 148.011926] Pid: 1759, comm: gnome-terminal Not tainted 2.6.39.2+ #5
[ 148.011934] Call Trace:
[ 148.011949] [<c07ccca4>] ? printk+0x25/0x29
[ 148.011962] [<c0461407>] valid_state+0x133/0x141
[ 148.011976] [<c04614e4>] mark_lock+0xcf/0x1b3
[ 148.011990] [<c0460e57>] ? print_irq_inversion_bug+0xe9/0xe9
[ 148.012004] [<c046182d>] __lock_acquire+0x265/0xbcc
[ 148.012019] [<c0461af7>] ? __lock_acquire+0x52f/0xbcc
[ 148.012034] [<c0710cc5>] ? pd_prep_slave_sg+0x92/0x26d
[ 148.012047] [<c0462594>] lock_acquire+0x94/0xb7
[ 148.012061] [<c0710cc5>] ? pd_prep_slave_sg+0x92/0x26d
[ 148.012076] [<c07ceacb>] _raw_spin_lock+0x20/0x2f
[ 148.012090] [<c0710cc5>] ? pd_prep_slave_sg+0x92/0x26d
[ 148.012104] [<c0710cc5>] pd_prep_slave_sg+0x92/0x26d
[ 148.012120] [<c0655af2>] pch_uart_interrupt+0x560/0x804
[ 148.012138] [<c04894fa>] handle_irq_event_percpu+0x53/0x174
[ 148.012153] [<c048964c>] handle_irq_event+0x31/0x49
[ 148.012167] [<c048b160>] ? handle_simple_irq+0x59/0x59
[ 148.012180] [<c048b1cb>] handle_fasteoi_irq+0x6b/0x89
[ 148.012189] <IRQ> [<c0404028>] ? do_IRQ+0x3c/0x87
[ 148.012213] [<c07d51b5>] ? common_interrupt+0x35/0x3c
[ 148.012219] [<c043007b>] ? enqueue_task_fair+0x319/0x4cd
[ 148.012219] [<c04f4a12>] ? dput+0xc5/0xca
[ 148.012219] [<c04ecacc>] ? path_put+0x13/0x1d
[ 148.012219] [<c047e0e2>] ? audit_free_names+0x45/0x5e
[ 148.012219] [<c047e985>] ? audit_syscall_exit+0x96/0x114
[ 148.012219] [<c07d4c71>] ? sysexit_audit+0x29/0x50

Best Regards,
tomoya

2011-06-28 11:15:35

by Alexander Stein

[permalink] [raw]
Subject: Re: [PATCH] pch_dma: Fix channel locking

On Tuesday 28 June 2011 12:34:34 Tomoya MORINAGA wrote:
> (2011/06/27 23:22), Alexander Stein wrote:
> > Ok. I have a 2.6.39 kernel with the following patches:
> > pch_phub: Don't panic if dmi_get_system_info returns NULL
> > drivers/tty/serial/pch_uart.c: don't oops if dmi_get_system_info returns
> > NULL pch_uart: Add console support
> >
> > I need these to get a booting board with console support on pch_uart.
> > After boot (dma and pch_uart compiled into kernel), I do
> > * setserial /dev/ttyPCH0 ^low_latency
> > * ./raise_dma_fault
> > <- Here I get the message
>
> Thank you for your information.
> I could see the kernel message.
> However using your patch, it seems the message remains to be seen.
>
> [ 142.401597] pch_uart 0000:02:0a.1: PCH UART : Use DMA Mode
> [ 148.011410]
> [ 148.011415] =================================
> [ 148.011428] [ INFO: inconsistent lock state ]
> [ 148.011439] 2.6.39.2+ #5
> [ 148.011446] ---------------------------------
> [ 148.011455] inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
> [ 148.011467] gnome-terminal/1759 [HC1[1]:SC0[0]:HE0:SE1] takes:
> [ 148.011477] (&(&pd_chan->lock)->rlock){?.....}, at: [<c0710cc5>]
> pd_prep_slave_sg+0x92/0x26d
> [ 148.011509] {HARDIRQ-ON-W} state was registered at:
> [ 148.011516] [<c04618a1>] __lock_acquire+0x2d9/0xbcc
> [ 148.011533] [<c0462594>] lock_acquire+0x94/0xb7
> [ 148.011546] [<c07cec85>] _raw_spin_lock_bh+0x25/0x34
> [ 148.011561] [<c07110d0>] pd_alloc_chan_resources+0xb2/0x165
> [ 148.011577] [<c070f447>] dma_chan_get+0x63/0xbb
> [ 148.011591] [<c070f7a0>] __dma_request_channel+0x168/0x246
> [ 148.011606] [<c0655ff4>] pch_uart_startup+0x16b/0x2c4
> [ 148.011622] [<c064f2df>] uart_startup+0x8b/0x158
> [ 148.011636] [<c0650149>] uart_open+0x13f/0x168
> [ 148.011649] [<c063b561>] tty_open+0x2c2/0x3cc
> [ 148.011663] [<c04e8113>] chrdev_open+0xf4/0x118
> [ 148.011678] [<c04e3a68>] __dentry_open+0x167/0x255
> [ 148.011694] [<c04e477b>] nameidata_to_filp+0x4a/0x58
> [ 148.011708] [<c04ee74b>] do_last+0x45d/0x573
> [ 148.011722] [<c04ef75c>] path_openat+0x9b/0x2b5
> [ 148.011735] [<c04ef99c>] do_filp_open+0x26/0x62
> [ 148.011749] [<c04e47e2>] do_sys_open+0x59/0xd2
> [ 148.011762] [<c04e487e>] sys_open+0x23/0x2b
> [ 148.011775] [<c07d4bdf>] sysenter_do_call+0x12/0x38
> [ 148.011791] irq event stamp: 485514
> [ 148.011799] hardirqs last enabled at (485513): [<c07d4c59>]
> sysexit_audit+0x11/0x50
> [ 148.011815] hardirqs last disabled at (485514): [<c07d51ae>]
> common_interrupt+0x2e/0x3c
> [ 148.011832] softirqs last enabled at (485398): [<c043f585>]
> __do_softirq+0x153/0x162
> [ 148.011849] softirqs last disabled at (485375): [<c040430a>]
> do_softirq+0x65/0xba
> [ 148.011866]
> [ 148.011869] other info that might help us debug this:
> [ 148.011879] 1 lock held by gnome-terminal/1759:
> [ 148.011886] #0: (&port_lock_key){-.-...}, at: [<c06555ab>]
> pch_uart_interrupt+0x19/0x804
> [ 148.011912]
> [ 148.011915] stack backtrace:
> [ 148.011926] Pid: 1759, comm: gnome-terminal Not tainted 2.6.39.2+ #5
> [ 148.011934] Call Trace:
> [ 148.011949] [<c07ccca4>] ? printk+0x25/0x29
> [ 148.011962] [<c0461407>] valid_state+0x133/0x141
> [ 148.011976] [<c04614e4>] mark_lock+0xcf/0x1b3
> [ 148.011990] [<c0460e57>] ? print_irq_inversion_bug+0xe9/0xe9
> [ 148.012004] [<c046182d>] __lock_acquire+0x265/0xbcc
> [ 148.012019] [<c0461af7>] ? __lock_acquire+0x52f/0xbcc
> [ 148.012034] [<c0710cc5>] ? pd_prep_slave_sg+0x92/0x26d
> [ 148.012047] [<c0462594>] lock_acquire+0x94/0xb7
> [ 148.012061] [<c0710cc5>] ? pd_prep_slave_sg+0x92/0x26d
> [ 148.012076] [<c07ceacb>] _raw_spin_lock+0x20/0x2f
> [ 148.012090] [<c0710cc5>] ? pd_prep_slave_sg+0x92/0x26d
> [ 148.012104] [<c0710cc5>] pd_prep_slave_sg+0x92/0x26d
> [ 148.012120] [<c0655af2>] pch_uart_interrupt+0x560/0x804
> [ 148.012138] [<c04894fa>] handle_irq_event_percpu+0x53/0x174
> [ 148.012153] [<c048964c>] handle_irq_event+0x31/0x49
> [ 148.012167] [<c048b160>] ? handle_simple_irq+0x59/0x59
> [ 148.012180] [<c048b1cb>] handle_fasteoi_irq+0x6b/0x89
> [ 148.012189] <IRQ> [<c0404028>] ? do_IRQ+0x3c/0x87
> [ 148.012213] [<c07d51b5>] ? common_interrupt+0x35/0x3c
> [ 148.012219] [<c043007b>] ? enqueue_task_fair+0x319/0x4cd
> [ 148.012219] [<c04f4a12>] ? dput+0xc5/0xca
> [ 148.012219] [<c04ecacc>] ? path_put+0x13/0x1d
> [ 148.012219] [<c047e0e2>] ? audit_free_names+0x45/0x5e
> [ 148.012219] [<c047e985>] ? audit_syscall_exit+0x96/0x114
> [ 148.012219] [<c07d4c71>] ? sysexit_audit+0x29/0x50

Odd. What does addr2line report for 0xc0710cc5 and for 0xc07110d0?

Regards,
Alexander

2011-06-29 04:04:44

by Tomoya MORINAGA

[permalink] [raw]
Subject: Re: [PATCH] pch_dma: Fix channel locking

Hi Alexander,

(2011/06/28 20:14), Alexander Stein wrote:
> Odd. What does addr2line report for 0xc0710cc5 and for 0xc07110d0?
>

addr2line -e a.out 0xc0710cc5 0xc07110d0
??:0
??:0

Is the above command true ?
If not, could you tell me how to use addr2line ?

--
tomoya
OKI SEMICONDUCTOR CO., LTD.

2011-06-29 04:25:53

by Tomoya MORINAGA

[permalink] [raw]
Subject: Re: [PATCH] pch_dma: Fix channel locking

(2011/06/29 13:04), Tomoya MORINAGA wrote:
> Hi Alexander,
>
> (2011/06/28 20:14), Alexander Stein wrote:
>> Odd. What does addr2line report for 0xc0710cc5 and for 0xc07110d0?
>>
>
> addr2line -e a.out 0xc0710cc5 0xc07110d0
> ??:0
> ??:0
>
> Is the above command true ?
> If not, could you tell me how to use addr2line ?
>

I can get the information.
[root@localhost morinaga]# addr2line -e vmlinux 0xc0710cc5 0xc07110d0
/home/morinaga/KORG/linux-2.6.39.y/drivers/dma/pch_dma.c:409
/home/morinaga/KORG/linux-2.6.39.y/include/linux/list.h:188

Best Regards,
--
tomoya
OKI SEMICONDUCTOR CO., LTD.

2011-06-29 06:28:18

by Alexander Stein

[permalink] [raw]
Subject: Re: [PATCH] pch_dma: Fix channel locking

On Tuesday 28 June 2011 12:34:34 Tomoya MORINAGA wrote:
> (2011/06/27 23:22), Alexander Stein wrote:
> > Ok. I have a 2.6.39 kernel with the following patches:
> > pch_phub: Don't panic if dmi_get_system_info returns NULL
> > drivers/tty/serial/pch_uart.c: don't oops if dmi_get_system_info returns
> > NULL pch_uart: Add console support
> >
> > I need these to get a booting board with console support on pch_uart.
> > After boot (dma and pch_uart compiled into kernel), I do
> > * setserial /dev/ttyPCH0 ^low_latency
> > * ./raise_dma_fault
> > <- Here I get the message
>
> Thank you for your information.
> I could see the kernel message.
> However using your patch, it seems the message remains to be seen.
>
> [ 142.401597] pch_uart 0000:02:0a.1: PCH UART : Use DMA Mode
> [ 148.011410]
> [ 148.011415] =================================
> [ 148.011428] [ INFO: inconsistent lock state ]
> [ 148.011439] 2.6.39.2+ #5
> [ 148.011446] ---------------------------------
> [ 148.011455] inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
> [ 148.011467] gnome-terminal/1759 [HC1[1]:SC0[0]:HE0:SE1] takes:
> [ 148.011477] (&(&pd_chan->lock)->rlock){?.....}, at: [<c0710cc5>]
> pd_prep_slave_sg+0x92/0x26d
> [ 148.011509] {HARDIRQ-ON-W} state was registered at:
> [ 148.011516] [<c04618a1>] __lock_acquire+0x2d9/0xbcc
> [ 148.011533] [<c0462594>] lock_acquire+0x94/0xb7
> [ 148.011546] [<c07cec85>] _raw_spin_lock_bh+0x25/0x34
> [ 148.011561] [<c07110d0>] pd_alloc_chan_resources+0xb2/0x165

Are you sure you applied the patch? There shouldn't be a _raw_spin_lock_bh
here.

Regards,
Alexander

2011-06-29 10:26:24

by Tomoya MORINAGA

[permalink] [raw]
Subject: Re: [PATCH] pch_dma: Fix channel locking

(2011/06/29 15:28), Alexander Stein wrote:
> Are you sure you applied the patch? There shouldn't be a _raw_spin_lock_bh
> here.
>

You are right.
I tested again, as a result,
I could confirm that after using your patch,
kernel error message doesn't logged at all.
It seems my patching yesterday was failed or operation mistake.
Sorry for the in inconvenience.

--
tomoya
OKI SEMICONDUCTOR CO., LTD.