2013-06-07 18:49:51

by Grygorii Strashko

[permalink] [raw]
Subject: [PATCH 0/5] v3.10-rc4: fix OMAP4 boot failure if CONFIG_SENSORS_LM75=y

Hi All,

I observe, that OMAP4 SDP board boot is failed if CONFIG_SENSORS_LM75=y. See log below.
This issue reproduced with kernel v3.9-rc6, but without hang - just I2C is stuck
in Bus Busy condition. My investigation results are below.

[ 2.048858] TCP: cubic registered
[ 2.052398] Initializing XFRM netlink socket
[ 2.057037] NET: Registered protocol family 17
[ 2.061859] NET: Registered protocol family 15
[ 2.066955] Key type dns_resolver registered
[ 2.071716] VFP support v0.3: implementor 41 architecture 3 part 30 variant 9 rev 4
[ 2.081085] Power Management for TI OMAP4.
[ 2.085418] OMAP4 PM: u-boot >= v2012.07 is required for full PM support
[ 2.092773] ThumbEE CPU extension supported.
[ 2.107788] omap_i2c 48070000.i2c: bus 0 rev0.11 at 400 kHz
[ 2.117858] Skipping twl internal clock init and using bootloader value (unknown osc rate)
[ 2.128570] twl 0-0048: PIH (irq 39) chaining IRQs 416..436
[ 2.137237] twl_rtc rtc.11: Enabling TWL-RTC
[ 2.146301] twl_rtc rtc.11: rtc core: registered rtc.11 as rtc0
[ 2.155181] VAUX1_6030: 1000 <--> 3000 mV at 2800 mV
[ 2.162750] VAUX2_6030: 1200 <--> 2800 mV at 1800 mV
[ 2.170318] VAUX3_6030: 1000 <--> 3000 mV at 1200 mV
[ 2.177856] VMMC: 1200 <--> 3000 mV at 1800 mV
[ 2.184692] VPP: 1800 <--> 2500 mV at 1900 mV
[ 2.191589] VUSIM: 1200 <--> 2900 mV at 1800 mV
[ 2.198394] VDAC: 1800 mV
[ 2.202850] VANA: 2100 mV
[ 2.207763] VCXIO: 1800 mV
[ 2.212310] VUSB: 3300 mV
[ 2.217346] V1V8: 1800 mV
[ 2.222290] V2V1: 2100 mV
[ 2.239837] omap_i2c 48072000.i2c: bus 1 rev0.11 at 400 kHz
[ 2.246307] ------------[ cut here ]------------
[ 2.251220] WARNING: at drivers/bus/omap_l3_noc.c:113 l3_interrupt_handler+0x140/0x184()
[ 2.257781] L3 custom error: MASTER:MPU TARGET:L4 PER2
[ 2.264373] Modules linked in:
[ 2.268463] CPU: 0 PID: 2 Comm: kthreadd Not tainted 3.10.0-rc4-00034-g042dd60-dirty #64
[ 2.270385] [<c001a80c>] (unwind_backtrace+0x0/0xf0) from [<c0017238>] (show_stack+0x10/0x14)
[ 2.286102] [<c0017238>] (show_stack+0x10/0x14) from [<c0040fd0>] (warn_slowpath_common+0x4c/0x68)
[ 2.295593] [<c0040fd0>] (warn_slowpath_common+0x4c/0x68) from [<c0041080>] (warn_slowpath_fmt+0x30/0x40)
[ 2.299987] [<c0041080>] (warn_slowpath_fmt+0x30/0x40) from [<c02c5ed0>] (l3_interrupt_handler+0x140/0x184)
[ 2.315582] [<c02c5ed0>] (l3_interrupt_handler+0x140/0x184) from [<c009ffb8>] (handle_irq_event_percpu+0x58/0x258)
[ 2.323364] [<c009ffb8>] (handle_irq_event_percpu+0x58/0x258) from [<c00a01f4>] (handle_irq_event+0x3c/0x5c)
[ 2.327819] [<c00a01f4>] (handle_irq_event+0x3c/0x5c) from [<c00a2f7c>] (handle_fasteoi_irq+0xbc/0x164)
[ 2.337829] [<c00a2f7c>] (handle_fasteoi_irq+0xbc/0x164) from [<c009f978>] (generic_handle_irq+0x20/0x30)
[ 2.357513] [<c009f978>] (generic_handle_irq+0x20/0x30) from [<c0014168>] (handle_IRQ+0x4c/0xac)
[ 2.366821] [<c0014168>] (handle_IRQ+0x4c/0xac) from [<c00085b4>] (gic_handle_irq+0x28/0x5c)
[ 2.375762] [<c00085b4>] (gic_handle_irq+0x28/0x5c) from [<c04f08a4>] (__irq_svc+0x44/0x5c)
[ 2.379821] Exception stack(0xdb085ec0 to 0xdb085f08)
[ 2.389953] 5ec0: 00000001 00000001 00000000 db07ea00 40000113 db2317a8 db084000 000002f1
[ 2.389953] 5ee0: db07ea00 00000000 00000000 00000000 c04fd990 db085f08 c009170c c04f03e8
[ 2.405609] 5f00: 20000113 ffffffff
[ 2.408355] [<c04f08a4>] (__irq_svc+0x44/0x5c) from [<c04f03e8>] (_raw_spin_unlock_irqrestore+0x34/0x44)
[ 2.418304] [<c04f03e8>] (_raw_spin_unlock_irqrestore+0x34/0x44) from [<c00403c0>] (do_fork+0xa4/0x2d4)
[ 2.427795] [<c00403c0>] (do_fork+0xa4/0x2d4) from [<c0040620>] (kernel_thread+0x30/0x38)
[ 2.437805] [<c0040620>] (kernel_thread+0x30/0x38) from [<c0063888>] (kthreadd+0xd4/0x13c)
[ 2.448364] [<c0063888>] (kthreadd+0xd4/0x13c) from [<c0013308>] (ret_from_fork+0x14/0x2c)
[ 2.448364] ---[ end trace da8cf92c433d1616 ]---
[ 4.028594] omap_i2c 48060000.i2c: timeout waiting for bus ready
[ 5.047637] omap_i2c 48060000.i2c: timeout waiting for bus ready
[ 6.662322] omap_i2c 48060000.i2c: timeout waiting for bus ready
[ 7.687499] omap_i2c 48060000.i2c: timeout waiting for bus ready
[ 8.717315] omap_i2c 48060000.i2c: timeout waiting for bus ready
[ 9.737365] omap_i2c 48060000.i2c: timeout waiting for bus ready
[ 10.747863] omap_i2c 48060000.i2c: timeout waiting for bus ready
[ 11.767608] omap_i2c 48060000.i2c: timeout waiting for bus ready
[ 12.768035] omap_i2c 48060000.i2c: timeout waiting for bus ready
[ 13.777893] omap_i2c 48060000.i2c: timeout waiting for bus ready
[ 14.787872] omap_i2c 48060000.i2c: timeout waiting for bus ready
[ 15.807128] omap_i2c 48060000.i2c: timeout waiting for bus ready
[ 17.318084] omap_i2c 48060000.i2c: timeout waiting for bus ready
[ 18.338226] omap_i2c 48060000.i2c: timeout waiting for bus ready
[ 19.367645] omap_i2c 48060000.i2c: timeout waiting for bus ready
[ 20.388031] omap_i2c 48060000.i2c: timeout waiting for bus ready
[ 21.407928] omap_i2c 48060000.i2c: timeout waiting for bus ready
[ 22.457336] omap_i2c 48060000.i2c: timeout waiting for bus ready
[ 23.487915] omap_i2c 48060000.i2c: timeout waiting for bus ready
[ 23.487915] omap_i2c 48060000.i2c: bus 2 rev0.11 at 400 kHz
[ 24.537414] omap_i2c 48060000.i2c: timeout waiting for bus ready
[ 24.537445] lm75: probe of 2-0048 failed with error -110
[ 24.578216] omap_i2c 48350000.i2c: bus 3 rev0.11 at 400 kHz
[ 24.585601] omap-dma-engine 4a056000.dma-controller: allocating channel for 62
[ 24.593383] omap-dma-engine 4a056000.dma-controller: allocating channel for 61
[ 24.602386] omap_hsmmc 4809c000.mmc: pins are not configured from the driver
[ 24.641540] omap-dma-engine 4a056000.dma-controller: allocating channel for 48
[ 24.649261] omap-dma-engine 4a056000.dma-controller: allocating channel for 47
[ 24.657104] omap_hsmmc 480b4000.mmc: pins are not configured from the driver
[ 24.713287] twl_rtc rtc.11: setting system clock to 2000-01-01 03:51:29 UTC (946698689)
[ 24.725891] RAMDISK: gzip image found at block 0
[ 24.805297] mmc1: new high speed MMC card at address 0001
[ 24.813659] mmcblk0: mmc1:0001 SEM32G 29.7 GiB
[ 24.819488] mmcblk0boot0: mmc1:0001 SEM32G partition 1 1.00 MiB
[ 24.827301] mmcblk0boot1: mmc1:0001 SEM32G partition 2 1.00 MiB
[ 24.848236] Alternate GPT is invalid, using primary GPT.
[ 24.854034] mmcblk0: p1 p2 p3 p4 p5 p6 p7
[ 24.869995] mmcblk0boot1: unknown partition table
[ 24.878601] mmcblk0boot0: unknown partition table
[ 25.244140] kjournald starting. Commit interval 5 seconds
[ 25.244140] EXT3-fs (ram0): using internal journal
[ 25.244140] EXT3-fs (ram0): mounted filesystem with ordered data mode
[ 25.244140] VFS: Mounted root (ext3 filesystem) on device 1:0.
[ 25.270233] Unhandled fault: imprecise external abort (0x1406) at 0x00207eb0
[ 25.281677] VFS: Cannot open root device "(null)" or unknown-block(0,0): error -6
[ 25.290008] Please append a correct "root=" boot option; here are the available partitions:
[ 25.298950] b300 31162368 mmcblk0 driver: mmcblk
[ 25.298950] b301 128 mmcblk0p1 f9f21f00-a8d4-5f0e-9746-594869aec34e
[ 25.312561] b302 256 mmcblk0p2 f9f21f01-a8d4-5f0e-9746-594869aec34e
[ 25.317779] b303 128 mmcblk0p3 f9f21f02-a8d4-5f0e-9746-594869aec34e
[ 25.328216] b304 16384 mmcblk0p4 f9f21f03-a8d4-5f0e-9746-594869aec34e
[ 25.336364] b305 16 mmcblk0p5 f9f21f04-a8d4-5f0e-9746-594869aec34e
[ 25.344116] b306 8192 mmcblk0p6 f9f21f05-a8d4-5f0e-9746-594869aec34e
[ 25.351501] b307 8192 mmcblk0p7 f9f21f06-a8d4-5f0e-9746-594869aec34e
[ 25.360229] b310 1024 mmcblk0boot1 (driver?)
[ 25.364074] b308 1024 mmcblk0boot0 (driver?)
[ 25.370391] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
[ 25.371643] CPU1: stopping
[ 25.383361] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G W 3.10.0-rc4-00034-g042dd60-dirty #64
[ 25.383361] [<c001a80c>] (unwind_backtrace+0x0/0xf0) from [<c0017238>] (show_stack+0x10/0x14)
[ 25.402069] [<c0017238>] (show_stack+0x10/0x14) from [<c0018840>] (handle_IPI+0x130/0x158)
[ 25.402069] [<c0018840>] (handle_IPI+0x130/0x158) from [<c00085e0>] (gic_handle_irq+0x54/0x5c)
[ 25.410827] [<c00085e0>] (gic_handle_irq+0x54/0x5c) from [<c04f08a4>] (__irq_svc+0x44/0x5c)
[ 25.428771] Exception stack(0xdb0a7f90 to 0xdb0a7fd8)
[ 25.428771] 7f80: 00000001 00000001 00000000 db0a5480
[ 25.428771] 7fa0: db0a6000 00000000 10c03c7d c07f6834 c04fc6a0 c07f4eec c07689ac 00000000
[ 25.451416] 7fc0: 01000000 db0a7fd8 c009170c c00144b4 20000113 ffffffff
[ 25.451416] [<c04f08a4>] (__irq_svc+0x44/0x5c) from [<c00144b4>] (arch_cpu_idle+0x20/0x44)
[ 25.451416] [<c00144b4>] (arch_cpu_idle+0x20/0x44) from [<c0080bd0>] (cpu_startup_entry+0xa8/0x228)
[ 25.451416] [<c0080bd0>] (cpu_startup_entry+0xa8/0x228) from [<804e45c4>] (0x804e45c4)


Investigation results - There are three issues:
1) The crash happens because of the race between omap_i2c_runtime_suspend()
and omap_i2c_isr_thread()

CPU1: CPU2:
|-omap_i2c_xfer |
|- pm_runtime_put_autosuspend() |
|-timeout |-omap_i2c_isr()
|-return IRQ_WAKE_THREAD
|-omap_i2c_runtime_suspend() |
|-omap_i2c_isr_thread()
|- oops is here - I2C module is in idle state

The patch https://lkml.org/lkml/2013/6/3/59
"[PATCH v5] i2c: omap: correct usage of the interrupt enable register" improves
situation and allows to boot sometimes.
The patch https://lkml.org/lkml/2013/4/23/215
"[PATCH 1/2] i2c: omap: convert to module_platform_driver()" hides the issue.

The patch http://www.spinics.net/lists/linux-i2c/msg10169.html
"i2c: omap: fix spurious IRQs: disable/enable IRQ at INTC when idle"
- fixes the boot crash, but I2C bus 2 still stuck in "Bus Busy" state.

[ 23.487915] omap_i2c 48060000.i2c: timeout waiting for bus ready

In addition, I've created two patches to fix wrong IRQ handling and to prevent
such kind of crashes in the future:
i2c: omap: add runtime check in isr to be sure that i2c is enabled
i2c: omap: handle all irqs befor unblocking omap_i2c_xfer_msg()


2) Q: Why I2C is suspended so fast during the boot?
The issue 1 should be reproducible very rare, in general. I traced it to:
omap_i2c_xfer
|- pm_runtime_put_autosuspend()
|- __pm_runtime_suspend
|- rpm_suspend()
|- mod_timer(&dev->power.suspend_timer, expires);

The mod_timer() should schedule next timer event after ~100-200 jiffes,
which is equal to 1 sec (as requested).
But, at boot time, the timer expires after ~50 ms instead!!

[ 23.190002] omap_i2c 48350000.i2c: ==== i2c_add_numbered_adapter
[ 23.204681] omap_i2c 48350000.i2c: bus 3 rev0.11 at 400 kHz
[ 23.211669] omap_i2c 48350000.i2c: ====== rpm_suspend elapsed 0 last_busy 4294939616
[ 23.219879] omap_i2c 48350000.i2c: ====== rpm_suspend expires 4294939716 last_busy 4294939616 autosuspend_delay 1000
[ 23.219879] omap_i2c 48350000.i2c: ====== rpm_suspend expires 4294939700
[ 23.237274] omap_i2c 48350000.i2c: ====== rpm_suspend mod_timer expires 4294939700
--- the timer scheduled to 84 jiffes
[ 23.246185] omap_i2c 48350000.i2c: ==== pm_runtime_put_autosuspend
[ 23.253448] omap-dma-engine 4a056000.dma-controller: allocating channel for 62
[ 23.261138] omap-dma-engine 4a056000.dma-controller: allocating channel for 61
[ 23.269500] omap_i2c 48070000.i2c: ==== omap_i2c_runtime_resume
[ 23.275817] omap_i2c 48350000.i2c: ==== omap_i2c_runtime_suspend
--- and expired after ~40 ms

Unfortunatelly, Im not able to find the root cause of such timers behavior - have
not to much knowledge about system clocks code.


3) when lm75_detect() requests to read from bus 2 addr 0x48 (tmp105 sensor)
registers 0x04 and above (registers don't exist) the bus stuck in Bus busy condition
until next I2C re-initialization (PM runtime suspend/resume).

[ 3.300933] omap_i2c 48060000.i2c: timeout waiting for bus ready
[ 4.369262] omap_i2c 48060000.i2c: timeout waiting for bus ready
[ 5.381530] omap_i2c 48060000.i2c: timeout waiting for bus ready

I've found that the NACK condition not handled properly:
- when NACK condition is detected the master should generate STT or STP condition
- now, the I2C driver generates STP only if NACK has been received during the
last message transmitting/recieving. As result, the "Bus Busy" state may occur
if NACK has been received during any other messages transmission
/reception, because STP isn't generated.

Patch "i2c: omap: query STP always when NACK is received:" fixes this issue.


This patch series based on top of:
http://patchwork.ozlabs.org/patch/248188/

Grygorii Strashko (4):
i2c: omap: add runtime check in isr to be sure that i2c is enabled
i2c: omap: handle all irqs befor unblocking omap_i2c_xfer_msg()
i2c: omap: query STP always when NACK is received
i2c: omap: remove omap_i2c_isr() hw irq handler

Kevin Hilman (1):
i2c: omap: fix spurious IRQs: disable/enable IRQ at INTC when idle

drivers/i2c/busses/i2c-omap.c | 52 +++++++++++++----------------------------
1 file changed, 16 insertions(+), 36 deletions(-)


CC: Kevin Hilman <[email protected]>
CC: Tomi Valkeinen <[email protected]>
CC: Felipe Balbi <[email protected]>
--
1.7.9.5


2013-06-07 18:50:08

by Grygorii Strashko

[permalink] [raw]
Subject: [PATCH 3/5] i2c: omap: handle all irqs befor unblocking omap_i2c_xfer_msg()

ARDY|NACK and ARDY|AL are set together in OMAP_I2C_STAT_REG, which will be
processed incorrectly now:
iterration 1:
- I2C irq triggered (ARDY|NACK)
- omap_i2c_isr_thread() will ask NACK, fill dev->cmd_err = OMAP_I2C_STAT_NACK
and trigger "cmd_complete" completion.
- omap_i2c_xfer_msg() will be unblocked, hande NACK and finish its execution.
- omap_i2c_xfer() will finish

iteration 2:
- I2C irq triggered (ARDY)
- omap_i2c_isr_thread() will ask ARDY, trigger completion
At this point dev->cmd_err == OMAP_I2C_STAT_NACK, "cmd_complete" is not
initialized and omap_i2c_xfer() may be completed at all.

So, I2C driver is not ready for iteration 2.
Hence, fix it by asking NACK, AL and ARDY all togather in omap_i2c_isr_thread()
before unblocking omap_i2c_xfer_msg() execution.

This is the "old" I2C interrupt handler behavior which was changed by commit:
1d7afc95946487945cc7f5019b41255b72224b70 "i2c: omap: ack IRQ in parts".

CC: Kevin Hilman <[email protected]>
Signed-off-by: Grygorii Strashko <[email protected]>
---
drivers/i2c/busses/i2c-omap.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/i2c/busses/i2c-omap.c b/drivers/i2c/busses/i2c-omap.c
index 2dac598..46fb8a5 100644
--- a/drivers/i2c/busses/i2c-omap.c
+++ b/drivers/i2c/busses/i2c-omap.c
@@ -938,17 +938,15 @@ omap_i2c_isr_thread(int this_irq, void *dev_id)
break;
}

+ omap_i2c_ack_stat(dev, OMAP_I2C_STAT_NACK | OMAP_I2C_STAT_AL);
+
if (stat & OMAP_I2C_STAT_NACK) {
err |= OMAP_I2C_STAT_NACK;
- omap_i2c_ack_stat(dev, OMAP_I2C_STAT_NACK);
- break;
}

if (stat & OMAP_I2C_STAT_AL) {
dev_err(dev->dev, "Arbitration lost\n");
err |= OMAP_I2C_STAT_AL;
- omap_i2c_ack_stat(dev, OMAP_I2C_STAT_AL);
- break;
}

/*
--
1.7.9.5

2013-06-07 18:50:17

by Grygorii Strashko

[permalink] [raw]
Subject: [PATCH 4/5] i2c: omap: query STP always when NACK is received

According to I2C specification the NACK should be handled as folowing:
"When SDA remains HIGH during this ninth clock pulse, this is defined as the Not
Acknowledge signal. The master can then gene rate either a STOP condition to
abort the transfer, or a repeated START condition to start a new transfer."
[http://www.nxp.com/documents/user_manual/UM10204.pdf]

The same is recomened by TI I2C wiki:
http://processors.wiki.ti.com/index.php/I2C_Tips

Currently, the OMAP I2C driver interrupts I2C trunsfer in case of NACK, but
It queries Stop condition OMAP_I2C_CON_REG.STP=1 only if NACK has been received
during the last message transmitting/recieving.
This may lead to stuck Bus in "Bus Busy" until I2C IP reset (idle/enable).

Hence, fix it by querying Stop condition (STP) always when NACK is received.

CC: Kevin Hilman <[email protected]>
CC: Felipe Balbi <[email protected]>
Signed-off-by: Grygorii Strashko <[email protected]>
---
drivers/i2c/busses/i2c-omap.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/i2c/busses/i2c-omap.c b/drivers/i2c/busses/i2c-omap.c
index 46fb8a5..b3daf3f 100644
--- a/drivers/i2c/busses/i2c-omap.c
+++ b/drivers/i2c/busses/i2c-omap.c
@@ -618,11 +618,10 @@ static int omap_i2c_xfer_msg(struct i2c_adapter *adap,
if (dev->cmd_err & OMAP_I2C_STAT_NACK) {
if (msg->flags & I2C_M_IGNORE_NAK)
return 0;
- if (stop) {
- w = omap_i2c_read_reg(dev, OMAP_I2C_CON_REG);
- w |= OMAP_I2C_CON_STP;
- omap_i2c_write_reg(dev, OMAP_I2C_CON_REG, w);
- }
+
+ w = omap_i2c_read_reg(dev, OMAP_I2C_CON_REG);
+ w |= OMAP_I2C_CON_STP;
+ omap_i2c_write_reg(dev, OMAP_I2C_CON_REG, w);
return -EREMOTEIO;
}
return -EIO;
--
1.7.9.5

2013-06-07 18:50:15

by Grygorii Strashko

[permalink] [raw]
Subject: [PATCH 5/5] i2c: omap: remove omap_i2c_isr() hw irq handler

The omap_i2c_isr() does the irq check and schedules threaded handler if any of
enabled IRQs is active, but currently the I2C IRQs are enabled just once,
when I2C IP is enabling (transfer started) and they aren't changed after that.
More over, now the I2C INTC IRQ is disabled when I2C IP is idled.
Thus, I2C IRQs will start coming only when we want that, and there is
no sense to have omap_i2c_isr() anymore:
- omap_i2c_isr_thread() does all needed checks
- synchronization is provided IRQ Core.

So, get rid of omap_i2c_isr(), custom locking and
struct omap_i2c_dev->lock variable.

CC: Kevin Hilman <[email protected]>
Signed-off-by: Grygorii Strashko <[email protected]>
---

This is clean-up patch.

drivers/i2c/busses/i2c-omap.c | 33 +--------------------------------
1 file changed, 1 insertion(+), 32 deletions(-)

diff --git a/drivers/i2c/busses/i2c-omap.c b/drivers/i2c/busses/i2c-omap.c
index b3daf3f..749f390 100644
--- a/drivers/i2c/busses/i2c-omap.c
+++ b/drivers/i2c/busses/i2c-omap.c
@@ -183,7 +183,6 @@ enum {
#define OMAP_I2C_IP_V2_INTERRUPTS_MASK 0x6FFF

struct omap_i2c_dev {
- spinlock_t lock; /* IRQ synchronization */
struct device *dev;
void __iomem *base; /* virtual */
int irq;
@@ -876,35 +875,10 @@ static int omap_i2c_transmit_data(struct omap_i2c_dev *dev, u8 num_bytes,
}

static irqreturn_t
-omap_i2c_isr(int irq, void *dev_id)
-{
- struct omap_i2c_dev *dev = dev_id;
- irqreturn_t ret = IRQ_HANDLED;
- u16 mask;
- u16 stat;
-
- spin_lock(&dev->lock);
- if (pm_runtime_suspended(dev->dev)) {
- WARN_ONCE(true, "We should never be here!\n");
- return IRQ_NONE;
- }

- mask = omap_i2c_read_reg(dev, OMAP_I2C_IE_REG);
- stat = omap_i2c_read_reg(dev, OMAP_I2C_STAT_REG);
-
- if (stat & mask)
- ret = IRQ_WAKE_THREAD;
-
- spin_unlock(&dev->lock);
-
- return ret;
-}
-
-static irqreturn_t
omap_i2c_isr_thread(int this_irq, void *dev_id)
{
struct omap_i2c_dev *dev = dev_id;
- unsigned long flags;
u16 bits;
u16 stat;
int err = 0, count = 0;
@@ -914,7 +888,6 @@ omap_i2c_isr_thread(int this_irq, void *dev_id)
return IRQ_NONE;
}

- spin_lock_irqsave(&dev->lock, flags);
do {
bits = omap_i2c_read_reg(dev, OMAP_I2C_IE_REG);
stat = omap_i2c_read_reg(dev, OMAP_I2C_STAT_REG);
@@ -1035,8 +1008,6 @@ omap_i2c_isr_thread(int this_irq, void *dev_id)
omap_i2c_complete_cmd(dev, err);

out:
- spin_unlock_irqrestore(&dev->lock, flags);
-
return IRQ_HANDLED;
}

@@ -1146,8 +1117,6 @@ omap_i2c_probe(struct platform_device *pdev)
dev->dev = &pdev->dev;
dev->irq = irq;

- spin_lock_init(&dev->lock);
-
platform_set_drvdata(pdev, dev);
init_completion(&dev->cmd_complete);

@@ -1229,7 +1198,7 @@ omap_i2c_probe(struct platform_device *pdev)
IRQF_NO_SUSPEND, pdev->name, dev);
else
r = devm_request_threaded_irq(&pdev->dev, dev->irq,
- omap_i2c_isr, omap_i2c_isr_thread,
+ NULL, omap_i2c_isr_thread,
IRQF_NO_SUSPEND | IRQF_ONESHOT,
pdev->name, dev);

--
1.7.9.5

2013-06-07 18:50:01

by Grygorii Strashko

[permalink] [raw]
Subject: [PATCH 1/5] i2c: omap: fix spurious IRQs: disable/enable IRQ at INTC when idle

From: Kevin Hilman <[email protected]>

Currently, runtime PM is used to keep the device enabled only during
active transfers and for a configurable runtime PM autosuspend timout
after an xfer.

In addition to idling the device, driver's ->runtime_suspend() method
currently disables device interrupts when idle. However, on some SoCs
(notably OMAP4+), the I2C hardware may shared with other coprocessors.
This means that the MPU will still recieve interrupts if a coprocessor
is using the I2C device. To avoid this, also disable interrupts at
the MPU INTC when idling the device in ->runtime_suspend() (and
re-enable them in ->runtime_resume().) This part based on an original
patch from Shubhrajyoti Datta. NOTE: for proper sharing the I2C with
a coprocessor, this driver still needs hwspinlock support added.

More over, this patch fixes the kernel boot failure which happens when
CONFIG_SENSORS_LM75=y:

[ 2.251220] WARNING: at drivers/bus/omap_l3_noc.c:113 l3_interrupt_handler+0x140/0x184()
[ 2.257781] L3 custom error: MASTER:MPU TARGET:L4 PER2
[ 2.264373] Modules linked in:
[ 2.268463] CPU: 0 PID: 2 Comm: kthreadd Not tainted 3.10.0-rc4-00034-g042dd60-dirty #64
[ 2.270385] [<c001a80c>] (unwind_backtrace+0x0/0xf0) from [<c0017238>] (show_stack+0x10/0x14)
[ 2.286102] [<c0017238>] (show_stack+0x10/0x14) from [<c0040fd0>] (warn_slowpath_common+0x4c/0x68)
[ 2.295593] [<c0040fd0>] (warn_slowpath_common+0x4c/0x68) from [<c0041080>] (warn_slowpath_fmt+0x30/0x40)
[ 2.299987] [<c0041080>] (warn_slowpath_fmt+0x30/0x40) from [<c02c5ed0>] (l3_interrupt_handler+0x140/0x184)
[ 2.315582] [<c02c5ed0>] (l3_interrupt_handler+0x140/0x184) from [<c009ffb8>] (handle_irq_event_percpu+0x58/0x258)
[ 2.323364] [<c009ffb8>] (handle_irq_event_percpu+0x58/0x258) from [<c00a01f4>] (handle_irq_event+0x3c/0x5c)
[ 2.327819] [<c00a01f4>] (handle_irq_event+0x3c/0x5c) from [<c00a2f7c>] (handle_fasteoi_irq+0xbc/0x164)
[ 2.337829] [<c00a2f7c>] (handle_fasteoi_irq+0xbc/0x164) from [<c009f978>] (generic_handle_irq+0x20/0x30)
[ 2.357513] [<c009f978>] (generic_handle_irq+0x20/0x30) from [<c0014168>] (handle_IRQ+0x4c/0xac)
[ 2.366821] [<c0014168>] (handle_IRQ+0x4c/0xac) from [<c00085b4>] (gic_handle_irq+0x28/0x5c)
[ 2.375762] [<c00085b4>] (gic_handle_irq+0x28/0x5c) from [<c04f08a4>] (__irq_svc+0x44/0x5c)
[ 2.379821] Exception stack(0xdb085ec0 to 0xdb085f08)
[ 2.389953] 5ec0: 00000001 00000001 00000000 db07ea00 40000113 db2317a8 db084000 000002f1
[ 2.389953] 5ee0: db07ea00 00000000 00000000 00000000 c04fd990 db085f08 c009170c c04f03e8
[ 2.405609] 5f00: 20000113 ffffffff
[ 2.408355] [<c04f08a4>] (__irq_svc+0x44/0x5c) from [<c04f03e8>] (_raw_spin_unlock_irqrestore+0x34/0x44)
[ 2.418304] [<c04f03e8>] (_raw_spin_unlock_irqrestore+0x34/0x44) from [<c00403c0>] (do_fork+0xa4/0x2d4)
[ 2.427795] [<c00403c0>] (do_fork+0xa4/0x2d4) from [<c0040620>] (kernel_thread+0x30/0x38)
[ 2.437805] [<c0040620>] (kernel_thread+0x30/0x38) from [<c0063888>] (kthreadd+0xd4/0x13c)
[ 2.448364] [<c0063888>] (kthreadd+0xd4/0x13c) from [<c0013308>] (ret_from_fork+0x14/0x2c)
[ 2.448364] ---[ end trace da8cf92c433d1616 ]---

The root casue of crash is race between omap_i2c_runtime_suspend()
and omap_i2c_isr_thread()

CPU1: CPU2:
|-omap_i2c_xfer |
|- pm_runtime_put_autosuspend() |
|-timeout |-omap_i2c_isr()
|-return IRQ_WAKE_THREAD;
|-omap_i2c_runtime_suspend() |
|-omap_i2c_isr_thread()
|- oops is here - I2C module is in idle state


Cc: Nishanth Menon <[email protected]>
Signed-off-by: Kevin Hilman <[email protected]>
Signed-off-by: Grygorii Strashko <[email protected]>
---
drivers/i2c/busses/i2c-omap.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/drivers/i2c/busses/i2c-omap.c b/drivers/i2c/busses/i2c-omap.c
index 64c26f9..97844ff 100644
--- a/drivers/i2c/busses/i2c-omap.c
+++ b/drivers/i2c/busses/i2c-omap.c
@@ -1290,6 +1290,8 @@ static int omap_i2c_runtime_suspend(struct device *dev)
struct platform_device *pdev = to_platform_device(dev);
struct omap_i2c_dev *_dev = platform_get_drvdata(pdev);

+ disable_irq(_dev->irq);
+
_dev->iestate = omap_i2c_read_reg(_dev, OMAP_I2C_IE_REG);

if (_dev->scheme == OMAP_I2C_SCHEME_0)
@@ -1320,6 +1322,8 @@ static int omap_i2c_runtime_resume(struct device *dev)

__omap_i2c_init(_dev);

+ enable_irq(_dev->irq);
+
return 0;
}
#endif /* CONFIG_PM_RUNTIME */
--
1.7.9.5

2013-06-07 18:52:14

by Grygorii Strashko

[permalink] [raw]
Subject: [PATCH 2/5] i2c: omap: add runtime check in isr to be sure that i2c is enabled

Add runtime check at the beginning of omap_i2c_isr/omap_i2c_isr_thread
to be sure that i2c is enabled, before performing IRQ handling and accessing
I2C IP registers:
if (pm_runtime_suspended(dev->dev)) {
WARN_ONCE(true, "We should never be here!\n");
return IRQ_NONE;
}

Produce warning in case if ISR called when i2c is disabled

CC: Kevin Hilman <[email protected]>
Signed-off-by: Grygorii Strashko <[email protected]>
---
drivers/i2c/busses/i2c-omap.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/drivers/i2c/busses/i2c-omap.c b/drivers/i2c/busses/i2c-omap.c
index 97844ff..2dac598 100644
--- a/drivers/i2c/busses/i2c-omap.c
+++ b/drivers/i2c/busses/i2c-omap.c
@@ -885,6 +885,11 @@ omap_i2c_isr(int irq, void *dev_id)
u16 stat;

spin_lock(&dev->lock);
+ if (pm_runtime_suspended(dev->dev)) {
+ WARN_ONCE(true, "We should never be here!\n");
+ return IRQ_NONE;
+ }
+
mask = omap_i2c_read_reg(dev, OMAP_I2C_IE_REG);
stat = omap_i2c_read_reg(dev, OMAP_I2C_STAT_REG);

@@ -905,6 +910,11 @@ omap_i2c_isr_thread(int this_irq, void *dev_id)
u16 stat;
int err = 0, count = 0;

+ if (pm_runtime_suspended(dev->dev)) {
+ WARN_ONCE(true, "We should never be here!\n");
+ return IRQ_NONE;
+ }
+
spin_lock_irqsave(&dev->lock, flags);
do {
bits = omap_i2c_read_reg(dev, OMAP_I2C_IE_REG);
--
1.7.9.5

2013-06-07 19:03:15

by Felipe Balbi

[permalink] [raw]
Subject: Re: [PATCH 2/5] i2c: omap: add runtime check in isr to be sure that i2c is enabled

Hi,

On Fri, Jun 07, 2013 at 09:46:05PM +0300, Grygorii Strashko wrote:
> Add runtime check at the beginning of omap_i2c_isr/omap_i2c_isr_thread
> to be sure that i2c is enabled, before performing IRQ handling and accessing
> I2C IP registers:
> if (pm_runtime_suspended(dev->dev)) {
> WARN_ONCE(true, "We should never be here!\n");
> return IRQ_NONE;
> }
>
> Produce warning in case if ISR called when i2c is disabled
>
> CC: Kevin Hilman <[email protected]>
> Signed-off-by: Grygorii Strashko <[email protected]>
> ---
> drivers/i2c/busses/i2c-omap.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/drivers/i2c/busses/i2c-omap.c b/drivers/i2c/busses/i2c-omap.c
> index 97844ff..2dac598 100644
> --- a/drivers/i2c/busses/i2c-omap.c
> +++ b/drivers/i2c/busses/i2c-omap.c
> @@ -885,6 +885,11 @@ omap_i2c_isr(int irq, void *dev_id)
> u16 stat;
>
> spin_lock(&dev->lock);
> + if (pm_runtime_suspended(dev->dev)) {
> + WARN_ONCE(true, "We should never be here!\n");
> + return IRQ_NONE;
> + }

returning IRQ_NONE is not what you want to do in this case. You want to
setup a flag so that your runtime_resume() knows that there are pending
events to be handled and you handle those in runtime_resume time.

But to be frank, I don't see how this can trigger since we're calling
pm_runtime_get_sync() from omap_i2c_xfer() which means by the time
pm_runtime_get_sync() returns, assuming no errors, i2c module should be
fully resumed and ready to go. Perhaps you have found a bug somewhere
else ?

Also, your 'We should never be here' message isn't helpfull at all.

> @@ -905,6 +910,11 @@ omap_i2c_isr_thread(int this_irq, void *dev_id)
> u16 stat;
> int err = 0, count = 0;
>
> + if (pm_runtime_suspended(dev->dev)) {
> + WARN_ONCE(true, "We should never be here!\n");
> + return IRQ_NONE;
> + }

because of IRQF_ONESHOT I can't see how this would *ever* be a valid
check.

--
balbi


Attachments:
(No filename) (1.89 kB)
signature.asc (836.00 B)
Digital signature
Download all attachments

2013-06-07 19:06:52

by Felipe Balbi

[permalink] [raw]
Subject: Re: [PATCH 3/5] i2c: omap: handle all irqs befor unblocking omap_i2c_xfer_msg()

Hi,

On Fri, Jun 07, 2013 at 09:46:06PM +0300, Grygorii Strashko wrote:
> ARDY|NACK and ARDY|AL are set together in OMAP_I2C_STAT_REG, which will be

Have you seen that happen ever ? AL is Arbitration Lost, we never put
OMAP in a multi-master environment before.

ARDY | NACK I also find it a bit hard for those two to happen together
since ARDY will be set when you can change controller's register
*again*, mening that a transfer has completed.

Also, we need to follow what the programming model says. And, I don't
have docs with me right now, but IIRC it tells us to bail out if any of
the error conditions are met.

--
balbi


Attachments:
(No filename) (631.00 B)
signature.asc (836.00 B)
Digital signature
Download all attachments

2013-06-07 19:08:11

by Felipe Balbi

[permalink] [raw]
Subject: Re: [PATCH 4/5] i2c: omap: query STP always when NACK is received

Hi,

On Fri, Jun 07, 2013 at 09:46:07PM +0300, Grygorii Strashko wrote:
> According to I2C specification the NACK should be handled as folowing:
> "When SDA remains HIGH during this ninth clock pulse, this is defined as the Not
> Acknowledge signal. The master can then gene rate either a STOP condition to
> abort the transfer, or a repeated START condition to start a new transfer."

right ...

> [http://www.nxp.com/documents/user_manual/UM10204.pdf]
>
> The same is recomened by TI I2C wiki:
> http://processors.wiki.ti.com/index.php/I2C_Tips
>
> Currently, the OMAP I2C driver interrupts I2C trunsfer in case of NACK, but
> It queries Stop condition OMAP_I2C_CON_REG.STP=1 only if NACK has been received
> during the last message transmitting/recieving.
> This may lead to stuck Bus in "Bus Busy" until I2C IP reset (idle/enable).
>
> Hence, fix it by querying Stop condition (STP) always when NACK is received.
>
> CC: Kevin Hilman <[email protected]>
> CC: Felipe Balbi <[email protected]>
> Signed-off-by: Grygorii Strashko <[email protected]>
> ---
> drivers/i2c/busses/i2c-omap.c | 9 ++++-----
> 1 file changed, 4 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/i2c/busses/i2c-omap.c b/drivers/i2c/busses/i2c-omap.c
> index 46fb8a5..b3daf3f 100644
> --- a/drivers/i2c/busses/i2c-omap.c
> +++ b/drivers/i2c/busses/i2c-omap.c
> @@ -618,11 +618,10 @@ static int omap_i2c_xfer_msg(struct i2c_adapter *adap,
> if (dev->cmd_err & OMAP_I2C_STAT_NACK) {
> if (msg->flags & I2C_M_IGNORE_NAK)
> return 0;
> - if (stop) {
> - w = omap_i2c_read_reg(dev, OMAP_I2C_CON_REG);
> - w |= OMAP_I2C_CON_STP;
> - omap_i2c_write_reg(dev, OMAP_I2C_CON_REG, w);
> - }
> +
> + w = omap_i2c_read_reg(dev, OMAP_I2C_CON_REG);
> + w |= OMAP_I2C_CON_STP;
> + omap_i2c_write_reg(dev, OMAP_I2C_CON_REG, w);

... and this is breaking repeated start.

--
balbi


Attachments:
(No filename) (1.83 kB)
signature.asc (836.00 B)
Digital signature
Download all attachments

2013-06-07 19:08:59

by Felipe Balbi

[permalink] [raw]
Subject: Re: [PATCH 5/5] i2c: omap: remove omap_i2c_isr() hw irq handler

Hi,

On Fri, Jun 07, 2013 at 09:46:08PM +0300, Grygorii Strashko wrote:
> The omap_i2c_isr() does the irq check and schedules threaded handler if any of
> enabled IRQs is active, but currently the I2C IRQs are enabled just once,
> when I2C IP is enabling (transfer started) and they aren't changed after that.
> More over, now the I2C INTC IRQ is disabled when I2C IP is idled.
> Thus, I2C IRQs will start coming only when we want that, and there is
> no sense to have omap_i2c_isr() anymore:

so ? we still want to check if this device generated IRQs in hardirq
context. What if the IRQ line is shared ?

--
balbi


Attachments:
(No filename) (616.00 B)
signature.asc (836.00 B)
Digital signature
Download all attachments

2013-06-07 20:51:33

by Kevin Hilman

[permalink] [raw]
Subject: Re: [PATCH 1/5] i2c: omap: fix spurious IRQs: disable/enable IRQ at INTC when idle

Grygorii Strashko <[email protected]> writes:

> From: Kevin Hilman <[email protected]>
>
> Currently, runtime PM is used to keep the device enabled only during
> active transfers and for a configurable runtime PM autosuspend timout
> after an xfer.
>
> In addition to idling the device, driver's ->runtime_suspend() method
> currently disables device interrupts when idle. However, on some SoCs
> (notably OMAP4+), the I2C hardware may shared with other coprocessors.
> This means that the MPU will still recieve interrupts if a coprocessor
> is using the I2C device. To avoid this, also disable interrupts at
> the MPU INTC when idling the device in ->runtime_suspend() (and
> re-enable them in ->runtime_resume().) This part based on an original
> patch from Shubhrajyoti Datta. NOTE: for proper sharing the I2C with
> a coprocessor, this driver still needs hwspinlock support added.
>
> More over, this patch fixes the kernel boot failure which happens when
> CONFIG_SENSORS_LM75=y:
>
> [ 2.251220] WARNING: at drivers/bus/omap_l3_noc.c:113 l3_interrupt_handler+0x140/0x184()
> [ 2.257781] L3 custom error: MASTER:MPU TARGET:L4 PER2
> [ 2.264373] Modules linked in:
> [ 2.268463] CPU: 0 PID: 2 Comm: kthreadd Not tainted 3.10.0-rc4-00034-g042dd60-dirty #64
> [ 2.270385] [<c001a80c>] (unwind_backtrace+0x0/0xf0) from [<c0017238>] (show_stack+0x10/0x14)
> [ 2.286102] [<c0017238>] (show_stack+0x10/0x14) from [<c0040fd0>] (warn_slowpath_common+0x4c/0x68)
> [ 2.295593] [<c0040fd0>] (warn_slowpath_common+0x4c/0x68) from [<c0041080>] (warn_slowpath_fmt+0x30/0x40)
> [ 2.299987] [<c0041080>] (warn_slowpath_fmt+0x30/0x40) from [<c02c5ed0>] (l3_interrupt_handler+0x140/0x184)
> [ 2.315582] [<c02c5ed0>] (l3_interrupt_handler+0x140/0x184) from [<c009ffb8>] (handle_irq_event_percpu+0x58/0x258)
> [ 2.323364] [<c009ffb8>] (handle_irq_event_percpu+0x58/0x258) from [<c00a01f4>] (handle_irq_event+0x3c/0x5c)
> [ 2.327819] [<c00a01f4>] (handle_irq_event+0x3c/0x5c) from [<c00a2f7c>] (handle_fasteoi_irq+0xbc/0x164)
> [ 2.337829] [<c00a2f7c>] (handle_fasteoi_irq+0xbc/0x164) from [<c009f978>] (generic_handle_irq+0x20/0x30)
> [ 2.357513] [<c009f978>] (generic_handle_irq+0x20/0x30) from [<c0014168>] (handle_IRQ+0x4c/0xac)
> [ 2.366821] [<c0014168>] (handle_IRQ+0x4c/0xac) from [<c00085b4>] (gic_handle_irq+0x28/0x5c)
> [ 2.375762] [<c00085b4>] (gic_handle_irq+0x28/0x5c) from [<c04f08a4>] (__irq_svc+0x44/0x5c)
> [ 2.379821] Exception stack(0xdb085ec0 to 0xdb085f08)
> [ 2.389953] 5ec0: 00000001 00000001 00000000 db07ea00 40000113 db2317a8 db084000 000002f1
> [ 2.389953] 5ee0: db07ea00 00000000 00000000 00000000 c04fd990 db085f08 c009170c c04f03e8
> [ 2.405609] 5f00: 20000113 ffffffff
> [ 2.408355] [<c04f08a4>] (__irq_svc+0x44/0x5c) from [<c04f03e8>] (_raw_spin_unlock_irqrestore+0x34/0x44)
> [ 2.418304] [<c04f03e8>] (_raw_spin_unlock_irqrestore+0x34/0x44) from [<c00403c0>] (do_fork+0xa4/0x2d4)
> [ 2.427795] [<c00403c0>] (do_fork+0xa4/0x2d4) from [<c0040620>] (kernel_thread+0x30/0x38)
> [ 2.437805] [<c0040620>] (kernel_thread+0x30/0x38) from [<c0063888>] (kthreadd+0xd4/0x13c)
> [ 2.448364] [<c0063888>] (kthreadd+0xd4/0x13c) from [<c0013308>] (ret_from_fork+0x14/0x2c)
> [ 2.448364] ---[ end trace da8cf92c433d1616 ]---
>
> The root casue of crash is race between omap_i2c_runtime_suspend()
> and omap_i2c_isr_thread()

If there's a race here, then it's not the same problem which is
described above, unless the CPU2 IRQ is happening because of a shared
user of I2C, in which case it doesn't need any additional explanation.

>
> CPU1: CPU2:
> |-omap_i2c_xfer |
> |- pm_runtime_put_autosuspend() |
> |-timeout |-omap_i2c_isr()
> |-return IRQ_WAKE_THREAD;
> |-omap_i2c_runtime_suspend() |
> |-omap_i2c_isr_thread()
> |- oops is here - I2C module is in idle state

If this is happening, I don't think it's related to $SUBJECT patch (but
is probably hidden by it.)

Instead, what probably needs to happen in this is case is that
omap_i2c_isr() should be doing a "mark last busy" to reset the
autosuspend timeout. And, that should be done in a separate patch.

> Cc: Nishanth Menon <[email protected]>
> Signed-off-by: Kevin Hilman <[email protected]>

Both the From: and Signed-off should be from my TI address since the
work was done while I was working for TI.

Also, if you change the original patch/changelog, you should add a line
here like:

[[email protected]]: updated changlog

> Signed-off-by: Grygorii Strashko <[email protected]>

Kevin

2013-06-19 18:40:07

by Grygorii Strashko

[permalink] [raw]
Subject: Re: [PATCH 1/5] i2c: omap: fix spurious IRQs: disable/enable IRQ at INTC when idle

On 06/07/2013 11:51 PM, Kevin Hilman wrote:
> Grygorii Strashko <[email protected]> writes:
>
>> From: Kevin Hilman <[email protected]>
>>
>> Currently, runtime PM is used to keep the device enabled only during
>> active transfers and for a configurable runtime PM autosuspend timout
>> after an xfer.
>>
>> In addition to idling the device, driver's ->runtime_suspend() method
>> currently disables device interrupts when idle. However, on some SoCs
>> (notably OMAP4+), the I2C hardware may shared with other coprocessors.
>> This means that the MPU will still recieve interrupts if a coprocessor
>> is using the I2C device. To avoid this, also disable interrupts at
>> the MPU INTC when idling the device in ->runtime_suspend() (and
>> re-enable them in ->runtime_resume().) This part based on an original
>> patch from Shubhrajyoti Datta. NOTE: for proper sharing the I2C with
>> a coprocessor, this driver still needs hwspinlock support added.
>>
>> More over, this patch fixes the kernel boot failure which happens when
>> CONFIG_SENSORS_LM75=y:
>>
>> [ 2.251220] WARNING: at drivers/bus/omap_l3_noc.c:113 l3_interrupt_handler+0x140/0x184()
>> [ 2.257781] L3 custom error: MASTER:MPU TARGET:L4 PER2
>> [ 2.264373] Modules linked in:
>> [ 2.268463] CPU: 0 PID: 2 Comm: kthreadd Not tainted 3.10.0-rc4-00034-g042dd60-dirty #64
>> [ 2.270385] [<c001a80c>] (unwind_backtrace+0x0/0xf0) from [<c0017238>] (show_stack+0x10/0x14)
>> [ 2.286102] [<c0017238>] (show_stack+0x10/0x14) from [<c0040fd0>] (warn_slowpath_common+0x4c/0x68)
>> [ 2.295593] [<c0040fd0>] (warn_slowpath_common+0x4c/0x68) from [<c0041080>] (warn_slowpath_fmt+0x30/0x40)
>> [ 2.299987] [<c0041080>] (warn_slowpath_fmt+0x30/0x40) from [<c02c5ed0>] (l3_interrupt_handler+0x140/0x184)
>> [ 2.315582] [<c02c5ed0>] (l3_interrupt_handler+0x140/0x184) from [<c009ffb8>] (handle_irq_event_percpu+0x58/0x258)
>> [ 2.323364] [<c009ffb8>] (handle_irq_event_percpu+0x58/0x258) from [<c00a01f4>] (handle_irq_event+0x3c/0x5c)
>> [ 2.327819] [<c00a01f4>] (handle_irq_event+0x3c/0x5c) from [<c00a2f7c>] (handle_fasteoi_irq+0xbc/0x164)
>> [ 2.337829] [<c00a2f7c>] (handle_fasteoi_irq+0xbc/0x164) from [<c009f978>] (generic_handle_irq+0x20/0x30)
>> [ 2.357513] [<c009f978>] (generic_handle_irq+0x20/0x30) from [<c0014168>] (handle_IRQ+0x4c/0xac)
>> [ 2.366821] [<c0014168>] (handle_IRQ+0x4c/0xac) from [<c00085b4>] (gic_handle_irq+0x28/0x5c)
>> [ 2.375762] [<c00085b4>] (gic_handle_irq+0x28/0x5c) from [<c04f08a4>] (__irq_svc+0x44/0x5c)
>> [ 2.379821] Exception stack(0xdb085ec0 to 0xdb085f08)
>> [ 2.389953] 5ec0: 00000001 00000001 00000000 db07ea00 40000113 db2317a8 db084000 000002f1
>> [ 2.389953] 5ee0: db07ea00 00000000 00000000 00000000 c04fd990 db085f08 c009170c c04f03e8
>> [ 2.405609] 5f00: 20000113 ffffffff
>> [ 2.408355] [<c04f08a4>] (__irq_svc+0x44/0x5c) from [<c04f03e8>] (_raw_spin_unlock_irqrestore+0x34/0x44)
>> [ 2.418304] [<c04f03e8>] (_raw_spin_unlock_irqrestore+0x34/0x44) from [<c00403c0>] (do_fork+0xa4/0x2d4)
>> [ 2.427795] [<c00403c0>] (do_fork+0xa4/0x2d4) from [<c0040620>] (kernel_thread+0x30/0x38)
>> [ 2.437805] [<c0040620>] (kernel_thread+0x30/0x38) from [<c0063888>] (kthreadd+0xd4/0x13c)
>> [ 2.448364] [<c0063888>] (kthreadd+0xd4/0x13c) from [<c0013308>] (ret_from_fork+0x14/0x2c)
>> [ 2.448364] ---[ end trace da8cf92c433d1616 ]---
>>
>> The root casue of crash is race between omap_i2c_runtime_suspend()
>> and omap_i2c_isr_thread()
> If there's a race here, then it's not the same problem which is
> described above, unless the CPU2 IRQ is happening because of a shared
> user of I2C, in which case it doesn't need any additional explanation.
no shared users here
>> CPU1: CPU2:
>> |-omap_i2c_xfer |
>> |- pm_runtime_put_autosuspend() |
>> |-timeout |-omap_i2c_isr()
>> |-return IRQ_WAKE_THREAD;
>> |-omap_i2c_runtime_suspend() |
>> |-omap_i2c_isr_thread()
>> |- oops is here - I2C module is in idle state
> If this is happening, I don't think it's related to $SUBJECT patch (but
> is probably hidden by it.)
I can remove "fix spurious IRQs: " from $SUBJECT. What do you think?
>
> Instead, what probably needs to happen in this is case is that
> omap_i2c_isr() should be doing a "mark last busy" to reset the
> autosuspend timeout. And, that should be done in a separate patch.
Yes - from one side. From other side, this patch prevents such situation
to happen.
disable_irq(_dev->irq); - waits for any pending IRQ handlers for this
interrupt
to complete before returning.

If we are in .runtime_idle() callback - it means I2C transaction has
been finished
(and It doesn't matter successfully or not) and we don't want to receive
IRQs any more.

As I mentioned in cover-latter this happens because PM runtime
auto-suspend isn't
working properly during the boot:

[ 23.190002] omap_i2c 48350000.i2c: ==== i2c_add_numbered_adapter
[ 23.204681] omap_i2c 48350000.i2c: bus 3 rev0.11 at 400 kHz
[ 23.211669] omap_i2c 48350000.i2c: ====== rpm_suspend elapsed 0 last_busy 4294939616
[ 23.219879] omap_i2c 48350000.i2c: ====== rpm_suspend expires 4294939716 last_busy 4294939616 autosuspend_delay 1000
[ 23.219879] omap_i2c 48350000.i2c: ====== rpm_suspend expires 4294939700
[ 23.237274] omap_i2c 48350000.i2c: ====== rpm_suspend mod_timer expires 4294939700
--- the timer scheduled to 84 jiffes
[ 23.246185] omap_i2c 48350000.i2c: ==== pm_runtime_put_autosuspend
[ 23.253448] omap-dma-engine 4a056000.dma-controller: allocating channel for 62
[ 23.261138] omap-dma-engine 4a056000.dma-controller: allocating channel for 61
[ 23.269500] omap_i2c 48070000.i2c: ==== omap_i2c_runtime_resume
[ 23.275817] omap_i2c 48350000.i2c: ==== omap_i2c_runtime_suspend
--- and expired after ~40 ms

So, there are no guaranty that this race will not happen even if I'll do
"mark last busy"
in HW IRQ handler (, because in high CPU load use case the threaded IRQ
can be delayed.

I2C driver should not receive IRQs after finishing the transfer.

>> Cc: Nishanth Menon <[email protected]>
>> Signed-off-by: Kevin Hilman <[email protected]>
> Both the From: and Signed-off should be from my TI address since the
> work was done while I was working for TI.
>
> Also, if you change the original patch/changelog, you should add a line
> here like:
>
> [[email protected]]: updated changlog
ok.
>> Signed-off-by: Grygorii Strashko <[email protected]>
> Kevin

Actually, I can confirm, that this patch really fixes "spurious IRQs" issues
when M3 Ducati is enabled :).

Sorry, for delayed reply - I've had problems with my e-mail.

- grygorii

2013-06-19 18:47:35

by Grygorii Strashko

[permalink] [raw]
Subject: Re: [PATCH 3/5] i2c: omap: handle all irqs befor unblocking omap_i2c_xfer_msg()

Hi Felipe,
On 06/07/2013 10:05 PM, Felipe Balbi wrote:
> Hi,
>
> On Fri, Jun 07, 2013 at 09:46:06PM +0300, Grygorii Strashko wrote:
>> ARDY|NACK and ARDY|AL are set together in OMAP_I2C_STAT_REG, which will be
> Have you seen that happen ever ? AL is Arbitration Lost, we never put
> OMAP in a multi-master environment before.
This is an example from real life:
[ 0.271942] omap_i2c omap_i2c.1: bus 1 rev2.4.0 at 400 kHz
[ 1.283416] omap_i2c omap_i2c.1: timeout waiting for bus ready
[ 1.300109] OMAP_I2C DEBUG: stat=1001
[ 1.300140] omap_i2c omap_i2c.1: Arbitration lost
[ 1.300140] OMAP_I2C DEBUG: IE=601F
[ 1.300140] OMAP_I2C DEBUG: STAT=1000
[ 1.300170] OMAP_I2C DEBUG: IV=636F
[ 1.300170] OMAP_I2C DEBUG: WE=636F
[ 1.300170] OMAP_I2C DEBUG: SYSS=1
[ 1.300170] OMAP_I2C DEBUG: BUF=707
[ 1.300201] OMAP_I2C DEBUG: CNT=1
[ 1.300201] OMAP_I2C DEBUG: DATA=1
[ 1.300201] OMAP_I2C DEBUG: SYSC=215
[ 1.300201] OMAP_I2C DEBUG: CON=8200
[ 1.300231] OMAP_I2C DEBUG: OA=0
[ 1.300231] OMAP_I2C DEBUG: SA=49
[ 1.300231] OMAP_I2C DEBUG: PSC=9
[ 1.300262] OMAP_I2C DEBUG: SCLL=9
[ 1.300262] OMAP_I2C DEBUG: SCLH=3
[ 1.300262] OMAP_I2C DEBUG: SYSTEST=1E0
[ 1.300262] OMAP_I2C DEBUG: BUFSTAT=4000

and my headache now :..(
>
> ARDY | NACK I also find it a bit hard for those two to happen together
> since ARDY will be set when you can change controller's register
> *again*, mening that a transfer has completed.
There are examples:
[ 3.544952] omap_i2c 48060000.i2c: IRQ (ISR = 0x0006)

[ 25.574523] omap_i2c 48350000.i2c: IRQ (ISR = 0x0014)
[ 25.579925] omap_i2c 48350000.i2c: IRQ (ISR = 0x0012)

to see it - enable debug output in omap_i2c_isr_thread:
dev_dbg(dev->dev, "IRQ (ISR = 0x%04x)\n", stat);

>
> Also, we need to follow what the programming model says. And, I don't
> have docs with me right now, but IIRC it tells us to bail out if any of
> the error conditions are met.
>
yep, but first of all - all IRQs need to be acked before exit.

Sorry, for delayed reply - I've had problems with my e-mail.

- grygorii

2013-06-19 18:48:08

by Grygorii Strashko

[permalink] [raw]
Subject: Re: [PATCH 5/5] i2c: omap: remove omap_i2c_isr() hw irq handler

Hi Felipe,

On 06/07/2013 10:07 PM, Felipe Balbi wrote:
> Hi,
>
> On Fri, Jun 07, 2013 at 09:46:08PM +0300, Grygorii Strashko wrote:
>> The omap_i2c_isr() does the irq check and schedules threaded handler if any of
>> enabled IRQs is active, but currently the I2C IRQs are enabled just once,
>> when I2C IP is enabling (transfer started) and they aren't changed after that.
>> More over, now the I2C INTC IRQ is disabled when I2C IP is idled.
>> Thus, I2C IRQs will start coming only when we want that, and there is
>> no sense to have omap_i2c_isr() anymore:
> so ? we still want to check if this device generated IRQs in hardirq
> context. What if the IRQ line is shared ?
>
Pleas see, https://patchwork.kernel.org/patch/2689211/
[1/5] i2c: omap: fix spurious IRQs: disable/enable IRQ at INTC when idle

It covers shared IRQ problem

Sorry, for delayed reply - I've had problems with my e-mail.

- grygorii

2013-06-19 18:48:57

by Grygorii Strashko

[permalink] [raw]
Subject: Re: [PATCH 2/5] i2c: omap: add runtime check in isr to be sure that i2c is enabled

Hi Felipe,
On 06/07/2013 10:02 PM, Felipe Balbi wrote:
> Hi,
>
> On Fri, Jun 07, 2013 at 09:46:05PM +0300, Grygorii Strashko wrote:
>> Add runtime check at the beginning of omap_i2c_isr/omap_i2c_isr_thread
>> to be sure that i2c is enabled, before performing IRQ handling and accessing
>> I2C IP registers:
>> if (pm_runtime_suspended(dev->dev)) {
>> WARN_ONCE(true, "We should never be here!\n");
>> return IRQ_NONE;
>> }
>>
>> Produce warning in case if ISR called when i2c is disabled
>>
>> CC: Kevin Hilman <[email protected]>
>> Signed-off-by: Grygorii Strashko <[email protected]>
>> ---
>> drivers/i2c/busses/i2c-omap.c | 10 ++++++++++
>> 1 file changed, 10 insertions(+)
>>
>> diff --git a/drivers/i2c/busses/i2c-omap.c b/drivers/i2c/busses/i2c-omap.c
>> index 97844ff..2dac598 100644
>> --- a/drivers/i2c/busses/i2c-omap.c
>> +++ b/drivers/i2c/busses/i2c-omap.c
>> @@ -885,6 +885,11 @@ omap_i2c_isr(int irq, void *dev_id)
>> u16 stat;
>>
>> spin_lock(&dev->lock);
>> + if (pm_runtime_suspended(dev->dev)) {
>> + WARN_ONCE(true, "We should never be here!\n");
>> + return IRQ_NONE;
>> + }
> returning IRQ_NONE is not what you want to do in this case. You want to
> setup a flag so that your runtime_resume() knows that there are pending
> events to be handled and you handle those in runtime_resume time.
I don't want to handle this IRQ - we should never be here.
Will be changed to IRQ_HANDLED.
>
> But to be frank, I don't see how this can trigger since we're calling
> pm_runtime_get_sync() from omap_i2c_xfer() which means by the time
> pm_runtime_get_sync() returns, assuming no errors, i2c module should be
> fully resumed and ready to go. Perhaps you have found a bug somewhere
> else ?
May be it's better to revert this patch:
e3a36b207f76364c281aeecaf14c1b22a7247278
i2c: omap: remove unnecessary pm_runtime_suspended check

which doesn't cover case when transfer is *finished*.
Please, see https://patchwork.kernel.org/patch/2689211/ and
cover-latter.
>
> Also, your 'We should never be here' message isn't helpfull at all.
>
>> @@ -905,6 +910,11 @@ omap_i2c_isr_thread(int this_irq, void *dev_id)
>> u16 stat;
>> int err = 0, count = 0;
>>
>> + if (pm_runtime_suspended(dev->dev)) {
>> + WARN_ONCE(true, "We should never be here!\n");
>> + return IRQ_NONE;
>> + }
> because of IRQF_ONESHOT I can't see how this would *ever* be a valid
> check.
>
Please, see https://patchwork.kernel.org/patch/2689211/ and
cover-latter.

Sorry, for delayed reply - I've had problems with my e-mail.

- grygorii

2013-06-19 19:31:19

by Felipe Balbi

[permalink] [raw]
Subject: Re: [PATCH 1/5] i2c: omap: fix spurious IRQs: disable/enable IRQ at INTC when idle

Hi,

On Wed, Jun 19, 2013 at 09:35:38PM +0300, Grygorii Strashko wrote:
> On 06/07/2013 11:51 PM, Kevin Hilman wrote:
> >Grygorii Strashko <[email protected]> writes:
> >
> >>From: Kevin Hilman <[email protected]>
> >>
> >>Currently, runtime PM is used to keep the device enabled only during
> >>active transfers and for a configurable runtime PM autosuspend timout
> >>after an xfer.
> >>
> >>In addition to idling the device, driver's ->runtime_suspend() method
> >>currently disables device interrupts when idle. However, on some SoCs
> >>(notably OMAP4+), the I2C hardware may shared with other coprocessors.
> >>This means that the MPU will still recieve interrupts if a coprocessor
> >>is using the I2C device. To avoid this, also disable interrupts at
> >>the MPU INTC when idling the device in ->runtime_suspend() (and
> >>re-enable them in ->runtime_resume().) This part based on an original
> >>patch from Shubhrajyoti Datta. NOTE: for proper sharing the I2C with
> >>a coprocessor, this driver still needs hwspinlock support added.
> >>
> >>More over, this patch fixes the kernel boot failure which happens when
> >>CONFIG_SENSORS_LM75=y:
> >>
> >>[ 2.251220] WARNING: at drivers/bus/omap_l3_noc.c:113 l3_interrupt_handler+0x140/0x184()
> >>[ 2.257781] L3 custom error: MASTER:MPU TARGET:L4 PER2
> >>[ 2.264373] Modules linked in:
> >>[ 2.268463] CPU: 0 PID: 2 Comm: kthreadd Not tainted 3.10.0-rc4-00034-g042dd60-dirty #64
> >>[ 2.270385] [<c001a80c>] (unwind_backtrace+0x0/0xf0) from [<c0017238>] (show_stack+0x10/0x14)
> >>[ 2.286102] [<c0017238>] (show_stack+0x10/0x14) from [<c0040fd0>] (warn_slowpath_common+0x4c/0x68)
> >>[ 2.295593] [<c0040fd0>] (warn_slowpath_common+0x4c/0x68) from [<c0041080>] (warn_slowpath_fmt+0x30/0x40)
> >>[ 2.299987] [<c0041080>] (warn_slowpath_fmt+0x30/0x40) from [<c02c5ed0>] (l3_interrupt_handler+0x140/0x184)
> >>[ 2.315582] [<c02c5ed0>] (l3_interrupt_handler+0x140/0x184) from [<c009ffb8>] (handle_irq_event_percpu+0x58/0x258)
> >>[ 2.323364] [<c009ffb8>] (handle_irq_event_percpu+0x58/0x258) from [<c00a01f4>] (handle_irq_event+0x3c/0x5c)
> >>[ 2.327819] [<c00a01f4>] (handle_irq_event+0x3c/0x5c) from [<c00a2f7c>] (handle_fasteoi_irq+0xbc/0x164)
> >>[ 2.337829] [<c00a2f7c>] (handle_fasteoi_irq+0xbc/0x164) from [<c009f978>] (generic_handle_irq+0x20/0x30)
> >>[ 2.357513] [<c009f978>] (generic_handle_irq+0x20/0x30) from [<c0014168>] (handle_IRQ+0x4c/0xac)
> >>[ 2.366821] [<c0014168>] (handle_IRQ+0x4c/0xac) from [<c00085b4>] (gic_handle_irq+0x28/0x5c)
> >>[ 2.375762] [<c00085b4>] (gic_handle_irq+0x28/0x5c) from [<c04f08a4>] (__irq_svc+0x44/0x5c)
> >>[ 2.379821] Exception stack(0xdb085ec0 to 0xdb085f08)
> >>[ 2.389953] 5ec0: 00000001 00000001 00000000 db07ea00 40000113 db2317a8 db084000 000002f1
> >>[ 2.389953] 5ee0: db07ea00 00000000 00000000 00000000 c04fd990 db085f08 c009170c c04f03e8
> >>[ 2.405609] 5f00: 20000113 ffffffff
> >>[ 2.408355] [<c04f08a4>] (__irq_svc+0x44/0x5c) from [<c04f03e8>] (_raw_spin_unlock_irqrestore+0x34/0x44)
> >>[ 2.418304] [<c04f03e8>] (_raw_spin_unlock_irqrestore+0x34/0x44) from [<c00403c0>] (do_fork+0xa4/0x2d4)
> >>[ 2.427795] [<c00403c0>] (do_fork+0xa4/0x2d4) from [<c0040620>] (kernel_thread+0x30/0x38)
> >>[ 2.437805] [<c0040620>] (kernel_thread+0x30/0x38) from [<c0063888>] (kthreadd+0xd4/0x13c)
> >>[ 2.448364] [<c0063888>] (kthreadd+0xd4/0x13c) from [<c0013308>] (ret_from_fork+0x14/0x2c)
> >>[ 2.448364] ---[ end trace da8cf92c433d1616 ]---
> >>
> >>The root casue of crash is race between omap_i2c_runtime_suspend()
> >> and omap_i2c_isr_thread()
> >If there's a race here, then it's not the same problem which is
> >described above, unless the CPU2 IRQ is happening because of a shared
> >user of I2C, in which case it doesn't need any additional explanation.
> no shared users here
> >>CPU1: CPU2:
> >>|-omap_i2c_xfer |
> >> |- pm_runtime_put_autosuspend() |
> >> |-timeout |-omap_i2c_isr()
> >> |-return IRQ_WAKE_THREAD;
> >> |-omap_i2c_runtime_suspend() |
> >> |-omap_i2c_isr_thread()
> >> |- oops is here - I2C module is in idle state
> >If this is happening, I don't think it's related to $SUBJECT patch (but
> >is probably hidden by it.)
> I can remove "fix spurious IRQs: " from $SUBJECT. What do you think?
> >
> >Instead, what probably needs to happen in this is case is that
> >omap_i2c_isr() should be doing a "mark last busy" to reset the
> >autosuspend timeout. And, that should be done in a separate patch.

this doesn't make sense... mark last busy is done after the I2C message
is actually complete, so is pm_runtime_put_autosuspend() which is done
following mark_last_busy. autosuspend timer shouldn't be firing since
put won't be called until we're dead sure I2C message (all of them, in
fact) are finalized.

> Yes - from one side. From other side, this patch prevents such
> situation to happen.
> disable_irq(_dev->irq); - waits for any pending IRQ handlers for this
> interrupt
> to complete before returning.

IRQ subsystem makes sure to disable our IRQ line while our handler is
running and, because of IRQF_ONESHOT, IRQ will only be re-enabled after
thread has finished.

> If we are in .runtime_idle() callback - it means I2C transaction has
> been finished

we don't even use runtime_idle() and your statement is wrong, btw. If
you have 200 pm_runtime_get() followed by 200 pm_runtime_put() (put
is called only after 200 gets, no put-get ping-pong), your
->runtime_resume() gets called once, your ->runtime_suspend() gets
called once but your ->runtime_idle() will get called 200 times.

assuming I2C transaction has finished is wrong here. What if we decide
to add pm_runtime_get() on hardirq handler and pm_runtime_put() to the
end of the thread handler ?

If you mask IRQs on ->runtime_idle(), you might miss some events.

> As I mentioned in cover-latter this happens because PM runtime
> auto-suspend isn't
> working properly during the boot:
>
> [ 23.190002] omap_i2c 48350000.i2c: ==== i2c_add_numbered_adapter
> [ 23.204681] omap_i2c 48350000.i2c: bus 3 rev0.11 at 400 kHz
> [ 23.211669] omap_i2c 48350000.i2c: ====== rpm_suspend elapsed 0 last_busy 4294939616
> [ 23.219879] omap_i2c 48350000.i2c: ====== rpm_suspend expires 4294939716 last_busy 4294939616 autosuspend_delay 1000
> [ 23.219879] omap_i2c 48350000.i2c: ====== rpm_suspend expires 4294939700
> [ 23.237274] omap_i2c 48350000.i2c: ====== rpm_suspend mod_timer expires 4294939700
> --- the timer scheduled to 84 jiffes
> [ 23.246185] omap_i2c 48350000.i2c: ==== pm_runtime_put_autosuspend
> [ 23.253448] omap-dma-engine 4a056000.dma-controller: allocating channel for 62
> [ 23.261138] omap-dma-engine 4a056000.dma-controller: allocating channel for 61
> [ 23.269500] omap_i2c 48070000.i2c: ==== omap_i2c_runtime_resume
> [ 23.275817] omap_i2c 48350000.i2c: ==== omap_i2c_runtime_suspend
> --- and expired after ~40 ms
>
> So, there are no guaranty that this race will not happen even if I'll
> do "mark last busy"
> in HW IRQ handler (, because in high CPU load use case the threaded
> IRQ can be delayed.

then the bug is on pm_runtime time keeping, no ? Not really on the I2C
driver.

--
balbi


Attachments:
(No filename) (7.28 kB)
signature.asc (836.00 B)
Digital signature
Download all attachments

2013-06-19 19:39:24

by Felipe Balbi

[permalink] [raw]
Subject: Re: [PATCH 2/5] i2c: omap: add runtime check in isr to be sure that i2c is enabled

On Wed, Jun 19, 2013 at 09:42:25PM +0300, Grygorii Strashko wrote:
> Hi Felipe,
> On 06/07/2013 10:02 PM, Felipe Balbi wrote:
> >Hi,
> >
> >On Fri, Jun 07, 2013 at 09:46:05PM +0300, Grygorii Strashko wrote:
> >>Add runtime check at the beginning of omap_i2c_isr/omap_i2c_isr_thread
> >>to be sure that i2c is enabled, before performing IRQ handling and accessing
> >>I2C IP registers:
> >> if (pm_runtime_suspended(dev->dev)) {
> >> WARN_ONCE(true, "We should never be here!\n");
> >> return IRQ_NONE;
> >> }
> >>
> >>Produce warning in case if ISR called when i2c is disabled
> >>
> >>CC: Kevin Hilman <[email protected]>
> >>Signed-off-by: Grygorii Strashko <[email protected]>
> >>---
> >> drivers/i2c/busses/i2c-omap.c | 10 ++++++++++
> >> 1 file changed, 10 insertions(+)
> >>
> >>diff --git a/drivers/i2c/busses/i2c-omap.c b/drivers/i2c/busses/i2c-omap.c
> >>index 97844ff..2dac598 100644
> >>--- a/drivers/i2c/busses/i2c-omap.c
> >>+++ b/drivers/i2c/busses/i2c-omap.c
> >>@@ -885,6 +885,11 @@ omap_i2c_isr(int irq, void *dev_id)
> >> u16 stat;
> >> spin_lock(&dev->lock);
> >>+ if (pm_runtime_suspended(dev->dev)) {
> >>+ WARN_ONCE(true, "We should never be here!\n");
> >>+ return IRQ_NONE;
> >>+ }
> >returning IRQ_NONE is not what you want to do in this case. You want to
> >setup a flag so that your runtime_resume() knows that there are pending
> >events to be handled and you handle those in runtime_resume time.
> I don't want to handle this IRQ - we should never be here.
> Will be changed to IRQ_HANDLED.

blindly returning IRQ_HANDLED won't do you any good either. Your line
will be re-enabled anyway and you'll get another IRQ even being fired.

If you have found a bug in the driver, fix it, don't try to mask it.

> >But to be frank, I don't see how this can trigger since we're calling
> >pm_runtime_get_sync() from omap_i2c_xfer() which means by the time
> >pm_runtime_get_sync() returns, assuming no errors, i2c module should be
> >fully resumed and ready to go. Perhaps you have found a bug somewhere
> >else ?
> May be it's better to revert this patch:
> e3a36b207f76364c281aeecaf14c1b22a7247278
> i2c: omap: remove unnecessary pm_runtime_suspended check
>
> which doesn't cover case when transfer is *finished*.

what happens when transfer is finished ? After we come out of the loop
in omap_i2c_xfer() we will mark last busy and
pm_runtime_put_autosuspend()

633 static int
634 omap_i2c_xfer(struct i2c_adapter *adap, struct i2c_msg msgs[], int num)
635 {
636 struct omap_i2c_dev *dev = i2c_get_adapdata(adap);
637 int i;
638 int r;
639
640 r = pm_runtime_get_sync(dev->dev);
641 if (IS_ERR_VALUE(r))
642 goto out;
643
644 r = omap_i2c_wait_for_bb(dev);
645 if (r < 0)
646 goto out;
647
648 if (dev->set_mpu_wkup_lat != NULL)
649 dev->set_mpu_wkup_lat(dev->dev, dev->latency);
650
651 for (i = 0; i < num; i++) {
652 r = omap_i2c_xfer_msg(adap, &msgs[i], (i == (num - 1)));
653 if (r != 0)
654 break;
655 }
656
657 if (r == 0)
658 r = num;
659
660 omap_i2c_wait_for_bb(dev);
661
662 if (dev->set_mpu_wkup_lat != NULL)
663 dev->set_mpu_wkup_lat(dev->dev, -1);
664
665 out:
666 pm_runtime_mark_last_busy(dev->dev);
667 pm_runtime_put_autosuspend(dev->dev);
668 return r;
669 }

When that timer expires, we will mask the controller's IRQs on our
->runtime_suspend().

Now, if another IRQ comes before we ->runtime_suspend(), then we need to
figure out what's generating that event, we don't want to be "fixing"
what's not broken in this driver.

> Please, see https://patchwork.kernel.org/patch/2689211/ and
> cover-latter.

that patch is fine, but doesn't seem like has nothing to do with what
you're talking about here.

> >Also, your 'We should never be here' message isn't helpfull at all.

nor is your explanation of the problem. It's not sufficient.

I'm telling you that we should never reach this case because that's the
assumption that the driver makes. It assumes that no IRQs will be fired
unless a transfer is started and by the time that for loop ends, no
transfer will be started.

> >>@@ -905,6 +910,11 @@ omap_i2c_isr_thread(int this_irq, void *dev_id)
> >> u16 stat;
> >> int err = 0, count = 0;
> >>+ if (pm_runtime_suspended(dev->dev)) {
> >>+ WARN_ONCE(true, "We should never be here!\n");
> >>+ return IRQ_NONE;
> >>+ }
> >because of IRQF_ONESHOT I can't see how this would *ever* be a valid
> >check.
> >
> Please, see https://patchwork.kernel.org/patch/2689211/ and
> cover-latter.

explain to me how would we get to this point, meaning the IRQ thread
handler, with the device disabled.

--
balbi


Attachments:
(No filename) (4.77 kB)
signature.asc (836.00 B)
Digital signature
Download all attachments

2013-06-19 19:44:22

by Felipe Balbi

[permalink] [raw]
Subject: Re: [PATCH 3/5] i2c: omap: handle all irqs befor unblocking omap_i2c_xfer_msg()

On Wed, Jun 19, 2013 at 09:43:04PM +0300, Grygorii Strashko wrote:
> Hi Felipe,
> On 06/07/2013 10:05 PM, Felipe Balbi wrote:
> >Hi,
> >
> >On Fri, Jun 07, 2013 at 09:46:06PM +0300, Grygorii Strashko wrote:
> >>ARDY|NACK and ARDY|AL are set together in OMAP_I2C_STAT_REG, which will be
> >Have you seen that happen ever ? AL is Arbitration Lost, we never put
> >OMAP in a multi-master environment before.
> This is an example from real life:
> [ 0.271942] omap_i2c omap_i2c.1: bus 1 rev2.4.0 at 400 kHz
> [ 1.283416] omap_i2c omap_i2c.1: timeout waiting for bus ready
> [ 1.300109] OMAP_I2C DEBUG: stat=1001
> [ 1.300140] omap_i2c omap_i2c.1: Arbitration lost
> [ 1.300140] OMAP_I2C DEBUG: IE=601F
> [ 1.300140] OMAP_I2C DEBUG: STAT=1000
> [ 1.300170] OMAP_I2C DEBUG: IV=636F
> [ 1.300170] OMAP_I2C DEBUG: WE=636F
> [ 1.300170] OMAP_I2C DEBUG: SYSS=1
> [ 1.300170] OMAP_I2C DEBUG: BUF=707
> [ 1.300201] OMAP_I2C DEBUG: CNT=1
> [ 1.300201] OMAP_I2C DEBUG: DATA=1
> [ 1.300201] OMAP_I2C DEBUG: SYSC=215
> [ 1.300201] OMAP_I2C DEBUG: CON=8200
> [ 1.300231] OMAP_I2C DEBUG: OA=0
> [ 1.300231] OMAP_I2C DEBUG: SA=49
> [ 1.300231] OMAP_I2C DEBUG: PSC=9
> [ 1.300262] OMAP_I2C DEBUG: SCLL=9
> [ 1.300262] OMAP_I2C DEBUG: SCLH=3
> [ 1.300262] OMAP_I2C DEBUG: SYSTEST=1E0
> [ 1.300262] OMAP_I2C DEBUG: BUFSTAT=4000
>
> and my headache now :..(

have you looked for erratas around that ? Maybe you just found a silicon
issue. Why is AL bit being set ? Have you tried to reach the IP owner ?

If there are no other I2C masters in the system, there will be no
arbitration, hence we would never loose the arbitration.

> >ARDY | NACK I also find it a bit hard for those two to happen together
> >since ARDY will be set when you can change controller's register
> >*again*, mening that a transfer has completed.
> There are examples:
> [ 3.544952] omap_i2c 48060000.i2c: IRQ (ISR = 0x0006)
>
> [ 25.574523] omap_i2c 48350000.i2c: IRQ (ISR = 0x0014)
> [ 25.579925] omap_i2c 48350000.i2c: IRQ (ISR = 0x0012)
>
> to see it - enable debug output in omap_i2c_isr_thread:
> dev_dbg(dev->dev, "IRQ (ISR = 0x%04x)\n", stat);

then you need to figure out why that's happening, right ? What do you do
to trigger that particular condition, have you looked in the wire to see
if you find a real NACK or is the OMAP I2C controller misbehaving ?

> >Also, we need to follow what the programming model says. And, I don't
> >have docs with me right now, but IIRC it tells us to bail out if any of
> >the error conditions are met.
> >
> yep, but first of all - all IRQs need to be acked before exit.

that's alright, then fix only that... OTOH, you don't want to ack a
read while data is still sitting in the FIFO, data you haven't read out
of the FIFO, I mean. Not sure if that could happen though.

--
balbi


Attachments:
(No filename) (2.79 kB)
signature.asc (836.00 B)
Digital signature
Download all attachments

2013-06-19 19:44:53

by Felipe Balbi

[permalink] [raw]
Subject: Re: [PATCH 5/5] i2c: omap: remove omap_i2c_isr() hw irq handler

On Wed, Jun 19, 2013 at 09:43:17PM +0300, Grygorii Strashko wrote:
> Hi Felipe,
>
> On 06/07/2013 10:07 PM, Felipe Balbi wrote:
> >Hi,
> >
> >On Fri, Jun 07, 2013 at 09:46:08PM +0300, Grygorii Strashko wrote:
> >>The omap_i2c_isr() does the irq check and schedules threaded handler if any of
> >>enabled IRQs is active, but currently the I2C IRQs are enabled just once,
> >>when I2C IP is enabling (transfer started) and they aren't changed after that.
> >>More over, now the I2C INTC IRQ is disabled when I2C IP is idled.
> >>Thus, I2C IRQs will start coming only when we want that, and there is
> >>no sense to have omap_i2c_isr() anymore:
> >so ? we still want to check if this device generated IRQs in hardirq
> >context. What if the IRQ line is shared ?
> >
> Pleas see, https://patchwork.kernel.org/patch/2689211/
> [1/5] i2c: omap: fix spurious IRQs: disable/enable IRQ at INTC when idle
>
> It covers shared IRQ problem

then you don't need $SUBJECT.

--
balbi


Attachments:
(No filename) (972.00 B)
signature.asc (836.00 B)
Digital signature
Download all attachments

2013-06-19 20:01:36

by Kevin Hilman

[permalink] [raw]
Subject: Re: [PATCH 1/5] i2c: omap: fix spurious IRQs: disable/enable IRQ at INTC when idle

Felipe Balbi <[email protected]> writes:

[...]

> If you have 200 pm_runtime_get() followed by 200 pm_runtime_put() (put
> is called only after 200 gets, no put-get ping-pong), your
> ->runtime_resume() gets called once, your ->runtime_suspend() gets
> called once but your ->runtime_idle() will get called 200 times.

No. The driver's ->runtime_idle() will only be called when the usecount
goes to zero.

Kevin

2013-06-19 20:56:14

by Felipe Balbi

[permalink] [raw]
Subject: Re: [PATCH 1/5] i2c: omap: fix spurious IRQs: disable/enable IRQ at INTC when idle

Hi,

On Wed, Jun 19, 2013 at 01:01:28PM -0700, Kevin Hilman wrote:
> Felipe Balbi <[email protected]> writes:
>
> [...]
>
> > If you have 200 pm_runtime_get() followed by 200 pm_runtime_put() (put
> > is called only after 200 gets, no put-get ping-pong), your
> > ->runtime_resume() gets called once, your ->runtime_suspend() gets
> > called once but your ->runtime_idle() will get called 200 times.
>
> No. The driver's ->runtime_idle() will only be called when the usecount
> goes to zero.

Indeed, just re-read the code.

--
balbi


Attachments:
(No filename) (533.00 B)
signature.asc (836.00 B)
Digital signature
Download all attachments