Hi Gregory,
Are you able to provide a bit more context around
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6cf70ae928bae
To save you a click that's commit 6cf70ae928ba ("i2c: mv64xxx: Fix bus
hang on A0 version of the Armada XP SoCs") basically you added the
"marvell,mv78230-a0-i2c" compatible and used that to disable the I2C
transfer offload feature. It's almost 10 years ago so I don't really
expect anyone to remember.
I've been chasing an issue where certain I2C bus conditions (which I'm
now injecting using another board and the i2c-gpio fault injection)
cause a system wide lockup on some Marvell SoCs. The response I've got
from Marvell via their FAE is that that these adverse bus conditions
make the I2C controller assume that another master is accessing the bus,
it will then wait for the other master to generate a STOP condition
(which never happens).
Their suggestion was to check for the bus being idle (SDA/SCL high)
before launching the transfer. That will avoid the issue if SCL or SDA
are shorted to ground but didn't help with the lockup caused by the
incomplete_address_phase or incomplete_write_byte. Their response to
that was basically "meh, protocol error".
As a temporary workaround we ended up putting the MPP into gpio mode and
making use of the i2c-gpio bus driver. That worked but has it's own
downsides when the CPU gets busy.
Initially I thought this affected only the newer ARM64 ones (CN9130 and
AC5) but I eventually found that from commit fbffee74986c ("ARM: dts:
Fix I2C repeated start issue on Armada-38x") we've been using the
"marvell,mv78230-a0-i2c" compatible string on the Armada-38x which is
likely why I can't reproduce it on an Armada-385 based board. Using that
compatible string to disable the offload on my AC5 based board and the
CN9130-CRB seems to avoid the issue as well.
I need to do more testing but it's likely we'll run with that as a
change for our boards. I'm also thinking that the I2C offload feature is
not really suitable for boards where the I2C bus is not completely
reliable (in my case this connected to SFP cages and we've seen all
kinds of weird and wonderful errors due to different SFPs causing shorts
or just generally misbehaving).
Does any of that sound like the issue from the A0 Armada XP?
Thanks,
Chris