Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752181AbbGMREO (ORCPT ); Mon, 13 Jul 2015 13:04:14 -0400 Received: from mail-ig0-f182.google.com ([209.85.213.182]:38744 "EHLO mail-ig0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751037AbbGMREM (ORCPT ); Mon, 13 Jul 2015 13:04:12 -0400 Date: Mon, 13 Jul 2015 12:03:44 -0500 From: Michael Welling To: Sebastian Reichel Cc: Tony Lindgren , Pali =?iso-8859-1?Q?Roh=E1r?= , Pavel Machek , Ivaylo Dimitrov , Aaro Koskinen , Nishanth Menon , linux-omap@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: linux 4.2-rc1 broken Nokia N900 Message-ID: <20150713170344.GA5825@deathray> References: <201507111405.06048@pali> <20150713064425.GB26485@atomide.com> <20150713080920.GA25585@earth> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20150713080920.GA25585@earth> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 7689 Lines: 130 On Mon, Jul 13, 2015 at 10:09:21AM +0200, Sebastian Reichel wrote: > [+cc Michael Welling , author of all omap-spi patches between 4.1 and 4.2-rc1] > > Hi, > > On Sun, Jul 12, 2015 at 11:44:25PM -0700, Tony Lindgren wrote: > > * Pali Roh?r [150711 05:07]: > > > Hello, > > > > > > now I tested 4.2-rc1 release on Nokia N900 and couple of drivers are > > > broken and cause kernel oops... > > > > > > Basically wifi, touchscreen and rtc drivers not working... > > > > > > Here are some relevant snippets form dmesg: > > > > > > [ 13.933959] Unhandled fault: external abort on non-linefetch (0x1028) at 0xfa09802c > > > [ 13.940490] pgd = cfb38000 > > > [ 13.946594] [fa09802c] *pgd=48011452(bad) > > > [ 13.952758] Internal error: : 1028 [#1] PREEMPT ARM > > > [ 13.958862] Modules linked in: tsc2005(+) omap_sham twl4030_wdt omap_wdt > > > [ 13.965332] CPU: 0 PID: 183 Comm: modprobe Not tainted 4.2.0-rc1+ #363 > > > [ 13.971801] Hardware name: Nokia RX-51 board > > > [ 13.978302] task: cf572300 ti: cb1f2000 task.ti: cb1f2000 > > > [ 13.984924] PC is at omap2_mcspi_set_cs+0x44/0x4c Here is the disassembly of the omap2_mcspi_set_cs function from my compiler: 00000040 : 40: e2803e25 add r3, r0, #592 ; 0x250 44: e5902258 ldr r2, [r0, #600] ; 0x258 48: e1d330b2 ldrh r3, [r3, #2] 4c: e3130004 tst r3, #4 50: 12211001 eorne r1, r1, #1 54: e3520000 cmp r2, #0 58: 012fff1e bxeq lr 5c: e5923018 ldr r3, [r2, #24] 60: e3510000 cmp r1, #0 64: 13c33601 bicne r3, r3, #1048576 ; 0x100000 68: 03833601 orreq r3, r3, #1048576 ; 0x100000 6c: e5823018 str r3, [r2, #24] 70: e5902258 ldr r2, [r0, #600] ; 0x258 74: e5922000 ldr r2, [r2] 78: e582302c str r3, [r2, #44] ; 0x2c 7c: e5903258 ldr r3, [r0, #600] ; 0x258 80: e5933000 ldr r3, [r3] 84: e593202c ldr r2, [r3, #44] ; 0x2c 88: e12fff1e bx lr The omap2_mcspi_set_cs function is being called before the controller_state is initialized in omap2_mcspi_setup. That is why there is a conditional checking if controller_state is NULL. Perhaps the controller_state is uninitialized but has garbage instead of NULL causing the data abort. Though that does not make much sense because a similar check in the setup function did not cause a data abort in the past. Not sure what is going wrong here. Could you do a objdump with the compiler you are using? > > > [ 13.991485] LR is at spi_set_cs+0x5c/0x60 > > > [ 13.997985] pc : [] lr : [] psr: 20000013 > > > [ 13.997985] sp : cb1f3dd0 ip : 00000001 fp : 00000004 > > > [ 14.011260] r10: cfce5be8 r9 : 00000fff r8 : c0654f98 > > > [ 14.017913] r7 : 00000000 r6 : 00000000 r5 : 00000000 r4 : 00000000 > > > [ 14.024505] r3 : 200103dc r2 : fa098000 r1 : 00000001 r0 : cf09bc00 > > > [ 14.031036] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user > > > [ 14.037689] Control: 10c5387d Table: 8fb38019 DAC: 00000015 > > > [ 14.044403] Process modprobe (pid: 183, stack limit = 0xcb1f2210) > > > [ 14.051300] Stack: (0xcb1f3dd0 to 0xcb1f4000) > > > [ 14.058105] 3dc0: cf09bc00 c02bafa4 cf09bc00 cf09bc00 > > > [ 14.065277] 3de0: bf013444 bf01254c cf0e2230 cf0e2230 00000001 c0654f98 00000fff 00000fff > > > [ 14.072570] 3e00: 00000008 00000002 00000118 00001f40 00000031 cf09bc00 ffffffed bf013444 > > > [ 14.080078] 3e20: 00000031 c0654f98 cb1f2000 00000000 00000000 c02bb5c0 cf09bc00 00000000 > > > [ 14.087738] 3e40: bf013454 c027a2f4 00000000 cf09bc00 bf013454 bf013454 00000000 c027a594 > > > [ 14.095367] 3e60: 00000000 cf09bc00 cf09bc34 c027a60c bf013454 cb1f3e80 c027a5ac c0278ec8 > > > [ 14.102935] 3e80: cf972c4c cf09d630 bf013454 bf013454 cbb55300 c06848d8 00000000 c0279c84 > > > [ 14.110473] 3ea0: bf01327c bf01327d 00000000 bf013454 cb889180 00000000 c0654f98 c027b0c8 > > > [ 14.117980] 3ec0: 00000000 bf015000 cb889180 c00095b0 0040003e cfe6a080 0040003f 00000000 > > > [ 14.125457] 3ee0: 00080000 cfcf9000 cb1f2000 60000013 0040003e cbf1bbc0 00000000 00000001 > > > [ 14.132843] 3f00: bf0134cc cb1f2000 bf0134c0 cb1f3f58 00000000 c04352d0 cf801f00 000000d0 > > > [ 14.140136] 3f20: bf0134c0 bf0134c0 0000416c cb889040 00000080 c000ebe4 cb1f2000 c0089f68 > > > [ 14.147308] 3f40: bf0134c0 cbf1bc00 001a9193 0000416c 001f8d20 c008ab30 d0b10000 0000416c > > > [ 14.154571] 3f60: d0b1267c d0b1252b d0b13514 000016c0 00001ad0 00000000 00000000 00000000 > > > [ 14.161865] 3f80: 0000001f 00000020 00000017 00000014 00000012 00000000 00201208 00000000 > > > [ 14.169097] 3fa0: 00000000 c000ea60 00201208 00000000 001f8d20 0000416c 001a9193 00000000 > > > [ 14.176177] 3fc0: 00201208 00000000 00000000 00000080 00208c20 001a9193 bee09e98 00000000 > > > [ 14.183197] 3fe0: b6f742b4 bee09ae4 000153f0 000093e4 60000010 001f8d20 72757463 69665f65 > > > [ 14.190277] [] (omap2_mcspi_set_cs) from [] (spi_set_cs+0x5c/0x60) > > > [ 14.197479] [] (spi_set_cs) from [] (spi_setup+0xd4/0x10c) > > > [ 14.204833] [] (spi_setup) from [] (tsc2005_probe+0x104/0x484 [tsc2005]) > > > [ 14.212249] [] (tsc2005_probe [tsc2005]) from [] (spi_drv_probe+0x50/0x6c) > > > [ 14.219818] [] (spi_drv_probe) from [] (really_probe+0xd4/0x230) > > > [ 14.227478] [] (really_probe) from [] (driver_probe_device+0x30/0x48) > > > [ 14.235290] [] (driver_probe_device) from [] (__driver_attach+0x60/0x84) > > > [ 14.243286] [] (__driver_attach) from [] (bus_for_each_dev+0x50/0x84) > > > [ 14.251281] [] (bus_for_each_dev) from [] (bus_add_driver+0xcc/0x1e0) > > > [ 14.259246] [] (bus_add_driver) from [] (driver_register+0x9c/0xe0) > > > [ 14.267272] [] (driver_register) from [] (do_one_initcall+0x100/0x1b0) > > > [ 14.275421] [] (do_one_initcall) from [] (do_init_module+0x58/0x1bc) > > > [ 14.283477] [] (do_init_module) from [] (SyS_init_module+0x54/0x64) > > > [ 14.291412] [] (SyS_init_module) from [] (ret_fast_syscall+0x0/0x3c) > > > [ 14.299407] Code: e5823018 e5902188 e5922000 e582302c (e592302c) > > > [ 14.307403] ---[ end trace d21553dcaefcb5ac ]--- > > > > That seems to be a regression with the SPI driver. Care to git bisect it? > > It's probably one of the following commits: > > > > $ git log --pretty=oneline v4.1..v4.2-rc2 drivers/spi/spi-omap2-mcspi.c > > > > Looks like just modprobe tsc2005 is enough to reproduce it. > > mh omap2_mcspi_set_cs has been introduced in this range > (ddcad7e9068) and from the commit message it seems to be > a fix for the first commit (b28cb9414d) in this range. > > Just looking at the commit log, I sugest starting with testing if > ddcad7e9068 is affected and if b28cb9414d~1 is not affected. > > -- Sebastian -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/