Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754201AbeALAVw (ORCPT + 1 other); Thu, 11 Jan 2018 19:21:52 -0500 Received: from mail-cys01nam02on0071.outbound.protection.outlook.com ([104.47.37.71]:19834 "EHLO NAM02-CY1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753186AbeALAVr (ORCPT ); Thu, 11 Jan 2018 19:21:47 -0500 From: "Madhani, Himanshu" To: Abdul Haleem CC: Bart Van Assche , "linuxppc-dev@lists.ozlabs.org" , "linux-kernel@vger.kernel.org" , "linux-block@vger.kernel.org" , "keescook@chromium.org" , "sim@linux.vnet.ibm.com" , "linux-scsi@vger.kernel.org" , "sfr@canb.auug.org.au" , "linux-next@vger.kernel.org" , "sachinp@linux.vnet.ibm.com" , "mpe@ellerman.id.au" Subject: Re: [linux-next][qla2xxx][85caa95]kernel BUG at lib/list_debug.c:31! Thread-Topic: [linux-next][qla2xxx][85caa95]kernel BUG at lib/list_debug.c:31! Thread-Index: AQHTiSpM/fVQDUlVv0mZnBSXXzRYCqNrshEAgAAlcoCAAlL8gIABOdGA Date: Fri, 12 Jan 2018 00:21:44 +0000 Message-ID: References: <1515489264.3648.18.camel@abdul.in.ibm.com> <1515513297.2721.2.camel@wdc.com> <1515649111.3516.4.camel@abdul.in.ibm.com> In-Reply-To: <1515649111.3516.4.camel@abdul.in.ibm.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Himanshu.Madhani@cavium.com; x-originating-ip: [173.186.134.106] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1;MWHPR0701MB3628;7:hPYhhwDBtNSJn8sj/tW/cYj9tC+mBVxuZdSVQjG5h+0KiuSymWQWnZ4UZj1/vEBrAzLkxynOvWn2DJL7Lq+79b1ljPiCALvWC6yXO6If9ULbEsyWAEWJxykgZMHh6AJAFBcyIZFhyKS+sQVSHn8Bw56gyrMudKUsA0LjoAB4SRY0FMunjDnPvTjIUJvpnZuZu3PzmdBdRuVSve0GMTomkYT7vTHFOglYu7NKgBDGSIHc/GguT8E/o9NGil/8kUFZ x-ms-exchange-antispam-srfa-diagnostics: SSOS; x-ms-office365-filtering-correlation-id: bba97970-0eab-4d75-27b0-08d559527633 x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(7020073)(4652020)(4534103)(4602075)(4627199)(201703031133081)(201702281549075)(5600026)(4604075)(3008032)(2017052603307)(7153060)(7193020);SRVR:MWHPR0701MB3628; x-ms-traffictypediagnostic: MWHPR0701MB3628: x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(84791874153150)(104084551191319); x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(6040470)(2401047)(5005006)(8121501046)(3231023)(944501137)(93006095)(93001095)(10201501046)(3002001)(6041268)(20161123560045)(20161123558120)(20161123564045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123562045)(6072148)(201708071742011);SRVR:MWHPR0701MB3628;BCL:0;PCL:0;RULEID:(100000803101)(100110400095);SRVR:MWHPR0701MB3628; x-forefront-prvs: 0550778858 x-forefront-antispam-report: SFV:NSPM;SFS:(10009020)(346002)(396003)(376002)(39850400004)(39380400002)(366004)(24454002)(377424004)(189003)(199004)(83716003)(305945005)(6506007)(97736004)(93886005)(54906003)(99286004)(33656002)(25786009)(86362001)(3846002)(14454004)(76176011)(6486002)(68736007)(2950100002)(5660300001)(8936002)(6916009)(6246003)(53546011)(7416002)(81166006)(82746002)(7736002)(105586002)(5250100002)(102836004)(106356001)(6306002)(8676002)(966005)(6436002)(3280700002)(6512007)(478600001)(45080400002)(36756003)(53936002)(229853002)(4326008)(2900100001)(66066001)(2906002)(6116002)(316002)(72206003)(226693001)(5890100001)(81156014)(3660700001);DIR:OUT;SFP:1101;SCL:1;SRVR:MWHPR0701MB3628;H:MWHPR0701MB3627.namprd07.prod.outlook.com;FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; x-microsoft-antispam-message-info: IYjLmNfcYb2VOPzpF2/gR4/sdtVAMxI2wmBhX87dgV9B+/XkI8BEEeDpv3gptDYbGf18fJ516W07HFcTC8/B7A== spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" Content-ID: <6B40D7F9E622004F94BF2E96593B15FA@namprd07.prod.outlook.com> Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 X-OriginatorOrg: cavium.com X-MS-Exchange-CrossTenant-Network-Message-Id: bba97970-0eab-4d75-27b0-08d559527633 X-MS-Exchange-CrossTenant-originalarrivaltime: 12 Jan 2018 00:21:44.6641 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 711e4ccf-2e9b-4bcf-a551-4094005b6194 X-MS-Exchange-Transport-CrossTenantHeadersStamped: MWHPR0701MB3628 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: > On Jan 10, 2018, at 9:38 PM, Abdul Haleem wrote: > > On Tue, 2018-01-09 at 18:09 +0000, Madhani, Himanshu wrote: >> Hello Abdul, >> >>> On Jan 9, 2018, at 7:54 AM, Bart Van Assche wrote: >>> >>> On Tue, 2018-01-09 at 14:44 +0530, Abdul Haleem wrote: >>>> Greeting's, >>>> >>>> Linux next kernel panics on powerpc when module qla2xxx is load/unload. >>>> >>>> Machine Type: Power 8 PowerVM LPAR >>>> Kernel : 4.15.0-rc2-next-20171211 >>>> gcc : version 4.8.5 >>>> Test type: module load/unload few times >>>> >>>> Trace messages: >>>> --------------- >>>> qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 10.00.00.03-k. >>>> qla2xxx [0106:a0:00.0]-001a: : MSI-X vector count: 32. >>>> qla2xxx [0106:a0:00.0]-001d: : Found an ISP2532 irq 505 iobase 0x00000000aeb324e6. >>>> qla2xxx [0106:a0:00.0]-00c6:1: MSI-X: Failed to enable support with 32 vectors, using 16 vectors. >>>> qla2xxx [0106:a0:00.0]-00fb:1: QLogic QLE2562 - PCIe 2-port 8Gb FC Adapter. >>>> qla2xxx [0106:a0:00.0]-00fc:1: ISP2532: PCIe (5.0GT/s x8) @ 0106:a0:00.0 hdma- host#=1 fw=8.06.00 (90d5). >>>> qla2xxx [0106:a0:00.1]-001a: : MSI-X vector count: 32. >>>> qla2xxx [0106:a0:00.1]-001d: : Found an ISP2532 irq 506 iobase 0x00000000a46f1774. >>>> qla2xxx [0106:a0:00.1]-00c6:2: MSI-X: Failed to enable support with 32 vectors, using 16 vectors. >>>> 2xxx >>>> qla2xxx [0106:a0:00.1]-00fb:2: QLogic QLE2562 - PCIe 2-port 8Gb FC Adapter. >>>> qla2xxx [0106:a0:00.1]-00fc:2: ISP2532: PCIe (5.0GT/s x8) @ 0106:a0:00.1 hdma- host#=2 fw=8.06.00 (90d5). >>>> 0:00.0]-500a:1: LOOP UP detected (8 Gbps). >>>> qla2xxx [0106:a0:00.1]-500a:2: LOOP UP detected (8 Gbps). >>>> list_add double add: new=000000008d33e594, prev=000000008d33e594, next=00000000adef1df4. >>>> ------------[ cut here ]------------ >>>> kernel BUG at lib/list_debug.c:31! >>>> Oops: Exception in kernel mode, sig: 5 [#1] >>>> LE SMP NR_CPUS=2048 NUMA pSeries >>>> Dumping ftrace buffer: >>>> (ftrace buffer empty) >>>> Modules linked in: qla2xxx(E) tg3(E) ibmveth(E) xt_CHECKSUM(E) >>>> iptable_mangle(E) ipt_MASQUERADE(E) nf_nat_masquerade_ipv4(E) >>>> iptable_nat(E) nf_nat_ipv4(E) nf_nat(E) nf_conntrack_ipv4(E) >>>> nf_defrag_ipv4(E) xt_conntrack(E) nf_conntrack(E) ipt_REJECT(E) >>>> nf_reject_ipv4(E) tun(E) bridge(E) stp(E) llc(E) kvm_pr(E) kvm(E) >>>> sctp_diag(E) sctp(E) libcrc32c(E) tcp_diag(E) udp_diag(E) >>>> ebtable_filter(E) ebtables(E) dccp_diag(E) ip6table_filter(E) dccp(E) >>>> ip6_tables(E) iptable_filter(E) inet_diag(E) unix_diag(E) >>>> af_packet_diag(E) netlink_diag(E) xts(E) sg(E) vmx_crypto(E) >>>> pseries_rng(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) >>>> sunrpc(E) binfmt_misc(E) ip_tables(E) ext4(E) mbcache(E) jbd2(E) >>>> fscrypto(E) sd_mod(E) ibmvscsi(E) scsi_transport_srp(E) nvme_fc(E) >>>> nvme_fabrics(E) nvme_core(E) scsi_transport_fc(E) >>>> ptp(E) pps_core(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E) >>>> [last unloaded: qla2xxx] >>>> CPU: 7 PID: 22230 Comm: qla2xxx_1_dpc Tainted: G E 4.15.0-rc2-next-20171211-autotest-autotest #1 >>>> NIP: c000000000511040 LR: c00000000051103c CTR: 0000000000655170 >>>> REGS: 000000009b7356fa TRAP: 0700 Tainted: G E (4.15.0-rc2-next-20171211-autotest-autotest) >>>> MSR: 800000010282b033 CR: 22000022 XER: 00000009 >>>> CFAR: c000000000170594 SOFTE: 0 >>>> GPR00: c00000000051103c c0000000fc293ac0 c0000000010f1d00 0000000000000058 >>>> GPR04: c00000028fcccdd0 c00000028fce3798 80000000374060b8 ffffffffffffffff >>>> GPR08: 0000000000000000 c000000000d435ec 000000028ef90000 0000000000002717 >>>> GPR12: 0000000000000000 c00000000e734980 c0000000001215d8 c0000002886996c0 >>>> GPR16: 0000000000000000 0000000000000020 c0000002813d83f8 0000000000000001 >>>> GPR20: 0000000020000000 0000000000002000 0000000000000002 c0000002813dc808 >>>> GPR24: 0000000000000003 0000000000000001 c00000027f5a5c20 c0000002813dced0 >>>> GPR28: c00000027f5a5d90 c00000027f5a5d90 c00000027f5a5c00 c0000002813dc7f8 >>>> NIP [c000000000511040] __list_add_valid+0x70/0xb0 >>>> LR [c00000000051103c] __list_add_valid+0x6c/0xb0 >>>> Call Trace: >>>> [c0000000fc293ac0] [c00000000051103c] __list_add_valid+0x6c/0xb0 (unreliable) >>>> [c0000000fc293b20] [d0000000051f1a08] qla24xx_async_gnl+0x108/0x420 [qla2xxx] >>>> [c0000000fc293bc0] [d0000000051e762c] qla2x00_do_work+0x18c/0x8c0 [qla2xxx] >>>> [c0000000fc293ce0] [d0000000051e8180] qla2x00_relogin+0x420/0xff0 [qla2xxx] >>>> [c0000000fc293dc0] [c00000000012172c] kthread+0x15c/0x1a0 >>>> [c0000000fc293e30] [c00000000000b4e8] ret_from_kernel_thread+0x5c/0x74 >>>> Instruction dump: >>>> 41de0018 38210060 38600001 e8010010 7c0803a6 4e800020 3c62ffae 7d445378 >>>> 38631748 7d254b78 4bc5f51d 60000000 <0fe00000> 3c62ffae 7cc43378 386316f8 >>>> ---[ end trace a41bc8bd434657f1 ]--- >>>> >>>> Kernel panic - not syncing: Fatal exception >>>> Dumping ftrace buffer: >>>> (ftrace buffer empty) >>>> Rebooting in 10 seconds.. >>>> >>>> This trace back to the below code path: >>>> >>>> # gdb -batch vmlinux -ex 'list *(0xc000000000511040)' >>>> 0xc000000000511040 is in __list_add_valid (lib/list_debug.c:29). >>>> 24 "list_add corruption. next->prev should be prev (%p), but was %p. (next=%p).\n", >>>> 25 prev, next->prev, next) || >>>> 26 CHECK_DATA_CORRUPTION(prev->next != next, >>>> 27 "list_add corruption. prev->next should be next (%p), but was %p. (prev=%p).\n", >>>> 28 next, prev->next, prev) || >>>> 29 CHECK_DATA_CORRUPTION(new == prev || new == next, >>>> 30 "list_add double add: new=%p, prev=%p, next=%p.\n", >>>> 31 new, prev, next)) >>>> 32 return false; >>>> 33 >>> >>> (+linux-scsi) >>> >>> Hello Abdul, >>> >>> Please report SCSI LLD issues on the linux-scsi mailing list. >>> >>> Bart. >> >> We have fixed this issue with following patch >> >> https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git/commit/?h=4.16/scsi-queue&id=5d3300a9b8b122b4743aed5a178bf12c87e2b8c9 >> >> Can you apply this on your setup and retry your test. > > I see the patch already applied to next-20171211 and the problem is also > seen with the patch. > > I am attaching the kernel configuration file. > > Regard's > Abdul > > Looks like these 3 patches should help with the issue. Can you check if your Tree has these applied https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git/commit/?h=4.16/scsi-queue&id=6d67492764b39ad6efb6822816ad73dc141752f4 https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git/commit/?h=4.16/scsi-queue&id=3dbec59bdf63f3c82323bd6ab8a4bd2946abaaec https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git/commit/?h=4.16/scsi-queue&id=3dbec59bdf63f3c82323bd6ab8a4bd2946abaaec if not then please pull them on your tree and retest. Thanks, - Himanshu