Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755538AbdDFUoW (ORCPT ); Thu, 6 Apr 2017 16:44:22 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:52077 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752469AbdDFUoN (ORCPT ); Thu, 6 Apr 2017 16:44:13 -0400 Subject: Re: WARN @lib/refcount.c:128 during hot unplug of I/O adapter. To: Sachin Sant , linuxppc-dev@ozlabs.org References: Cc: LKML , "nfont >> Nathan Fontenot" From: Tyrel Datwyler Date: Thu, 6 Apr 2017 13:44:07 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 x-cbid: 17040620-0012-0000-0000-00001402A7ED X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00006889; HX=3.00000240; KW=3.00000007; PH=3.00000004; SC=3.00000208; SDB=6.00844014; UDB=6.00415954; IPR=6.00622244; BA=6.00005274; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00014940; XFM=3.00000013; UTC=2017-04-06 20:44:11 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17040620-0013-0000-0000-00004CCC5C13 Message-Id: X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-04-06_14:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=2 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1702020001 definitions=main-1704060168 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3966 Lines: 59 On 04/06/2017 03:27 AM, Sachin Sant wrote: > On a POWER8 LPAR running 4.11.0-rc5, a hot unplug operation on > any I/O adapter results in the following warning > > This problem has been in the code for some time now. I had first seen this in > -next tree. > > [ 269.589441] rpadlpar_io: slot PHB 72 removed > [ 270.589997] refcount_t: underflow; use-after-free. > [ 270.590019] ------------[ cut here ]------------ > [ 270.590025] WARNING: CPU: 5 PID: 3335 at lib/refcount.c:128 refcount_sub_and_test+0xf4/0x110 > [ 270.590028] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc rpadlpar_io rpaphp kvm_pr kvm ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter dccp_diag dccp tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag ghash_generic xts gf128mul vmx_crypto tpm_ibmvtpm tpm sg pseries_rng nfsd auth_rpcgss nfs_acl lockd grace sunrpc binfmt_misc ip_tables xfs libcrc32c sr_mod sd_mod cdrom ibmvscsi ibmveth scsi_transport_srp dm_mirror dm_region_hash dm_log dm_mod > [ 270.590076] CPU: 5 PID: 3335 Comm: drmgr Not tainted 4.11.0-rc5 #3 > [ 270.590079] task: c0000005d8df8600 task.stack: c0000000fb3a8000 > [ 270.590081] NIP: c000000001aa3ca4 LR: c000000001aa3ca0 CTR: 00000000006338e4 > [ 270.590084] REGS: c0000000fb3ab8a0 TRAP: 0700 Not tainted (4.11.0-rc5) > [ 270.590087] MSR: 8000000000029033 > [ 270.590090] CR: 22002422 XER: 00000007 > [ 270.590093] CFAR: c000000001edaabc SOFTE: 1 > [ 270.590093] GPR00: c000000001aa3ca0 c0000000fb3abb20 c0000000025ea900 0000000000000026 > [ 270.590093] GPR04: c00000077fc4ada0 c00000077fc617b8 00000000000f0c33 0000000000000000 > [ 270.590093] GPR08: 0000000000000000 c00000000227146c 000000077d9e0000 0000000000003ff0 > [ 270.590093] GPR12: 0000000000002200 c00000000e802d00 0000000000000000 0000000000000000 > [ 270.590093] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > [ 270.590093] GPR20: 0000000000000000 000000001001b5a8 0000000010018338 0000000010016650 > [ 270.590093] GPR24: 000000001001b278 c000000776e0fdcc 0000000010016650 0000000000000000 > [ 270.590093] GPR28: c00000077ffea910 c0000000fbf79180 c000000776e0fdc0 c0000000fbf791d8 > [ 270.590126] NIP [c000000001aa3ca4] refcount_sub_and_test+0xf4/0x110 > [ 270.590129] LR [c000000001aa3ca0] refcount_sub_and_test+0xf0/0x110 > [ 270.590132] Call Trace: > [ 270.590134] [c0000000fb3abb20] [c000000001aa3ca0] refcount_sub_and_test+0xf0/0x110 (unreliable) > [ 270.590139] [c0000000fb3abb80] [c000000001a8221c] kobject_put+0x3c/0xa0 > [ 270.590143] [c0000000fb3abbf0] [c000000001d22d34] of_node_put+0x24/0x40 > [ 270.590147] [c0000000fb3abc10] [c00000000165c874] ofdt_write+0x204/0x6b0 > [ 270.590151] [c0000000fb3abcd0] [c00000000197a220] proc_reg_write+0x80/0xd0 > [ 270.590155] [c0000000fb3abd00] [c0000000018de680] __vfs_write+0x40/0x1c0 > [ 270.590158] [c0000000fb3abd90] [c0000000018dffd8] vfs_write+0xc8/0x240 > [ 270.590162] [c0000000fb3abde0] [c0000000018e1c40] SyS_write+0x60/0x110 > [ 270.590165] [c0000000fb3abe30] [c0000000015cb184] system_call+0x38/0xe0 > [ 270.590168] Instruction dump: > [ 270.590170] 7863d182 4e800020 7c0802a6 39200001 3d42fff8 3c62ffb1 386371a8 992a0171 > [ 270.590175] f8010010 f821ffa1 48436de1 60000000 <0fe00000> 38210060 38600000 e8010010 > [ 270.590180] ---[ end trace 08c7a2f3c8bead33 ]— > > Have attached the dmesg log from the system. Let me know if any additional > information is required to help debug this problem. I remember you mentioning this when the issue was brought up for CPUs. I assume the case is the same here where the issue is only seen with adapters that were hot-added after boot (ie. hot-remove of adapter present at boot doesn't trip the warning)? -Tyrel > > Thanks > -Sachin > >