Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp2462502imm; Wed, 3 Oct 2018 04:22:51 -0700 (PDT) X-Google-Smtp-Source: ACcGV61wrqFqMBgr6yIlVVXxRL4FGCHzORXR5KRfA8Z5iPDcMfNbC0e2OLEBcky8e156GuNtHg3k X-Received: by 2002:a62:cd45:: with SMTP id o66-v6mr1154308pfg.12.1538565771255; Wed, 03 Oct 2018 04:22:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1538565771; cv=none; d=google.com; s=arc-20160816; b=TE1zwXfd7KYEoAnwXyvdrwj2KMaQQJZs4aG12fSdEFtpmtNT+ePt+hwOzsFqQH4IgD u8I1ejct1PVlKSkbNQyX/ydt7p3h3ULngQ3FmZYlj56uBUd66OFC1zy6cWvjIcBDwiXN jOjPgrA3qIIJ8YwvQQ8p0eeIeWWxvYzoB/SYLXPnRxIb9EKFBfo/A8SECvrxy5jOKlG9 AtEB166HAqcjM8KNTTgSvhWCNdiGrmFgXs7SvioKQTqtefZxUfDrPCc0Y5iNR61IvjQ5 Ea0e+kdqI7GYRvPPTg0oy4Yze078oa5aIY4JmLiujoVqfiNv807UecbBf1q5X0yIPCia fAKA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:to:cc:date:message-id:subject :mime-version:content-transfer-encoding:from:dkim-signature; bh=TqeilqPwq4z5YIWZUODyPGsRIUPIqVRpjTVQpzOwNs0=; b=M14i9GkRBZp66++AJeCv/9ACZzv920HxvQskY7tYFB1ZBchqJkUMKcSKGt9ExR0z9E NqpuzvwPhwrdkW8jll+SINz7gKqRugw1ZKldVN/KfykCP5j2u1oinBky+tHXH/OiDwzT XfxGc2Pss5qOFI9JKY0vMx5uKFEJB6Q1eAjJea6y9qCIxsc+bBqSwA4nlut8Iolmc2p+ LsT1sVVz1pOXGI7WJbyonZ2mVDWKQ3KQbNU5AhhkNvs89LqOaYCRNrsVxmGT3jRoOt1m E109A1VIuUopDMwtBPCEtpkqXNMB1pu9H6IVUBh6yynM/Uf2+0d9mJ7hOTLquS5OQeX6 l6pg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b=uMUJ4FJp; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id cf13-v6si1339820plb.334.2018.10.03.04.22.35; Wed, 03 Oct 2018 04:22:51 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b=uMUJ4FJp; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726842AbeJCSI5 (ORCPT + 99 others); Wed, 3 Oct 2018 14:08:57 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:43324 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726617AbeJCSI4 (ORCPT ); Wed, 3 Oct 2018 14:08:56 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w93BIXnS104740; Wed, 3 Oct 2018 11:20:56 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : content-type : content-transfer-encoding : mime-version : subject : message-id : date : cc : to; s=corp-2018-07-02; bh=TqeilqPwq4z5YIWZUODyPGsRIUPIqVRpjTVQpzOwNs0=; b=uMUJ4FJpy9CjDquiq8cUT/twK3Lij2L9s92dzRR/X1ZtN8/pqmN8DlBc0h/VVhG9afMd j6ex70Maqi7lSl8lidSeLfLTz8vqWQbBnR2OT0/GHqWCx/GgMCUP2Yt80cOm8bWUkQ7w tv3XlkNJYXUtMdtnLDtvqgchU3j9CBJRx4OWIARAChbyIrOBaLK/UvQFsCoKDxENDcUN GGftcumxNF/ZYqIrOtNsSviswzercdcGgx3d1Z5k+ybTgKoAVED60r/zmqZrd7BTMazD 0UPgJVxngv4z6XnBR87SVZmhppe/5d9OBuQYQZRXTFwd0Wuny/l+3hyjDW/e+dQ4IcBG wg== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by userp2130.oracle.com with ESMTP id 2mt0ttu818-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 03 Oct 2018 11:20:56 +0000 Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id w93BKnFi013607 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 3 Oct 2018 11:20:49 GMT Received: from abhmp0005.oracle.com (abhmp0005.oracle.com [141.146.116.11]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id w93BKm4k020832; Wed, 3 Oct 2018 11:20:48 GMT Received: from dhcp-10-172-157-159.no.oracle.com (/10.172.157.159) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 03 Oct 2018 11:20:47 +0000 From: =?utf-8?Q?H=C3=A5kon_Bugge?= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\)) Subject: Bug introduced by commit ebeeb1ad9b8a Message-Id: <8EEB4CE2-F6E5-4128-AB04-6326F8315E31@oracle.com> Date: Wed, 3 Oct 2018 13:20:44 +0200 Cc: Sowmini Varadhan , Santosh Shilimkar , "David S. Miller" , Ka-Cheong Poon , netdev@vger.kernel.or, OFED mailing list , rds-devel@oss.oracle.com, linux-kernel@vger.kernel.org, Yanjun Zhu To: Greg Kroah-Hartman X-Mailer: Apple Mail (2.3445.9.1) X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9034 signatures=668707 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=3 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1810030113 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Greg, I hope you will find this note appropriate. The stable cherry-pick of upstream commit ebeeb1ad9b8a ("rds: tcp: use = rds_destroy_pending() to synchronize netns/module teardown and rds = connection/workq management") provokes the following stack trace when = running with debug: kernel: BUG: sleeping function called from invalid context at = kernel/locking/mutex.c:748 kernel: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D kernel: in_atomic(): 1, irqs_disabled(): 0, pid: 4392, name: rds-stress kernel: 1 lock held by rds-stress/4392: kernel: #0: 00000000df837d5e kernel: WARNING: suspicious RCU usage kernel: 4.18.8 #1 Not tainted kernel: ----------------------------- kernel: ./include/linux/rcupdate.h:303 Illegal context switch in RCU = read-side critical section! kernel: ( kernel: #012other info that might help us debug this: kernel: #012rcu_scheduler_active =3D 2, debug_locks =3D 1 kernel: rcu_read_lock){....} kernel: 1 lock held by rds-stress/4393: kernel: #0: kernel: , at: __rds_conn_create+0x604/0x960 [rds] kernel: 00000000df837d5e kernel: CPU: 38 PID: 4392 Comm: rds-stress Not tainted 4.18.8 #1 kernel: Hardware name: Oracle Corporation ORACLE SERVER X5-2L/ASM,MOBO = TRAY,2U, BIOS 31110000 03/03/2017 kernel: (rcu_read_lock kernel: Call Trace: kernel: ){....} kernel: dump_stack+0x81/0xb8 kernel: , at: __rds_conn_create+0x604/0x960 [rds] kernel: #012stack backtrace: kernel: ___might_sleep+0x239/0x260 kernel: __might_sleep+0x4a/0x80 kernel: __mutex_lock+0x58/0x9c0 kernel: ? __lock_acquire+0x47f/0x7e0 kernel: ? pcpu_alloc+0x429/0x860 kernel: ? find_held_lock+0x40/0xb0 kernel: ? create_object+0x22f/0x320 kernel: ? _raw_write_unlock_irqrestore+0x36/0x60 kernel: mutex_lock_killable_nested+0x1b/0x20 kernel: pcpu_alloc+0x429/0x860 kernel: ? create_object+0x22f/0x320 kernel: __alloc_percpu+0x15/0x20 kernel: rds_ib_recv_alloc_cache+0x1c/0x80 [rds_rdma] kernel: rds_ib_recv_alloc_caches+0x1d/0x60 [rds_rdma] kernel: rds_ib_conn_alloc+0x46/0x170 [rds_rdma] kernel: __rds_conn_create+0x68d/0x960 [rds] kernel: ? __rds_conn_create+0x604/0x960 [rds] kernel: rds_conn_create_outgoing+0x14/0x20 [rds] kernel: rds_sendmsg+0x2e8/0xcd0 [rds] kernel: ? copy_msghdr_from_user+0xdb/0x140 kernel: sock_sendmsg+0x38/0x50 kernel: ___sys_sendmsg+0x27b/0x290 kernel: ? __lock_acquire+0x47f/0x7e0 kernel: ? find_held_lock+0x40/0xb0 kernel: ? __audit_syscall_entry+0xdf/0x160 kernel: ? ktime_get_coarse_real_ts64+0x6e/0xe0 kernel: ? trace_hardirqs_on_caller+0x128/0x1b0 kernel: ? trace_hardirqs_on+0xd/0x10 kernel: ? __audit_syscall_entry+0xdf/0x160 kernel: ? __audit_syscall_entry+0xdf/0x160 kernel: __sys_sendmsg+0x5d/0xb0 kernel: __x64_sys_sendmsg+0x1f/0x30 kernel: do_syscall_64+0x5f/0x220 kernel: entry_SYSCALL_64_after_hwframe+0x49/0xbe Command line: $ rds-stress -r & sleep 1; rds-stress -r -s = -T 10 Deliberately or accidently, Ka-Cheong's commit f394ad28feff ("rds: = rds_ib_recv_alloc_cache() should call alloc_percpu_gfp() instead") fixes = the bug introduced by commit ebeeb1ad9b8a. Kudos to Zhu Yanjun who = quickly detected this. But be aware, commit f394ad28feff does not contain the "Fixes:" tag. Hence, I suggest that in all stable releases containing commit = ebeeb1ad9b8a, f394ad28feff must be included as well. Thxs, H=C3=A5kon