Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753787AbbLJTpg (ORCPT ); Thu, 10 Dec 2015 14:45:36 -0500 Received: from mail-bl2on0078.outbound.protection.outlook.com ([65.55.169.78]:47232 "EHLO na01-bl2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753028AbbLJTnz (ORCPT ); Thu, 10 Dec 2015 14:43:55 -0500 Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=David.Daney@caviumnetworks.com; Message-ID: <5669D5F2.5050004@caviumnetworks.com> Date: Thu, 10 Dec 2015 11:43:46 -0800 From: David Daney User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130625 Thunderbird/17.0.7 MIME-Version: 1.0 To: Will Deacon , Davidlohr Bueso , "Peter Zijlstra (Intel)" , Thomas Gleixner , "Paul E. McKenney" , Ingo Molnar CC: Linux Kernel Mailing List , "linux-arm-kernel@lists.infradead.org" , "Pinski, Andrew" Subject: Commit 81a43adae3b9 (locking/mutex: Use acquire/release semantics) causing failures on arm64 (ThunderX) Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [64.2.3.194] X-ClientProxiedBy: SN2PR07CA005.namprd07.prod.outlook.com (10.255.174.22) To CY1PR07MB2136.namprd07.prod.outlook.com (25.164.112.14) X-Microsoft-Exchange-Diagnostics: 1;CY1PR07MB2136;2:7UpEHfE5mPc/4bgoC3E5sImPVx1ojy4CsF2nNElfNQc4lZxyuUi3C5qf9UZgEv4Oxy887+T2WPlWMchZQX3ZrfR3lNC+V/+uzNjSMyYDD/gzt840lwXKvGJ/x4H56pPAMQXxK8D/s3Ekfk5ayuYlDg==;3:U2ZS60Lvohte520NXqmFG4C3xner0fKh2qK1nUrYBLGUG1UG/XqaZ+OcttopbE/yN5HjW1ZdlC057i6Rc4hIY+KwoXqxrD8xQyYs5+4ruZdRBF/2czwRzhKXlK8gsXfS;25:CaIw7XwbT7N27852r5OtpD7s23/m07SgE8e0Wql2OHt3ugdfkZ5SI4zGyBUxBsi7iGZ8W2Y9Uno5zDFrtGZpyXQ84zMQNk2nvjCzMNn4g0NHP6qhfml+EuZ5+ykJT5/byfHYqhMAOwZYgeNgK63nqhxobsuTmMXPXbcLND+IEpJ0hRkhlTOOw7JhTm5TCprSb/Ic3s6zIvaaFqwevLDNyMEWqcHtlbJLsBaonD/1UAZi9ibqFTix/7CFH74CdWU+a1Kwg6PZ0fjl/tj7xDc8tQ== X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:CY1PR07MB2136; X-Microsoft-Exchange-Diagnostics: 1;CY1PR07MB2136;20:u7R1Hlw91/SS9qjBFWJ9kpbDaK6sW+ZTdFP2EoxRDVwLIWpC/eJEw+LAZ5hL6IAIm8vol4aNZjiXD+TvSjtlLzLdClXdOwIcgJEcOqkbExJHZhM5jSut7f9W0CqAu9T6dovqJb1xUXP4C1p/gsBaTqWMGyyT/YKWRE3/Jv3dTE/zVKEydAOl7w15aVyCANTyv9WZnzalv3yz6szbxSNhi2kSGOwi8q3lJ11nAJtACszqGt6prS2UU2i/KHxxd64bMQbzCBapmx5mXB6vDwAqm6KzxGqwSFWDxdezjZr6GGQr2GDaLAscfyPHi7ePkOr5Ax3PniMpLwRT9qsthEk4NV1axLUyxvF/G0PKyPuREft4wtVKwtkkGoVXYkEzyfbI7j/Ai0jx3KVj4x/2b9US0job9Ugyz90cyoNWSPBBqvb8pRj3hYNRh5uuZ3YianPlYzCGBmVPalDBy/WaTTbi8xs3ohPinjHeSCfmn+LF5X1sfMPmAx8vwgJoUUN5RBa5wsJUtxfJ+rs8L/dj6fLJk2y148giw6lhKShhzrG6ROw8CQPWIb9rNQ7DC2le1eaFjy4SrgAqxF7VLP+gXHfvwvaig0i37SzWmYSLhyYM6W8= X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(2401047)(520078)(8121501046)(5005006)(10201501046)(3002001);SRVR:CY1PR07MB2136;BCL:0;PCL:0;RULEID:;SRVR:CY1PR07MB2136; X-Microsoft-Exchange-Diagnostics: 1;CY1PR07MB2136;4:IC9jvL3XdHSuuWNKq6tcHp41m9MBHIiSkURd3bvgQBJ/sIH2trYOimPbSnaanw3iXP4ctaixASl5o/dDrEhYrnspOeIxRGXC9OYExL7A+O5eFG0qrJDil7WAA7+n6dnVhFWo9NjTNokQTGxynlUcjC+pd2qbidBTXn1D8Nr8b+GQkBAB1hEWkegD5ACDC7EWPvjgqAHkn1Wwr6qci67St+t9u1L/XB7qbDlD6ybZ1TG/ffEv//Lz1mbbHRISuxe2M4kbkI9zs7WvBxLYHE8zbVq97u2CcrmG9HQz8/oN+AHLxWLUhaRe1B0GNV1I5D7kZprWewGAyYnK5tZKdrVdL6sELl09YWCaR/vO3XZxfROr4bMUUcRDOcQG+KP+RG/v X-Forefront-PRVS: 078693968A X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10009020)(6009001)(164054003)(199003)(189002)(77096005)(87976001)(97736004)(23756003)(40100003)(122386002)(4001430100002)(101416001)(47776003)(5004730100002)(50466002)(586003)(3846002)(65816999)(50986999)(66066001)(69596002)(92566002)(105586002)(33656002)(1096002)(54356999)(5001770100001)(106356001)(87266999)(5001960100002)(80316001)(107886002)(5008740100001)(230700001)(4001350100001)(42186005)(6116002)(229853001)(53416004)(65956001)(81156007)(575784001)(189998001)(59896002)(83506001)(65806001)(36756003)(64126003);DIR:OUT;SFP:1101;SCL:1;SRVR:CY1PR07MB2136;H:dl.caveonetworks.com;FPR:;SPF:None;PTR:InfoNoRecords;A:1;MX:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?iso-8859-1?Q?1;CY1PR07MB2136;23:A3cyIC5mO7qShksFSVJozrcUMB1GtsIQUYnmTEc?= =?iso-8859-1?Q?QoJ781dUswuIORIr9vjLNWgWNvHeCgYfdQRFc3sbRRwQMqZwlcTmuFA0zJ?= =?iso-8859-1?Q?2BOYJ4UxLxGLfRTEMCFkFMhU0Ft/iNVZao6ZmXq7PzNjVQ9TxEft4Jnahf?= =?iso-8859-1?Q?XpON5d/m1AQOLgFrh83XvuGsmJmBXMC6arrVYwew51eJye/XZVoTEQzWtO?= =?iso-8859-1?Q?S2HH9K+/KidLvbLvN4WBONwFgomL62T9bmQUJrN4l/vTydBgOrToNDIsMf?= =?iso-8859-1?Q?3nZfW42y7h/mVnxlZYTuKFIrUPQx5oqISWkSQY/DorEv3lIwMNcIrLhp9G?= =?iso-8859-1?Q?MeMCxRm5jAXy2LdnqtS7kkeVMt9PjEUVzuRk+ckpLhr3ze2nHsUtdGTcdB?= =?iso-8859-1?Q?fhmY0xiY8Q0mhveOHVKkQ8dundEBTiYpIFhK5vGf06jk8gh97oQmKjM22u?= =?iso-8859-1?Q?6Fm9Z6QFr7rzsTdBAKg8714/koTtCxV1K6O93MvXE3I9qMTy2CMAB8/lQb?= =?iso-8859-1?Q?qiWSKWGi8ti+zYe5OsU7YD89lIqerCarYRPTCnetohOSxOG+Fb/kzTlwQV?= =?iso-8859-1?Q?Z5UbUkMtVb+6SfKyqTIaJIXSasurXUZpB4Q+aoyvIW8q3JOZm+Ttwgmqu5?= =?iso-8859-1?Q?CtDcokHW2hDed+ly327NC4X9GfG/KVx/pqq5vuhrFokqe9aOM5XtHUtQYU?= =?iso-8859-1?Q?9H4ZUXaVx3ebZU1a4Fs6cPTxhfoY6BRHhyNqQ9pjgfAN3Wp5Vc0Dx1AJ7Q?= =?iso-8859-1?Q?z51AAmUfwLgYKN1S2f+Wh6BOzssxp6Zy0l0gwZ6f95YAnsn4xtVhOsJYKC?= =?iso-8859-1?Q?hANXMnx7aWCfXX2zxr08/eqBlN7aImw8PNCB1SiojcK0BPYIaCsgKoriCE?= =?iso-8859-1?Q?z6i7uaIFb+Q0lalvpa2KguKOXvv/lEOHGuYhAOx4W8ZjTvhziN4DBRpFYV?= =?iso-8859-1?Q?tpj6UQ2/lrzb+y0XAxXdviMQQc9JDyLXU177rAmgJdvEotlAK0rLEYLn6h?= =?iso-8859-1?Q?/bMW6+og4P8+ci1M9ezsovGaJDaxFerNcs4B+JyD/ujoX1Zia1/7TnbIcu?= =?iso-8859-1?Q?2zJ31tqfNfhptyfXBSyS2sHQsz+hql08YOgNxvU2Gb8TCi4SsL+ul8TSBX?= =?iso-8859-1?Q?FvMAt60jEyVgrrFWykQjkCYoWybXibjKBqevuWe6RQ31ej6V2MJvzNWa7r?= =?iso-8859-1?Q?fyR3+RBKMGdNJP8Q156skmTs4qLStRxXhSHOiwB9MCjZVFyaZfNbIacaNQ?= =?iso-8859-1?Q?gLc7gyrnmRZfsu/jsrE6psRnkK7YsDkLTyBYA54Epm67YFcOKB2KSYItRq?= =?iso-8859-1?Q?vE=3D?= X-Microsoft-Exchange-Diagnostics: 1;CY1PR07MB2136;5:c8i9OfeFidwNUaDcXaMSutKAthBBDnvvHZo8okzjEmNzLLdN3LuqLt1JIl8H6wt+bwsAQzuWuNVo7hFlJJQ/Ezbdxw20Qi3QNiu7k8FKLOPeZniKSBYINCgElNICkQdiUU97BhrwdQjCEXhZmrFo4Q==;24:eK+YgXkRkeuNnFevkUvfjyodFzprdSY/CcdPszzAb1Z0TeViwrB/Bjk1ObvhxIV5ZRLs4r0p906GHGTf6ibB7ZsWE8kO1mkszriFNGEWszk= SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: caviumnetworks.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 10 Dec 2015 19:43:49.3539 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY1PR07MB2136 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3750 Lines: 85 Hi, We are getting soft lockup OOPs on Cavium CN88XX (A.K.A. ThunderX), which is an arm64 implementation. A typical failure shows multiple threads stuck in mutex operations like this: . . . [ 68.909873] Task dump for CPU 18: [ 68.909876] systemd-udevd R running task 0 537 534 0x00000002 [ 68.909877] Call trace: [ 68.909880] [] dump_backtrace+0x0/0x17c [ 68.909883] [] show_stack+0x24/0x2c [ 68.909885] [] sched_show_task+0xb0/0x104 [ 68.909888] [] dump_cpu_task+0x48/0x54 [ 68.909890] [] rcu_dump_cpu_stacks+0x9c/0xec [ 68.909893] [] rcu_check_callbacks+0x524/0xa18 [ 68.909896] [] update_process_times+0x44/0x74 [ 68.909899] [] tick_sched_timer+0x78/0x1ac [ 68.909901] [] __hrtimer_run_queues+0x148/0x2d4 [ 68.909903] [] hrtimer_interrupt+0xb0/0x1f4 [ 68.909906] [] arch_timer_handler_phys+0x3c/0x48 [ 68.909909] [] handle_percpu_devid_irq+0xb0/0x1b0 [ 68.909912] [] generic_handle_irq+0x34/0x4c [ 68.909914] [] __handle_domain_irq+0x90/0xfc [ 68.909916] [] gic_handle_irq+0x90/0x18c [ 68.909918] Exception stack(0xfffffe03f14e3920 to 0xfffffe03f14e3a40) [ 68.909921] 3920: fffffe03fd5c5800 fffffe0000c55800 fffffe03f14e3a80 fffffe00000dabd8 [ 68.909924] 3940: 00000000a0000145 0000000000000015 fffffe03e9602400 fffffe00002fddb0 [ 68.909927] 3960: 0000000000000000 0000000000000000 fffffe03fd5c5810 fffffe03f14e0000 [ 68.909929] 3980: 0000000000000001 ffffffffff000000 fffffe03db307e38 0000000000000000 [ 68.909932] 39a0: 0000000000737973 00000000ffffffff 0000000000000000 000000003b364d50 [ 68.909935] 39c0: 0000000000000018 ffffffffa99641af 0016fd71b6000000 003b9aca00000000 [ 68.909937] 39e0: fffffe00001f1508 000003ff9b9fd028 000003ffed7a0a10 fffffe03fd5c5800 [ 68.909940] 3a00: fffffe0000c55800 fffffe0000cea1c8 fffffe03fd5a5800 fffffe0000ca2eb0 [ 68.909943] 3a20: 0000000000000015 fffffe03e9602400 fffffe0000cea1c8 fffffe0000712000 [ 68.909945] [] el1_irq+0x68/0xd8 [ 68.909948] [] mutex_optimistic_spin+0x9c/0x1d0 [ 68.909951] [] __mutex_lock_slowpath+0x44/0x158 [ 68.909953] [] mutex_lock+0x54/0x58 [ 68.909956] [] kernfs_iop_permission+0x38/0x70 [ 68.909959] [] __inode_permission+0x88/0xd8 [ 68.909961] [] inode_permission+0x30/0x6c [ 68.909964] [] link_path_walk+0x68/0x4d4 [ 68.909966] [] path_openat+0xb4/0x2bc [ 68.909968] [] do_filp_open+0x74/0xd0 [ 68.909971] [] do_sys_open+0x14c/0x228 [ 68.909973] [] SyS_openat+0x3c/0x48 [ 68.909976] [] el0_svc_naked+0x24/0x28 . . . Reverting 81a43adae3b9 (locking/mutex: Use acquire/release semantics) Makes the problem go away. At this point it is unknown if this patch is incorrect, or if the underlying ARM64 atomic_*_{acquire,release} primitives are defective, or if the problem lies elsewhere. I am not requesting any specific action with this e-mail, but wanted to draw attention to the issue. Undoubtedly we will be able to provide more detailed information about the issue in the coming days. Thanks, David Daney -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/