Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp1257177imm; Sat, 8 Sep 2018 21:01:15 -0700 (PDT) X-Google-Smtp-Source: ANB0VdaHNHuVDFKQppOvSYNLL8oO7aHml9BqbTp2DVL35IaAl+kLyOmTWZmrHjcW+mjXA3qV0NsV X-Received: by 2002:a65:6455:: with SMTP id s21-v6mr16214181pgv.25.1536465675771; Sat, 08 Sep 2018 21:01:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536465675; cv=none; d=google.com; s=arc-20160816; b=EM1dHgVtvpksx+xHV0MkGPHgE/UU+qMkaM65fdnnd300FZeCm2betkNZTz1oc4a7dE 9xQZATzbceuh4RvLt305se2PeBKN7BzdgisX3SZJVKaWuz0IYCv4V9xW6pKGu5CMS/1y JGd3HnULIxLTWJON0Oj87ZjBHESCAbWq4nBaivMYb5Wxr1pUow1W7ax8wkUWJBRkrN1G 3DsQBgElKdfjU8qpHoUy3ddNA3p4HOmRZruFh5/qnBYzbGKdrriMS6uj66zw+Iyngev8 WLW8I9q9eDWGlGxL0HIJG/vxaKAVNYaYMYL3WxYFCbMnmcjcbRp2Bkh7jdZZ+TD/iQ8y aR4g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature; bh=psPoydgIbybMqu35uTV1t7Yblz2X139lg8lZPKMgRqY=; b=Wg3azk6zs8HN/xbICRQ85iShEZEnKVnlRUVBF0sspZH9eFMmtg0RSgYqOKllwnKkmt ep/GUxz7sdWyiDUKsxODXxa1oYZYtTzH/q8aEyfkD9TuqGOsTuQWn8KoWg7NIJT8IVwj gCYKxjeIM0YiXol0ddq5A8ThokRtcZQLcLb75wx1BBtlnjEex3jKUj6lG9H3hE2uNIeY qoICdCvy6eOiZpBuSaSQDsDLekwYP5Am8LX253sNPebrpUR7N0T+BxNRZI9jRd5Gi2dA n/P8RHGCy8+rzNqHWlASgM7k3Qe7zJvSTNFUyeiQBNMxcL2J3gNPyf+YeRmjEX5JyhXq Z8wA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=T82n2A4E; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m37-v6si12709373pla.236.2018.09.08.21.00.28; Sat, 08 Sep 2018 21:01:15 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=T82n2A4E; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726650AbeIIIqg (ORCPT + 99 others); Sun, 9 Sep 2018 04:46:36 -0400 Received: from mail-pg1-f194.google.com ([209.85.215.194]:39081 "EHLO mail-pg1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726489AbeIIIqg (ORCPT ); Sun, 9 Sep 2018 04:46:36 -0400 Received: by mail-pg1-f194.google.com with SMTP id i190-v6so8821654pgc.6; Sat, 08 Sep 2018 20:58:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=psPoydgIbybMqu35uTV1t7Yblz2X139lg8lZPKMgRqY=; b=T82n2A4EapwGqGWk6m3MBmKkn1aaU5Z+XaefmiYNy3YNP64maBeriK5YD2UioWObBh +lpf0R++xoFdk6leu3fpMIaAVRzxwi50dI4EW6qEIggg+4SiM3SNUMdM9IVdvsT10VSW giO1JjTOfDcxbgJrzXjZ0gtzhbLWn30+SW4ClhV94fzNBUj38oZ47/UI4if5hsd+tQaz sfeddjEL761040rX4RVBsWPbAqTUrWh68jLzBP3YyGyyt9L+o9uK63gII6w64ML5UM7/ EZpszA1/oc5r8vIns/sNq7NDgaUin8PBhfVzkfWfjNCrCNjCTf3wYhdv909hPU/xyPxD jY3A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:subject:to:cc:references:from:message-id :date:user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=psPoydgIbybMqu35uTV1t7Yblz2X139lg8lZPKMgRqY=; b=WiyNBZSMzu411c2iLNtaBrx+Gvf65DKUXQbh1oF4F7Uu7/NNn6+BsTsuA0hkK/Z/vU BTfrUd9vZQm5iQuS32YvWoZB+uRI9AabLqJk0vkW8o9vCeOX+bGe5uI9Zhw45XLJmxKV dgF1B5DPFJi60gkSGWWeTU3Kbm8R4AwEkt5Cq6DW1A2Yb7Q7wQvA5ZYKDYdifM2K6NyV 7+JDi4y/a7I0yDNzEaazORjPf9YpixKe+hXXPSM/u499GfRbia3mGlsSPhmRAYMvMu9J sjN46KTpDjsuBUMPX7/TiHM4DxoIJ6HJ038eeIh6xwgBvgmyOI4XHDIFgzQwybU4dFdc F/zA== X-Gm-Message-State: APzg51Bzw6L+l1t4Ksd3p3qXhuwNY/qt8vUtd/qNPAMsrYnY0ZhVD6Xg ZniU5saW6GH1GtD5jrMR1pUdOJGH X-Received: by 2002:a65:6292:: with SMTP id f18-v6mr15720800pgv.85.1536465507170; Sat, 08 Sep 2018 20:58:27 -0700 (PDT) Received: from server.roeck-us.net (108-223-40-66.lightspeed.sntcca.sbcglobal.net. [108.223.40.66]) by smtp.gmail.com with ESMTPSA id b64-v6sm15981599pfg.66.2018.09.08.20.58.25 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 08 Sep 2018 20:58:26 -0700 (PDT) Subject: Re: [PATCH 4.18 000/123] 4.18.6-stable review To: Linus Torvalds Cc: Greg Kroah-Hartman , Linux Kernel Mailing List , Andrew Morton , Shuah Khan , patches@kernelci.org, Ben Hutchings , lkft-triage@lists.linaro.org, stable References: <20180903165719.499675257@linuxfoundation.org> <20180904162434.GA16396@roeck-us.net> <20180905090110.GC30538@kroah.com> <7d4d11ab-c769-44b4-0037-d1be7f45e2c8@roeck-us.net> From: Guenter Roeck Message-ID: Date: Sat, 8 Sep 2018 20:58:24 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 09/05/2018 10:01 AM, Linus Torvalds wrote: > On Wed, Sep 5, 2018 at 8:34 AM Guenter Roeck wrote: >> >> On 09/05/2018 02:01 AM, Greg Kroah-Hartman wrote: >>>> --- >>>> [ 9990.754641] watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [kworker/5:1:155] >>>> [ 9990.762601] RIP: 0010:smp_call_function_many+0x208/0x270 >>>> [ 9990.762601] Code: e8 0d d1 77 00 3b 05 cb f0 24 01 0f 83 86 fe ff ff 48 63 d0 49 8b 0c 24 48 03 0c d5 00 f7 11 a7 8b 51 18 83 e2 01 74 0a f3 90 <8b> 51 18 83 e2 01 75 f6 eb c7 0f b6 4d d0 4c 89 f2 4c 89 ee 44 89 > > It's stuck in this loop: > > loop: > pause > mov 0x18(%rcx),%edx > and $0x1,%edx > jne loop > > which is csd_lock_wait(). > > Judging by the offset in smp_call_function_many(), it's the final one > (there's two: the other one is part of "csd_lock()"). But that's just > a guess. > > Anyway, it means that we're waiting for another CPU to finish > processing an IPI - either a previous one we sent asynchronously (if > it's the earlier csd_lock() case) or the TLB IPI we just sent and > we're waiting for completion of. > >> Not tested, but I see it in v4.17.19 and in v4.18.6-rc2. Turns out it is >> related to heavy load, not to suspend/resume. At this point I suspect that >> it may be an AMD/Ryzen specific problem - it looks like it disappears if I >> add "kernel.randomize_va_space = 0" to /etc/sysctl.conf. No idea if it is a >> CPU bug or some AMD specific code problem. I'll try to analyze it further. > > Ouch. Some IPI sending/receiving problem would be very very painful to > debug if it's hw related. > Turns out this is a well known problem with Ryzen CPUs: https://bugzilla.kernel.org/show_bug.cgi?id=196683 Guenter