Received: by 2002:a25:6193:0:0:0:0:0 with SMTP id v141csp624402ybb; Wed, 8 Apr 2020 06:47:53 -0700 (PDT) X-Google-Smtp-Source: APiQypKOGgsGXOhxj0tsFfbkJed8/8cO3LltMYGGw/2fDHFXpUCBKATcoRxVS3L3AcIG2e1MUg1K X-Received: by 2002:a05:6830:1348:: with SMTP id r8mr5519577otq.57.1586353673494; Wed, 08 Apr 2020 06:47:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1586353673; cv=none; d=google.com; s=arc-20160816; b=oIMZdBpByNQoKObugPuVFA+u44oxpx9MER60XeSdThdL/MU6EqILUZtHifg8z8Ljve 8O4zpWUCblhuwmYQKCIPZ+5MlsKoCwL5yO68LGLvHLEon1TiW0uKX78fwUsjNYmChK6z XJ1n8mC5Lt2GeZ1ARN+QwdZwge4CnLUPrGekyCcIrEISzIKVDOSZPGXR3c3gOb+bnpiW PFVkkJYAFKALHGC8hZBm7HGTtqDNHsRDeJQgsiP7I+5tkCDnLsnslvDgANZOxZKK8Yw/ WQazJ9aGDE8cg3wF2fbvJeFC3QrUSIwbl4IvruNNF/1LC+IYfOkWlsg7O1mL4CObPds+ PSng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:dkim-signature; bh=a55Nshtqc7e/xjVF/YuhI8vysvRE6/BHxyFHHtWQTJw=; b=LB5oqPJ31pz3R8ylEv1Dqb10I0ofLBuu8sPC0uqp+ejiceA4zhoA7HmI7EG4M4uNfN K2+alXw+1S4MU+wiViwsHYNr3oh7bUDEw35vQiUPDeWhYEyhYpUyX9IDaWgmaa37UXBB 4jqwjA2/ZUb0ZVgA+tyitx7Ud1nr79nVmGwNUpW/m/BRvafgrxGlBZYIME7Mvqa1wgNU fRTfFYN4P8k1JkvyTv3dWYETptAWgbSiD+zK5NSuOcUrJ3nODJqtbh398BUP4QNocP3i rsL9lhUw+yhW/i0c5oPRY84XSThidmIcKX7iDYMiXbyf+ZUZUpNLh+Cm35g/i/3q2Upz AJ5A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ellerman.id.au header.s=201909 header.b=MaMka7xA; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t5si2433821otl.193.2020.04.08.06.47.39; Wed, 08 Apr 2020 06:47:53 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@ellerman.id.au header.s=201909 header.b=MaMka7xA; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728725AbgDHMVW (ORCPT + 99 others); Wed, 8 Apr 2020 08:21:22 -0400 Received: from ozlabs.org ([203.11.71.1]:58909 "EHLO ozlabs.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728705AbgDHMVU (ORCPT ); Wed, 8 Apr 2020 08:21:20 -0400 Received: from authenticated.ozlabs.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mail.ozlabs.org (Postfix) with ESMTPSA id 48y3KV3x0wz9sTG; Wed, 8 Apr 2020 22:21:18 +1000 (AEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ellerman.id.au; s=201909; t=1586348478; bh=4rUz+H7v2Waw0n15vK+mmX8ZhtuacfaRbQ798xCdaM0=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=MaMka7xAc3QkSqtY2z/tAQFXfmV4ucoolb1YqB7wefK3LbcrOrxfX6f+CLllXsIuK lmZVWGu2izJKrnQfReazlBh48q/o6auoc12hucM6QxPR1ju+Gyi71q4qkEA3tv7OrA kGWbDF6LxWTs9R9CyQKRIQd2IdCOStaxL5kCklL8tIRpRqitBPYdrI90PkbHGAOgKp ehDhGwwFteQ6pwz2dDzSmkyyeNzlrbRvu5f5ZAlri2k93vHk0+NNUIbaToBShXTabn ipiEocP6c8sr74GvXc0BbiybpCI0B4XxuwKC7d0MaNLGVKLThKG/blEw6nUmwhV1xF 3CA6d2ywW4snQ== From: Michael Ellerman To: Leonardo Bras , Nicholas Piggin , Alexios Zavras , Benjamin Herrenschmidt , Christophe Leroy , Greg Kroah-Hartman , Enrico Weigelt , Paul Mackerras , peterz@infradead.org, Thomas Gleixner Cc: linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org Subject: Re: [PATCH v3 1/1] ppc/crash: Reset spinlocks during crash In-Reply-To: References: <20200401000020.590447-1-leonardo@linux.ibm.com> <871rp6t9di.fsf@mpe.ellerman.id.au> <02e74be19534ab1db2f16a0c89ecb164e380c12a.camel@linux.ibm.com> <1585895551.7o9oa0ey62.astroid@bobo.none> Date: Wed, 08 Apr 2020 22:21:29 +1000 Message-ID: <87v9majhh2.fsf@mpe.ellerman.id.au> MIME-Version: 1.0 Content-Type: text/plain Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Leonardo Bras writes: > Hello Nick, Michael, > > On Fri, 2020-04-03 at 16:41 +1000, Nicholas Piggin wrote: > [...] >> > > PAPR says we are not allowed to have multiple CPUs calling RTAS at once, >> > > except for a very small list of RTAS calls. So if we bust the RTAS lock >> > > there's a risk we violate that part of PAPR and crash even harder. >> > >> > Interesting, I was not aware. >> > >> > > Also it's not specific to kdump, we can't even get through a normal >> > > reboot if we crash with the RTAS lock held. >> > > >> > > Anyway here's a patch with some ideas. That allows me to get from a >> > > crash with the RTAS lock held through kdump into the 2nd kernel. But it >> > > only works if it's the crashing CPU that holds the RTAS lock. >> > > >> > >> > Nice idea. >> > But my test environment is just triggering a crash from sysrq, so I >> > think it would not improve the result, given that this thread is >> > probably not holding the lock by the time. >> >> Crash paths should not take that RTAS lock, it's a massive pain. I'm >> fixing it for machine check, for other crashes I think it can be removed >> too, it just needs to be unpicked. The good thing with crashing is that >> you can reasonably *know* that you're single threaded, so you can >> usually reason through situations like above. >> >> > I noticed that when rtas is locked, irqs and preemption are also >> > disabled. >> > >> > Should the IPI send by crash be able to interrupt a thread with >> > disabled irqs? >> >> Yes. It's been a bit painful, but in the long term it means that a CPU >> which hangs with interrupts off can be debugged, and it means we can >> take it offline to crash without risking that it will be clobbering what >> we're doing. >> >> Arguably what I should have done is try a regular IPI first, wait a few >> seconds, then NMI IPI. >> >> A couple of problems with that. Firstly it probably avoids this issue >> you hit almost all the time, so it won't get fixed. So when we really >> need the NMI IPI in the field, it'll still be riddled with deadlocks. >> >> Secondly, sending the IPI first in theory can be more intrusive to the >> state that we want to debug. It uses the currently running stack, paca >> save areas, ec. NMI IPI uses its own stack and save regions so it's a >> little more isolated. Maybe this is only a small advantage but I'd like >> to have it if we can. > > I think the printk issue is solved (sent a patch on that), now what is > missing is the rtas call spinlock. > > I noticed that rtas.lock is taken on machine_kexec_mask_interrupts(), > which happen after crashing the other threads and getting into > realmode. > > The following rtas are called each IRQ with valid interrupt descriptor: > ibm,int-off : Reset mask bit for that interrupt > ibm,set_xive : Set XIVE priority to 0xff > > By what I could understand, these rtas calls happen to put the next > kexec kernel (kdump kernel) in a safer environment, so I think it's not > safe to just remove them. Yes. > (See commit d6c1a9081080c6c4658acf2a06d851feb2855933) In hindsight the person who wrote that commit was being lazy. We *should* have made the 2nd kernel robust against the IRQ state being messed up. > On the other hand, busting the rtas.lock could be dangerous, because > it's code we can't control. > > According with LoPAR, for both of these rtas-calls, we have: > > For the PowerPC External Interrupt option: The call must be reentrant > to the number of processors on the platform. > For the PowerPC External Interrupt option: The argument call buffer for > each simultaneous call must be physically unique. Oh well spotted. Where is that in the doc? > Which I think means this rtas-calls can be done simultaneously. I think so too. I'll read PAPR in the morning and make sure. > Would it mean that busting the rtas.lock for these calls would be safe? What would be better is to make those specific calls not take the global RTAS lock to begin with. We should be able to just allocate the rtas_args on the stack, it's only ~80 odd bytes. And then we can use rtas_call_unlocked() which doesn't take the global lock. cheers