Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp7552092rwb; Wed, 23 Nov 2022 07:52:52 -0800 (PST) X-Google-Smtp-Source: AA0mqf4hJEWnYACLygvTx48F+ujnEgzvUE65Kv8a53dRN4mFJf2TLBIvMkc70zrLvkyHXovqOW2K X-Received: by 2002:a17:90a:39c9:b0:218:499b:bee9 with SMTP id k9-20020a17090a39c900b00218499bbee9mr31836346pjf.171.1669218771998; Wed, 23 Nov 2022 07:52:51 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669218771; cv=none; d=google.com; s=arc-20160816; b=GhL+dz9VznKKdYPbzOwn8nh2fJpvettoQUzCeABl3BeV4afnXiQsIz8hxfxQv9eQl8 BAbTTsoXVqut2s76sBH+jAPvR8eQu6uevAIWFUv+nFnOCaO4q4dVwIpNEU1hMcRZtTpF bwrrM0BdfQbdCPXPS1X4VYC8cCxvFhQPBZDWRJc8FI7tHHTW2mguhxH4cvViirhZkN4u ryGBi6QswtmxMS/8+gzLVUBobJ+YIP4YGuYfTcxK9HZ4/IiO9wTcJF2iLO/STQycL/F0 Gp+hy9ooW0Fe4pPffjxuS6eDe6bffUaHY1v/ZmAVGW4PFNt7MXuzaTMzDE0a37Bd33cR feog== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:content-transfer-encoding :in-reply-to:from:content-language:references:cc:to:subject :user-agent:date:message-id:dkim-signature; bh=UcADcFV47xfsKjH963fVkenN94MZeq+RIy3Tyn231CM=; b=N3iqCUIhsh7dseDctGQZAIIFlrOe4TlAWwa54EQLWFSlBNdcNLA6UbtOUarVLU+ncG jRRNQvEaMxVEiusGOB554fa/wPbeYb+DUVwd2nzBCvzG2stKDsP9JwjBgu+zcAO57wI3 9hyrz1zKd+X35dc4gc7WOdJEG9HrKebLs4XQ1YohtSP8pO9nK9mSw7kXUPFCrmw7664w 2MVIXLu1KXIStsRqu5l6YwPHHOVdkNQ/EaBPXDjz7vOtl28DfZF0XoEdnxNZrIEv6che yjFL8lekS2irXRh6I9+qSDlr09wH5vH01XXNmLyO7tWykrPMG/RKttjQsRFtMl/sFGvn U4Iw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=bm8OCy0F; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ik14-20020a170902ab0e00b00186ff4b98a1si14782630plb.208.2022.11.23.07.52.39; Wed, 23 Nov 2022 07:52:51 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=bm8OCy0F; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238580AbiKWPdw (ORCPT + 88 others); Wed, 23 Nov 2022 10:33:52 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47620 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238621AbiKWPdn (ORCPT ); Wed, 23 Nov 2022 10:33:43 -0500 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E969766C8C for ; Wed, 23 Nov 2022 07:33:42 -0800 (PST) Received: from pps.filterd (m0098409.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2ANEarFl014362; Wed, 23 Nov 2022 15:32:13 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding : mime-version; s=pp1; bh=UcADcFV47xfsKjH963fVkenN94MZeq+RIy3Tyn231CM=; b=bm8OCy0FZHjM5ni6WNs58bt86d9cdO1t2qymlyPzRngY7HxZ6FCZL4YWkLe74gZuXqWo fHjuqNCiQoqDF7AsZ96jRY9he781PWwVWh/PKoM2AVjPjVgSbrpk9nhl07Enr+nMoE7Z 6+EURMxXYT6rHQt0V3luR01YMuG3vQPYTSV4rhg/2HOqPn5SPj1OHoLgYmkdIFuYpNE0 CS9ci21vxYjNi/H43nVqwSnqLDYi0v7ueZsFKXW/RAhqefuVUI9PfuIPhbNVNCIZKwDd GGOqMJHlcyvLY+OLOmRmvJJEbt1cECsZaE2SUoYRmqU4RRuZ69PEGGQgQq/kuwE5kNDx Iw== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3m1152ysyu-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 23 Nov 2022 15:32:12 +0000 Received: from m0098409.ppops.net (m0098409.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 2ANEbl9Z019653; Wed, 23 Nov 2022 15:32:11 GMT Received: from ppma03ams.nl.ibm.com (62.31.33a9.ip4.static.sl-reverse.com [169.51.49.98]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3m1152yswm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 23 Nov 2022 15:32:11 +0000 Received: from pps.filterd (ppma03ams.nl.ibm.com [127.0.0.1]) by ppma03ams.nl.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 2ANFKmek016489; Wed, 23 Nov 2022 15:32:08 GMT Received: from b06cxnps4076.portsmouth.uk.ibm.com (d06relay13.portsmouth.uk.ibm.com [9.149.109.198]) by ppma03ams.nl.ibm.com with ESMTP id 3kxps8x59w-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 23 Nov 2022 15:32:08 +0000 Received: from d06av24.portsmouth.uk.ibm.com (mk.ibm.com [9.149.105.60]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 2ANFW64Q66650546 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 23 Nov 2022 15:32:06 GMT Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 399C842045; Wed, 23 Nov 2022 15:32:06 +0000 (GMT) Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 32C1B42049; Wed, 23 Nov 2022 15:32:01 +0000 (GMT) Received: from [9.43.47.46] (unknown [9.43.47.46]) by d06av24.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 23 Nov 2022 15:32:00 +0000 (GMT) Message-ID: <3923f27e-2a35-3116-fffb-560fdb544de2@linux.ibm.com> Date: Wed, 23 Nov 2022 21:01:59 +0530 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.1 Subject: Re: [PATCH v14 0/7] crash: Kernel handling of CPU and memory hot un/plug To: Eric DeVolder , linux-kernel@vger.kernel.org, x86@kernel.org, kexec@lists.infradead.org, ebiederm@xmission.com, dyoung@redhat.com, bhe@redhat.com, vgoyal@redhat.com Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, nramas@linux.microsoft.com, thomas.lendacky@amd.com, robh@kernel.org, efault@gmx.de, rppt@kernel.org, david@redhat.com, konrad.wilk@oracle.com, boris.ostrovsky@oracle.com References: <20221116214643.6384-1-eric.devolder@oracle.com> Content-Language: en-US From: Sourabh Jain In-Reply-To: <20221116214643.6384-1-eric.devolder@oracle.com> Content-Type: text/plain; charset=UTF-8; format=flowed X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: Rn5vuJxmyRozbRqrxRu7m_C-1xsFaWgQ X-Proofpoint-GUID: P1FRqF_m7efNGb07PxqT0jl5SFOmzrki Content-Transfer-Encoding: 7bit X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-23_08,2022-11-23_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 spamscore=0 malwarescore=0 impostorscore=0 lowpriorityscore=0 bulkscore=0 phishscore=0 adultscore=0 priorityscore=1501 mlxscore=0 mlxlogscore=999 clxscore=1011 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211230114 X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 17/11/22 03:16, Eric DeVolder wrote: > When the kdump service is loaded, if a CPU or memory is hot > un/plugged, the crash elfcorehdr, which describes the CPUs > and memory in the system, must also be updated, else the resulting > vmcore is inaccurate (eg. missing either CPU context or memory > regions). > > The current solution utilizes udev to initiate an unload-then-reload > of the kdump image (eg. kernel, initrd, boot_params, puratory and > elfcorehdr) by the userspace kexec utility. In previous posts I have > outlined the significant performance problems related to offloading > this activity to userspace. > > This patchset introduces a generic crash hot un/plug handler that > registers with the CPU and memory notifiers. Upon CPU or memory > changes, this generic handler is invoked and performs important > housekeeping, for example obtaining the appropriate lock, and then > invokes an architecture specific handler to do the appropriate > updates. > > In the case of x86_64, the arch specific handler generates a new > elfcorehdr, and overwrites the old one in memory; thus no > involvement with userspace needed. > > To realize the benefits/test this patchset, one must make a couple > of minor changes to userspace: > > - Prevent udev from updating kdump crash kernel on hot un/plug changes. > Add the following as the first lines to the RHEL udev rule file > /usr/lib/udev/rules.d/98-kexec.rules: > > # The kernel handles updates to crash elfcorehdr for cpu and memory changes > SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end" > SUBSYSTEM=="memory", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end" > > These lines will cause cpu and memory hot un/plug events to be > skipped within this rule file, with this changset applied. > > - Change to the kexec_file_load for loading the kdump kernel: > Eg. on RHEL: in /usr/bin/kdumpctl, change to: > standard_kexec_args="-p -d -s" > which adds the -s to select kexec_file_load syscall. > > This kernel patchset also supports kexec_load() with a modified kexec > userspace utility. A working changeset to the kexec userspace utility > is posted to the kexec-tools mailing list here: > > http://lists.infradead.org/pipermail/kexec/2022-October/026032.html > > To use the kexec-tools patch, apply, build and install kexec-tools, > then change the kdumpctl's standard_kexec_args to replace the -s with > --hotplug. The removal of -s reverts to the kexec_load syscall and > the addition of --hotplug invokes the changes put forth in the > kexec-tools patch. > > Regards, > eric > --- > v14: 16nov2022 > - Rebased onto 6.1.0-rc5 > - Introduced CRASH_HOTPLUG Kconfig item to better fine tune > compilation of feature components, per Boris. > - Removed hp_action parameter to arch_crash_handle_hotplug_event() > as it is unused. But what if an architecture has different actions based on hp type? - Sourabh Jain