Received: by 2002:a05:7412:b10a:b0:f3:1519:9f41 with SMTP id az10csp313266rdb; Thu, 30 Nov 2023 05:41:30 -0800 (PST) X-Google-Smtp-Source: AGHT+IFaWj58uMaG+YnNo0dOIhNT5my/eK558ZX98hw4elIOeuSq7OmKnKjitrLI0ZjL+3AbqBMF X-Received: by 2002:a17:902:ea05:b0:1cf:d52a:2247 with SMTP id s5-20020a170902ea0500b001cfd52a2247mr18219353plg.13.1701351690361; Thu, 30 Nov 2023 05:41:30 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701351690; cv=none; d=google.com; s=arc-20160816; b=XXrsFQteKsmM1ctjw/7TVYOZEbAOTD9JuP9s5WsFeWfm268Uq6g8U6fa64uXXp2flR aThrOTk0S+ztSuOVNs5djZAm1BW8toQgy06iNwl6srH9ML/dJmajS3sH849RmpSTA049 2ES0y3ihELinyGGQ4IW7A/W34t/6AXuMdNkZxSWbGNuONdmYjqD+psY6vXdxqoPNXq7W GcT0PoSNkIWLT7E2bvBUscoT7AgmFJ0DqIMC+8X9a9FOTteQpprv4SwamJKP2oO6rRuN Wjw+OXuSGcJyRcZP4rWmwx5wf/Y5sOqE77ioYT/36zeHii4tiaUTTXKmh9J4MUIxt/2Z lN6A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=9F9Mgbd1Q4nxfYba9ZeBtXatG2N5U9vDkUkfp7pibMc=; fh=S7hkGnZhYlctT9a7ctMy5uePF3DPUQdV32eZY95u7mw=; b=VCrU7GWXySseUI8Xnv7mCeFMLDkSzCbXLebMTYOwa0cok7ix+ehNIiierx0JCOTBp2 v5/FZMboDTxTYVHEJEMHkXw2O2rdFKn4rqcO4I01puhnKKX557SZ8KieJp2mLqq3i236 6RYfGPck1K/JB10+Ffmu+HdQLn2qRn0qKG14e9CIzbSV0PUj7GG8DY3bTOaNAfgjMeqH kUXXmoSQ8W6v+zofP61HLzVsB0zO2yQFeWNdsVgrIneQLL0E/+V+EWSI6ami113LsQ1g 7RUCJkw11U+Q2mOpDPUCIA0cKM+yAbZgOkJGZeBoJlvtm3o6jydM1aF7SmIArTaXzURY 0ENA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=NDNLxkDV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Return-Path: Received: from fry.vger.email (fry.vger.email. [23.128.96.38]) by mx.google.com with ESMTPS id cf5-20020a056a02084500b0057d08dac75csi1314270pgb.517.2023.11.30.05.41.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Nov 2023 05:41:30 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) client-ip=23.128.96.38; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=NDNLxkDV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id 718F4809F3AB; Thu, 30 Nov 2023 05:41:27 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345533AbjK3NlL (ORCPT + 99 others); Thu, 30 Nov 2023 08:41:11 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52254 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232050AbjK3NlK (ORCPT ); Thu, 30 Nov 2023 08:41:10 -0500 Received: from smtp-out2.suse.de (smtp-out2.suse.de [IPv6:2a07:de40:b251:101:10:150:64:2]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3CE96C4 for ; Thu, 30 Nov 2023 05:41:15 -0800 (PST) Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id CD3C71F8BF; Thu, 30 Nov 2023 13:41:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1701351673; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=9F9Mgbd1Q4nxfYba9ZeBtXatG2N5U9vDkUkfp7pibMc=; b=NDNLxkDV7C7sVgY2ASW+FK9THKNYGKG9Uyf9jdGkIx/rUOWgxWzVeNVfAUva8bkjJR70UN Rr49wObYN5j8iyFNhRpZbHH1/MHx37i9VvjqmB1q988TQMS1rVQyC7+5abTjwVI3xBBBJo n0arfHpl0zPyW8Hi7NASLOoaXwuBTy4= Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id AAA8213AB1; Thu, 30 Nov 2023 13:41:13 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id V4niJvmQaGWRQwAAD6G6ig (envelope-from ); Thu, 30 Nov 2023 13:41:13 +0000 Date: Thu, 30 Nov 2023 14:41:12 +0100 From: Michal Hocko To: Baoquan He Cc: Donald Dutile , Jiri Bohac , Pingfan Liu , Tao Liu , Vivek Goyal , Dave Young , kexec@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 0/4] kdump: crashkernel reservation from CMA Message-ID: References: <91a31ce5-63d1-7470-18f7-92b039fda8e6@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Authentication-Results: smtp-out2.suse.de; none X-Spam-Level: X-Spam-Score: -3.60 X-Spamd-Result: default: False [-3.60 / 50.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; RCVD_COUNT_THREE(0.00)[3]; DKIM_SIGNED(0.00)[suse.com:s=susede1]; RCPT_COUNT_SEVEN(0.00)[9]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; MID_RHS_NOT_FQDN(0.50)[]; RCVD_TLS_ALL(0.00)[]; BAYES_HAM(-3.00)[100.00%] X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Thu, 30 Nov 2023 05:41:27 -0800 (PST) On Thu 30-11-23 20:31:44, Baoquan He wrote: [...] > > > which doesn't use the proper pinning API (which would migrate away from > > > the CMA) then what is the worst case? We will get crash kernel corrupted > > > potentially and fail to take a proper kernel crash, right? Is this > > > worrisome? Yes. Is it a real roadblock? I do not think so. The problem > > We may fail to take a proper kernel crash, why isn't it a roadblock? It would be if the threat was practical. So far I only see very theoretical what-if concerns. And I do not mean to downplay those at all. As already explained proper CMA users shouldn't ever leak out any writes across kernel reboot. > We > have stable way with a little more memory, why would we take risk to > take another way, just for saving memory? Usually only high end server > needs the big memory for crashkernel and the big end server usually have > huge system ram. The big memory will be a very small percentage relative > to huge system RAM. Jiri will likely talk more specific about that but our experience tells that proper crashkernel memory scaling has turned out a real maintainability problem because existing setups tend to break with major kernel version upgrades or non trivial changes. > > > seems theoretical to me and it is not CMA usage at fault here IMHO. It > > > is the said theoretical driver that needs fixing anyway. > > Now, what we want to make clear is if it's a theoretical possibility, or > very likely happen. We have met several on-flight DMA stomping into > kexec kernel's initrd in the past two years because device driver didn't > provide shutdown() methor properly. For kdump, once it happen, the pain > is we don't know how to debug. For kexec reboot, customer allows to > login their system to reproduce and figure out the stomping. For kdump, > the system corruption rarely happend, and the stomping could rarely > happen too. yes, this is understood. > The code change looks simple and the benefit is very attractive. I > surely like it if finally people confirm there's no risk. As I said, we > can't afford to take the risk if it possibly happen. But I don't object > if other people would rather take risk, we can let it land in kernel. I think it is fair to be cautious and I wouldn't impose the new method as a default. Only time can tell how safe this really is. It is hard to protect agains theoretical issues though. Bugs should be fixed. I believe this option would allow to configure kdump much easier and less fragile. > My personal opinion, thanks for sharing your thought. Thanks for sharing. -- Michal Hocko SUSE Labs