Received: by 2002:a05:7412:b10a:b0:f3:1519:9f41 with SMTP id az10csp314300rdb; Thu, 30 Nov 2023 05:43:27 -0800 (PST) X-Google-Smtp-Source: AGHT+IFmGvRvoG1tViMDJKiaGO7JREwsPRcuN2iOy/mHshLt2yp+n0Mx+Wqw1pnEVsJVjWZarbPo X-Received: by 2002:a17:90b:33d0:b0:285:dbc9:dc18 with SMTP id lk16-20020a17090b33d000b00285dbc9dc18mr13049462pjb.38.1701351807570; Thu, 30 Nov 2023 05:43:27 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701351807; cv=none; d=google.com; s=arc-20160816; b=JWYjYezKNLCL+2uXguP6kx1fanKYoqZj5niq5zq2jSF864UVZqowmSZHepkY8ffwvg TipcwdvptJQEc5LgBRHA/RnkFOUHXKLTtPro1Iyq/rzYZHvdRMqwCEdDrD/v3zTgVWHb ZyaElwg47uLBX29B0OWLUZrOhBfOsBTSlkz+rJQp0FlepDmFwi2gkrCtCNcJvNYQ8mpa 8fV7TH4apVMoMO9g/r9O03T+nlvPriBWvCKT/SnpS62676xCAd1gNNCf5jZsdaegUJx3 0ou5u68RS9KpVezV0sNzJn+4JLu2dxWiSHqTQ50ohTB5JjIyrsQL3VDEhOVpsvONjRtt t2tQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=E0xXRK+aUL8cLCXEAdHHc8BYb0q+10UFV8/xhlNkfxo=; fh=9KeZQjOGe0HdMiqFojivNiG2gMMhYrBYUOkSdcxx8SI=; b=O66Sgk1KKeLz9qk9r6Zn5K9Wzs0hrrltZcaCuzEtBjCOecKSb+zf9hUVobxLN/7kIZ y5NW08N3fTf6gH3NnsYOc1YY7wGINTUISc1qKTO4JZeuTtKjkDcdlVjG1hSAlWp3bllg 5wo7CEkHfNFVqF4AwjUI8VvTjvul9RzYGsBKJwoK7jLnJInUOItP6+8tUk6ZupRxHFko QXo700PxitR/i7cJSJlCHCis/yVMMpO1LQ3lZ6xzqPo2aW3sIzIarEtOjejXc86NENgb 1C1XbKIB21rasEl+3xVIlTcZTDHaGKXEA8n1BpmKEVZ3pbTOIHleJoiRMfieh2R+jSlY q2GA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=UqaGXeIC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Return-Path: Received: from fry.vger.email (fry.vger.email. [2620:137:e000::3:8]) by mx.google.com with ESMTPS id l4-20020a170902ec0400b001d0107e765bsi1173056pld.564.2023.11.30.05.43.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Nov 2023 05:43:27 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) client-ip=2620:137:e000::3:8; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=UqaGXeIC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id E0EDC803236C; Thu, 30 Nov 2023 05:43:24 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345637AbjK3NnG (ORCPT + 99 others); Thu, 30 Nov 2023 08:43:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47448 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345622AbjK3NnF (ORCPT ); Thu, 30 Nov 2023 08:43:05 -0500 Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2a07:de40:b251:101:10:150:64:1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 37A84194 for ; Thu, 30 Nov 2023 05:43:11 -0800 (PST) Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id CF6D021A3C; Thu, 30 Nov 2023 13:43:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1701351789; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=E0xXRK+aUL8cLCXEAdHHc8BYb0q+10UFV8/xhlNkfxo=; b=UqaGXeICW1+gaCJDa6EZhJWtlB4LY7R/CiHmE7gP5NDtbyhtrd5Q+id0DpmQy3PSdQXx3F YmKFs6lmIf1ejZKglsIKpWaF9FPWZ+VoI60WHgMFsIUDiTkTgPCdVjHECXhi50gel2Va4A 8dRFvP2nKJJRJQPukyjxCPgNnS7KyJY= Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id AB59113AB1; Thu, 30 Nov 2023 13:43:09 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id q5nTJm2RaGUKRAAAD6G6ig (envelope-from ); Thu, 30 Nov 2023 13:43:09 +0000 Date: Thu, 30 Nov 2023 14:43:08 +0100 From: Michal Hocko To: Pingfan Liu Cc: Baoquan He , Donald Dutile , Jiri Bohac , Tao Liu , Vivek Goyal , Dave Young , kexec@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 0/4] kdump: crashkernel reservation from CMA Message-ID: References: <91a31ce5-63d1-7470-18f7-92b039fda8e6@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Authentication-Results: smtp-out1.suse.de; none X-Spam-Level: X-Spam-Score: -3.60 X-Spamd-Result: default: False [-3.60 / 50.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; RCVD_COUNT_THREE(0.00)[3]; DKIM_SIGNED(0.00)[suse.com:s=susede1]; RCPT_COUNT_SEVEN(0.00)[9]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.com:email]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; MID_RHS_NOT_FQDN(0.50)[]; RCVD_TLS_ALL(0.00)[]; BAYES_HAM(-3.00)[100.00%] X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Thu, 30 Nov 2023 05:43:25 -0800 (PST) On Thu 30-11-23 21:33:04, Pingfan Liu wrote: > On Thu, Nov 30, 2023 at 9:29 PM Michal Hocko wrote: > > > > On Thu 30-11-23 20:04:59, Baoquan He wrote: > > > On 11/30/23 at 11:16am, Michal Hocko wrote: > > > > On Thu 30-11-23 11:00:48, Baoquan He wrote: > > > > [...] > > > > > Now, we are worried if there's risk if the CMA area is retaken into kdump > > > > > kernel as system RAM. E.g is it possible that 1st kernel's ongoing RDMA > > > > > or DMA will interfere with kdump kernel's normal memory accessing? > > > > > Because kdump kernel usually only reset and initialize the needed > > > > > device, e.g dump target. Those unneeded devices will be unshutdown and > > > > > let go. > > > > > > > > I do not really want to discount your concerns but I am bit confused why > > > > this matters so much. First of all, if there is a buggy RDMA driver > > > > which doesn't use the proper pinning API (which would migrate away from > > > > the CMA) then what is the worst case? We will get crash kernel corrupted > > > > potentially and fail to take a proper kernel crash, right? Is this > > > > worrisome? Yes. Is it a real roadblock? I do not think so. The problem > > > > seems theoretical to me and it is not CMA usage at fault here IMHO. It > > > > is the said theoretical driver that needs fixing anyway. > > > > > > > > Now, it is really fair to mention that CMA backed crash kernel memory > > > > has some limitations > > > > - CMA reservation can only be used by the userspace in the > > > > primary kernel. If the size is overshot this might have > > > > negative impact on kernel allocations > > > > - userspace memory dumping in the crash kernel is fundamentally > > > > incomplete. > > > > > > I am not sure if we are talking about the same thing. My concern is: > > > ==================================================================== > > > 1) system corrutption happened, crash dumping is prepared, cpu and > > > interrupt controllers are shutdown; > > > 2) all pci devices are kept alive; > > > 3) kdump kernel boot up, initialization is only done on those devices > > > which drivers are added into kdump kernel's initrd; > > > 4) those on-flight DMA engine could be still working if their kernel > > > module is not loaded; > > > > > > In this case, if the DMA's destination is located in crashkernel=,cma > > > region, the DMA writting could continue even when kdump kernel has put > > > important kernel data into the area. Is this possible or absolutely not > > > possible with DMA, RDMA, or any other stuff which could keep accessing > > > that area? > > > > I do nuderstand your concern. But as already stated if anybody uses > > movable memory (CMA including) as a target of {R}DMA then that memory > > should be properly pinned. That would mean that the memory will be > > migrated to somewhere outside of movable (CMA) memory before the > > transfer is configured. So modulo bugs this shouldn't really happen. > > Are there {R}DMA drivers that do not pin memory correctly? Possibly. Is > > that a road bloack to not using CMA to back crash kernel memory, I do > > not think so. Those drivers should be fixed instead. > > > I think that is our concern. Is there any method to guarantee that > will not happen instead of 'should be' ? > Any static analysis during compiling time or dynamic checking method? I am not aware of any method to detect a driver is going to configure a RDMA. > If this can be resolved, I think this method is promising. Are you indicating this is a mandatory prerequisite? -- Michal Hocko SUSE Labs