Received: by 2002:a05:7412:b10a:b0:f3:1519:9f41 with SMTP id az10csp308360rdb; Thu, 30 Nov 2023 05:33:39 -0800 (PST) X-Google-Smtp-Source: AGHT+IGIfE4iu+CPkFLdWAVyMcjgEy8iq9FEt/R5oE/bXnhRyGsuY14zp0aFeT9FfFlSkiav8k13 X-Received: by 2002:aca:1c02:0:b0:3ae:156f:d325 with SMTP id c2-20020aca1c02000000b003ae156fd325mr22702020oic.58.1701351219557; Thu, 30 Nov 2023 05:33:39 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701351219; cv=none; d=google.com; s=arc-20160816; b=cZKHBqTpbQk8rf2qOgFgpRXyu/a8+AR0wsbPhZ45PJ07lYqLP7VakMBrKjyY8EOmgb tuKzkpcKQx9D/hScWkn+moJNnCbT679pEelK3sLEktL2YZYKFnIbi8ZeWCXTOXQs/nYu cTawmKQfJ2jMyTBoihFNPLtS7ajPzgTb+l1fjcRZB9l3a2IKYyA6nHHAaN5FpkJYW00/ hxdzr56bI06/iQlD7ndqXHmAjrEWIu7LD7rczVtSovYaVY1akBDJ8bsYaoS5UQrRrgyC 9mhPGFvXnBVwmkMISYOOyg79/gHbRgHiouhjM3QCfy4bx4/VA0XmdG9b6vitfK2oOEdy 4dpA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=lIdax5nkZnwudivN9rSoOKpFOEYMEgv4l8J3J8z/rRE=; fh=uU6rCR2e3ydZ7w3kCowFi4kiBNOVstLo4rIuJX7MRAU=; b=l2yWa9LNQj2P0nvQyVG5QgaU3epoFbI0Ulquu3iThOxGm6zhuWe/zc24gRaqQML7av JrnTDQdcTw+RyF6kVXVIkvEQ4qWolO/r3UJ3TNMziZvuAu8l7QkFGicsGSRaOSk0WJPA spGCTvSajSsbM10CK2C7v/xOp/1zDz68VpE88q9zQH7oq/mxhVY2WNThGbDwWYejyLPR GdsXy7/EjfSzpg2L4Z26CHx5nRHCk6LZ9xMhWUjHq/fRAKslU1amj26jTs3Ri5igoYw7 zzf/lsm6QGGw6/m/xJdqwlDbwBmrBNjRmOjl0DHjpL/FnwQNpdrVG6ol3p9awESRCw0W JkPw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=IhH3UnSg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from pete.vger.email (pete.vger.email. [23.128.96.36]) by mx.google.com with ESMTPS id g64-20020a636b43000000b005bd043711cbsi1354684pgc.216.2023.11.30.05.33.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Nov 2023 05:33:39 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) client-ip=23.128.96.36; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=IhH3UnSg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id 128458023FFC; Thu, 30 Nov 2023 05:33:37 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345576AbjK3NdU (ORCPT + 99 others); Thu, 30 Nov 2023 08:33:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53492 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235176AbjK3NdT (ORCPT ); Thu, 30 Nov 2023 08:33:19 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6F80A10A for ; Thu, 30 Nov 2023 05:33:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1701351204; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=lIdax5nkZnwudivN9rSoOKpFOEYMEgv4l8J3J8z/rRE=; b=IhH3UnSgjHbkd/winX+Kckwc/Q19FFssG0XUaefzu0WWtJLpYNV3IKWnt3fJQcEOHriOgT A3ZbNQcD70Qpnfx0tWVuFUEaZCcM0H6P2JtXLEwBoKiwz11xolz1N0w7DY1GH/eUUl32Pz lHQ3gvv0Cfdt4yZebWud1Ahb9Onzn4Y= Received: from mail-yw1-f199.google.com (mail-yw1-f199.google.com [209.85.128.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-609--Fw0J-5vNfWjNhcQYTpyMw-1; Thu, 30 Nov 2023 08:33:16 -0500 X-MC-Unique: -Fw0J-5vNfWjNhcQYTpyMw-1 Received: by mail-yw1-f199.google.com with SMTP id 00721157ae682-5cddc35545dso15843737b3.2 for ; Thu, 30 Nov 2023 05:33:16 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701351196; x=1701955996; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=lIdax5nkZnwudivN9rSoOKpFOEYMEgv4l8J3J8z/rRE=; b=FCZVbbWHq4T2Qx68FRUZgaGivyRZ3o4CIrb3fp8/TYyMoIgBEv11yhrPu5Gxot/4Be D2VPB3H/F0HCuWfq8I8BV5JzQV9hyGjbJNNVf5EKdSv3YlseMVQXSFD7IOlpNFYgG8RA Bsz/8au4cc8D/0s+swFKtI+JtCLgn71wME62EAF6uWRsv6g05DBaMC5wP7cDw/aB4U8U LqdEqHouszEKRl8cGfJXfkLV5O4XPTMAeJf4vyQZyrv45wkxZ/j2PxBVmCCGQM0HbQ1o zf8CK8VdSfUgRmh3zRPYngpgb2XZgzhIulJO1WldefUeeqV5fvp2IARQmoUCzZDjCvaj W70Q== X-Gm-Message-State: AOJu0YzsqpnelMn1l620C04E9RG1gb3EI+IaAiZZtZcyTeIxhOl2xI+S SKzf0+wsmTJtHNqVFrNHAfsUnnIv4fqcly/ftlTbWgQUARHfDhSBmxN19cTJBoylFaXDVn+/6eD rR3hJXUk0J5Y5+WZz7LVz3QyMq7RVCVjHbcUJOnv/ X-Received: by 2002:a05:690c:26c5:b0:5ce:2148:d4cf with SMTP id eb5-20020a05690c26c500b005ce2148d4cfmr20665601ywb.7.1701351196243; Thu, 30 Nov 2023 05:33:16 -0800 (PST) X-Received: by 2002:a05:690c:26c5:b0:5ce:2148:d4cf with SMTP id eb5-20020a05690c26c500b005ce2148d4cfmr20665576ywb.7.1701351195958; Thu, 30 Nov 2023 05:33:15 -0800 (PST) MIME-Version: 1.0 References: <91a31ce5-63d1-7470-18f7-92b039fda8e6@redhat.com> In-Reply-To: From: Pingfan Liu Date: Thu, 30 Nov 2023 21:33:04 +0800 Message-ID: Subject: Re: [PATCH 0/4] kdump: crashkernel reservation from CMA To: Michal Hocko Cc: Baoquan He , Donald Dutile , Jiri Bohac , Tao Liu , Vivek Goyal , Dave Young , kexec@lists.infradead.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Thu, 30 Nov 2023 05:33:37 -0800 (PST) On Thu, Nov 30, 2023 at 9:29=E2=80=AFPM Michal Hocko wrot= e: > > On Thu 30-11-23 20:04:59, Baoquan He wrote: > > On 11/30/23 at 11:16am, Michal Hocko wrote: > > > On Thu 30-11-23 11:00:48, Baoquan He wrote: > > > [...] > > > > Now, we are worried if there's risk if the CMA area is retaken into= kdump > > > > kernel as system RAM. E.g is it possible that 1st kernel's ongoing = RDMA > > > > or DMA will interfere with kdump kernel's normal memory accessing? > > > > Because kdump kernel usually only reset and initialize the needed > > > > device, e.g dump target. Those unneeded devices will be unshutdown = and > > > > let go. > > > > > > I do not really want to discount your concerns but I am bit confused = why > > > this matters so much. First of all, if there is a buggy RDMA driver > > > which doesn't use the proper pinning API (which would migrate away fr= om > > > the CMA) then what is the worst case? We will get crash kernel corrup= ted > > > potentially and fail to take a proper kernel crash, right? Is this > > > worrisome? Yes. Is it a real roadblock? I do not think so. The proble= m > > > seems theoretical to me and it is not CMA usage at fault here IMHO. I= t > > > is the said theoretical driver that needs fixing anyway. > > > > > > Now, it is really fair to mention that CMA backed crash kernel memory > > > has some limitations > > > - CMA reservation can only be used by the userspace in the > > > primary kernel. If the size is overshot this might have > > > negative impact on kernel allocations > > > - userspace memory dumping in the crash kernel is fundamentally > > > incomplete. > > > > I am not sure if we are talking about the same thing. My concern is: > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > 1) system corrutption happened, crash dumping is prepared, cpu and > > interrupt controllers are shutdown; > > 2) all pci devices are kept alive; > > 3) kdump kernel boot up, initialization is only done on those devices > > which drivers are added into kdump kernel's initrd; > > 4) those on-flight DMA engine could be still working if their kernel > > module is not loaded; > > > > In this case, if the DMA's destination is located in crashkernel=3D,cma > > region, the DMA writting could continue even when kdump kernel has put > > important kernel data into the area. Is this possible or absolutely not > > possible with DMA, RDMA, or any other stuff which could keep accessing > > that area? > > I do nuderstand your concern. But as already stated if anybody uses > movable memory (CMA including) as a target of {R}DMA then that memory > should be properly pinned. That would mean that the memory will be > migrated to somewhere outside of movable (CMA) memory before the > transfer is configured. So modulo bugs this shouldn't really happen. > Are there {R}DMA drivers that do not pin memory correctly? Possibly. Is > that a road bloack to not using CMA to back crash kernel memory, I do > not think so. Those drivers should be fixed instead. > I think that is our concern. Is there any method to guarantee that will not happen instead of 'should be' ? Any static analysis during compiling time or dynamic checking method? If this can be resolved, I think this method is promising. Thanks, Pingfan > > The existing crashkernel=3D syntax can gurantee the reserved crashkerne= l > > area for kdump kernel is safe. > > I do not think this is true. If a DMA is misconfigured it can still > target crash kernel memory even if it is not mapped AFAICS. But those > are theoreticals. Or am I missing something? > -- > Michal Hocko > SUSE Labs >