Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp2168079imm; Mon, 28 May 2018 03:05:29 -0700 (PDT) X-Google-Smtp-Source: ADUXVKII0VVz/67DfqVLWRO4RcUPkdLZApjLQqwfzmvyn342G1bNcOA0Eo0sPeAx3mNU6uvvqPmc X-Received: by 2002:a17:902:7105:: with SMTP id a5-v6mr4758333pll.171.1527501928971; Mon, 28 May 2018 03:05:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527501928; cv=none; d=google.com; s=arc-20160816; b=RQ59O4ee0u+kHH7kyo3rPnFob75OtWw0Fc2kKJTDtGaiU3bCLd8oLjHvx4T1bEqjRt zh3sdB8ut3aJYsC1QoJ308EuEZ/9QMvxtxKPGX0wlbxgZ+56/r4NDZh7QaZ0g5mk7om5 aFDQtaaoEu+DPAL70PtoYZO+JJ9nw5sLg4T+/Nnv3q/n3I0nKk26A317Co34L/aupHmx LpkLQ7PF5It+Lac+nSld5q+iP+BgrEu35LtmfufKdqU0MYvcF9alZjIjHi0ds5l71ykt RpCGNCazjBGLMfnP+QX+URMQsg5LolElmkhrQnn9vPMl//QgKLSzDXwZGDQuXaSWCzK8 2tFA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:organization:from:references:cc:to:subject :arc-authentication-results; bh=ikZQZB9W4G7ANbDv1Qj6jAWLHeyGEzwFO9Qg+WLdSQg=; b=BMC0IRs9hN24napxC+YCbwCVhX5j2YyaD9q6+WX/zmO+30byreGIYbG9VlONFgwjLZ MHc8tY3OBxuZEfNr4a28ZoyBWHqsMrqoRwVEK6WSZl+JxgUHnceuV/RjZCHyTk0Epsgq EBQhxcMuqB83SNmNgsLCU9jRn7hqy13Um6gZ26AhGyVnOVgTuGcaR2GaKjEFoEt2NLgD PKnWlVDmihz/1dr5jGQHsz4ev84nHqoh5DDn/1aWS7eqSwQhEQmO8fgRWbMXlxLlqEoG aglL3Mhx08WBX1IVCmjQuUa535BJ8n9Yk97gZIWPk7JZPyRvU2YMfYhaXH1a2EO2U7tv wNMQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v15-v6si8005538pfl.233.2018.05.28.03.05.14; Mon, 28 May 2018 03:05:28 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754484AbeE1KDY (ORCPT + 99 others); Mon, 28 May 2018 06:03:24 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:49478 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1754454AbeE1KDX (ORCPT ); Mon, 28 May 2018 06:03:23 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D5C0D406E8A9; Mon, 28 May 2018 10:03:21 +0000 (UTC) Received: from [10.36.117.136] (ovpn-117-136.ams2.redhat.com [10.36.117.136]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8ED102026612; Mon, 28 May 2018 10:03:12 +0000 (UTC) Subject: Re: [PATCH v1 00/10] mm: online/offline 4MB chunks controlled by device driver To: Dave Young Cc: Michal Hocko , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Alexander Potapenko , Andrew Morton , Andrey Ryabinin , Balbir Singh , Baoquan He , Benjamin Herrenschmidt , Boris Ostrovsky , Dan Williams , Dmitry Vyukov , Greg Kroah-Hartman , Hari Bathini , Huang Ying , Hugh Dickins , Ingo Molnar , Jaewon Kim , Jan Kara , =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= , Joonsoo Kim , Juergen Gross , Kate Stewart , "Kirill A. Shutemov" , Matthew Wilcox , Mel Gorman , Michael Ellerman , Miles Chen , Oscar Salvador , Paul Mackerras , Pavel Tatashin , Philippe Ombredanne , Rashmica Gupta , Reza Arbab , Souptick Joarder , Tetsuo Handa , Thomas Gleixner , Vlastimil Babka References: <20180523151151.6730-1-david@redhat.com> <20180524075327.GU20441@dhcp22.suse.cz> <14d79dad-ad47-f090-2ec0-c5daf87ac529@redhat.com> <20180524085610.GA5467@dhcp-128-65.nay.redhat.com> <20180528082846.GA7884@dhcp-128-65.nay.redhat.com> From: David Hildenbrand Organization: Red Hat GmbH Message-ID: <4b181422-334d-7ede-c00f-c967e4e3d13e@redhat.com> Date: Mon, 28 May 2018 12:03:11 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <20180528082846.GA7884@dhcp-128-65.nay.redhat.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Mon, 28 May 2018 10:03:22 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Mon, 28 May 2018 10:03:22 +0000 (UTC) for IP:'10.11.54.4' DOMAIN:'int-mx04.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'david@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 28.05.2018 10:28, Dave Young wrote: > On 05/24/18 at 11:14am, David Hildenbrand wrote: >> On 24.05.2018 10:56, Dave Young wrote: >>> Hi, >>> >>> [snip] >>>>> >>>>>> For kdump and onlining/offlining code, we >>>>>> have to mark pages as offline before a new segment is visible to the system >>>>>> (e.g. as these pages might not be backed by real memory in the hypervisor). >>>>> >>>>> Please expand on the kdump part. That is really confusing because >>>>> hotplug should simply not depend on kdump at all. Moreover why don't you >>>>> simply mark those pages reserved and pull them out from the page >>>>> allocator? >>>> >>>> 1. "hotplug should simply not depend on kdump at all" >>>> >>>> In theory yes. In the current state we already have to trigger kdump to >>>> reload whenever we add/remove a memory block. >>>> >>>> >>>> 2. kdump part >>>> >>>> Whenever we offline a page and tell the hypervisor about it ("unplug"), >>>> we should not assume that we can read that page again. Now, if dumping >>>> tools assume they can read all memory that is offline, we are in trouble. >>>> >>>> It is the same thing as we already have with Pg_hwpoison. Just a >>>> different meaning - "don't touch this page, it is offline" compared to >>>> "don't touch this page, hw is broken". >>> >>> Does that means in case an offline no kdump reload as mentioned in 1)? >>> >>> If we have the offline event and reload kdump, I assume the memory state >>> is refreshed so kdump will not read the memory offlined, am I missing >>> something? >> >> If a whole section is offline: yes. (ACPI hotplug) After my investigation and reply to the other subthread, I think this is not the case. If a section/memory block is offline, it will currently still be dumped as far as I can see. The ONLINE flag for sections is not (yet) interpreted in makedumpfile. >> >> If pages are online but broken ("logically offline" - hwpoison): no >> >> If single pages are logically offline: no. (Balloon inflation - let's >> call it unplug as that's what some people refer to) >> >> If only subsections (4MB chunks) are offline: no. >> >> Exporting memory ranges in a smaller granularity to kdump than section >> size would a) be heavily complicated b) introduce a lot of overhead for >> this tracking data c) make us retrigger kdump way too often. >> >> So simply marking pages offline in the struct pages and telling kdump >> about it is the straight forward thing to do. And it is fairly easy to >> add and implement as we have the exact same thing in place for hwpoison. > > Ok, it is clear enough. If case fine grained page offline is is like > a hwpoison page so a userspace patch for makedumpfile is needes to > exclude them when copying vmcore. Exactly, to not touch pages that have no backing in the hypervisor. Even if the pages would be readable on the hypervisor side, it makes no sense to read/process them, as the y are logically offline and the content is of no importance anymore - performance improvement, possible dump size reduction. -- Thanks, David / dhildenb