Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp431789pxk; Wed, 2 Sep 2020 05:45:02 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzLNPrdKl2rJ0h+JchaiHZ6/sYnesD1bEHgdj61xIdMeTD0LXD4OQXCjDgO4JxQreudXBQ6 X-Received: by 2002:a17:906:1d08:: with SMTP id n8mr6432712ejh.236.1599050701988; Wed, 02 Sep 2020 05:45:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1599050701; cv=none; d=google.com; s=arc-20160816; b=L3isbzAVsnUqJTS+bsO0mNOfXjt4rvCcZeAx8wsyBxX2Co5PNXCWxKbes8lBQycL9j GWOhWS2SJcup2pmdHE1qR65iR6zvnQNbUJXyYRnbsPOGEPNQggQv2mA1UbNZbmL5ll8z 8GzJMhJCAYbYzaiiMF3XJc6gyKsep3UE5D9x4oE9uHftimmEx7vDpPyK8PnA6QNoU2/i klMcw4ln6XaMtEdKKxHv+pf6Ki7GyQaHBjfoZhdJOY9YL/jnndPyEzjXlD6BdWy1/npQ qFOAzuNbMIW/2Xhq6WwOv0z+uOGA2+I7x0lXzpOotdoLZ/8C9lMilS6cMwyvGorsw2ZJ SFxA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=SccKNTt3YC7XbTuW2H30Zmdads/idBqIfZTM94c1dTk=; b=NhP0LWDyiLZ8gW8ZI/8h3q+H3hj0ft/kucD2Wf9hWwKQdCRDXvlpbqQjS9fDn7EurZ fisc/IbInzOWT0MwLd4KW95SHK3ZMUnqUMoZSy4acvM7F1GKwKcoxSrsw8+VYxzBTDzf CJx+QwsIo07aJR1TXCxYZnec84kW5L8cWuyirZ28K6FV6dzyKmJw3hhnRw10aPLQVDI6 7RgVOgm+uES2gzbRINSriwGlXWgQsl2UA6/5YsQHNKxPsZ5A29yZFTKYaKEYhfAkFxEY zj9MJ67NwRwCV//FcT6C8zVKQtft5Sp1fglL7R6F86U8omObfKLgjsWpG4ToeLe8czYK HSNA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@soleen.com header.s=google header.b=V8U7jhgO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id a12si2307112edn.536.2020.09.02.05.44.38; Wed, 02 Sep 2020 05:45:01 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@soleen.com header.s=google header.b=V8U7jhgO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726871AbgIBMnY (ORCPT + 99 others); Wed, 2 Sep 2020 08:43:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52330 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726686AbgIBMnB (ORCPT ); Wed, 2 Sep 2020 08:43:01 -0400 Received: from mail-ej1-x643.google.com (mail-ej1-x643.google.com [IPv6:2a00:1450:4864:20::643]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2DC52C061244 for ; Wed, 2 Sep 2020 05:42:51 -0700 (PDT) Received: by mail-ej1-x643.google.com with SMTP id d26so6475948ejr.1 for ; Wed, 02 Sep 2020 05:42:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=SccKNTt3YC7XbTuW2H30Zmdads/idBqIfZTM94c1dTk=; b=V8U7jhgODsB8A//a8e72+JnBkGzNrjWlf+cOescmbiJ3DdRRPUNnX7AH0GN4CKTzGQ +/DnJnNFfFFBJdJNHpWGYydROSt6ust0nCR9krxGx1RYNVxs3B9Vt6pXMrhZ8ezZilSg 5iMLITForaLMdJBbuQ8/D4kKbKs5exxbDRNVJo9rsu4teVNsFhLVlefNJLDkf5vnWjK6 zCRx5PQacyjnr2low+ElUA4KMRIwH64tKzzsty1u/A5yrzKTSqH1Y5b+yFxgftrNkQrZ nsrBzKgtGopGLoDGEl4qx5f3eYHngO7vqpj+i4F57egPLI66c960+Mpld71TPJl37vP/ c1sA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=SccKNTt3YC7XbTuW2H30Zmdads/idBqIfZTM94c1dTk=; b=oencepJrrcELnjiFuES+/Eoa1HrIalgBgFyJ3+TfOoPMmLVa1RZUJtwPwNGK01Whld RXQMtu6Qy7uvSrDHuQ7UvFqXjwuUJ26CmayBkkj0PlSZJiKGP7vhJFm+b9Bu7rmjG1ZQ QhG193lq5uU+iy6ziLPzWEVeDzm5e/ZE08arQviRT432uSfacp2fBZxuTvmcd7xJnPUV D6HIkNO2wTM1c3HtECobvSk+T+V5TPlKWQMJmgI3wF94XL1ktfJoQ7NxY1xpzhVTnZM3 NpRhU3BcsrY+xtdka/hqOKkL1HClqHDdiW/DLeHTOJ5X+O9i82gLrHFC+5VhchuNg+G6 J/iQ== X-Gm-Message-State: AOAM530O7QiMfmFatn+ebTAyV8DC16uKDDIGfIvMVulz/OsYqULpXOlK tnhTh7UPMaLw09iZlFN7uc+q2hkedd+uD5WqiTXByg== X-Received: by 2002:a17:906:715b:: with SMTP id z27mr2767202ejj.166.1599050569831; Wed, 02 Sep 2020 05:42:49 -0700 (PDT) MIME-Version: 1.0 References: <6469324e-afa2-18b4-81fb-9e96466c1bf3@suse.cz> In-Reply-To: From: Pavel Tatashin Date: Wed, 2 Sep 2020 08:42:13 -0400 Message-ID: Subject: Re: [PATCH v2 00/28] The new cgroup slab memory controller To: David Hildenbrand Cc: Vlastimil Babka , Roman Gushchin , Bharata B Rao , "linux-mm@kvack.org" , Andrew Morton , Michal Hocko , Johannes Weiner , Shakeel Butt , Vladimir Davydov , "linux-kernel@vger.kernel.org" , Kernel Team , Yafang Shao , stable , Linus Torvalds , Sasha Levin , Greg Kroah-Hartman , David Hildenbrand Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > > Am 02.09.2020 um 11:53 schrieb Vlastimil Babka : > > > > =EF=BB=BFOn 8/28/20 6:47 PM, Pavel Tatashin wrote: > >> There appears to be another problem that is related to the > >> cgroup_mutex -> mem_hotplug_lock deadlock described above. > >> > >> In the original deadlock that I described, the workaround is to > >> replace crash dump from piping to Linux traditional save to files > >> method. However, after trying this workaround, I still observed > >> hardware watchdog resets during machine shutdown. > >> > >> The new problem occurs for the following reason: upon shutdown systemd > >> calls a service that hot-removes memory, and if hot-removing fails for > > > > Why is that hotremove even needed if we're shutting down? Are there any > > (virtualization?) platforms where it makes some difference over plain > > shutdown/restart? > > If all it=E2=80=98s doing is offlining random memory that sounds unnecess= ary and dangerous. Any pointers to this service so we can figure out what i= t=E2=80=98s doing and why? (Arch? Hypervisor?) Hi David, This is how we are using it at Microsoft: there is a very large number of small memory machines (8G each) with low downtime requirements (reboot must be under a second). There is also a large state ~2G of memory that we need to transfer during reboot, otherwise it is very expensive to recreate the state. We have 2G of system memory memory reserved as a pmem in the device tree, and use it to pass information across reboots. Once the information is not needed we hot-add that memory and use it during runtime, before shutdown we hot-remove the 2G, save the program state on it, and do the reboot. Pasha