Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754750Ab3GXQnO (ORCPT ); Wed, 24 Jul 2013 12:43:14 -0400 Received: from www.sr71.net ([198.145.64.142]:53843 "EHLO blackbird.sr71.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754140Ab3GXQnN (ORCPT ); Wed, 24 Jul 2013 12:43:13 -0400 Message-ID: <51F00415.8070104@sr71.net> Date: Wed, 24 Jul 2013 09:43:01 -0700 From: Dave Hansen User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130623 Thunderbird/17.0.7 MIME-Version: 1.0 To: KY Srinivasan CC: Dave Hansen , Michal Hocko , "gregkh@linuxfoundation.org" , "linux-kernel@vger.kernel.org" , "devel@linuxdriverproject.org" , "olaf@aepfle.de" , "apw@canonical.com" , "andi@firstfloor.org" , "akpm@linux-foundation.org" , "linux-mm@kvack.org" , "kamezawa.hiroyuki@gmail.com" , "hannes@cmpxchg.org" , "yinghan@google.com" , "jasowang@redhat.com" , "kay@vrfy.org" Subject: Re: [PATCH 1/1] Drivers: base: memory: Export symbols for onlining memory blocks References: <1374261785-1615-1-git-send-email-kys@microsoft.com> <20130722123716.GB24400@dhcp22.suse.cz> <51EEA11D.4030007@intel.com> <3318be0a96cb4d05838d76dc9d088cc0@SN2PR03MB061.namprd03.prod.outlook.com> <51EEA89F.9070309@intel.com> <9f351a549e76483d9148f87535567ea0@SN2PR03MB061.namprd03.prod.outlook.com> In-Reply-To: <9f351a549e76483d9148f87535567ea0@SN2PR03MB061.namprd03.prod.outlook.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3424 Lines: 64 On 07/23/2013 10:21 AM, KY Srinivasan wrote: >> You have allocated some large, physically contiguous areas of memory >> under heavy pressure. But you also contend that there is too much >> memory pressure to run a small userspace helper. Under heavy memory >> pressure, I'd expect large, kernel allocations to fail much more often >> than running a small userspace helper. > > I am only reporting what I am seeing. Broadly, I have two main failure conditions to > deal with: (a) resource related failure (add_memory() returning -ENOMEM) and (b) not being > able to online a segment that has been successfully hot-added. I have seen both these failures > under high memory pressure. By supporting "in context" onlining, we can eliminate one failure > case. Our inability to online is not a recoverable failure from the host's point of view - the memory > is committed to the guest (since hot add succeeded) but is not usable since it is not onlined. Could you please precisely report on what you are seeing in detail? Where are the -ENOMEMs coming from? Which allocation site? Are you seeing OOMs or page allocation failure messages on the console? The operation was split up in to two parts for good reason. It's actually for your _precise_ use case. A system under memory pressure is going to have troubles doing a hot-add. You need memory to add memory. Of the two operations ("add" and "online"), "add" is the one vastly more likely to fail. It has to allocate several large swaths of contiguous physical memory. For that reason, the system was designed so that you could "add" and "online" separately. The intention was that you could "add" far in advance and then "online" under memory pressure, with the "online" having *VASTLY* smaller memory requirements and being much more likely to succeed. You're lumping the "allocate several large swaths of contiguous physical memory" failures in to the same class as "run a small userspace helper". They are _really_ different problems. Both prone to allocation failures for sure, but _very_ separate problems. Please don't conflate them. >> It _sounds_ like you really want to be able to have the host retry the >> operation if it fails, and you return success/failure from inside the >> kernel. It's hard for you to tell if running the userspace helper >> failed, so your solution is to move what what previously done in >> userspace in to the kernel so that you can more easily tell if it failed >> or succeeded. >> >> Is that right? > > No; I am able to get the proper error code for recoverable failures (hot add failures > because of lack of memory). By doing what I am proposing here, we can avoid one class > of failures completely and I think this is what resulted in a better "hot add" experience in the > guest. I think you're taking a huge leap here: "We could not online memory, thus we must take userspace out of the loop." You might be right. There might be only one way out of this situation. But you need to provide a little more supporting evidence before we all arrive at the same conclusion. BTW, it doesn't _require_ udev. There could easily be another listener for hotplug events. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/