Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753765AbbGXTtS (ORCPT ); Fri, 24 Jul 2015 15:49:18 -0400 Received: from mail-pa0-f51.google.com ([209.85.220.51]:35156 "EHLO mail-pa0-f51.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750934AbbGXTtQ (ORCPT ); Fri, 24 Jul 2015 15:49:16 -0400 Date: Fri, 24 Jul 2015 12:49:14 -0700 (PDT) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: =?UTF-8?Q?J=C3=B6rn_Engel?= cc: Spencer Baugh , Andrew Morton , Naoya Horiguchi , Davidlohr Bueso , Mike Kravetz , Luiz Capitulino , "open list:MEMORY MANAGEMENT" , open list , Spencer Baugh , Joern Engel Subject: Re: [PATCH] hugetlb: cond_resched for set_max_huge_pages and follow_hugetlb_page In-Reply-To: <20150723230928.GI24876@Sligo.logfs.org> Message-ID: References: <1437688476-3399-1-git-send-email-sbaugh@catern.com> <20150723223651.GH24876@Sligo.logfs.org> <20150723230928.GI24876@Sligo.logfs.org> User-Agent: Alpine 2.10 (DEB 1266 2009-07-14) MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="397176738-21862817-1437767156=:5215" Content-ID: Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2789 Lines: 69 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --397176738-21862817-1437767156=:5215 Content-Type: TEXT/PLAIN; CHARSET=UTF-8 Content-Transfer-Encoding: 8BIT Content-ID: On Thu, 23 Jul 2015, Jörn Engel wrote: > > The loop is dropping the lock simply to do the allocation and it needs to > > compare with the user-written number of hugepages to allocate. > > And at this point the existing code is racy. Page allocation might > block for minutes trying to free some memory. A cond_resched doesn't > change that - it only increases the odds of hitting the race window. > The existing code has always been racy, it explicitly admits this, the problem is that your patch is making the race window larger. > Are we looking at the same code? Mine looks like this: > while (count > persistent_huge_pages(h)) { > /* > * If this allocation races such that we no longer need the > * page, free_huge_page will handle it by freeing the page > * and reducing the surplus. > */ > spin_unlock(&hugetlb_lock); > if (hstate_is_gigantic(h)) > ret = alloc_fresh_gigantic_page(h, nodes_allowed); > else > ret = alloc_fresh_huge_page(h, nodes_allowed); > spin_lock(&hugetlb_lock); > if (!ret) > goto out; > > /* Bail for signals. Probably ctrl-c from user */ > if (signal_pending(current)) > goto out; > } > I don't see the cond_resched() you propose to add, but the need for it is obvious with a large user-written nr_hugepages in the above loop. The suggestion is to check the conditional, reschedule if needed (and if so, recheck the conditional), and then allocate. Your third option looks fine and the best place to do the cond_resched(). I was looking at your second option when I responded and compared it to the first. We don't want to do cond_resched() immediately before or after the allocation, the net result is the same: we may be pointlessly allocating the hugepage and each hugepage allocation can be very heavyweight. So I agree with your third option from the previous email. You may also want to include the actual text of the warning from the kernel log in your commit message. When people encounter this, then will probably grep in the kernel logs for some keywords to see if it was already fixed and I fear your current commit message may allow it to be missed. --397176738-21862817-1437767156=:5215-- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/