Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp1575522imm; Fri, 15 Jun 2018 21:50:47 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJciG0OqhVEhsZNJL9eMwiQ1TaI+ixWPmTlKn3Ywchk2MHANYNlOvHGVyrPLRIQFHfn6KF4 X-Received: by 2002:a63:7d1b:: with SMTP id y27-v6mr4042798pgc.418.1529124647804; Fri, 15 Jun 2018 21:50:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529124647; cv=none; d=google.com; s=arc-20160816; b=rQg2ifoZC2kXNyIPbx5Iag70Wx4RiOYUU43bSUjeywmPUoTDv5y2Z0KgCUn67ivt61 X06mnfydQZyhMVB7uxJfaOhfAeEWrnTvMXs8NeOvfJfia5H5eU+exs1LB1aeO8wEeg+d dLgtKKs/34jlJaylfH49X8GHxhDfYka+cPSWazDIoyKs/Bm/3PXfmL/HSlLa6pXmUwGS zYOJ2+c4J1EvZzZWUeFUuH/ykaXD7ssW6rfyAtuXArgkYkCHiokz570lLN9iQfh5wmQd ZBB0A/7Dy7bnv+DgJHeKeET8z1Uaa8a/MKS/qoQSoRvgTGTHMctUmLe7oi3FEBoKSvYB iM8A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=vO7qBuc0OnHNekWaeFs7kkqXSzw6bPh4b7xlS4Avv58=; b=BIBcrd88H7Ug5KSSv8veM4R2fJJ2IU1IpFrMy44Ty8Zqc8P0X9mMK7G6Pj1H/3IDRG WLZC3am0jh9WcccHraODQk3MMqgLJdO1fwlYsIsddcWuj5ngJbh3y0/ZCIm65Q2KDoRJ 7IJi514hT41cKDR+CtN7HnlvDdR9OVjIUvQauexxPm7RI7ylFLs6BQa+md9NBLRZfzec TftuUhvMaLLl+xZWSOoB6+Nc0/h3RBnXmulfYIQP33a9Dqx0yHsiHyQC1s1puDhKSmDy mFQbTLwD5/tthzhoZV/I0bhL+xxjsD5y5yc6k62vAMPg5pUS8+TC2Pxo/qDFsJkxfkeS EUHw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=RWbruW3w; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 189-v6si8797625pfg.163.2018.06.15.21.50.32; Fri, 15 Jun 2018 21:50:47 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=RWbruW3w; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754919AbeFPEuM (ORCPT + 99 others); Sat, 16 Jun 2018 00:50:12 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:42152 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751407AbeFPEuL (ORCPT ); Sat, 16 Jun 2018 00:50:11 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=vO7qBuc0OnHNekWaeFs7kkqXSzw6bPh4b7xlS4Avv58=; b=RWbruW3w73EcuwwqDE2b0udHd WF1OisjheKCLkuzs2SqX/DuWZ+Eavra80uVQM/PZryi08GBS269vtJIPAps865QzzUugrcahPwIrd Lyu84MFNNGDLynHNdVEjNk9FDUd82PDGfpyRw3AjL3YE/ubibXWfIo5KkM5d8moCJwxJfpMyHkpma WmQ6R92BG23UBIUBi8j+mb7ZiVRmIDgvOXm3UKnl0uzshL9/za1WURUJtLm+YuoU4v658BnxkYcli MFCY20vXrce4qc4k81QVs9+RZVQWhxhunfmZXjFTIi6bHwXsDx245taAB3QylmChLlyGiDLTRxmR0 DMDjGohPg==; Received: from willy by bombadil.infradead.org with local (Exim 4.90_1 #2 (Red Hat Linux)) id 1fU3A6-0001uC-30; Sat, 16 Jun 2018 04:50:06 +0000 Date: Fri, 15 Jun 2018 21:50:05 -0700 From: Matthew Wilcox To: Wei Wang Cc: virtio-dev@lists.oasis-open.org, linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, kvm@vger.kernel.org, linux-mm@kvack.org, mst@redhat.com, mhocko@kernel.org, akpm@linux-foundation.org, torvalds@linux-foundation.org, pbonzini@redhat.com, liliang.opensource@gmail.com, yang.zhang.wz@gmail.com, quan.xu0@gmail.com, nilal@redhat.com, riel@redhat.com, peterx@redhat.com Subject: Re: [PATCH v33 1/4] mm: add a function to get free page blocks Message-ID: <20180616045005.GA14936@bombadil.infradead.org> References: <1529037793-35521-1-git-send-email-wei.w.wang@intel.com> <1529037793-35521-2-git-send-email-wei.w.wang@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1529037793-35521-2-git-send-email-wei.w.wang@intel.com> User-Agent: Mutt/1.9.2 (2017-12-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jun 15, 2018 at 12:43:10PM +0800, Wei Wang wrote: > +/** > + * get_from_free_page_list - get free page blocks from a free page list > + * @order: the order of the free page list to check > + * @buf: the array to store the physical addresses of the free page blocks > + * @size: the array size > + * > + * This function offers hints about free pages. There is no guarantee that > + * the obtained free pages are still on the free page list after the function > + * returns. pfn_to_page on the obtained free pages is strongly discouraged > + * and if there is an absolute need for that, make sure to contact MM people > + * to discuss potential problems. > + * > + * The addresses are currently stored to the array in little endian. This > + * avoids the overhead of converting endianness by the caller who needs data > + * in the little endian format. Big endian support can be added on demand in > + * the future. > + * > + * Return the number of free page blocks obtained from the free page list. > + * The maximum number of free page blocks that can be obtained is limited to > + * the caller's array size. > + */ Please use: * Return: The number of free page blocks obtained from the free page list. Also, please include a * Context: Any context. or * Context: Process context. or whatever other conetext this function can be called from. Since you're taking the lock irqsafe, I assume this can be called from any context, but I wonder if it makes sense to have this function callable from interrupt context. Maybe this should be callable from process context only. > +uint32_t get_from_free_page_list(int order, __le64 buf[], uint32_t size) > +{ > + struct zone *zone; > + enum migratetype mt; > + struct page *page; > + struct list_head *list; > + unsigned long addr, flags; > + uint32_t index = 0; > + > + for_each_populated_zone(zone) { > + spin_lock_irqsave(&zone->lock, flags); > + for (mt = 0; mt < MIGRATE_TYPES; mt++) { > + list = &zone->free_area[order].free_list[mt]; > + list_for_each_entry(page, list, lru) { > + addr = page_to_pfn(page) << PAGE_SHIFT; > + if (likely(index < size)) { > + buf[index++] = cpu_to_le64(addr); > + } else { > + spin_unlock_irqrestore(&zone->lock, > + flags); > + return index; > + } > + } > + } > + spin_unlock_irqrestore(&zone->lock, flags); > + } > + > + return index; > +} I wonder if (to address Michael's concern), you shouldn't instead use the first free chunk of pages to return the addresses of all the pages. ie something like this: __le64 *ret = NULL; unsigned int max = (PAGE_SIZE << order) / sizeof(__le64); for_each_populated_zone(zone) { spin_lock_irq(&zone->lock); for (mt = 0; mt < MIGRATE_TYPES; mt++) { list = &zone->free_area[order].free_list[mt]; list_for_each_entry_safe(page, list, lru, ...) { if (index == size) break; addr = page_to_pfn(page) << PAGE_SHIFT; if (!ret) { list_del(...); ret = addr; } ret[index++] = cpu_to_le64(addr); } } spin_unlock_irq(&zone->lock); } return ret; } You'll need to return the page to the freelist afterwards, but free_pages() should take care of that.