Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp4235979imm; Wed, 30 May 2018 01:31:56 -0700 (PDT) X-Google-Smtp-Source: ADUXVKIopjYYsh7M1nHzLux4Mn7SDfmhKlUzs2weZneoA66jW4ywG5Iks43QoEdbvPK7mlAQfz4j X-Received: by 2002:a17:902:a4:: with SMTP id a33-v6mr1901050pla.346.1527669116442; Wed, 30 May 2018 01:31:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527669116; cv=none; d=google.com; s=arc-20160816; b=ct3CDbTRuvz8FnNbwrvInrf77Hqnpd12Y2+H8sDvtcdqjHdLQN8+RpDGAhTqCgevAH d2625JAy4KZN3WaY/V88m6vOnlUPk9s2Qnwovfv0qWwD/ipda0gubyGdO2n7KYks8KXk dfTRk7OILn1hIehOt3l9D0tqx0gIv/M4whEMRVivLdtOY8AHRDsmQvSebQ27JgMkrXRH CP6xq+aDrYv+xg72XPL6gBwtXVajhyScP0iMTXBSUCwQ9YIbYDg0kPMsCiXdYKR+vC6k K79QO0aLyq8f+Im9fiFiUjS/kyLTCrvz03IUCjEA8TuILo5euVJ594GX7Pcv139b+jeO KNAg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=3GOhH+xUQfB1IP7A3p+LUONEy4s/S1bcc2GzGkDMEP8=; b=umCdsadCKesjF+o5yh0fh7Pyj8zA0PnAhvz6zSO2xtpB4v3RRaW7LM4q7bBrC/lgzT Mb8eeyFQgoW6LDAY54FW5F0yxEZPBJLjuuGRv3464cKZNPXVvOrICQXkgTvVn0GMCHE+ MI7hFXDyOrPLDqNpa0boUBJbe4YZFFsHBgkYmBI9oOOHxOlT5Mxuc5kHbArToID9wBF1 RMqHFRm2sN221virl6+8+NCyivob59qx8qtvH9PBoljXDoUA5eP6mDywHXU++il9Uytj 8htil2zW19oJynmCy1dta6Ghr+5x/wFHKWHr/VkKXyfFvEV/3qGOn7WrdbUwHNX7BG0u UMug== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=VGDj2/VB; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z6-v6si16243314pgp.102.2018.05.30.01.31.42; Wed, 30 May 2018 01:31:56 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=VGDj2/VB; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S937291AbeE3I35 (ORCPT + 99 others); Wed, 30 May 2018 04:29:57 -0400 Received: from mail-lf0-f65.google.com ([209.85.215.65]:36258 "EHLO mail-lf0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S935879AbeE3I3w (ORCPT ); Wed, 30 May 2018 04:29:52 -0400 Received: by mail-lf0-f65.google.com with SMTP id u4-v6so2941193lff.3; Wed, 30 May 2018 01:29:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding:content-language; bh=3GOhH+xUQfB1IP7A3p+LUONEy4s/S1bcc2GzGkDMEP8=; b=VGDj2/VBlhp/jt5jc/K1z8WqwIE7J+mHSCQglqAa/BrNpbcG2hTtk26UO6yUjsamTW 3nAli6a3XGU0dJ/iaah15PWEtAmO2qC99LmXRDClsuP6XRRyE7N+oAOEFODUIqDNnZe6 JqxT5vuAEOX8vKY5acPBNxuu1IRgLCfHSAO6445I7KYQU7lLlAJwtJPL/r+nrz4l+DX/ 50USUyVJtffGVIi50aI4fGUN4dT5yw/k5zHP3gIWKtb3R3hyWdX+Fwx741ns0DbPdni3 0vQcb6j1/Ldo6xhtUkwKviiwif1ifhbzZ4ODk5GWtwwGgoeCDEldgdgkhWqTJj2H26xA zt9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=3GOhH+xUQfB1IP7A3p+LUONEy4s/S1bcc2GzGkDMEP8=; b=eac1J8iPaPGDFArB45r1AXkUAFAv6+540uGKtCBTZIHmGlBgA+B+8Y11240V515HK2 Bsq3/UTfVkhxxgwZUr0cuDo7sZLmxSu+zz8eX/84lQG07zyIt//uQe8DtBHbNjewHa7k 3gCZTfVKT4oHNx5xFb8+5MeEjcmCDepoB/1Lp3Ym7Tvw6MPqJThGpBElJyvtFwnaVf8m MuFw5DTBCPdcYNr0SNt6asByxvOoReheOohuTZTJwbj4JMix83rsysRUB6fBbQPmtwf2 BK1v2/Wq2IDTsERhGtlJ+WpCjelOyC40goyyVhC+ptctTmlQa4ofcAZ8jQn4E9dYFj3H mJrw== X-Gm-Message-State: ALKqPwfXXzbvz1QxQ9BhijGTyX+hcjPYY47OVrUEp63cU3l/J4lIQ5+T 9xxeGV7Eu5h5EUSulrcthFI= X-Received: by 2002:a2e:5988:: with SMTP id g8-v6mr1391925ljf.64.1527668991058; Wed, 30 May 2018 01:29:51 -0700 (PDT) Received: from [10.17.182.9] (ll-55.209.223.85.sovam.net.ua. [85.223.209.55]) by smtp.gmail.com with ESMTPSA id g62-v6sm2476548lfk.54.2018.05.30.01.29.49 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 30 May 2018 01:29:50 -0700 (PDT) Subject: Re: [PATCH 2/8] xen/balloon: Move common memory reservation routines to a module To: Boris Ostrovsky , xen-devel@lists.xenproject.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, jgross@suse.com, konrad.wilk@oracle.com Cc: daniel.vetter@intel.com, dongwon.kim@intel.com, matthew.d.roper@intel.com, Oleksandr Andrushchenko References: <20180525153331.31188-1-andr2000@gmail.com> <20180525153331.31188-3-andr2000@gmail.com> <59ab73b0-967b-a82f-3b0d-95f1b0dc40a5@oracle.com> <89de7bdb-8759-419f-63bf-8ed0d57650f0@gmail.com> From: Oleksandr Andrushchenko Message-ID: <6ca7f428-eede-2c14-85fe-da4a20bcea0d@gmail.com> Date: Wed, 30 May 2018 11:29:48 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 05/29/2018 11:03 PM, Boris Ostrovsky wrote: > On 05/29/2018 02:22 PM, Oleksandr Andrushchenko wrote: >> On 05/29/2018 09:04 PM, Boris Ostrovsky wrote: >>> On 05/25/2018 11:33 AM, Oleksandr Andrushchenko wrote: >>> @@ -463,11 +457,6 @@ static enum bp_state >>> increase_reservation(unsigned long nr_pages) >>>       int rc; >>>       unsigned long i; >>>       struct page   *page; >>> -    struct xen_memory_reservation reservation = { >>> -        .address_bits = 0, >>> -        .extent_order = EXTENT_ORDER, >>> -        .domid        = DOMID_SELF >>> -    }; >>>         if (nr_pages > ARRAY_SIZE(frame_list)) >>>           nr_pages = ARRAY_SIZE(frame_list); >>> @@ -486,9 +475,7 @@ static enum bp_state >>> increase_reservation(unsigned long nr_pages) >>>           page = balloon_next_page(page); >>>       } >>>   -    set_xen_guest_handle(reservation.extent_start, frame_list); >>> -    reservation.nr_extents = nr_pages; >>> -    rc = HYPERVISOR_memory_op(XENMEM_populate_physmap, &reservation); >>> +    rc = xenmem_reservation_increase(nr_pages, frame_list); >>>       if (rc <= 0) >>>           return BP_EAGAIN; >>>   @@ -496,29 +483,7 @@ static enum bp_state >>> increase_reservation(unsigned long nr_pages) >>>           page = balloon_retrieve(false); >>>           BUG_ON(page == NULL); >>>   -#ifdef CONFIG_XEN_HAVE_PVMMU >>> -        /* >>> -         * We don't support PV MMU when Linux and Xen is using >>> -         * different page granularity. >>> -         */ >>> -        BUILD_BUG_ON(XEN_PAGE_SIZE != PAGE_SIZE); >>> - >>> -        if (!xen_feature(XENFEAT_auto_translated_physmap)) { >>> -            unsigned long pfn = page_to_pfn(page); >>> - >>> -            set_phys_to_machine(pfn, frame_list[i]); >>> - >>> -            /* Link back into the page tables if not highmem. */ >>> -            if (!PageHighMem(page)) { >>> -                int ret; >>> -                ret = HYPERVISOR_update_va_mapping( >>> -                        (unsigned long)__va(pfn << PAGE_SHIFT), >>> -                        mfn_pte(frame_list[i], PAGE_KERNEL), >>> -                        0); >>> -                BUG_ON(ret); >>> -            } >>> -        } >>> -#endif >>> +        xenmem_reservation_va_mapping_update(1, &page, &frame_list[i]); >>> >>> Can you make a single call to xenmem_reservation_va_mapping_update(rc, >>> ...)? You need to keep track of pages but presumable they can be put >>> into an array (or a list). In fact, perhaps we can have >>> balloon_retrieve() return a set of pages. >> This is actually how it is used later on for dma-buf, but I just >> didn't want >> to alter original balloon code too much, but this can be done, in >> order of simplicity: >> >> 1. Similar to frame_list, e.g. static array of struct page* of size >> ARRAY_SIZE(frame_list): >> more static memory is used, but no allocations >> >> 2. Allocated at run-time with kcalloc: allocation can fail > > If this is called in freeing DMA buffer code path or in error path then > we shouldn't do it. > > >> 3. Make balloon_retrieve() return a set of pages: will require >> list/array allocation >> and handling, allocation may fail, balloon_retrieve prototype change > > balloon pages are strung on the lru list. Can we keep have > balloon_retrieve return a list of pages on that list? First of all, before we go deep in details, I will highlight the goal of the requested change: for balloon driver we call xenmem_reservation_va_mapping_update(*1*, &page, &frame_list[i]); from increase_reservation and xenmem_reservation_va_mapping_reset(*1*, &page); from decrease_reservation and it seems to be not elegant because of that one page/frame passed while we might have multiple pages/frames passed at once. In the balloon driver the producer of pages for increase_reservation is balloon_retrieve(false) and for decrease_reservation it is alloc_page(gfp). In case of decrease_reservation the page is added on the list: LIST_HEAD(pages); [...] list_add(&page->lru, &pages); and in case of increase_reservation it is retrieved page by page and can be put on a list as well with the same code from decrease_reservation, e.g. LIST_HEAD(pages); [...] list_add(&page->lru, &pages); Thus, both decrease_reservation and increase_reservation may hold their pages on a list before calling xenmem_reservation_va_mapping_{update|reset}. For that we need a prototype change: xenmem_reservation_va_mapping_reset(, ); But for xenmem_reservation_va_mapping_update it will look like: xenmem_reservation_va_mapping_update(, , ) which seems to be inconsistent. Converting entries of the static frame_list array into corresponding list doesn't seem to be cute as well. For dma-buf use-case arrays are more preferable as dma-buf constructs scatter-gather tables from array of pages etc. and if page list is passed then it needs to be converted into page array anyways. So, we can: 1. Keep the prototypes as is, e.g. accept array of pages and use nr_pages == 1 in case of balloon driver (existing code) 2. Statically allocate struct page* array in the balloon driver and fill it with pages when those pages are retrieved: static struct page *page_list[ARRAY_SIZE(frame_list)]; which will take additional 8KiB of space on 64-bit platform, but simplify things a lot. 3. Allocate struct page *page_list[ARRAY_SIZE(frame_list)] dynamically As to Boris' suggestion "balloon pages are strung on the lru list. Can we keep have balloon_retrieve return a list of pages on that list?" Because of alloc_xenballooned_pages' retry logic for page retireval, e.g.     while (pgno < nr_pages) {         page = balloon_retrieve(true);         if (page) { [...]         } else {             ret = add_ballooned_pages(nr_pages - pgno); [...]     } I wouldn't change things that much. IMO, we can keep 1 page based API with the only overhead for balloon driver of function calls to xenmem_reservation_va_mapping_{update|reset} for each page. > -boris Thank you, Oleksandr > >> Could you please tell which of the above will fit better? >> >>>