Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp4729416imm; Wed, 30 May 2018 10:47:51 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJElzXqSISR6g7U8rRRjyqFfJdHxNRFQ7P1nZcBMP/TwCLVC5GwBYiq31Iej0kTJKHbAyZ0 X-Received: by 2002:a62:4086:: with SMTP id f6-v6mr3624431pfd.194.1527702471325; Wed, 30 May 2018 10:47:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527702471; cv=none; d=google.com; s=arc-20160816; b=ru1SHcZaBqwzy5iVItJGjbPP5g//j4oe1WdljY+wOp53eeygEGhLJEFzsrFQQWbmcM TdvnicdDRDpMPpVxIFP+H410V7J+jo68Ksc+YMsY5rTvojoTLpOoQebpyKqPJY6I5GaS L3o7hoJIqvsQIFVpPcuS4tzeq40nm6qbNT/RlIT3Gc4svPcw2iNCdDnMZR1HCBgYzMwM qs0ejiV+FTb0n/5kUnfdopVe0E5J1nMjoklkjnEFxSFqyv6iFod8FDVc2heYiYDZBEBk QObSjKOBsbv2retmGIv5tQQXwCPCzB36Q/Ea+molQQp5Mabllo++MAbyA5izyuDs2hDI +Vmg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=1CubbmIyXZ+SQTJ6y3g+c9/LNBMlqY+duU/IsCMGcbI=; b=UwxVMJ3YGZaVY+i/pJvCIM8XzFwSr+CIHCL6ZP1f1rSOxP7HPIpYZJvYuTgxOcQhmO rDsnSDEQiUFVWyjQFcWzTxeocMNs1ZrGo+51Z0Ai5K81PaAVOveXQv4c+/kf2KH1v5k5 dZ5kR4dbp4ajRnKMvnPhd1F0kd15GfGYT2plPtinDmasozGrVe7IYBTMSlXeFaWg50pz qDNbdgAY4rAxjvuS1prtPu2L1rDKy7IWaqa61ON5emBpJC6GzRnWssdSBYskNIB2DMxk P2oiBxul4r05iPY/0PYQ6GQgd7uO/UXXhjI19jo1KHzqYvEIZUdkHgSVQttqrOVbWnRO EOrw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=esMx0lIf; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v8-v6si34932640plo.306.2018.05.30.10.47.37; Wed, 30 May 2018 10:47:51 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=esMx0lIf; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753880AbeE3RqI (ORCPT + 99 others); Wed, 30 May 2018 13:46:08 -0400 Received: from mail-lf0-f66.google.com ([209.85.215.66]:46068 "EHLO mail-lf0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751833AbeE3RqG (ORCPT ); Wed, 30 May 2018 13:46:06 -0400 Received: by mail-lf0-f66.google.com with SMTP id n3-v6so5617775lfe.12; Wed, 30 May 2018 10:46:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding:content-language; bh=1CubbmIyXZ+SQTJ6y3g+c9/LNBMlqY+duU/IsCMGcbI=; b=esMx0lIfqJn61BR7rfy3dbd1uwohg/muQpOGdn1Av4f0Z8iw4dPV3WMAWGWDSdjLFb NyhbITeiVFChHye7O9z3DTpzD3Bb0YDUOO1t/6HKDzkJstpxr3b2ci6MCSZtGCQaouX8 5WU73FROKib5s1modZPDM981G9ipGCxf1S3wRK+hUjmjAPfUGQLUK2Y9e9qOTv/6BE2o 4+wOTVBRmxH5p8uZnx8fEnG64xFrADAtZ5nIjiOLj+vfyg9CC5W2YPOJMxDVkh2XjUd/ lDSWnSEhWlqb4cpesTcN6SLlOtd+qzG7S1tOyowCSt2LpNNaYuTwUwliYS3Nne3YNNlb 7NgA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=1CubbmIyXZ+SQTJ6y3g+c9/LNBMlqY+duU/IsCMGcbI=; b=MjAIGBXeUUA+i+M8Fr38Am/AHNTO46YZAQaL/7wbXcrzOMsCHo9XrYn4RiXtLbyh/n FtjlGYc+sS0PKBFFFiLU16KGPN1+eDEeD8hdtl6dNuECueaOyZrLVUNPJzXqbaMtDuYc kS7V1MHw4Vm3jxXKULbKRYuBQ1VQroUTPWF6JDfjEDGKdwdizfoGf5i4Pz1DvDYMJLcA xMdvwbMKy3i5o1GLjJ8S/Eh5Q0gtdCc/YVDOOURfsQTuzGwEJcUOQ/wNVj/6yUxd0otJ TalfzM6x8H5PY9tcjcgo0CvadoBm+TJkskFxPEfUABbsT3cnj7eSlsmUnQyLcXRbbJUY 0rCQ== X-Gm-Message-State: ALKqPwcdbbqEuw5UiNu9YV7XCB6745lh+DPc3wip2ByUI7CCAjdxYiMb MRTzYPNMCG57L4Gz31zoSUI= X-Received: by 2002:a19:5388:: with SMTP id h8-v6mr2217570lfl.92.1527702363768; Wed, 30 May 2018 10:46:03 -0700 (PDT) Received: from [192.168.0.20] (39-55-94-178.pool.ukrtel.net. [178.94.55.39]) by smtp.googlemail.com with ESMTPSA id z18-v6sm7869145lfj.38.2018.05.30.10.46.01 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 30 May 2018 10:46:02 -0700 (PDT) Subject: Re: [PATCH 2/8] xen/balloon: Move common memory reservation routines to a module To: Boris Ostrovsky , xen-devel@lists.xenproject.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, jgross@suse.com, konrad.wilk@oracle.com Cc: daniel.vetter@intel.com, dongwon.kim@intel.com, matthew.d.roper@intel.com, Oleksandr Andrushchenko References: <20180525153331.31188-1-andr2000@gmail.com> <20180525153331.31188-3-andr2000@gmail.com> <59ab73b0-967b-a82f-3b0d-95f1b0dc40a5@oracle.com> <89de7bdb-8759-419f-63bf-8ed0d57650f0@gmail.com> <6ca7f428-eede-2c14-85fe-da4a20bcea0d@gmail.com> <5dd3378d-ac32-691e-1f80-7218a5d07fd6@oracle.com> From: Oleksandr Andrushchenko Message-ID: <43c17501-8865-6e1f-1a92-d947755d8fa8@gmail.com> Date: Wed, 30 May 2018 20:46:00 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <5dd3378d-ac32-691e-1f80-7218a5d07fd6@oracle.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 05/30/2018 06:54 PM, Boris Ostrovsky wrote: > On 05/30/2018 04:29 AM, Oleksandr Andrushchenko wrote: >> On 05/29/2018 11:03 PM, Boris Ostrovsky wrote: >>> On 05/29/2018 02:22 PM, Oleksandr Andrushchenko wrote: >>>> On 05/29/2018 09:04 PM, Boris Ostrovsky wrote: >>>>> On 05/25/2018 11:33 AM, Oleksandr Andrushchenko wrote: >>>>> @@ -463,11 +457,6 @@ static enum bp_state >>>>> increase_reservation(unsigned long nr_pages) >>>>>        int rc; >>>>>        unsigned long i; >>>>>        struct page   *page; >>>>> -    struct xen_memory_reservation reservation = { >>>>> -        .address_bits = 0, >>>>> -        .extent_order = EXTENT_ORDER, >>>>> -        .domid        = DOMID_SELF >>>>> -    }; >>>>>          if (nr_pages > ARRAY_SIZE(frame_list)) >>>>>            nr_pages = ARRAY_SIZE(frame_list); >>>>> @@ -486,9 +475,7 @@ static enum bp_state >>>>> increase_reservation(unsigned long nr_pages) >>>>>            page = balloon_next_page(page); >>>>>        } >>>>>    -    set_xen_guest_handle(reservation.extent_start, frame_list); >>>>> -    reservation.nr_extents = nr_pages; >>>>> -    rc = HYPERVISOR_memory_op(XENMEM_populate_physmap, &reservation); >>>>> +    rc = xenmem_reservation_increase(nr_pages, frame_list); >>>>>        if (rc <= 0) >>>>>            return BP_EAGAIN; >>>>>    @@ -496,29 +483,7 @@ static enum bp_state >>>>> increase_reservation(unsigned long nr_pages) >>>>>            page = balloon_retrieve(false); >>>>>            BUG_ON(page == NULL); >>>>>    -#ifdef CONFIG_XEN_HAVE_PVMMU >>>>> -        /* >>>>> -         * We don't support PV MMU when Linux and Xen is using >>>>> -         * different page granularity. >>>>> -         */ >>>>> -        BUILD_BUG_ON(XEN_PAGE_SIZE != PAGE_SIZE); >>>>> - >>>>> -        if (!xen_feature(XENFEAT_auto_translated_physmap)) { >>>>> -            unsigned long pfn = page_to_pfn(page); >>>>> - >>>>> -            set_phys_to_machine(pfn, frame_list[i]); >>>>> - >>>>> -            /* Link back into the page tables if not highmem. */ >>>>> -            if (!PageHighMem(page)) { >>>>> -                int ret; >>>>> -                ret = HYPERVISOR_update_va_mapping( >>>>> -                        (unsigned long)__va(pfn << PAGE_SHIFT), >>>>> -                        mfn_pte(frame_list[i], PAGE_KERNEL), >>>>> -                        0); >>>>> -                BUG_ON(ret); >>>>> -            } >>>>> -        } >>>>> -#endif >>>>> +        xenmem_reservation_va_mapping_update(1, &page, >>>>> &frame_list[i]); >>>>> >>>>> Can you make a single call to xenmem_reservation_va_mapping_update(rc, >>>>> ...)? You need to keep track of pages but presumable they can be put >>>>> into an array (or a list). In fact, perhaps we can have >>>>> balloon_retrieve() return a set of pages. >>>> This is actually how it is used later on for dma-buf, but I just >>>> didn't want >>>> to alter original balloon code too much, but this can be done, in >>>> order of simplicity: >>>> >>>> 1. Similar to frame_list, e.g. static array of struct page* of size >>>> ARRAY_SIZE(frame_list): >>>> more static memory is used, but no allocations >>>> >>>> 2. Allocated at run-time with kcalloc: allocation can fail >>> If this is called in freeing DMA buffer code path or in error path then >>> we shouldn't do it. >>> >>> >>>> 3. Make balloon_retrieve() return a set of pages: will require >>>> list/array allocation >>>> and handling, allocation may fail, balloon_retrieve prototype change >>> balloon pages are strung on the lru list. Can we keep have >>> balloon_retrieve return a list of pages on that list? >> First of all, before we go deep in details, I will highlight >> the goal of the requested change: for balloon driver we call >> xenmem_reservation_va_mapping_update(*1*, &page, &frame_list[i]); >> from increase_reservation >> and >> xenmem_reservation_va_mapping_reset(*1*, &page); >> from decrease_reservation and it seems to be not elegant because of >> that one page/frame passed while we might have multiple pages/frames >> passed at once. >> >> In the balloon driver the producer of pages for increase_reservation >> is balloon_retrieve(false) and for decrease_reservation it is >> alloc_page(gfp). >> In case of decrease_reservation the page is added on the list: >> LIST_HEAD(pages); >> [...] >> list_add(&page->lru, &pages); >> >> and in case of increase_reservation it is retrieved page by page >> and can be put on a list as well with the same code from >> decrease_reservation, e.g. >> LIST_HEAD(pages); >> [...] >> list_add(&page->lru, &pages); >> >> Thus, both decrease_reservation and increase_reservation may hold >> their pages on a list before calling >> xenmem_reservation_va_mapping_{update|reset}. >> >> For that we need a prototype change: >> xenmem_reservation_va_mapping_reset(, ); >> But for xenmem_reservation_va_mapping_update it will look like: >> xenmem_reservation_va_mapping_update(, , >> ) >> which seems to be inconsistent. Converting entries of the static >> frame_list array >> into corresponding list doesn't seem to be cute as well. >> >> For dma-buf use-case arrays are more preferable as dma-buf constructs >> scatter-gather >> tables from array of pages etc. and if page list is passed then it >> needs to be >> converted into page array anyways. >> >> So, we can: >> 1. Keep the prototypes as is, e.g. accept array of pages and use >> nr_pages == 1 in >> case of balloon driver (existing code) >> 2. Statically allocate struct page* array in the balloon driver and >> fill it with pages >> when those pages are retrieved: >> static struct page *page_list[ARRAY_SIZE(frame_list)]; >> which will take additional 8KiB of space on 64-bit platform, but >> simplify things a lot. >> 3. Allocate struct page *page_list[ARRAY_SIZE(frame_list)] dynamically >> >> As to Boris' suggestion "balloon pages are strung on the lru list. Can >> we keep have >> balloon_retrieve return a list of pages on that list?" >> Because of alloc_xenballooned_pages' retry logic for page retireval, e.g. >>     while (pgno < nr_pages) { >>         page = balloon_retrieve(true); >>         if (page) { >> [...] >>         } else { >>             ret = add_ballooned_pages(nr_pages - pgno); >> [...] >>     } >> I wouldn't change things that much. >> >> IMO, we can keep 1 page based API with the only overhead for balloon >> driver of >> function calls to xenmem_reservation_va_mapping_{update|reset} for >> each page. > > > I still think what I suggested is doable but we can come back to it > later and keep your per-page implementation for now. > > BTW, I also think you can further simplify > xenmem_reservation_va_mapping_* routines by bailing out right away if > xen_feature(XENFEAT_auto_translated_physmap). In fact, you might even > make them inlines, along the lines of > > inline void xenmem_reservation_va_mapping_reset(unsigned long count, > struct page **pages) > { > #ifdef CONFIG_XEN_HAVE_PVMMU > if (!xen_feature(XENFEAT_auto_translated_physmap)) > __xenmem_reservation_va_mapping_reset(...) > #endif > } How about: #ifdef CONFIG_XEN_HAVE_PVMMU static inline __xenmem_reservation_va_mapping_reset(struct page *page) { [...] } #endif and void xenmem_reservation_va_mapping_reset(unsigned long count,                      struct page **pages) { #ifdef CONFIG_XEN_HAVE_PVMMU     if (!xen_feature(XENFEAT_auto_translated_physmap)) {         int i;         for (i = 0; i < count; i++)             __xenmem_reservation_va_mapping_reset(pages[i]);     } #endif } This way I can use __xenmem_reservation_va_mapping_reset(page); instead of xenmem_reservation_va_mapping_reset(1, &page); > Or some such. > > -boris > > -boris > >