Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1033402AbdD0BJ2 (ORCPT ); Wed, 26 Apr 2017 21:09:28 -0400 Received: from mail.kernel.org ([198.145.29.136]:43168 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1033382AbdD0BJT (ORCPT ); Wed, 26 Apr 2017 21:09:19 -0400 MIME-Version: 1.0 In-Reply-To: References: From: Andy Lutomirski Date: Wed, 26 Apr 2017 18:08:55 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: xen_exit_mmap() questions To: Boris Ostrovsky Cc: Andy Lutomirski , "xen-devel@lists.xenproject.org" , Juergen Gross , X86 ML , Borislav Petkov , "linux-kernel@vger.kernel.org" Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2869 Lines: 81 On Wed, Apr 26, 2017 at 5:55 PM, Boris Ostrovsky wrote: > > > On 04/26/2017 06:49 PM, Andy Lutomirski wrote: >> >> On Wed, Apr 26, 2017 at 3:45 PM, Boris Ostrovsky >> wrote: >>> >>> On 04/26/2017 04:52 PM, Andy Lutomirski wrote: >>>> >>>> I was trying to understand xen_drop_mm_ref() to update it for some >>>> changes I'm working on, and I'm wondering whether we need >>>> xen_exit_mmap() at all. >>>> >>>> AFAICS the intent is to force all CPUs to drop their lazy uses of the >>>> mm being destroyed so it can be unpinned before tearing down the page >>>> tables, thus making it faster to tear down the page tables. This >>>> seems like it'll speed up xen_set_pud() and xen_set_pmd(), but this >>>> seems like it may be of rather limited value. >>> >>> >>> Why do you think it's of limited value? Without it we will end up with a >>> hypercall for each update. >>> >>> Or is your point that the number of those update is relatively small >>> when we are tearing down? >> >> >> The latter. Also, unless I'm missing something, xen_set_pte() doesn't >> have the optimization. I haven't looked at exactly how page table >> teardown works, but if it clears each PTE individually, then that's >> the bulk of the work. >> >>> >>> >>>> Could we get away with >>>> deleting it? >>>> >>>> Also, this code in drop_other_mm_ref() looks dubious to me: >>>> >>>> /* If this cpu still has a stale cr3 reference, then make sure >>>> it has been flushed. */ >>>> if (this_cpu_read(xen_current_cr3) == __pa(mm->pgd)) >>>> load_cr3(swapper_pg_dir); >>>> >>>> If cr3 hasn't been flushed to the hypervisor because we're in a lazy >>>> mode, why would load_cr3() help? Shouldn't this be xen_mc_flush() >>>> instead? >>> >>> >>> load_cr3() actually ends with xen_mc_flush() by way of xen_write_cr3() >>> -> xen_mc_issue(). >> >> >> xen_mc_issue() does: >> >> if ((paravirt_get_lazy_mode() & mode) == 0) >> xen_mc_flush(); >> >> I assume the load_cr3() is intended to deal with the case where we're >> in lazy mode, but we'll still be in lazy mode, right? Or does it >> serve some other purpose? > > > Of course. I can't read (I ignored the "== 0" part). > > Apparently the early version had an explicit flush but then it disappeared > (commit 9f79991d4186089e228274196413572cc000143b). > > The point of CR3 loading here, I believe, is to make sure the hypervisor > knows that the (v)CPU is no longer using the the mm's cr3 (we are loading > swapper_pgdir here). But that's what leave_mm() does. To be fair, the x86 lazy TLB management is a big mess, and this came up because I'm trying to clean it up without removing it. I suppose I can try to keep xen_exit_mmap() working. Is there a simple way to try to unpin but to not treat it as an error if the hypervisor rejects it?