Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752788AbdHKJXR (ORCPT ); Fri, 11 Aug 2017 05:23:17 -0400 Received: from mx1.redhat.com ([209.132.183.28]:51450 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752740AbdHKJXP (ORCPT ); Fri, 11 Aug 2017 05:23:15 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com EC9842CB9 Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=vkuznets@redhat.com From: Vitaly Kuznetsov To: Peter Zijlstra Cc: Jork Loeser , KY Srinivasan , Simon Xiao , Haiyang Zhang , Stephen Hemminger , "torvalds\@linux-foundation.org" , "luto\@kernel.org" , "hpa\@zytor.com" , "linux-kernel\@vger.kernel.org" , "rostedt\@goodmis.org" , "andy.shevchenko\@gmail.com" , "tglx\@linutronix.de" , "mingo\@kernel.org" , "linux-tip-commits\@vger.kernel.org" Subject: Re: [tip:x86/platform] x86/hyper-v: Use hypercall for remote TLB flush References: <20170802160921.21791-8-vkuznets@redhat.com> <20170810185646.GI6524@worktop.programming.kicks-ass.net> <20170810192742.GJ6524@worktop.programming.kicks-ass.net> Date: Fri, 11 Aug 2017 11:23:10 +0200 In-Reply-To: <20170810192742.GJ6524@worktop.programming.kicks-ass.net> (Peter Zijlstra's message of "Thu, 10 Aug 2017 21:27:42 +0200") Message-ID: <87lgmqqwzl.fsf@vitty.brq.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Fri, 11 Aug 2017 09:23:15 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1674 Lines: 41 Peter Zijlstra writes: > On Thu, Aug 10, 2017 at 07:08:22PM +0000, Jork Loeser wrote: > >> > > Subject: Re: [tip:x86/platform] x86/hyper-v: Use hypercall for remote TLB flush >> >> > > Hold on.. if we don't IPI for TLB invalidation. What serializes our >> > > software page table walkers like fast_gup() ? >> > >> > Hypervisor may implement this functionality via an IPI. >> > >> > K. Y >> >> HvFlushVirtualAddressList() states: >> This call guarantees that by the time control returns back to the >> caller, the observable effects of all flushes on the specified virtual >> processors have occurred. >> >> HvFlushVirtualAddressListEx() refers to HvFlushVirtualAddressList() as adding sparse target VP lists. >> >> Is this enough of a guarantee, or do you see other races? > > That's nowhere near enough. We need the remote CPU to have completed any > guest IF section that was in progress at the time of the call. > > So if a host IPI can interrupt a guest while the guest has IF cleared, > and we then process the host IPI -- clear the TLBs -- before resuming the > guest, which still has IF cleared, we've got a problem. > > Because at that point, our software page-table walker, that relies on IF > being clear to guarantee the page-tables exist, because it holds off the > TLB invalidate and thereby the freeing of the pages, gets its pages > ripped out from under it. Oh, I see your concern. Hyper-V, however, is not the first x86 hypervisor trying to avoid IPIs on remote TLB flush, Xen does this too. Briefly looking at xen_flush_tlb_others() I don't see anything special, do we know how serialization is achieved there? -- Vitaly