Date: Fri, 29 Sep 2017 09:38:48 -0700
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
        Boqun Feng <boqun.feng@gmail.com>,
        "Levin, Alexander (Sasha Levin)" <alexander.levin@verizon.com>,
        Sasha Levin <levinsasha928@gmail.com>,
        "linux-kernel@vger.kernel.org List" <linux-kernel@vger.kernel.org>,
        Ingo Molnar <mingo@kernel.org>,
        "jiangshanlai@gmail.com" <jiangshanlai@gmail.com>,
        "dipankar@in.ibm.com" <dipankar@in.ibm.com>,
        Andrew Morton <akpm@linux-foundation.org>,
        Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
        Josh Triplett <josh@joshtriplett.org>,
        Thomas Gleixner <tglx@linutronix.de>,
        "dhowells@redhat.com" <dhowells@redhat.com>,
        Eric Dumazet <edumazet@google.com>,
        Fr??d??ric Weisbecker <fweisbec@gmail.com>,
        Oleg Nesterov <oleg@redhat.com>,
        "bobby.prani@gmail.com" <bobby.prani@gmail.com>,
        Radim Kr??m???? <rkrcmar@redhat.com>, kvm@vger.kernel.org
Subject: Re: [PATCH v3 tip/core/rcu 40/40] rcu: Make non-preemptive schedule
 be Tasks RCU quiescent state
Reply-To: paulmck@linux.vnet.ibm.com
References: <20170419165805.GB10874@linux.vnet.ibm.com>
 <1492621117-13939-40-git-send-email-paulmck@linux.vnet.ibm.com>
 <CA+1xoqdDCuQ5pz61aHn3Y-VdP5g2GvYfXmTdpHsWJG0dsM3DKg@mail.gmail.com>
 <20170928123055.GI3521@linux.vnet.ibm.com>
 <20170928153813.7cernglt2d7umhpe@sasha-lappy>
 <20170928160514.GM3521@linux.vnet.ibm.com>
 <20170929093010.w56nawdoz23mkzio@tardis>
 <fdfac54b-9b15-d890-ca5c-a121a8ea9bf7@redhat.com>
 <20170929103424.o4yje6sv4s3c7hmq@hirez.programming.kicks-ass.net>
 <03e52ee5-b5b6-edd6-c26a-54bc1aaefd63@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <03e52ee5-b5b6-edd6-c26a-54bc1aaefd63@redhat.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Message-Id: <20170929163848.GA3521@linux.vnet.ibm.com>
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 1813
Lines: 41

On Fri, Sep 29, 2017 at 01:44:56PM +0200, Paolo Bonzini wrote:
> On 29/09/2017 12:34, Peter Zijlstra wrote:
> > On Fri, Sep 29, 2017 at 12:01:24PM +0200, Paolo Bonzini wrote:
> >>> Does this mean whenever we get a page fault in a RCU read-side critical
> >>> section, we may hit this?
> >>>
> >>> Could we simply avoid to schedule() in kvm_async_pf_task_wait() if the
> >>> fault process is in a RCU read-side critical section as follow?
> >>>
> >>> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> >>> index aa60a08b65b1..291ea13b23d2 100644
> >>> --- a/arch/x86/kernel/kvm.c
> >>> +++ b/arch/x86/kernel/kvm.c
> >>> @@ -140,7 +140,7 @@ void kvm_async_pf_task_wait(u32 token)
> >>>  
> >>>  	n.token = token;
> >>>  	n.cpu = smp_processor_id();
> >>> -	n.halted = is_idle_task(current) || preempt_count() > 1;
> >>> +	n.halted = is_idle_task(current) || preempt_count() > 1 || rcu_preempt_depth();
> >>>  	init_swait_queue_head(&n.wq);
> >>>  	hlist_add_head(&n.link, &b->list);
> >>>  	raw_spin_unlock(&b->lock);
> >>>
> >>> (Add KVM folks and list Cced)
> >>
> >> Yes, that would work.  Mind to send it as a proper patch?
> > 
> > I'm confused, why would we do an ASYNC PF at all here? Thing is, a
> > printk() shouldn't trigger a major fault _ever_. At worst it triggers
> > something like a vmalloc minor fault. And I'm thinking we should not do
> > the whole ASYNC machinery for minor faults.
> 
> Async page faults are page faults _on the host_ side, and you cannot
> control what the host pages out.  Of course the hypervisor filters out
> some cases itself (e.g. IF=0) but in general you could get one at any time.

Just to make sure I am understanding this...  You take a page fault on
the host, and this causes a schedule() on the guest?  Or did I lose the
thread here?

								Thanx, Paul