Subject: Re: [PATCH v2 4/5] KVM: add KVM_USER_EXIT vcpu ioctl for userspace
 exit
To: Paolo Bonzini <pbonzini@redhat.com>,
        =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= <rkrcmar@redhat.com>,
        linux-kernel@vger.kernel.org
References: <1438792381-19453-1-git-send-email-rkrcmar@redhat.com>
 <1438792381-19453-5-git-send-email-rkrcmar@redhat.com>
 <55D0F219.6020502@gmail.com> <55D1DE8F.4050806@redhat.com>
 <55D379C2.9020608@gmail.com> <55D38E37.5060709@redhat.com>
Cc: kvm@vger.kernel.org
From: Avi Kivity <avi.kivity@gmail.com>
Message-ID: <55D425AB.8000800@gmail.com>
Date: Wed, 19 Aug 2015 09:43:55 +0300
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101
 Thunderbird/38.1.0
MIME-Version: 1.0
In-Reply-To: <55D38E37.5060709@redhat.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2377
Lines: 64

On 08/18/2015 10:57 PM, Paolo Bonzini wrote:
>
> On 18/08/2015 11:30, Avi Kivity wrote:
>>> KVM_USER_EXIT in practice should be so rare (at least with in-kernel
>>> LAPIC) that I don't think this matters.  KVM_USER_EXIT is relatively
>>> uninteresting, it only exists to provide an alternative to signals that
>>> doesn't require expensive atomics on each and every KVM_RUN. :(
>> Ah, so the idea is to remove the cost of changing the signal mask?
> Yes, it's explained in the cover letter.
>
>> Yes, although it looks like a thread-local operation, it takes a
>> process-wide lock.
> IIRC the lock was only task-wide and uncontended.  Problem is, it's on
> the node that created the thread rather than the node that is running
> it, and inter-node atomics are really, really slow.

Cached inter-node atomics are (relatively) fast, but I think it really 
is a process-wide lock:

sigprocmask calls:

void __set_current_blocked(const sigset_t *newset)
{
     struct task_struct *tsk = current;

     spin_lock_irq(&tsk->sighand->siglock);
     __set_task_blocked(tsk, newset);
     spin_unlock_irq(&tsk->sighand->siglock);
}

struct sighand_struct {
     atomic_t        count;
     struct k_sigaction    action[_NSIG];
     spinlock_t        siglock;
     wait_queue_head_t    signalfd_wqh;
};

Since sigaction is usually process-wide, I conclude that so will 
tsk->sighand.


>
> For guests spanning >1 host NUMA nodes it's not really practical to
> ensure that the thread is created on the right node.  Even for guests
> that fit into 1 host node, if you rely on AutoNUMA the VCPUs are created
> too early for AutoNUMA to have any effect.  And newer machines have
> frighteningly small nodes (two nodes per socket, so it's something like
> 7 pCPUs if you don't have hyper-threading enabled).  True, the NUMA
> penalty within the same socket is not huge, but it still costs a few
> thousand clock cycles on vmexit.flat and this feature sweeps it away
> completely.
>
>> I expect most user wakeups are via irqfd, so indeed the performance of
>> KVM_USER_EXIT is uninteresting.
> Yup, either irqfd or KVM_SET_SIGNAL_MSI.
>
> Paolo

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/