From: Vitaly Kuznetsov <vkuznets@redhat.com>
To: KY Srinivasan <kys@microsoft.com>
Cc: Dexuan Cui <decui@microsoft.com>, Haiyang Zhang <haiyangz@microsoft.com>,
        "devel\@linuxdriverproject.org" <devel@linuxdriverproject.org>,
        "linux-kernel\@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 5/6] Drivers: hv: vmbus: distribute subchannels among all vcpus
References: <1429626460-7947-1-git-send-email-vkuznets@redhat.com>
	<1429626460-7947-6-git-send-email-vkuznets@redhat.com>
	<091d0fd6321f4dd490e61a574d5b5b50@SIXPR30MB031.064d.mgd.msft.net>
	<87oamdgalr.fsf@vitty.brq.redhat.com>
	<BN1PR0301MB0707CF5E57279A9E0244684AA0EC0@BN1PR0301MB0707.namprd03.prod.outlook.com>
Date: Mon, 27 Apr 2015 15:30:23 +0200
In-Reply-To: <BN1PR0301MB0707CF5E57279A9E0244684AA0EC0@BN1PR0301MB0707.namprd03.prod.outlook.com>
	(KY Srinivasan's message of "Fri, 24 Apr 2015 16:46:52 +0000")
Message-ID: <87oam9snps.fsf@vitty.brq.redhat.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 3855
Lines: 91

KY Srinivasan <kys@microsoft.com> writes:

>> -----Original Message-----
>> From: Vitaly Kuznetsov [mailto:vkuznets@redhat.com]
>> Sent: Friday, April 24, 2015 2:05 AM
>> To: Dexuan Cui
>> Cc: KY Srinivasan; Haiyang Zhang; devel@linuxdriverproject.org; linux-
>> kernel@vger.kernel.org
>> Subject: Re: [PATCH 5/6] Drivers: hv: vmbus: distribute subchannels among
>> all vcpus
>> 
>> Dexuan Cui <decui@microsoft.com> writes:
>> 
>> >> -----Original Message-----
>> >> From: Vitaly Kuznetsov [mailto:vkuznets@redhat.com]
>> >> Sent: Tuesday, April 21, 2015 22:28
>> >> To: KY Srinivasan
>> >> Cc: Haiyang Zhang; devel@linuxdriverproject.org; linux-
>> >> kernel@vger.kernel.org; Dexuan Cui
>> >> Subject: [PATCH 5/6] Drivers: hv: vmbus: distribute subchannels among all
>> >> vcpus
>> >>
>> >> Primary channels are distributed evenly across all vcpus we have. When
>> the
>> >> host asks us to create subchannels it usually makes us num_cpus-1 offers
>> >
>> > Hi Vitaly,
>> > AFAIK, in the VSP of storvsc, the number of subchannel is
>> >  (the_number_of_vcpus - 1) / 4.
>> >
>> > This means for a 8-vCPU guest, there is only 1 subchannel.
>> >
>> > Your new algorithm tends to make the vCPUs with small-number busier:
>> > e.g., in the 8-vCPU case, assuming we have 4 SCSI controllers:
>> > vCPU0: scsi0's PrimaryChannel (P)
>> > vCPU1: scsi0's SubChannel (S) + scsi1's P
>> > vCPU2: scsi1's S + scsi2's P
>> > vCPU3: scsi2's S + scsi3's P
>> > vCPU4: scsi3's S
>> > vCPU5, 6 and 7 are idle.
>> >
>> > In this special case, the existing algorithm is better. :-)
>> >
>> > However, I do like this idea in your patch, that is, making sure a device's
>> > primary/sub channels are assigned to differents vCPUs.
>> 
>> Under special circumstances with the current code we can end up with
>> having all subchannels on the same vCPU with the primary channel I guess
>> :-) This is not something common, but possible.
>> 
>> >
>> > I'm just wondering if we should use an even better (and complex)
>> > algorithm :-)
>> 
>> The question here is - does sticking to the current vCPU help? If it
>> does, I can suggest the following (I think I even mentioned that in my
>> PATCH 00): first we try to find a (sub)channel with target_cpu ==
>> current_vcpu and only when we fail we do the round robin. I'd like to
>> hear K.Y.'s opinion here as he's the original author :-)
>
> Sorry for the delayed response. Initially I had implemented a scheme that would 
> pick an outgoing CPU that was closest to the CPU on which the request came (to maintain
> cache locality especially on NUMA systems). I changed this algorithm to spread the load
> more uniformly as we were trying to improve Linux IOPS on Azure XIO
> (premium storage). We are currently testing
> this code on our Converged Offering - CPS and I am finding that the perf as measured by IOS has regressed.
> I have not narrowed the reason for this regression and it may very well be the change in the 
> algorithm for selecting the outgoing channel. In general, I don't think the logic here needs to be 
> exact and locality (being on the same CPU or within the same NUMA node) is important. Any change
> to this algorithm will have to be validated on different MSFT
> environments (Azure XIO, CPS etc.).

Thanks, can you please compare two algorythms here:
1) Simple round robin (the one my patch series implement but with issues
fixed, I'll send v2).
2) Try to find a (sub)channel with matching VCPU and round-robin when we
fail (I can actually include it in v2).
We can later decide something based on these testing results.

>
> Regards,
>
> K. Y

-- 
  Vitaly
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/