Received: by 10.213.65.68 with SMTP id h4csp109743imn; Tue, 27 Mar 2018 17:52:25 -0700 (PDT) X-Google-Smtp-Source: AIpwx48ziK3mfMkbUNDv1dAUIpHp/nz89vbl3JqobxfEudhFx1czfUlS16Zs4crgX1I4/+njGhHY X-Received: by 2002:a17:902:828b:: with SMTP id y11-v6mr1495001pln.69.1522198345264; Tue, 27 Mar 2018 17:52:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1522198345; cv=none; d=google.com; s=arc-20160816; b=V8z1SpcWZPqNgQBYHKgsGC7HTP/RelK08lz+8i3wu+BL6HvgX8jTwbEVifPz9dkwyb vkVQ6RIPh8k2tHvWh5CQMnKJfBOJAFszoB5k5iyUgV+aJjqw1ChcQVcTUJ6KD5DX/gfZ guRCZ0Vny1KDx2wtrZytb6MKsBdoY4qlKEM/wbyixZI0I/6P4R+f8bjviTE1hCIuhINH 1OwkjIh4hyLyq/RjOMMIAvHRQPQ33W6oA7e9E1mpzru7DZNhU36wWy3R8uoK4kCWQLSo yYPXEuvhRYYML0AO075sASBm+TJKi1WPS+Cl3Qh2OfkVFgoSxA6fLX4eB3T/KgMup5ha ymqw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:reply-to:references:in-reply-to :message-id:date:subject:cc:to:from:arc-authentication-results; bh=pgfpovqH+A4wM9I5rVKGLbqlASxM9eDvf9hpRnSagtE=; b=pbWXYnBw9pObjbGEsKL3XnSaG+gLXhFjo08OyYVh35BHIYbzV9CyBWk3zaIO6BCgo9 MXR2QjvHR3/Ekbu8vGJHsr/rqnjMEh2A2QHoRmXzj9AMrK4Py9wIIpl3m3O82HqfQKDX N/STCPVV3LtApNJLpeWxDf1Wk3gUem9LDWokCpeToEAwKAgpCcG6yftIp3XOoD0O7Avy Aqni4rNRwpsojK/UmEQYtxyIIW4WyYMgwihUAdAxSN0pU1kRpkTsdklgFN2HI4SYbw/U L7gTZ/Idebz6dYbu6SgOaUVtj9NPq4FwU+UlewYAnUtEU1F0TJyfPno5RDsIdZRzsJjg 9A8Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j8-v6si2633590pli.9.2018.03.27.17.52.11; Tue, 27 Mar 2018 17:52:25 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752851AbeC1Auu (ORCPT + 99 others); Tue, 27 Mar 2018 20:50:50 -0400 Received: from a2nlsmtp01-02.prod.iad2.secureserver.net ([198.71.225.36]:37996 "EHLO a2nlsmtp01-02.prod.iad2.secureserver.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752817AbeC1Au1 (ORCPT ); Tue, 27 Mar 2018 20:50:27 -0400 Received: from linuxonhyperv2.linuxonhyperv.com ([107.180.71.197]) by : HOSTING RELAY : with SMTP id 0zHKfzDEJC9PZ0zHKfeAMa; Tue, 27 Mar 2018 17:49:26 -0700 x-originating-ip: 107.180.71.197 Received: from longli by linuxonhyperv2.linuxonhyperv.com with local (Exim 4.89_1) (envelope-from ) id 1f0zHK-0005xI-9m; Tue, 27 Mar 2018 17:49:26 -0700 From: Long Li To: "K . Y . Srinivasan" , Haiyang Zhang , Stephen Hemminger , "James E . J . Bottomley" , "Martin K . Petersen" , devel@linuxdriverproject.org, linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org, netdev@vger.kernel.org Cc: Long Li Subject: [Resend Patch 3/3] Storvsc: Select channel based on available percentage of ring buffer to write Date: Tue, 27 Mar 2018 17:48:40 -0700 Message-Id: <20180328004840.22787-3-longli@linuxonhyperv.com> X-Mailer: git-send-email 2.15.1 In-Reply-To: <20180328004840.22787-1-longli@linuxonhyperv.com> References: <20180328004840.22787-1-longli@linuxonhyperv.com> Reply-To: longli@microsoft.com X-CMAE-Envelope: MS4wfOqt0U6DMuw2squMTZaArkKECmNWam4EVYhW808ODW5TSznVQD33wvBBYBtX77YkLE7q5SwoJfptZBHc5f0ivHthI/EsqY31ykDieNLeHsfIrf58iu2o pM8SfUZgzM9H7dUenQOqKYrxaDLC/AshWvzE1UwRMlpkiUo7/RF6O2+y750Rba5oHp682RTQsDJwK8epWUeC9Bkfi0PgSpNtyRY0hxtnA29J1aIJXxMKAbkT eGC4c+kfWybELV2vdu8G/jVWB5eF0qF/IcyO3Fv8LQcj27AOlkmLlRBeyw3ouM4f/dC+qZhV2FwtJpG3/EVK4qzoM16GIHmiURKuSaASYx2vAgO8536gLdA8 URo+RJuus1AbDNT4ufHIHeBIdAf5JOyblrgOhPeFKl0H1rW2hWuILUUyROr2uNHFS2gSAF3U4hnYdrQkn1cvOMb+VyCI+d1YhdVKThIMyikABII3NkEnftZF xw8d3UXSj9R2/mZAy77BqLZDq9sp+nrQVCrg3rD4LIE3cKm/H8a9d4gTrX640ashbPM4mDd/H1v8MYOt Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Long Li This is a best effort for estimating on how busy the ring buffer is for that channel, based on available buffer to write in percentage. It is still possible that at the time of actual ring buffer write, the space may not be available due to other processes may be writing at the time. Selecting a channel based on how full it is can reduce the possibility that a ring buffer write will fail, and avoid the situation a channel is over busy. Now it's possible that storvsc can use a smaller ring buffer size (e.g. 40k bytes) to take advantage of cache locality. Signed-off-by: Long Li --- drivers/scsi/storvsc_drv.c | 62 +++++++++++++++++++++++++++++++++++++--------- 1 file changed, 50 insertions(+), 12 deletions(-) diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index a2ec0bc9e9fa..b1a87072b3ab 100644 --- a/drivers/scsi/storvsc_drv.c +++ b/drivers/scsi/storvsc_drv.c @@ -395,6 +395,12 @@ MODULE_PARM_DESC(storvsc_ringbuffer_size, "Ring buffer size (bytes)"); module_param(storvsc_vcpus_per_sub_channel, int, S_IRUGO); MODULE_PARM_DESC(storvsc_vcpus_per_sub_channel, "Ratio of VCPUs to subchannels"); + +static int ring_avail_percent_lowater = 10; +module_param(ring_avail_percent_lowater, int, S_IRUGO); +MODULE_PARM_DESC(ring_avail_percent_lowater, + "Select a channel if available ring size > this in percent"); + /* * Timeout in seconds for all devices managed by this driver. */ @@ -1285,9 +1291,9 @@ static int storvsc_do_io(struct hv_device *device, { struct storvsc_device *stor_device; struct vstor_packet *vstor_packet; - struct vmbus_channel *outgoing_channel; + struct vmbus_channel *outgoing_channel, *channel; int ret = 0; - struct cpumask alloced_mask; + struct cpumask alloced_mask, other_numa_mask; int tgt_cpu; vstor_packet = &request->vstor_packet; @@ -1301,22 +1307,53 @@ static int storvsc_do_io(struct hv_device *device, /* * Select an an appropriate channel to send the request out. */ - if (stor_device->stor_chns[q_num] != NULL) { outgoing_channel = stor_device->stor_chns[q_num]; - if (outgoing_channel->target_cpu == smp_processor_id()) { + if (outgoing_channel->target_cpu == q_num) { /* * Ideally, we want to pick a different channel if * available on the same NUMA node. */ cpumask_and(&alloced_mask, &stor_device->alloced_cpus, cpumask_of_node(cpu_to_node(q_num))); - for_each_cpu_wrap(tgt_cpu, &alloced_mask, - outgoing_channel->target_cpu + 1) { - if (tgt_cpu != outgoing_channel->target_cpu) { - outgoing_channel = - stor_device->stor_chns[tgt_cpu]; - break; + + for_each_cpu_wrap(tgt_cpu, &alloced_mask, q_num + 1) { + if (tgt_cpu == q_num) + continue; + channel = stor_device->stor_chns[tgt_cpu]; + if (hv_get_avail_to_write_percent( + &channel->outbound) + > ring_avail_percent_lowater) { + outgoing_channel = channel; + goto found_channel; + } + } + + /* + * All the other channels on the same NUMA node are + * busy. Try to use the channel on the current CPU + */ + if (hv_get_avail_to_write_percent( + &outgoing_channel->outbound) + > ring_avail_percent_lowater) + goto found_channel; + + /* + * If we reach here, all the channels on the current + * NUMA node are busy. Try to find a channel in + * other NUMA nodes + */ + cpumask_andnot(&other_numa_mask, + &stor_device->alloced_cpus, + cpumask_of_node(cpu_to_node(q_num))); + + for_each_cpu(tgt_cpu, &other_numa_mask) { + channel = stor_device->stor_chns[tgt_cpu]; + if (hv_get_avail_to_write_percent( + &channel->outbound) + > ring_avail_percent_lowater) { + outgoing_channel = channel; + goto found_channel; } } } @@ -1324,7 +1361,7 @@ static int storvsc_do_io(struct hv_device *device, outgoing_channel = get_og_chn(stor_device, q_num); } - +found_channel: vstor_packet->flags |= REQUEST_COMPLETION_FLAG; vstor_packet->vm_srb.length = (sizeof(struct vmscsi_request) - @@ -1733,7 +1770,8 @@ static int storvsc_probe(struct hv_device *device, } scsi_driver.can_queue = (max_outstanding_req_per_channel * - (max_sub_channels + 1)); + (max_sub_channels + 1)) * + (100 - ring_avail_percent_lowater) / 100; host = scsi_host_alloc(&scsi_driver, sizeof(struct hv_host_device)); -- 2.14.1