From: kys@exchange.microsoft.com
To: gregkh@linuxfoundation.org, linux-kernel@vger.kernel.org,
        devel@linuxdriverproject.org, olaf@aepfle.de, apw@canonical.com,
        vkuznets@redhat.com, jasowang@redhat.com,
        leann.ogasawara@canonical.com, marcelo.cerri@canonical.com
Cc: Stephen Hemminger <stephen@networkplumber.org>,
        Stephen Hemminger <sthemmin@microsoft.com>,
        "K. Y. Srinivasan" <kys@microsoft.com>
Subject: [PATCH 1/9] vmbus: only reschedule tasklet if time limit exceeded
Date: Sat,  4 Mar 2017 18:27:10 -0700
Message-Id: <1488677238-5150-1-git-send-email-kys@exchange.microsoft.com>
In-Reply-To: <1488677204-5059-1-git-send-email-kys@exchange.microsoft.com>
References: <1488677204-5059-1-git-send-email-kys@exchange.microsoft.com>
Reply-To: kys@microsoft.com
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 3660
Lines: 102

From: Stephen Hemminger <stephen@networkplumber.org>

The change to reschedule tasklet if more data arrives in ring buffer
can cause performance regression if host timing is such that the
next response happens in small window.

Go back to a modified version of the original looping behavior.
If the race occurs in a small time, then loop. But if the tasklet
has been running for a long interval due to flood, then reschedule
the tasklet to allow migration to ksoftirqd.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
---
 drivers/hv/connection.c |   65 ++++++++++++++++++++++++----------------------
 1 files changed, 34 insertions(+), 31 deletions(-)

diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
index a8366fe..fce27fb 100644
--- a/drivers/hv/connection.c
+++ b/drivers/hv/connection.c
@@ -296,44 +296,47 @@ struct vmbus_channel *relid2channel(u32 relid)
 
 /*
  * vmbus_on_event - Process a channel event notification
+ *
+ * For batched channels (default) optimize host to guest signaling
+ * by ensuring:
+ * 1. While reading the channel, we disable interrupts from host.
+ * 2. Ensure that we process all posted messages from the host
+ *    before returning from this callback.
+ * 3. Once we return, enable signaling from the host. Once this
+ *    state is set we check to see if additional packets are
+ *    available to read. In this case we repeat the process.
+ *    If this tasklet has been running for a long time
+ *    then reschedule ourselves.
  */
 void vmbus_on_event(unsigned long data)
 {
 	struct vmbus_channel *channel = (void *) data;
-	void (*callback_fn)(void *);
+	unsigned long time_limit = jiffies + 2;
 
-	/*
-	 * A channel once created is persistent even when there
-	 * is no driver handling the device. An unloading driver
-	 * sets the onchannel_callback to NULL on the same CPU
-	 * as where this interrupt is handled (in an interrupt context).
-	 * Thus, checking and invoking the driver specific callback takes
-	 * care of orderly unloading of the driver.
-	 */
-	callback_fn = READ_ONCE(channel->onchannel_callback);
-	if (unlikely(callback_fn == NULL))
-		return;
-
-	(*callback_fn)(channel->channel_callback_context);
-
-	if (channel->callback_mode == HV_CALL_BATCHED) {
-		/*
-		 * This callback reads the messages sent by the host.
-		 * We can optimize host to guest signaling by ensuring:
-		 * 1. While reading the channel, we disable interrupts from
-		 *    host.
-		 * 2. Ensure that we process all posted messages from the host
-		 *    before returning from this callback.
-		 * 3. Once we return, enable signaling from the host. Once this
-		 *    state is set we check to see if additional packets are
-		 *    available to read. In this case we repeat the process.
+	do {
+		void (*callback_fn)(void *);
+
+		/* A channel once created is persistent even when
+		 * there is no driver handling the device. An
+		 * unloading driver sets the onchannel_callback to NULL.
 		 */
-		if (hv_end_read(&channel->inbound) != 0) {
-			hv_begin_read(&channel->inbound);
+		callback_fn = READ_ONCE(channel->onchannel_callback);
+		if (unlikely(callback_fn == NULL))
+			return;
 
-			tasklet_schedule(&channel->callback_event);
-		}
-	}
+		(*callback_fn)(channel->channel_callback_context);
+
+		if (channel->callback_mode != HV_CALL_BATCHED)
+			return;
+
+		if (likely(hv_end_read(&channel->inbound) == 0))
+			return;
+
+		hv_begin_read(&channel->inbound);
+	} while (likely(time_before(jiffies, time_limit)));
+
+	/* The time limit (2 jiffies) has been reached */
+	tasklet_schedule(&channel->callback_event);
 }
 
 /*
-- 
1.7.1