Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S268746AbUJECiI (ORCPT ); Mon, 4 Oct 2004 22:38:08 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S268745AbUJECiH (ORCPT ); Mon, 4 Oct 2004 22:38:07 -0400 Received: from fmr04.intel.com ([143.183.121.6]:56023 "EHLO caduceus.sc.intel.com") by vger.kernel.org with ESMTP id S268746AbUJECiC (ORCPT ); Mon, 4 Oct 2004 22:38:02 -0400 Message-Id: <200410050237.i952bx620740@unix-os.sc.intel.com> From: "Chen, Kenneth W" To: Subject: bug in sched.c:task_hot() Date: Mon, 4 Oct 2004 19:38:01 -0700 X-Mailer: Microsoft Office Outlook, Build 11.0.5510 Thread-Index: AcSqhFUV6ax6XVQDS9eE31wy/inRzw== X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1409 Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1531 Lines: 38 Current implementation of task_hot() has a performance bug in it that it will cause integer underflow. Variable "now" (typically passed in as rq->timestamp_last_tick) and p->timestamp are all defined as unsigned long long. However, If former is smaller than the latter, integer under flow occurs which make the result of subtraction a huge positive number. Then it is compared to sd->cache_hot_time and it will wrongly identify a cache hot task as cache cold. This bug causes large amount of incorrect process migration across cpus (at stunning 10,000 per second) and we lost cache affinity very quickly and almost took double digit performance regression on a db transaction processing workload. Patch to fix the bug. Diff'ed against 2.6.9-rc3. Signed-off-by: Ken Chen --- linux-2.6.9-rc3/kernel/sched.c.orig 2004-10-04 19:11:21.000000000 -0700 +++ linux-2.6.9-rc3/kernel/sched.c 2004-10-04 19:19:27.000000000 -0700 @@ -180,7 +180,8 @@ static unsigned int task_timeslice(task_ else return SCALE_PRIO(DEF_TIMESLICE, p->static_prio); } -#define task_hot(p, now, sd) ((now) - (p)->timestamp < (sd)->cache_hot_time) +#define task_hot(p, now, sd) ((long long) ((now) - (p)->timestamp) \ + < (long long) (sd)->cache_hot_time) enum idle_type { - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/