Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id ; Fri, 3 Jan 2003 16:03:13 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id ; Fri, 3 Jan 2003 16:03:13 -0500 Received: from robur.slu.se ([130.238.98.12]:65042 "EHLO robur.slu.se") by vger.kernel.org with ESMTP id ; Fri, 3 Jan 2003 16:03:12 -0500 From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <15893.65155.49072.307843@robur.slu.se> Date: Fri, 3 Jan 2003 22:20:03 +0100 To: "Avery Fay" Cc: linux-kernel@vger.kernel.org Subject: Gigabit/SMP performance problem In-Reply-To: References: X-Mailer: VM 6.92 under Emacs 19.34.1 Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2153 Lines: 53 Avery Fay writes: > > I'm working with a dual xeon platform with 4 dual e1000 cards on different > pci-x buses. I'm having trouble getting better performance with the second > cpu enabled (ht disabled). With a UP kernel (redhat's 2.4.18), I can route > about 2.9 gigabits/s at around 90% cpu utilization. With a SMP kernel > (redhat's 2.4.18), I can route about 2.8 gigabits/s with both cpus at > around 90% utilization. This suggests to me that the network code is > serialized. I would expect one of two things from my understanding of the > 2.4.x networking improvements (softirqs allowing execution on more than > one cpu): Well you have a gigabit router :-) How is your routing setup? Packet size? Also you'll never get increased performance of a single flow with SMP. Aggregated performance possible at best. I've been fighting with for some time too. You have some important data in /proc/net/softnet_stat which are per cpu packets received and "cpu collisions" should interest you. As far as I understand there no serialization in forwarding path except where it has to be -- when we add softirq's from different cpu into a single device. This seen in "cpu collisions" Also here we get into inherent SMP cache bouncing problem with TX interrupts When TX has skb's which are processed/created in different CPU's. Which CPU gonna take the interrupt? No matter how we do we run kfree we gona see a lot of cache bouncing. For systems that have same in/out interface smp_affinity can be used. In practice this impossible for forwarding. And this bouncing hurts especially for small pakets.... A litte TX test illustrates. Sender on cpu0. UP 186 kpps SMP Aff to cpu0 160 kpps SMP Aff to cpu0, cpu1 124 kpps SMP Aff to cpu1 106 kpps We are playing some code that might decrease this problem. Cheers. --ro - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/