Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751917AbaANUKq (ORCPT ); Tue, 14 Jan 2014 15:10:46 -0500 Received: from order.stressinduktion.org ([87.106.68.36]:36976 "EHLO order.stressinduktion.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751512AbaANUKo (ORCPT ); Tue, 14 Jan 2014 15:10:44 -0500 Date: Tue, 14 Jan 2014 21:10:42 +0100 From: Hannes Frederic Sowa To: Eric Dumazet Cc: Austin S Hemmelgarn , netdev@vger.kernel.org, dborkman@redhat.com, linux-kernel@vger.kernel.org, darkjames-ws@darkjames.pl Subject: Re: [PATCH RFC] reciprocal_divide: correction/update of the algorithm Message-ID: <20140114201042.GA13804@order.stressinduktion.org> Mail-Followup-To: Eric Dumazet , Austin S Hemmelgarn , netdev@vger.kernel.org, dborkman@redhat.com, linux-kernel@vger.kernel.org, darkjames-ws@darkjames.pl References: <20140113214249.GK6586@order.stressinduktion.org> <1389722825.31367.260.camel@edumazet-glaptop2.roam.corp.google.com> <52D58E6F.4050000@gmail.com> <1389729032.31367.262.camel@edumazet-glaptop2.roam.corp.google.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <1389729032.31367.262.camel@edumazet-glaptop2.roam.corp.google.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jan 14, 2014 at 11:50:32AM -0800, Eric Dumazet wrote: > On Tue, 2014-01-14 at 14:22 -0500, Austin S Hemmelgarn wrote: > > > I disagree with the statement that current CPU's have reasonably fast > > dividers. A lot of embedded processors and many low-end x86 CPU's do > > not in-fact have any hardware divider, and usually provide it using > > microcode based emulation if they provide it at all. The AMD Jaguar > > micro-architecture in particular comes to mind, it uses an iterative > > division algorithm provided by the microcode that only produces 2 bits > > of quotient per cycle, even in the best case (2 8-bit integers and an > > integral 8-bit quotient) this still takes 4 cycles, which is twice as > > slow as any other math operation on the same processor. > > I doubt you run any BPF filter with a divide instruction in it on these > platform. > > Get real, do not over optimize things where it does not matter. If I read the instruction tables correctly, we could half the latency with reciprocal divide even on haswell. What a pitty that the Intel Architecture Code Analyzer does not support imul nor div instruction. :( -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/