Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935955AbcKPUby (ORCPT ); Wed, 16 Nov 2016 15:31:54 -0500 Received: from mail-it0-f54.google.com ([209.85.214.54]:38185 "EHLO mail-it0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933245AbcKPUbu (ORCPT ); Wed, 16 Nov 2016 15:31:50 -0500 MIME-Version: 1.0 In-Reply-To: References: <1479324933-8161-1-git-send-email-cmetcalf@mellanox.com> From: John Stultz Date: Wed, 16 Nov 2016 12:31:05 -0800 Message-ID: Subject: Re: [PATCH v2] tile: avoid using clocksource_cyc2ns with absolute cycle count To: Chris Metcalf Cc: Thomas Gleixner , Salman Qazi , Paul Turner , Tony Lindgren , Steven Miao , lkml , Peter Zijlstra Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 903 Lines: 21 On Wed, Nov 16, 2016 at 12:29 PM, John Stultz wrote: > On Wed, Nov 16, 2016 at 12:16 PM, Chris Metcalf wrote: >> Change 4cecf6d401a0 results in essentially identical code for x86 as >> this proposed change does for tile. In fact a follow-on change by >> Salman introduced mult_frac() and switched to using it, so it was >> identical at that point. >> >> PeterZ (cc'ed) then improved it to use __int128 math via >> mul_u64_u32_shr(), but that doesn't help tile; we only do one multiply >> instead of two, but the multiply is handled by an out-of-line call to >> __multi3, and the sched_clock() function ends up about 2.5x slower as >> a result. >> >> Thanks for thinking about this! > > Heh. Thanks for the history lesson and apologies for my forgetfulness. :) Oh.. and some of these details might be useful to have in the commit message! thanks -john