Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752862AbaJFPcK (ORCPT ); Mon, 6 Oct 2014 11:32:10 -0400 Received: from mail.lang.hm ([64.81.33.126]:33717 "EHLO bifrost.lang.hm" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750967AbaJFPcI (ORCPT ); Mon, 6 Oct 2014 11:32:08 -0400 Date: Mon, 6 Oct 2014 08:31:54 -0700 (PDT) From: David Lang X-X-Sender: dlang@asgard.lang.hm To: Christoph Lameter cc: Thomas Gleixner , Richard Cochran , linux-kernel@vger.kernel.org Subject: Re: Why do we still have 32 bit counters? Interrupt counters overflow within 50 days In-Reply-To: Message-ID: References: <20141003120345.GA6652@localhost.localdomain> User-Agent: Alpine 2.02 (DEB 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 6 Oct 2014, Christoph Lameter wrote: > On Mon, 6 Oct 2014, Thomas Gleixner wrote: > >> So if you want to fix that as well, you really need to think about the >> 32 bit case because there is no serialization for the interrupts which >> are delivered directly from their own vector. And no, we should not >> diverge 32 and 64 bit artificially here simply because the same 50 >> days wrap applies to both. > > Is it a divergence if both 64bit and 32 bit are unsing unsigned long? > >> >> I really start to wonder whether all this is worth the trouble. It has >> been this way forever and 1k timer interrupts per second is not really >> a new thing either. So we did not change anything which suddenly makes >> tools confused. > > Tools expect the number of interrupt to increase linearly and not jump by > 2^32 once in awhile. There are functions in the kernel (/proc/stat) that > sum up various interrupt counters and that are types unsigned long. These > larger numbers can suddenly jump by 2^32. Its pretty unusual for a 64 bit > conter to do that and it requires some head scratching until we figured > that one out. No, tools recognize that things happen (wraps, reboots, etc) and have some threshold that they say "if this value changes more than the threshold, something happened and it's not valid to use this delta" This has been the case for decades. If you have a monitoring tool that does not account for this sort of thing, you have an immature tool. David Lang -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/