Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756322AbcCCNYk (ORCPT ); Thu, 3 Mar 2016 08:24:40 -0500 Received: from mail-wm0-f68.google.com ([74.125.82.68]:35591 "EHLO mail-wm0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751390AbcCCNYj (ORCPT ); Thu, 3 Mar 2016 08:24:39 -0500 Date: Thu, 3 Mar 2016 14:24:34 +0100 From: Ingo Molnar To: Jakub Jelinek Cc: Arnaldo Carvalho de Melo , Peter Zijlstra , Colin King , Ingo Molnar , linux-kernel@vger.kernel.org, Richard Henderson , Dan Carpenter , Linus Torvalds , Andrew Morton Subject: Re: Q: why didn't GCC warn about this uninitialized variable? (was: Re: [PATCH] perf tests: initialize sa.sa_flags) Message-ID: <20160303132433.GA9460@gmail.com> References: <1456923322-29697-1-git-send-email-colin.king@canonical.com> <20160302125901.GF6356@twins.programming.kicks-ass.net> <20160302130350.GO3604@kernel.org> <20160302132127.GG6356@twins.programming.kicks-ass.net> <20160302132323.GP3604@kernel.org> <20160303121944.GB2484@gmail.com> <20160303125542.GD3017@tucnak.redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160303125542.GD3017@tucnak.redhat.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4208 Lines: 102 * Jakub Jelinek wrote: > On Thu, Mar 03, 2016 at 01:19:44PM +0100, Ingo Molnar wrote: > > struct sigaction sa; > > > > ... > > > > sigfillset(&sa.sa_mask); > > sa.sa_sigaction = segfault_handler; > > sigaction(SIGSEGV, &sa, NULL); > > > > ... which uninitialized sa.sa_flags field GCC merrily accepted as proper C code, > > despite us turning on essentially _all_ GCC warnings for the perf build that exist > > under the sun: > > GCC -W{,maybe-}uninitialized warning works only on SSA form, so in order to > get that warning, either it needs to be some scalar that is uninitialized, > or Scalar Replacement of Aggregates needs to be able to turn the structure > into independent scalars. Neither is the case here, as you take address of > the struct when passing its address to sigaction, and that call can't be > inlined nor is in any other way visible to the compiler, so that it could > optimize both the caller and sigaction itself at the same time. > > Even if GCC added infrastructure for tracking which bits/bytes in > aggregates are or might be uninitialized at which point, generally, > struct XYZ abc; > foo (&abc); > is so common pattern that warning here that the fields are uninitialized > would have extremely high false positive ratio. So at least for the kernel, people rely on external tools that do something like this anyway, and which are essentially annotated manually that duplicates much of the effort it would take to make a simple GCC solution work. So in the aggregate, we already have this overhead _anyway_, except that: - some of the best tools are closed (so the techniques never enter the free software world) - the fashion we get the feedback is per tool and inefficient - that there's also an inevitable lag between code added upstream and tools finding uninitialized variables bugs. So it's all highly inefficient and fragile. There's also another cost, the cost of finding the bugs themselves - for example here's a recent upstream kernel fix: commit e01d8718de4170373cd7fbf5cf6f9cb61cebb1e9 Author: Peter Zijlstra Date: Wed Jan 27 23:24:29 2016 +0100 perf/x86: Fix uninitialized value usage When calling intel_alt_er() with .idx != EXTRA_REG_RSP_* we will not initialize alt_idx and then use this uninitialized value to index an array. When that is not fatal, it can result in an infinite loop in its caller __intel_shared_reg_get_constraints(), with IRQs disabled. Alternative error modes are random memory corruption due to the cpuc->shared_regs->regs[] array overrun, which manifest in either get_constraints or put_constraints doing weird stuff. Only took 6 hours of painful debugging to find this. Neither GCC nor ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Smatch warnings flagged this bug. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ --- a/arch/x86/kernel/cpu/perf_event_intel.c +++ b/arch/x86/kernel/cpu/perf_event_intel.c @@ -1960,7 +1960,8 @@ intel_bts_constraints(struct perf_event *event) static int intel_alt_er(int idx, u64 config) { - int alt_idx; + int alt_idx = idx; + if (!(x86_pmu.flags & PMU_FL_HAS_RSP_1)) return idx; 6 hours of PeterZ time translates to quite a bit of code restructuring overhead to eliminate false positive warnings... So it would scale far better if we could do this kind of checking and annotation in the kernel code, C module by C module, interface by interface. We could also push the detection to the stage where such bugs are introduced: when new code is written - which scales a lot better than the current method of a handful of people looking at static analysis tools. If GCC could warn in some really simplistic fashion (accepting tons of false positives), I'd definitely try to wade through the warnings, eliminate them step by step and make it all work for a couple of key subsystems I maintain. Most on-stack structures in the kernel are small, so there's very little reason to be overly clever with not initializing them. Thanks, Ingo