Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756565AbcDLLRG (ORCPT ); Tue, 12 Apr 2016 07:17:06 -0400 Received: from mail-wm0-f45.google.com ([74.125.82.45]:34574 "EHLO mail-wm0-f45.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756272AbcDLLRC convert rfc822-to-8bit (ORCPT ); Tue, 12 Apr 2016 07:17:02 -0400 MIME-Version: 1.0 In-Reply-To: References: <57cb1b66d85b85eadea28ef3304a62b1327ded45.1459432254.git.glider@google.com> <20160331142908.GG26532@leverpostej> <20160331160052.GA26393@leverpostej> <20160331171434.GC26393@leverpostej> Date: Tue, 12 Apr 2016 13:17:00 +0200 Message-ID: Subject: Re: [PATCH v1] arm64: allow building with kcov coverage on ARM64 From: Alexander Potapenko To: Dmitry Vyukov Cc: Mark Rutland , Catalin Marinas , Quentin Casasnovas , Will Deacon , Kostya Serebryany , Andrew Morton , syzkaller , LKML , linux-arm-kernel@lists.infradead.org, Ard Biesheuvel , marc.zyngier@arm.com, Christoffer Dall Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4681 Lines: 113 On Mon, Apr 4, 2016 at 7:30 PM, Dmitry Vyukov wrote: > On Thu, Mar 31, 2016 at 7:18 PM, Alexander Potapenko wrote: >> On Thu, Mar 31, 2016 at 7:14 PM, Mark Rutland wrote: >>> On Thu, Mar 31, 2016 at 06:33:24PM +0200, Alexander Potapenko wrote: >>>> On Thu, Mar 31, 2016 at 6:00 PM, Mark Rutland wrote: >>>> > On Thu, Mar 31, 2016 at 05:09:29PM +0200, Alexander Potapenko wrote: >>>> >> Currently kcov instrumentation is disabled for the following files: >>>> > >>>> >> arch/x86/boot/* >>>> >> arch/x86/boot/compressed/* >>>> >> arch/x86/entry/vdso/* >>>> >> arch/x86/realmode/rm/* >>>> > >>>> > These are executed outside of the usual kernel context / address space, >>>> > so excluding these makes sense to me. >>>> > >>>> >> arch/x86/kernel/* >>>> >> arch/x86/kernel/apic/* >>>> >> arch/x86/kernel/cpu/common.c >>>> >> arch/x86/kernel/cpu/perf_event.c >>>> >> arch/x86/lib/delay.c >>>> >> arch/x86/mm/tlb.c >>>> > >>>> > For these, it's not immediately clear to me why instrumentation is >>>> > disabled, so I don't know whether or not we can instrument the analogous >>>> > arm64 code. >>>> According to the comments in >>>> https://github.com/torvalds/linux/commit/5c9a8750a6409c63a0f01d51a9024861022f6593, >>>> instrumentation of arch/x86/kernel/apic/* and arch/x86/lib/delay.c >>>> leads to non-deterministic coverage, >>> >>> To what extent does determinism matter? Are we just ruling out the worst >>> cases, or is this likely to turn into a whack-a-mole game? >> I guess we'd better ask Dmitry who excluded these files on x86 and >> experimented with coverage a lot. >> Dmitry, can you clarify this, please? >>> Do we exclude clocksources and other driver code? >>> >>> Looking at the arm64 delay timer code, it looks like everything will be >>> inlined (and therefore coverage should be deterministic so long as the >>> delay functions are called deterministically). That said, the same looks >>> basically true of the x86 code, so I guess I've misunderstood. >>> >>>> instrumenting others prevent the kernel from booting. >>> >>> I haven't been able to come up with a scenario whereby kcov would be >>> fatal for the above, so it's difficult to say if we have equivalent >>> problems. >>> >>> For reference, do we have any examples as to why any of these prevent >>> booting? >> Not sure there's any documentation so far except for the comments in >> the original kcov patch. > > > I did not look at all boot crashes and hangs. The low level arch code > like interrupts and early bootstrap is not interesting in this > setting, so I just bisected down to file level and excluded it. I > looked at one crash, though. It was related to setup of permanent > per-cpu storage, the kcov callback was emitted into a critical > sequence of instructions that switches per-cpu storage from bootstrap > to the real one, and access to 'current' faulted in that callback. In > general, for the boot issue it's better to exclude files lazily as we > discover new issues. > > Besides the boot issues, other files are excluded for two reasons: > 1. non-deterministic coverage (like interrupts and mutex slow paths). > 2. excessive coverage, for example memcpy-like loop will produce O(N) > coverage since kcov is trace-based. I guess that delay.c falls into > this category. > > We don't need 100% deterministic coverage. I agree that it's not > feasible. User-space part of syzkaller (kcov-based fuzzer) tries to > work around it with some heuristics. But I've tried to to eliminate > some frequent and common sources of non-determinism. I've repeatedly > collected coverage from a simple program containing > mmap-open-read-close, and eliminated all frequent, large spikes of > coverage one by one. > > Re delay.c: on x86 it is not inlined, and some parts are written in C > so disable of instrumentation worked. Is it inlined on arm64? I see at > least the following in the c file: > > void __delay(unsigned long cycles) > { > cycles_t start = get_cycles(); > > while ((get_cycles() - start) < cycles) > cpu_relax(); > } Mark, Looks like we haven't reached the consensus on this topic yet. Do you have anything to comment on what Dmitry said? I also wonder if we can, say, land the change to arch/arm64/Kconfig separately from makefile changes that improve the precision or fix certain build configurations. Alex -- Alexander Potapenko Software Engineer Google Germany GmbH Erika-Mann-Straße, 33 80636 München Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg