Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp1991809imm; Thu, 2 Aug 2018 04:38:26 -0700 (PDT) X-Google-Smtp-Source: AAOMgpfjaxr+SJ8s4J1YWELCqKYTY6hTi6Pp2QHDHC0gY9zIlyDtJfh4HOlQhK6mJF5R184oPfyY X-Received: by 2002:a63:f713:: with SMTP id x19-v6mr2377437pgh.233.1533209906362; Thu, 02 Aug 2018 04:38:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533209906; cv=none; d=google.com; s=arc-20160816; b=nb3uiwxbSkGHmcjydjurv2HsDjgTE1ALXbcGh+ejw70JqfXhmpHlTTMTUAdBecF+V9 iUHRqmUihWOHvSK6kfdN7HiXIQ3fnZxa/JEpOjqKbfvLFY6bli39e2VWdT7uvui5yPFo gNvL6JcCCFGd//O7855lT6CrC3kbL9BDGOBm/MF/EZ7xkQUumMhOg91r8LsJIiInin1S 2d3mSv6/jA1r7vEJwjx0RPQRQ671Oy7TrRWK0NlS/++bm7+cZonReuuymgUxMgSNxt9w J1duauAXYAVXgQ2BP4vkaGaurmjpxB97vCsLsjLwGSoS9fnbnEDMD+iXV4j0I6Q8uGST ga0Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=M/nXno3zqZg0+jrj7mzrw1FY0OHE76d4A7UBHPq0YaQ=; b=x7p2VYyMYgm4UGbLk2dSYjtVkRg4Yz846qQDf19G2Po0Vua05MroSntqD4+TevcHpQ nDtCeyzfjXbc3E2oSg8yDWH+VYA7BqNPamMjs2ejWgudm//Ro4Mz6U2mi9M0zO1Y0sj4 lIK8+0mMs2UDrbzMRmUScyD7qAI+Ak6kjj9XF6vVu2iIX2+VoHCssLWD4ei5fyaNZvmd wbbdbrCWRSpAGe/EOHJPRWi4NFOXuszQeA6JudR/1IlkEuqD8kwqACeQ79zMAmednTsp HIO/IKPcPokeKpJ2Bg2XVTVhV7qQPneUaEn+4TuS1uMEsas6Q8jdKqBS9V4Vdq8RUtP4 3r5w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=JnPzbCsn; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t10-v6si1254358plh.306.2018.08.02.04.38.11; Thu, 02 Aug 2018 04:38:26 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=JnPzbCsn; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732320AbeHBN1d (ORCPT + 99 others); Thu, 2 Aug 2018 09:27:33 -0400 Received: from mail-pl0-f67.google.com ([209.85.160.67]:33404 "EHLO mail-pl0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732302AbeHBN1d (ORCPT ); Thu, 2 Aug 2018 09:27:33 -0400 Received: by mail-pl0-f67.google.com with SMTP id b90-v6so116823plb.0 for ; Thu, 02 Aug 2018 04:36:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=M/nXno3zqZg0+jrj7mzrw1FY0OHE76d4A7UBHPq0YaQ=; b=JnPzbCsnnGmEqXt8jKrg3iPEnyNrd8g6sS4bf811nAs5iHzPlWG2qCB8MLDuB7nX4g W74rpTQWJVnWD/z3kFnDoMhPHb7sQ7YTM+BJ9kv8TxjkmLXYE09KRxMCtUmqsqdRIDB9 oo3QD4jDRCGeB3xMsXTG9AnFWtZnHwu3DFTBNdPQs8AQsU0zZwqgdfjRuxPpCdCaboAo jSbaW0jvH+9AujNk0MbRtrqwRv4qxvJ4KvykNP7RbrmBkcm+upEL9dpsTez1pefGE+n1 GmoTHjfm51CqExQ59SRMN8e6RuIgjqHEMaue+sLC7jW+W64ZX6qoOq0b5SyWqX6TPF2H VBUQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=M/nXno3zqZg0+jrj7mzrw1FY0OHE76d4A7UBHPq0YaQ=; b=kavLNJkzYKZDPrSwdePvfQ7Az0A328zwNaX8EPd8g9miNK0QP3308Wblec7XQii7W7 oV57pSHpFyZFcu6a8NUtMMdEFQ/vj/Ic5n7jyQ4KQKQmEXaDOs/4PUxyxFbx5NOfUuy8 z5VVJhT6orrEHUM9eSToHIQ5P9zI8yKKDngCEavDTFBZ1ce1vejwkQYuoBRmGTF/ewdg eQIkbSNu3uF+NHtyexaQ05ngiikVPEkQT+l5guKToEwkVgIBxy/jJWyh6efj5DZUAqPM giOmnBs3EZoKQPYttSKau0MHiJIFziZha5sKEGlzvFnPQwx7rvxpDw6HhcG7qBLIKDGG /KQA== X-Gm-Message-State: AOUpUlESTK/pp473IdFp/DkjLPsf9xERXAJvjXSGXArBRepXHELFLQJO dEghYjdR7eIJrhLiUuckTnZgCVVDmB4oxSGdkCBITg== X-Received: by 2002:a17:902:4401:: with SMTP id k1-v6mr2023875pld.97.1533209806295; Thu, 02 Aug 2018 04:36:46 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a17:90a:ac14:0:0:0:0 with HTTP; Thu, 2 Aug 2018 04:36:25 -0700 (PDT) In-Reply-To: <20180802111031.yx3x6y5d5q6drq52@armageddon.cambridge.arm.com> References: <20180628105057.GA26019@e103592.cambridge.arm.com> <20180629110709.GA17859@arm.com> <20180703173608.GF27243@arm.com> <20180801163538.GA10800@arm.com> <20180802111031.yx3x6y5d5q6drq52@armageddon.cambridge.arm.com> From: Dmitry Vyukov Date: Thu, 2 Aug 2018 13:36:25 +0200 Message-ID: Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer To: Catalin Marinas Cc: Will Deacon , Mark Rutland , Kate Stewart , linux-doc@vger.kernel.org, Paul Lawrence , Linux Memory Management List , Alexander Potapenko , Chintan Pandya , Christoph Lameter , Ingo Molnar , Jacob Bramley , Jann Horn , Mark Brand , kasan-dev , linux-sparse@vger.kernel.org, Geert Uytterhoeven , Andrey Ryabinin , Dave Martin , Evgeniy Stepanov , Arnd Bergmann , Linux Kbuild mailing list , Marc Zyngier , Andrey Konovalov , Ramana Radhakrishnan , Ruben Ayrapetyan , Mike Rapoport , Linux ARM , Kostya Serebryany , Ard Biesheuvel , Greg Kroah-Hartman , Nick Desaulniers , LKML , "Eric W . Biederman" , Lee Smith , Andrew Morton , "Kirill A . Shutemov" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Aug 2, 2018 at 1:10 PM, Catalin Marinas wrote: > On Wed, Aug 01, 2018 at 06:52:09PM +0200, Dmitry Vyukov wrote: >> On Wed, Aug 1, 2018 at 6:35 PM, Will Deacon wrote: >> > On Tue, Jul 31, 2018 at 03:22:13PM +0200, Andrey Konovalov wrote: >> >> On Wed, Jul 18, 2018 at 7:16 PM, Andrey Konovalov wrote: >> >> > On Tue, Jul 3, 2018 at 7:36 PM, Will Deacon wrote: >> >> >> Hmm, but elsewhere in this thread, Evgenii is motivating the need for this >> >> >> patch set precisely because the lower overhead means it's suitable for >> >> >> "near-production" use. So I don't think writing this off as a debugging >> >> >> feature is the right approach, and we instead need to put effort into >> >> >> analysing the impact of address tags on the kernel as a whole. Playing >> >> >> whack-a-mole with subtle tag issues sounds like the worst possible outcome >> >> >> for the long-term. >> >> > >> >> > I don't see a way to find cases where pointer tags would matter >> >> > statically, so I've implemented the dynamic approach that I mentioned >> >> > above. I've instrumented all pointer comparisons/subtractions in an >> >> > LLVM compiler pass and used a kernel module that would print a bug >> >> > report whenever two pointers with different tags are being >> >> > compared/subtracted (ignoring comparisons with NULL pointers and with >> >> > pointers obtained by casting an error code to a pointer type). Then I >> >> > tried booting the kernel in QEMU and on an Odroid C2 board and I ran >> >> > syzkaller overnight. >> >> > >> >> > This yielded the following results. >> >> > >> >> > ====== >> >> > >> >> > The two places that look interesting are: >> >> > >> >> > is_vmalloc_addr in include/linux/mm.h (already mentioned by Catalin) >> >> > is_kernel_rodata in mm/util.c >> >> > >> >> > Here we compare a pointer with some fixed untagged values to make sure >> >> > that the pointer lies in a particular part of the kernel address >> >> > space. Since KWHASAN doesn't add tags to pointers that belong to >> >> > rodata or vmalloc regions, this should work as is. To make sure I've >> >> > added debug checks to those two functions that check that the result >> >> > doesn't change whether we operate on pointers with or without >> >> > untagging. >> >> > >> >> > ====== >> >> > >> >> > A few other cases that don't look that interesting: >> >> > >> >> > Comparing pointers to achieve unique sorting order of pointee objects >> >> > (e.g. sorting locks addresses before performing a double lock): >> >> > >> >> > tty_ldisc_lock_pair_timeout in drivers/tty/tty_ldisc.c >> >> > pipe_double_lock in fs/pipe.c >> >> > unix_state_double_lock in net/unix/af_unix.c >> >> > lock_two_nondirectories in fs/inode.c >> >> > mutex_lock_double in kernel/events/core.c >> >> > >> >> > ep_cmp_ffd in fs/eventpoll.c >> >> > fsnotify_compare_groups fs/notify/mark.c >> >> > >> >> > Nothing needs to be done here, since the tags embedded into pointers >> >> > don't change, so the sorting order would still be unique. >> >> > >> >> > Check that a pointer belongs to some particular allocation: >> >> > >> >> > is_sibling_entry lib/radix-tree.c >> >> > object_is_on_stack in include/linux/sched/task_stack.h >> >> > >> >> > Nothing needs to be here either, since two pointers can only belong to >> >> > the same allocation if they have the same tag. >> >> > >> >> > ====== >> >> > >> >> > Will, Catalin, WDYT? >> >> >> >> ping >> > >> > Thanks for tracking these cases down and going through each of them. The >> > obvious follow-up question is: how do we ensure that we keep on top of >> > this in mainline? Are you going to repeat your experiment at every kernel >> > release or every -rc or something else? I really can't see how we can >> > maintain this in the long run, especially given that the coverage we have >> > is only dynamic -- do you have an idea of how much coverage you're actually >> > getting for, say, a defconfig+modules build? >> > >> > I'd really like to enable pointer tagging in the kernel, I'm just still >> > failing to see how we can do it in a controlled manner where we can reason >> > about the semantic changes using something other than a best-effort, >> > case-by-case basis which is likely to be fragile and error-prone. >> > Unfortunately, if that's all we have, then this gets relegated to a >> > debug feature, which sort of defeats the point in my opinion. >> >> Well, in some cases there is no other way as resorting to dynamic testing. >> How do we ensure that kernel does not dereference NULL pointers, does >> not access objects after free or out of bounds? > > We should not confuse software bugs (like NULL pointer dereference) with > unexpected software behaviour introduced by khwasan where pointers no > longer represent only an address range (e.g. calling find_vmap_area()) > but rather an address and a tag. These are also software bugs, not different from NULL derefs that we do not detect statically. > Parts of the kernel rely on pointers > being just address ranges. > > It's the latter that we'd like to identify more easily and avoid subtle > bugs or change in behaviour when running correctly written code. You mean _previously_ correct code, now it's just incorrect code. Not different from any other types of incorrect code, and we do have thousands of types of incorrect code already, most of these types are not detectable statically. >> And, yes, it's >> constant maintenance burden resolved via dynamic testing. >> In some sense HWASAN is better in this regard because it's like, say, >> LOCKDEP in this regard. It's enabled only when one does dynamic >> testing and collect, analyze and fix everything that pops up. Any >> false positives will fail loudly (as opposed to, say, silent memory >> corruptions due to use-after-frees), so any false positives will be >> just first things to fix during the tool application. > > Again, you are talking about the bugs that khwasan would discover. We > don't deny its value and false positives are acceptable here. I am talking about the same thing you are talking about. New crashes of changes in behavior will also pop up and will need to be fixed. > However, not untagging a pointer when converting to long may have > side-effects in some cases and I consider these bugs introduced by the > khwasan support rather than bugs in the original kernel code. Ideally > we'd need some tooling on top of khwasan to detect such shortcomings but > I'm not sure we can do this statically, as Andrey already mentioned. For > __user pointers, things are slightly better as we can detect the > conversion either with sparse (modified) or some LLVM changes. I agree. Ideally we have strict static checking for this type of bugs. Ideally we have it for all types of bugs. NULL derefs, use-after-frees, or say confusion between a function returning 0/1 for ok/failure with a function returning true/false. How do we detect that statically? Nohow. For example, LOCKDEP has the same problem. Previously correct code can become incorrect and require finer-grained lock class annotations. KMEMLEAK has the same problem: previously correct code that hides a pointer may now need changes to unhide the pointer. If somebody has a practical idea how to detect these statically, let's do it. Otherwise let's go with the traditional solution to this -- dynamic testing. The patch series show that the problem is not a disaster and we won't need to change just every line of kernel code.