Received: by 2002:a6b:500f:0:0:0:0:0 with SMTP id e15csp1984259iob; Thu, 5 May 2022 12:28:29 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwPHIBCAATtNeutk1tK4fXL95iJ9zYKEYGPzlFtCIcpezY6r+uRUk2FOrJF4SqjUcvJHblh X-Received: by 2002:a17:906:478c:b0:6f4:e6c6:526f with SMTP id cw12-20020a170906478c00b006f4e6c6526fmr6685120ejc.41.1651778909343; Thu, 05 May 2022 12:28:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1651778909; cv=none; d=google.com; s=arc-20160816; b=H98w35qb3VDAew08FQZ7kcKTBVmDym/bp/Vxqf5pYkhZQY7BRmZbjsVxIDshjYWuWO TDk81Gao5r/e4ObB4rUhfzGbsIrltOA0rHMSL52OsThwQdmJ3vGz75HmwDrYYzWxJ0rE AzA+x6MyQ71Kf5f2i14k9r+p/7xjnOgJb67V1HoHJIlRRH0dMsz5W4J9pI9KL5DO5vo1 vPOMRia93nzvEMEXatYirzzk0eodaxJWbAGYlNYWh/sNjkwz5JGLyXHmhXZMaFGkwaF5 B9+hT2H7A2MW8GgpVrs6nJFbUjQCz28wIEQjSBsZR+o7XrQIAHtEsZtJo8glb7o0eA19 e3FA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=lfs9ILt8sxsLN7d0CVj7IGkIjZk0ddULlJ98hbLjdl0=; b=PKGMUSqoakp0hvPHqLZa4GR/6lzFUGjImJfa8vt/o73ayrm5WZFu1NpSGcXNlE0VW2 apbS6vdOfXrMx4MAkQ9TK2jHB32fKHVVgyHzU3DPongJ5cUsqN5PEY1wvrh1LFVyT5ti wd4sWpxlZSXkjDWyW5G1uNdtjnKn+pPhiUojqn0scWC60hBVGpN4OkvO1RK9IzEkYocN 2Wx6M3lp71/JiB9TE9Po0HoMr0IZtm5vIRCGDxv6JDJyIeIrA6UAl1gh0Na+Htcv4fk7 WHRREMnQfsxybMIMaTTgCPZ7QvQ67L504cluCQgpxnFD1VPDAqFxIaNCNKaaLTmE0ppb SGkA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=Ndgpe6IF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id bh25-20020a170906a0d900b006f4adc54180si3150984ejb.611.2022.05.05.12.28.06; Thu, 05 May 2022 12:28:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=Ndgpe6IF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1382973AbiEESJI (ORCPT + 99 others); Thu, 5 May 2022 14:09:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55954 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1382978AbiEESJC (ORCPT ); Thu, 5 May 2022 14:09:02 -0400 Received: from mail-yb1-xb35.google.com (mail-yb1-xb35.google.com [IPv6:2607:f8b0:4864:20::b35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 195A326E1 for ; Thu, 5 May 2022 11:05:22 -0700 (PDT) Received: by mail-yb1-xb35.google.com with SMTP id s30so8975869ybi.8 for ; Thu, 05 May 2022 11:05:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=lfs9ILt8sxsLN7d0CVj7IGkIjZk0ddULlJ98hbLjdl0=; b=Ndgpe6IFiFofWceJK8I/DTEcQWT+/sHZsEWM1fYblnv9x81eSyj1R7FuXA8ILnq7mL MirG8+CssXftrYGqNJ4vf1ERQtOm6a4Nq9IPOlrkw8GYT4MTbR51+m8zuOlLdoGmwuqT LCH7motTy7WzByAkTNq0edQigKWkmNqayX293USnIaJr7BJOSxH8GmifZMRZNAE+kkk2 zWArlAg0ixMHNJJTifGAlU5k07E2zrU6JttPU9X300dt1NgXtfBycBtCfouCZIPcZCWE LiBSab1rlgeKcaxikFZxkwfIaXNH19dfTS2sMDmjcRpqtla1OP0W8r34tKP8eXHD/c2L VqsA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=lfs9ILt8sxsLN7d0CVj7IGkIjZk0ddULlJ98hbLjdl0=; b=hv/pfQzi/uc06m6BBwfRCeBe+95kiboHGv1dybE5vTeP0tdCkqiC5HpePZWHpeBoz9 sCGuBhyn3KKqRAKOQPEwXxU0RtF/+o74jdQZy6iqIYdh90Q7UxZSJYXTgbbdao+11RrI YUqDlRYDXz/54BS2K/mjxmY87SZc8yFLGIgIEoMQBCV7Bdnisgn7E/2JMiITzeKk+QFW 7XmL8ScHlQrU5S4ZDka9GOP3Q2mqx5cyeQP0Yi26SsTNKwYTGV/VelPvy93OJibhK/TO zMQz2ej4w33Jo7uwWHBd1QgoeDZAZxpUBHtpXIbAcx1nfaGTaOhwhIY721IJMe/rHUuA OhIw== X-Gm-Message-State: AOAM530NUS4Epa0puwyI0jHj/X6YpXYrGW4yB3E21FKTDIjqQXwIlzWd YKpRw8DuG94rAGxy2bAJi0qBFaPq94QlRtIZIgeg/A== X-Received: by 2002:a25:aa62:0:b0:648:590f:5a53 with SMTP id s89-20020a25aa62000000b00648590f5a53mr23847856ybi.5.1651773921071; Thu, 05 May 2022 11:05:21 -0700 (PDT) MIME-Version: 1.0 References: <20220426164315.625149-1-glider@google.com> <20220426164315.625149-29-glider@google.com> <87a6c6y7mg.ffs@tglx> <87y1zjlhmj.ffs@tglx> In-Reply-To: <87y1zjlhmj.ffs@tglx> From: Alexander Potapenko Date: Thu, 5 May 2022 20:04:44 +0200 Message-ID: Subject: Re: [PATCH v3 28/46] kmsan: entry: handle register passing from uninstrumented code To: Thomas Gleixner Cc: Alexander Viro , Andrew Morton , Andrey Konovalov , Andy Lutomirski , Arnd Bergmann , Borislav Petkov , Christoph Hellwig , Christoph Lameter , David Rientjes , Dmitry Vyukov , Eric Dumazet , Greg Kroah-Hartman , Herbert Xu , Ilya Leoshkevich , Ingo Molnar , Jens Axboe , Joonsoo Kim , Kees Cook , Marco Elver , Mark Rutland , Matthew Wilcox , "Michael S. Tsirkin" , Pekka Enberg , Peter Zijlstra , Petr Mladek , Steven Rostedt , Vasily Gorbik , Vegard Nossum , Vlastimil Babka , kasan-dev , Linux Memory Management List , Linux-Arch , LKML Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 3, 2022 at 12:00 AM Thomas Gleixner wrote: > > Alexander, First of all, thanks a lot for the comments, those are greatly appreciated! I tried to revert this patch and the previous one ("kmsan: instrumentation.h: add instrumentation_begin_with_regs()") and reimplement unpoisoning pt_regs without breaking into instrumentation_begin(), see below. > > > > Regarding the regs, you are right. It should be enough to unpoison the > > regs at idtentry prologue instead. > > I tried that initially, but IIRC it required patching each of the > > DEFINE_IDTENTRY_XXX macros, which already use instrumentation_begin(). > > Exactly 4 instances :) > Not really, I had to add a call to `kmsan_unpoison_memory(regs, sizeof(*regs));` to the following places in arch/x86/include/asm/idtentry.h: - DEFINE_IDTENTRY() - DEFINE_IDTENTRY_ERRORCODE() - DEFINE_IDTENTRY_RAW() - DEFINE_IDTENTRY_RAW_ERRORCODE() - DEFINE_IDTENTRY_IRQ() - DEFINE_IDTENTRY_SYSVEC() - DEFINE_IDTENTRY_SYSVEC_SIMPLE() - DEFINE_IDTENTRY_DF() , but even that wasn't enough. For some reason I also had to unpoison pt_regs directly in DEFINE_IDTENTRY_SYSVEC(sysvec_apic_timer_interrupt) and DEFINE_IDTENTRY_IRQ(common_interrupt). In the latter case, this could have been caused by asm_common_interrupt being entered from irq_entries_start(), but I am not sure what is so special about sysvec_apic_timer_interrupt(). Ideally, it would be great to find that single point where pt_regs are set up before being passed to all IDT entries. I used to do that by inserting calls to kmsan_unpoison_memory right into arch/x86/entry/entry_64.S (https://github.com/google/kmsan/commit/3b0583f45f74f3a09f4c7e0e0588169cef9= 18026), but that required storing/restoring all GP registers. Maybe there's a better way? > > then > > instrumentation_begin(); > foo(fargs...); > bar(bargs...); > instrumentation_end(); > > is a source of potential false positives because the state is not > guaranteed to be correct, neither for foo() nor for bar(), even if you > wipe the state in instrumentation_begin(), right? Yes, this is right. > This approximation approach smells fishy and it's inevitably going to be > a constant source of 'add yet another kmsan annotation/fixup' patches, > which I'm not interested in at all. > > As this needs compiler support anyway, then why not doing the obvious: > > #define noinstr \ > .... __kmsan_conditional > > #define instrumentation_begin() \ > ..... __kmsan_cond_begin > > #define instrumentation_end() \ > __kmsan_cond_end ....... > > and let the compiler stick whatever is required into that code section > between instrumentation_begin() and instrumentation_end()? We define noinstr as __attribute__((disable_sanitizer_instrumentation)) (https://llvm.org/docs/LangRef.html#:~:text=3Ddisable_sanitizer_instrumenta= tion), which means no instrumentation will be applied to the annotated function. Changing that behavior by adding subregions that can be instrumented sounds questionable. C also doesn't have good syntactic means to define these subregions - perhaps some __xxx_begin()/__xxx_end() intrinsics would work, but they would require both compile-time and run-time validation. Fortunately, I don't think we need to insert extra instrumentation into instrumentation_begin()/instrumentation_end() regions. What I have in mind is adding a bool flag to kmsan_context_state, that the instrumentation sets to true before the function call. When entering an instrumented function, KMSAN would check that flag and set it to false, so that the context state can be only used once. If a function is called from another instrumented function, the context state is properly set up, and there is nothing to worry about. If it is called from non-instrumented code (either noinstr or the skipped files that have KMSAN_SANITIZE:=3Dn), KMSAN would detect that and wipe the context state before use. By the way, I've noticed that at least for now (with pt_regs unpoisoning performed in IDT entries) the problem with false positives in noinstr code is entirely gone, so maybe we don't even have to bother. > Yes, it's more work on the tooling side, but the tooling side is mostly > a one time effort while chasing the false positives is a long term > nightmare. Well said. > Thanks, > > tglx --=20 Alexander Potapenko Software Engineer Google Germany GmbH Erika-Mann-Stra=C3=9Fe, 33 80636 M=C3=BCnchen Gesch=C3=A4ftsf=C3=BChrer: Paul Manicle, Liana Sebastian Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg Diese E-Mail ist vertraulich. Falls Sie diese f=C3=A4lschlicherweise erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter, l=C3=B6schen Sie alle Kopien und Anh=C3=A4nge davon und lassen Sie = mich bitte wissen, dass die E-Mail an die falsche Person gesendet wurde. This e-mail is confidential. If you received this communication by mistake, please don't forward it to anyone else, please erase all copies and attachments, and please let me know that it has gone to the wrong person.