Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp23458181rwd; Sat, 1 Jul 2023 02:02:28 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5Aixf+WWKyZ9yZSzQoCKv09kgKOHdtDImNYo2OH/Tg2VTpJeK5tqCZJZQfyIB5BWxa7Pbg X-Received: by 2002:a05:6870:7010:b0:1b0:37a6:3dab with SMTP id u16-20020a056870701000b001b037a63dabmr6190907oae.40.1688202148133; Sat, 01 Jul 2023 02:02:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688202148; cv=none; d=google.com; s=arc-20160816; b=DWnzgd/S3b4BrQi123Wh2NZjQ5D2XC12mboBdNopCimIE3nVWY+TtbtjjUbw5wJNoK Yd2aXATDkaocJ8CBDaFriM7R7Xsp3qT1+K33MAp10dTJWaXxyT8DlGqmN8aTHYdB2gpI XHrmKoWc0uyUE8JUExcjHtn8vNOQdozYC5iF+P1p+4r8tBLatn0jQL9He7fcLxzMVFA5 /ssnNSJq2NEJCbsiHnawjfhv3JORvwAadgOd9UXJdDRAHrXoU8fuz/clOVo3yZIEwlar dgD9lfBjBMLRngy65ahB52wLpwXPi3DAXEAtRdvqc48HwxFsgs+AykWqAmhkzzfcgBLj ZrdA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=pWmI7yHnJRuzzs7mN9tkkrawHMCD5NSfQD7Rp8TK4Ek=; fh=KbV2UPZSQ5+O3F0stVCgp414tOrW1cicvPMCb53BwnM=; b=VIcwCYN8aZdHX9CJixtGrk8cOWSEFXd3rfx0RkmQUNA7tZm8q7vnhnaSsWmLH2jI/k RVHRUG9Dn11sAmgAEzb/WjVxkQJdonrmfiS4b4Xj9BiTUPmxPpj35D4pYdla1B/drEIK J992NzpHlQGcTaNi7OmcgaBR/MzUeja6oT417FLQyu3llIoPAcKik1NQhjKtJRckQTmA YtBpzF4utUWnEBTCRwWZ0anS7wlS4fhFmS1/7BhsRQpF1cFSl0mTrZzhWOVFHSjFT6qK iOmOaZ6QZEm+2OXsU8q96355SAW/VSZSmEwEjN53J0aUEKnor6QY7M5O8xh5am5cjsRC KEBw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id d11-20020a17090a628b00b0026308f709e6si9357566pjj.113.2023.07.01.02.02.11; Sat, 01 Jul 2023 02:02:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229836AbjGAIn5 (ORCPT + 99 others); Sat, 1 Jul 2023 04:43:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43368 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229506AbjGAIny (ORCPT ); Sat, 1 Jul 2023 04:43:54 -0400 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 81687BC; Sat, 1 Jul 2023 01:43:52 -0700 (PDT) Received: from kwepemi500019.china.huawei.com (unknown [172.30.72.55]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4QtQd82JY6ztQRs; Sat, 1 Jul 2023 16:41:00 +0800 (CST) Received: from [10.67.110.237] (10.67.110.237) by kwepemi500019.china.huawei.com (7.221.188.117) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Sat, 1 Jul 2023 16:43:47 +0800 Subject: Re: [PATCH 5.10] kprobes/x86: Fix kprobe debug exception handling logic To: Greg KH CC: , , , , , , , , , , References: <20230630020845.227939-1-lihuafei1@huawei.com> <2023063039-dotted-improper-7b3c@gregkh> From: Li Huafei Message-ID: <6cbfbd13-b2f6-4c76-8d0d-ac07f59b23e7@huawei.com> Date: Sat, 1 Jul 2023 16:43:46 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.9.0 MIME-Version: 1.0 In-Reply-To: <2023063039-dotted-improper-7b3c@gregkh> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.67.110.237] X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To kwepemi500019.china.huawei.com (7.221.188.117) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,NICE_REPLY_A, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2023/6/30 13:21, Greg KH wrote: > On Fri, Jun 30, 2023 at 10:08:45AM +0800, Li Huafei wrote: >> We get the following crash caused by a null pointer access: >> >> BUG: kernel NULL pointer dereference, address: 0000000000000000 >> ... >> RIP: 0010:resume_execution+0x35/0x190 >> ... >> Call Trace: >> <#DB> >> kprobe_debug_handler+0x41/0xd0 >> exc_debug+0xe5/0x1b0 >> asm_exc_debug+0x19/0x30 >> RIP: 0010:copy_from_kernel_nofault.part.0+0x55/0xc0 >> ... >> >> process_fetch_insn+0xfb/0x720 >> kprobe_trace_func+0x199/0x2c0 >> ? kernel_clone+0x5/0x2f0 >> kprobe_dispatcher+0x3d/0x60 >> aggr_pre_handler+0x40/0x80 >> ? kernel_clone+0x1/0x2f0 >> kprobe_ftrace_handler+0x82/0xf0 >> ? __se_sys_clone+0x65/0x90 >> ftrace_ops_assist_func+0x86/0x110 >> ? rcu_nocb_try_bypass+0x1f3/0x370 >> 0xffffffffc07e60c8 >> ? kernel_clone+0x1/0x2f0 >> kernel_clone+0x5/0x2f0 >> >> The analysis reveals that kprobe and hardware breakpoints conflict in >> the use of debug exceptions. >> >> If we set a hardware breakpoint on a memory address and also have a >> kprobe event to fetch the memory at this address. Then when kprobe >> triggers, it goes to read the memory and triggers hardware breakpoint >> monitoring. This time, since kprobe handles debug exceptions earlier >> than hardware breakpoints, it will cause kprobe to incorrectly assume >> that the exception is a kprobe trigger. >> >> Notice that after the mainline commit 6256e668b7af ("x86/kprobes: Use >> int3 instead of debug trap for single-step"), kprobe no longer uses >> debug trap, avoiding the conflict with hardware breakpoints here. This >> commit is to remove the IRET that returns to kernel, not to fix the >> problem we have here. Also there are a bunch of merge conflicts when >> trying to apply this commit to older kernels, so fixing it directly in >> older kernels is probably a better option. > > What is the list of commits that it would take to resolve this in these > kernels? We would almost always prefer to do that instead of taking > changes that are not upstream. I have sorted out that for 5.10 there are 9 patches that need to be backported: #9 8924779df820 ("x86/kprobes: Fix JNG/JNLE emulation") #8 dec8784c9088 ("x86/kprobes: Update kcb status flag after singlestepping") #7 2304d14db659 ("x86/kprobes: Move 'inline' to the beginning of the kprobe_is_ss() declaration") #6 2f706e0e5e26 ("x86/kprobes: Fix to identify indirect jmp and others using range case") #5 6256e668b7af ("x86/kprobes: Use int3 instead of debug trap for single-step") #4 a194acd316f9 ("x86/kprobes: Identify far indirect JMP correctly") #3 d60ad3d46f1d ("x86/kprobes: Retrieve correct opcode for group instruction") #2 abd82e533d88 ("x86/kprobes: Do not decode opcode in resume_execution()") #1 e689b300c99c ("kprobes/x86: Fix fall-through warnings for Clang e689b300c99c") The main one we need to backport is patch 5, patche 1-6 are pre-patches, and patche 6-9 are fix patches for patch 5. The major modifications are patch 2 and patch 4. Patch 2 optimizes resume_execution() to avoid repeated instruction decoding, and patch 5 uses int3 instead of debug trap, and as Masami said in the commit message this patch will change some behavior of kprobe, but it has almost no effect on the actual usage. I'm not sure backport these patches are acceptable, do I need to send them out for review? Thanks, Huafei > > thanks, > > greg k-h > . >