Received: by 2002:a6b:fb09:0:0:0:0:0 with SMTP id h9csp3869197iog; Tue, 21 Jun 2022 07:34:32 -0700 (PDT) X-Google-Smtp-Source: AGRyM1vTqfD/5PP8oqsCoilGgGNHjFYywOZSFS8ACHmiAkNm486S/VRY1Vlfpi1j5m1bwVB/XmQF X-Received: by 2002:a63:6e0d:0:b0:40c:6f47:e592 with SMTP id j13-20020a636e0d000000b0040c6f47e592mr16739965pgc.181.1655822072190; Tue, 21 Jun 2022 07:34:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1655822072; cv=none; d=google.com; s=arc-20160816; b=sh/2d0tLow/5eF62SN1746jntMFkrfrJHpPSnvXzqyDgCWq1FnSE1naL7VQvzzF0ZR x2TnilqZ3kx+ivfPsCZZ6ZBzip4zDqjtgONjljDLoLJlEdPlpI+OmKcdqIpeO+s5xVep CSTAC8OUQsS5Msz2uK25PRBP+uypVWIzWb+8rfNVuvzxyfilIh2Uf7sHOrAyGJJZOluK xNSy5jcDe9Gfi7RNmWTU5km/ER2xGLj6vRGAwmacKpCNyAxABPQpUfxzAqiJK36vd4fO 4hyYnmqic6PuLJuXHGghgGu6W2xglChANPY/IhKcLzPdnBD0SiqzHxgLuXvYL0O7rZ6L c6pg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:date:from:dkim-signature; bh=yFwOkX5u5xV89mKMUoEbIG06wYYhgJqYwetvw1ZUjd4=; b=z5/QanCU5E9T/OefzQAtYf3qAe2KtG/1TuU7l2GGPbszONoNJ2sFYPKpEP9BwvL/lh fTggwRp5dF/vp4WmiBL5h5em0pWAlmzfmcNoqjmmSFfYI/iwOyzNijdlUQdaqRa1kUnZ G3zKfljR1lFW+1N/0o7xtr3oICU8UTAaa7ivs/7BpyDB3SOnWHrL1dTy+t+9ea0YJJJD Jm+xVw7Hxel8egKMQwL328LFZnu0a4Gg/oAnioyA8UzgwxQmv/70FWtbRy8jMP+2mVHM Y10r+BS69HyBQgLgysRB8WHSvlYtzOSUKeZ5S90jMIStT9phHjFvhHvhEM4eKxQ1RP4T E56A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=owwv9fTp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u16-20020a170903125000b001640aa6d40bsi22996201plh.79.2022.06.21.07.34.18; Tue, 21 Jun 2022 07:34:32 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=owwv9fTp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1351585AbiFUO3U (ORCPT + 99 others); Tue, 21 Jun 2022 10:29:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40154 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1351562AbiFUO3M (ORCPT ); Tue, 21 Jun 2022 10:29:12 -0400 Received: from mail-lf1-x12f.google.com (mail-lf1-x12f.google.com [IPv6:2a00:1450:4864:20::12f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BE0B562D9 for ; Tue, 21 Jun 2022 07:29:10 -0700 (PDT) Received: by mail-lf1-x12f.google.com with SMTP id c4so22659407lfj.12 for ; Tue, 21 Jun 2022 07:29:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:date:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=yFwOkX5u5xV89mKMUoEbIG06wYYhgJqYwetvw1ZUjd4=; b=owwv9fTp+ThPxppT/bUM8W4zSlE7eydJgrzcdiOBkgF/UzE786XjbBJFh4LNuqzwPV FcdKfRDCJwQtYpbddfkj4EJRnCGWNToze0MtV5AojiF0rwwu97Huq6j6JOC8ujvBii5A 5VRpxLArMhwGEr3fjmfnCXfZSr3a/LX7dswiaox80JZOQHE2zEJXA3Si6p/hfdeklbGZ NcCks14qr9KKQuGFmwDUt3jss/F9gqfJ1VdNorGx+lZXQKoR1xigsBvJxQobnYyZE2PS mLqZl1SdKlnttpDOp7mvn1w96B16y91XrBcdiUm1043ST2IjdFCqUnudDHurVGHavjM1 NGxw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:date:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=yFwOkX5u5xV89mKMUoEbIG06wYYhgJqYwetvw1ZUjd4=; b=gf2cj717QAkqdZnwqP2zROzftlFmdULhg06uh+sC/7lEHfm6+CMNLa7AnhH6uBq7IP N+P69ZBhzSg/3XYMZjbJi4KZrYWaPco6qruiR9ZhGTRbqdQVGcnpXS8mHV/11w1q7pRd ciPAWqwe+wMjLdx3UB/0j2/W19KE3oj6g5IWiP/Y53eS1UH4o6U95U55eiu7BV2xK2ua KPakyy+jB8nMDg74yV0kJbec7bksKTAkH/NSn70H1NS7w+ltY2JoWBP8YsV7wdRfU9z3 sPwtsnZBRgTPK6a/BIQzEp2FS4RD6UTY8RpNYmFlon1CFLfouLz8z1tCKxDtwKHscwub H7+w== X-Gm-Message-State: AJIora9mW8bKSTgdWZzvM24O/jMQJLHJYl0IFY9pMJb8R51v6JSxH3th f0z6BhdEJQTfK55pU7Zcobs= X-Received: by 2002:a05:6512:b1c:b0:47d:df52:b5a9 with SMTP id w28-20020a0565120b1c00b0047ddf52b5a9mr15981152lfu.293.1655821748868; Tue, 21 Jun 2022 07:29:08 -0700 (PDT) Received: from pc638.lan ([155.137.26.201]) by smtp.gmail.com with ESMTPSA id n11-20020a2e878b000000b0025a6e47056csm850841lji.124.2022.06.21.07.29.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 Jun 2022 07:29:08 -0700 (PDT) From: Uladzislau Rezki X-Google-Original-From: Uladzislau Rezki Date: Tue, 21 Jun 2022 16:29:06 +0200 To: Zhaoyang Huang Cc: Uladzislau Rezki , "zhaoyang.huang" , Andrew Morton , "open list:MEMORY MANAGEMENT" , LKML , Ke Wang , Christoph Hellwig Subject: Re: [PATCH] mm: fix racing of vb->va when kasan enabled Message-ID: References: <1653447164-15017-1-git-send-email-zhaoyang.huang@unisoc.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > On Tue, Jun 21, 2022 at 5:27 PM Uladzislau Rezki wrote: > > > > > On Mon, Jun 20, 2022 at 6:44 PM Uladzislau Rezki wrote: > > > > > > > > > > > > > > > > > Is it easy to reproduce? If so could you please describe the steps? As i see > > > > > > the freeing of the "vb" is RCU safe whereas vb->va is not. But from the first > > > > > > glance i do not see how it can accessed twice. Hm.. > > > > > It was raised from a monkey test on A13_k515 system and got 1/20 pcs > > > > > failed. IMO, vb->va which out of vmap_purge_lock protection could race > > > > > with a concurrent ra freeing within __purge_vmap_area_lazy. > > > > > > > > > Do you have exact steps how you run "monkey" test? > > > There are about 30+ kos inserted during startup which could be a > > > specific criteria for reproduction. Do you have doubts about the test > > > result or the solution? > > > > > > I do not have any doubt about your test results, so if you can trigger it > > then there is an issue at least on the 5.4.161-android12 kernel. > > > > 1. With your fix we get expanded mutex range, thus the worst case of vmalloc > > allocation can be increased when it fails and repeat. Because it also invokes > > the purge_vmap_area_lazy() that access the same mutex. > I am not sure I get your point. _vm_unmap_aliases calls > _purge_vmap_area_lazy instead of purge_vmap_area_lazy. Do you have any > other solutions? I really don't think my patch is the best way as I > don't have a full view of vmalloc mechanism. > Yep, but it holds the mutex: mutex_lock(&vmap_purge_lock); purge_fragmented_blocks_allcpus(); if (!__purge_vmap_area_lazy(start, end) && flush) flush_tlb_kernel_range(start, end); mutex_unlock(&vmap_purge_lock); I do not have a solution yet. I am trying still to figure out how you can trigger it. rcu_read_lock(); list_for_each_entry_rcu(vb, &vbq->free, free_list) { spin_lock(&vb->lock); if (vb->dirty && vb->dirty != VMAP_BBMAP_BITS) { unsigned long va_start = vb->va->va_start; so you say that "vb->va->va_start" can be accessed twice. I do not see how it can happen. The purge_fragmented_blocks() removes "vb" from the free_list and set vb->dirty to the VMAP_BBMAP_BITS to prevent purging it again. It is protected by the spin_lock(&vb->lock): spin_lock(&vb->lock); if (vb->free + vb->dirty == VMAP_BBMAP_BITS && vb->dirty != VMAP_BBMAP_BITS) { vb->free = 0; /* prevent further allocs after releasing lock */ vb->dirty = VMAP_BBMAP_BITS; /* prevent purging it again */ vb->dirty_min = 0; vb->dirty_max = VMAP_BBMAP_BITS; so the VMAP_BBMAP_BITS is set under spinlock. The _vm_unmap_aliases() checks it: list_for_each_entry_rcu(vb, &vbq->free, free_list) { spin_lock(&vb->lock); if (vb->dirty && vb->dirty != VMAP_BBMAP_BITS) { unsigned long va_start = vb->va->va_start; unsigned long s, e; if the "vb->dirty != VMAP_BBMAP_BITS". I am missing your point here? > > > > 2. You run 5.4.161-android12 kernel what is quite old. Could you please > > retest with latest kernel? I am asking because on the latest kernel with > > CONFIG_KASAN i am not able to reproduce it. > > > > I do a lot of: vm_map_ram()/vm_unmap_ram()/vmalloc()/vfree() in parallel > > by 64 kthreads on my 64 CPUs test system. > The failure generates at 20s from starting up, I think it is a rare timing. > > > > Could you please confirm that you can trigger an issue on the latest kernel? > Sorry, I don't have an available latest kernel for now. > Can you do: "gdb ./vmlinux", execute "l *_vm_unmap_aliases+0x164" and provide output? -- Uladzislau Rezki