Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp4302870pxb; Thu, 14 Oct 2021 02:34:45 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx4EqED7O3SEYV8eOxYdES1IJrAALzY30/6o/yJdHNRsssI+Im4dMPe/e+t8aZ/L0PdXSQw X-Received: by 2002:aa7:9e9e:0:b0:44c:c821:c132 with SMTP id p30-20020aa79e9e000000b0044cc821c132mr4287842pfq.61.1634204085085; Thu, 14 Oct 2021 02:34:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1634204085; cv=none; d=google.com; s=arc-20160816; b=V67R8eV287aEtwsfASt3uQDIc/ON+JWe5vPptrCsZ9IDRc+TobqyjSycB8AMRt4pWF /AMMsPlZGakNaXhtXSalJRRNxPyDGk/76c9/F+kdm49eRobvGy6nEWtLoatz1Hzt4/6h dyR5xFzmstu1CN2koyCDWcapWejj/FmSBpzT81ih9aNs2+pZTc8Cu2phYVEHEX09oUiy mQ+44TEVcsptfQuboXB/dGI0HPncYq52RoINDs3NAQhXhdZO9n4EEgawsv+zOmdfvPSF sAEmtkxiiGE7vpW3Q6E1KvPOf5DruxHcXQvGYeET2kqfBdzMCPpaoYHmAfIcnaGAOju9 hiLQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:subject :from:references:cc:to:content-language:user-agent:mime-version:date :message-id:dkim-signature:dkim-signature; bh=2760sGFl5oJpeELLEsVmFeL7VhzN7ZXzcDPL7h4xml4=; b=Gbp6UlEplD4UPooGucLN3ITn6/5617SyrI4jfZCStxz8s9SZBzOzw8C4Ub1KiDaQwv wDwKqB2VxfS4K8Q5QYE6CbrkcPuY36accnJT+JLaW+xouwTSdCCOXYTDxbcsUKF/qi6N 6fOfC0gMbxz88WikA6XhhAZPvpDcLT+hBh5tl5319v8eXG6ci7qJNKdMXWWPeaAPQldv zbE18/C7JRLrzlH6kRKYpKk+l+TLXR2VnXfvT0uy9TZU0zv+5ohON2DfNE+Mj43zx73K VGvBfkhcLHsGzJb2KVLo4lR3p3tBV2F0Ciof5Ytmtbfvgdqco93hT2z3uCb73fb3gI19 qshQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b=s1ut4apr; dkim=neutral (no key) header.i=@suse.cz header.b=X3pC7ItK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id h33si2629470pjd.123.2021.10.14.02.34.30; Thu, 14 Oct 2021 02:34:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b=s1ut4apr; dkim=neutral (no key) header.i=@suse.cz header.b=X3pC7ItK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230119AbhJNJfK (ORCPT + 99 others); Thu, 14 Oct 2021 05:35:10 -0400 Received: from smtp-out2.suse.de ([195.135.220.29]:42954 "EHLO smtp-out2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230081AbhJNJfJ (ORCPT ); Thu, 14 Oct 2021 05:35:09 -0400 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 4C2F01F782; Thu, 14 Oct 2021 09:33:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1634203984; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2760sGFl5oJpeELLEsVmFeL7VhzN7ZXzcDPL7h4xml4=; b=s1ut4aprVploAh4/uWEuuu/padESxEDPMJjjNNXaA+w08/a4mY8lrrLu2soZTjmB/U1g7C xMqMFZ/SGNtZYyG5Y9p1/xD2939EYCAdLO3ShHLNLfdmQqVd8W6hg9p4m7EJztOySzXJiA o2n4Icv9ck6ze0iCNv06H7X5dBAT3NA= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1634203984; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2760sGFl5oJpeELLEsVmFeL7VhzN7ZXzcDPL7h4xml4=; b=X3pC7ItKtE7Kn2pf1Q2emOI1rHwvh4XVYF6etVyjZf5oVeIY6MGZx28MD0Xn5D7Weq8NqB ew4ndopHpfCCsCDw== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id EA76113D7C; Thu, 14 Oct 2021 09:33:03 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id gCdsOE/5Z2HaXgAAMHmgww (envelope-from ); Thu, 14 Oct 2021 09:33:03 +0000 Message-ID: <4d99add1-5cf7-c608-a131-18959b85e5dc@suse.cz> Date: Thu, 14 Oct 2021 11:33:03 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.2.0 Content-Language: en-US To: kernel test robot Cc: 0day robot , Dmitry Vyukov , Marco Elver , Vijayanand Jitta , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Daniel Vetter , Andrey Ryabinin , Alexander Potapenko , Andrey Konovalov , Geert Uytterhoeven , Oliver Glitta , Imran Khan , LKML , lkp@lists.01.org, Andrew Morton , linux-mm@kvack.org, dri-devel@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, kasan-dev@googlegroups.com, Mike Rapoport References: <20211014085450.GC18719@xsang-OptiPlex-9020> From: Vlastimil Babka Subject: Re: [lib/stackdepot] 1cd8ce52c5: BUG:unable_to_handle_page_fault_for_address In-Reply-To: <20211014085450.GC18719@xsang-OptiPlex-9020> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 10/14/21 10:54, kernel test robot wrote: > > > Greeting, > > FYI, we noticed the following commit (built with gcc-9): > > commit: 1cd8ce52c520c26c513899fb5aee42b8e5f60d0d ("[PATCH v2] lib/stackdepot: allow optional init and stack_table allocation by kvmalloc()") > url: https://github.com/0day-ci/linux/commits/Vlastimil-Babka/lib-stackdepot-allow-optional-init-and-stack_table-allocation-by-kvmalloc/20211012-170816 > base: git://anongit.freedesktop.org/drm-intel for-linux-next > > in testcase: rcutorture > version: > with following parameters: > > runtime: 300s > test: cpuhotplug > torture_type: srcud > > test-description: rcutorture is rcutorture kernel module load/unload test. > test-url: https://www.kernel.org/doc/Documentation/RCU/torture.txt > > > on test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 4G > > caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace): > > > +---------------------------------------------+------------+------------+ > | | a94a6d76c9 | 1cd8ce52c5 | > +---------------------------------------------+------------+------------+ > | boot_successes | 30 | 0 | > | boot_failures | 0 | 7 | > | BUG:kernel_NULL_pointer_dereference,address | 0 | 2 | > | Oops:#[##] | 0 | 7 | > | EIP:stack_depot_save | 0 | 7 | > | Kernel_panic-not_syncing:Fatal_exception | 0 | 7 | > | BUG:unable_to_handle_page_fault_for_address | 0 | 5 | > +---------------------------------------------+------------+------------+ > > > If you fix the issue, kindly add following tag > Reported-by: kernel test robot > > > > [ 319.147926][ T259] BUG: unable to handle page fault for address: 0ec74110 > [ 319.149309][ T259] #PF: supervisor read access in kernel mode > [ 319.150362][ T259] #PF: error_code(0x0000) - not-present page > [ 319.151372][ T259] *pde = 00000000 > [ 319.151964][ T259] Oops: 0000 [#1] SMP > [ 319.152617][ T259] CPU: 0 PID: 259 Comm: systemd-rc-loca Not tainted 5.15.0-rc1-00270-g1cd8ce52c520 #1 > [ 319.154514][ T259] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 > [ 319.156200][ T259] EIP: stack_depot_save+0x12a/0x4d0 Cc Mike Rapoport, looks like: - memblock_alloc() should have failed (I think, because page allocator already took over?), but didn't. So apparently we got some area that wasn't fully mapped. - using slab_is_available() is not accurate enough to detect when to use memblock or page allocator (kvmalloc in case of my patch). I have used it because memblock_alloc_internal() checks the same condition to issue a warning. Relevant part of dmesg.xz that was attached: [ 1.589075][ T0] Dentry cache hash table entries: 524288 (order: 9, 2097152 bytes, linear) [ 1.592396][ T0] Inode-cache hash table entries: 262144 (order: 8, 1048576 bytes, linear) [ 2.916844][ T0] allocated 31496920 bytes of page_ext - this means we were allocating from page allocator by alloc_pages_exact_nid() already [ 2.918197][ T0] mem auto-init: stack:off, heap alloc:off, heap free:on [ 2.919683][ T0] mem auto-init: clearing system memory may take some time... [ 2.921239][ T0] Initializing HighMem for node 0 (000b67fe:000bffe0) [ 23.023619][ T0] Initializing Movable for node 0 (00000000:00000000) [ 245.194520][ T0] Checking if this processor honours the WP bit even in supervisor mode...Ok. [ 245.196847][ T0] Memory: 2914460K/3145208K available (20645K kernel code, 5953K rwdata, 12624K rodata, 760K init, 8112K bss, 230748K reserved, 0K cma-reserved, 155528K highmem) [ 245.200521][ T0] Stack Depot allocating hash table with memblock_alloc - initializing stack depot as part of initializing page_owner, uses memblock_alloc() because slab_is_available() is still false [ 245.212005][ T0] Node 0, zone Normal: page owner found early allocated 0 pages [ 245.213867][ T0] Node 0, zone HighMem: page owner found early allocated 0 pages [ 245.216126][ T0] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1 - printed by slub's kmem_cache_init() after create_kmalloc_caches() setting slab_state to UP, making slab_is_available() true, but too late In my local testing of the patch, when stackdepot was initialized through page owner init, it was using kvmalloc() so slab_is_available() was true. Looks like the exact order of slab vs page_owner alloc is non-deterministic, could be arch-dependent or just random ordering of init calls. A wrong order will exploit the apparent fact that slab_is_available() is not a good indicator of using memblock vs page allocator, and we would need a better one. Thoughts?