Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp3073437ybt; Mon, 29 Jun 2020 14:36:18 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz8sbASf1LlfQ4rHJzC/MpZGMf6nDVX/fPLO8WrlIuX5xisWDw9q8B4DuF04qNDMxAI2P+I X-Received: by 2002:a17:906:8316:: with SMTP id j22mr15339081ejx.97.1593466578482; Mon, 29 Jun 2020 14:36:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1593466578; cv=none; d=google.com; s=arc-20160816; b=rQv2XjhHq4p1JoYJhHYwHmCDyQSThVZlr5Tuy0/C95zTz4bgkK4MdiRtWf8ARQkBJo 4Z1qTz3sIDVk0lKz4iv4me5Q40oH7VuZbJySf9Yc1q7OOSIm/JSsEwEWSY3+VoNRXi/L rgiD7LYoRvn3BiEiNSWZrHZMy0uCm4K3PjeXquUJUr0dnrX6bdn/CZiRWhoQf3qdOwvJ Zcoun7bG3vNvkYQQZTujGQ0/jyFxcioOh8hL8iW9wjyNdP6DV1tbCZl1JCSJnZ9KH5sp nrMkMFlMzs95gRJ1LZqkm4dnfdhLgdWYaA95t9GlzSNLqgnRtNb6YXhTtyNnR1UaeJRi s0ZA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:user-agent:references :in-reply-to:subject:cc:to:from:date:content-transfer-encoding :mime-version:dkim-signature; bh=D7Josv4k0LKxgAa6xwh1uQ1ShAtydPKSOudg/f6c4RE=; b=gu0aJ//XcfAgBSBCxgMHXjnDpwmLC4EGX8pPylgceUuUGifLjuJFta+4Av/YkcyMES 5zPTklPuAZA0y3P5IvUCyQ4BwMFZMAyJVIdwgMoUs3vkaMwIY+kvui5UUHDZLb8Ntu96 ppbyXn+5eA/1NUsVjQRylSEJdGeSa0ksOEtWeMWv0aYse6iGaowUq4Z2sKUWugeDST6z 4CQ4HhG6L/59WnaGtePtMHjejN6nRREV0Tcg2FKKn3JCxvl+pGutURcNoeeiZaXbsRO2 wK9dfvgTiNGoxTXD4iimaWv9jlAZZ5sK9otPwR8bp0/NpbGizSN7SRW9M59uFL4cbPir 9tfQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=H9hqRgTH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z22si441297edr.357.2020.06.29.14.35.55; Mon, 29 Jun 2020 14:36:18 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=H9hqRgTH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728575AbgF2Sky (ORCPT + 99 others); Mon, 29 Jun 2020 14:40:54 -0400 Received: from mail.kernel.org ([198.145.29.99]:60594 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728528AbgF2Skv (ORCPT ); Mon, 29 Jun 2020 14:40:51 -0400 Received: from disco-boy.misterjones.org (disco-boy.misterjones.org [51.254.78.96]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 7F07123D51; Mon, 29 Jun 2020 14:01:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1593439266; bh=34RLz4AGlOLv0qThfaZkESlSozovgvdYMlx7twj9dNo=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=H9hqRgTH3hbbEok3oVOWnPSHA8uX73nPBDevXHnc+9COGNF/WvBBV3bLj1bTeRu56 HY/l+I/ZhoyZ4BZCC4QisD1Jqi89UZtqqlLdEF+UYbuxaL9FKjK7eC0U/o9YpB4kS0 3LaZwMu/3b9xCFSjl/q0BfUfzGUiudLaGrBv8J3w= Received: from disco-boy.misterjones.org ([51.254.78.96] helo=www.loen.fr) by disco-boy.misterjones.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1jpuLJ-007Kfz-4d; Mon, 29 Jun 2020 15:01:05 +0100 MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Date: Mon, 29 Jun 2020 15:01:05 +0100 From: Marc Zyngier To: Zenghui Yu Cc: Thomas Gleixner , Jason Cooper , Linux Kernel Mailing List , wanghaibin.wang@huawei.com, kuhn.chenqun@huawei.com, wangjingyi11@huawei.com Subject: Re: [BUG] irqchip/gic-v4.1: sleeping function called from invalid context In-Reply-To: <1d673e99-0dd2-d287-aedf-65686eed5194@huawei.com> References: <1d673e99-0dd2-d287-aedf-65686eed5194@huawei.com> User-Agent: Roundcube Webmail/1.4.5 Message-ID: X-Sender: maz@kernel.org X-SA-Exim-Connect-IP: 51.254.78.96 X-SA-Exim-Rcpt-To: yuzenghui@huawei.com, tglx@linutronix.de, jason@lakedaemon.net, linux-kernel@vger.kernel.org, wanghaibin.wang@huawei.com, kuhn.chenqun@huawei.com, wangjingyi11@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Zenghui, On 2020-06-29 10:39, Zenghui Yu wrote: > Hi All, > > Booting the latest kernel with DEBUG_ATOMIC_SLEEP=y on a GICv4.1 > enabled > box, I get the following kernel splat: > > [ 0.053766] BUG: sleeping function called from invalid context at > mm/slab.h:567 > [ 0.053767] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, > pid: 0, name: swapper/1 > [ 0.053769] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.8.0-rc3+ #23 > [ 0.053770] Call trace: > [ 0.053774] dump_backtrace+0x0/0x218 > [ 0.053775] show_stack+0x2c/0x38 > [ 0.053777] dump_stack+0xc4/0x10c > [ 0.053779] ___might_sleep+0xfc/0x140 > [ 0.053780] __might_sleep+0x58/0x90 > [ 0.053782] slab_pre_alloc_hook+0x7c/0x90 > [ 0.053783] kmem_cache_alloc_trace+0x60/0x2f0 > [ 0.053785] its_cpu_init+0x6f4/0xe40 > [ 0.053786] gic_starting_cpu+0x24/0x38 > [ 0.053788] cpuhp_invoke_callback+0xa0/0x710 > [ 0.053789] notify_cpu_starting+0xcc/0xd8 > [ 0.053790] secondary_start_kernel+0x148/0x200 > > # ./scripts/faddr2line vmlinux its_cpu_init+0x6f4/0xe40 > its_cpu_init+0x6f4/0xe40: > allocate_vpe_l1_table at drivers/irqchip/irq-gic-v3-its.c:2818 > (inlined by) its_cpu_init_lpis at drivers/irqchip/irq-gic-v3-its.c:3138 > (inlined by) its_cpu_init at drivers/irqchip/irq-gic-v3-its.c:5166 Let me guess: a system with more than a single CommonLPIAff group? > I've tried to replace GFP_KERNEL flag with GFP_ATOMIC to allocate > memory > in this atomic context, and the splat disappears. But after a quick > look > at [*], it seems not a good idea to allocate memory within the CPU > hotplug notifier. I really don't know much about it, please have a > look. > > [*] > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=11e37d357f6ba7a9af850a872396082cc0a0001f The allocation of the cpumask is pretty benign, and could either be allocated upfront for all RDs (and freed on detecting that we share the same CommonLPIAff group) or made atomic. The much bigger issue is the alloc_pages call just after. Allocating this upfront probably is the wrong thing to do, as you are likely to allocate way too much memory, even if you free it quickly afterwards. At this stage, I'd rather we turn this into an atomic allocation. A notifier is just another atomic context, and if this fails at such an early stage, then the CPU is unlikely to continue booting... Would you like to write a patch for this? Given that you have tested something, it probably already exists. Or do you want me to do it? Thanks, M. -- Jazz is not dead. It just smells funny...