Received: by 2002:a25:c593:0:0:0:0:0 with SMTP id v141csp3315282ybe; Sun, 15 Sep 2019 12:27:29 -0700 (PDT) X-Google-Smtp-Source: APXvYqzRwAEwQ/K3wdESbOQLyjRMdpBCgulw4rm+vcaSag4gw7Lt9jsdePkyspPl2r+V1ZoIAPB7 X-Received: by 2002:a50:fc12:: with SMTP id i18mr12704884edr.25.1568575649734; Sun, 15 Sep 2019 12:27:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1568575649; cv=none; d=google.com; s=arc-20160816; b=fHHPz0a0+mUQVsxPgnhlYenlEJX7zGia1R+6xKl7xYJJqsA6X/O2ZNUKHtwCHaoBTC LgAKZ1izLynPEkqQtO97HaVTKV/NFim4LVkVkbyYqUavxo3W36c0PHzq16VK+6uQK0ya IOzQ14Tw10fraW0eT+moxNvlWCiJ6UPcp6liHHp1LvBXWOavDmeItrTiIto2sYWqz29e gyRiJvYaXwmwQXtjuGyi5NDdcA8Gdv859KzXBoGCoh3WK4Kvhw3DXx5jZgkpiyxJkBfa Y57o4Wec24u2TxFn2Ccb6/+pWOFYkvnhZvctnskzSWuxOgVECVsw7/nfy6XJNxCLk+H8 wpfw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=8RUP8Eomki4OQ8SPIrQKXPIWb+Iv3CPFd+v42oaCyys=; b=T9mKcS6WOULlfb7vih83i3/Q+yu8SYPStxPTvVMKQNcCgwfPjd2UsABnL7qrw3nF7G o0bRH0azR8tnBsDRK3LbZCfF6gt0dqfBhaHkFZMaw/JNOXE67nXzRrHNps29vGoaFN6F uOHPXxhfO8WfzSJZjqxDXV2tM+ooHyHSq+82WZj/di/23GF9i8GhicxaETk5s9NmbZzx evwPbiEYqvC/GhV1Diu51JpT6MCuHKNbBcyglbfXmyLk9jOs/5n8sOw/z3Uwak0fhgyn h+EbUaeFVNonfVG7E2S1fPxwtfZKcmtyw2FNpm99JR2VvD931fpFUOYFxPV9ekF14XoW 8FnA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id oa24si4322860ejb.41.2019.09.15.12.27.05; Sun, 15 Sep 2019 12:27:29 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726206AbfIORBD (ORCPT + 99 others); Sun, 15 Sep 2019 13:01:03 -0400 Received: from foss.arm.com ([217.140.110.172]:36466 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725788AbfIORBD (ORCPT ); Sun, 15 Sep 2019 13:01:03 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 94A5428; Sun, 15 Sep 2019 10:01:02 -0700 (PDT) Received: from [10.0.2.15] (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id DB5503F575; Sun, 15 Sep 2019 10:01:01 -0700 (PDT) Subject: Re: [PATCH] sched: fix migration to invalid cpu in __set_cpus_allowed_ptr To: shikemeng , mingo@redhat.com, peterz@infradead.org Cc: linux-kernel@vger.kernel.org References: <1568516867-11300-1-git-send-email-shikemeng@huawei.com> From: Valentin Schneider Message-ID: <2e856d95-0fbd-7239-000e-11cd7a1d05eb@arm.com> Date: Sun, 15 Sep 2019 18:00:53 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <1568516867-11300-1-git-send-email-shikemeng@huawei.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 15/09/2019 04:07, shikemeng wrote: > From: > > reason: migration to invalid cpu in __set_cpus_allowed_ptr > archive path: patches/euleros/sched > > Oops occur when running qemu on arm64: > Unable to handle kernel paging request at virtual address ffff000008effe40 > Internal error: Oops: 96000007 [#1] SMP > Process migration/0 (pid: 12, stack limit = 0x00000000084e3736) > pstate: 20000085 (nzCv daIf -PAN -UAO) > pc : __ll_sc___cmpxchg_case_acq_4+0x4/0x20 > lr : move_queued_task.isra.21+0x124/0x298 > ... > Call trace: > __ll_sc___cmpxchg_case_acq_4+0x4/0x20 > __migrate_task+0xc8/0xe0 > migration_cpu_stop+0x170/0x180 > cpu_stopper_thread+0xec/0x178 > smpboot_thread_fn+0x1ac/0x1e8 > kthread+0x134/0x138 > ret_from_fork+0x10/0x18 > > __set_cpus_allowed_ptr will choose an active dest_cpu in affinity mask to migrage the process if process is not > currently running on any one of the CPUs specified in affinity mask.__set_cpus_allowed_ptr will choose an invalid > dest_cpu(>= nr_cpu_ids, 1024 in my virtual machine) if CPUS in affinity mask are deactived by cpu_down after > cpumask_intersects check.Cpumask_test_cpu of dest_cpu afterwards is overflow and may passes if corresponding bit > is coincidentally set.As a consequence, kernel will access a invalid rq address associate with the invalid cpu in > migration_cpu_stop->__migrate_task->move_queued_task and the Oops occurs. Process as follows may trigger the Oops: > 1) A process repeatedly bind itself to cpu0 and cpu1 in turn by calling sched_setaffinity > 2) A shell script repeatedly "echo 0 > /sys/devices/system/cpu/cpu1/online" and "echo 1 > /sys/devices/system/cpu/cpu1/online" in turn > 3) Oops appears if the invalid cpu is set in memory after tested cpumask. > > Change-Id: I9c2f95aecd3da568991b7408397215f26c990e40 > Signed-off-by: The log still isn't wrapped to 75 chars, and the change-id still hasn't been removed. The subject should also mention that this is v2 of the patch, again this is all in the process documentation. The fix itself looks fine though, so once the log respects the rules: Reviewed-by: Valentin Schneider