Received: by 2002:a25:c593:0:0:0:0:0 with SMTP id v141csp1661327ybe; Wed, 11 Sep 2019 19:40:43 -0700 (PDT) X-Google-Smtp-Source: APXvYqw4vg4SsU6L/J6OaUNrGhjaQdsQd01hztcoW7qNM46PQoFKW/mFOvMskqeqMY4XKwtjKlH1 X-Received: by 2002:a17:906:4e8c:: with SMTP id v12mr32508229eju.96.1568256043497; Wed, 11 Sep 2019 19:40:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1568256043; cv=none; d=google.com; s=arc-20160816; b=KIlbgxUNc5qt3F1SS9wkqgxkBaj6L88cZsMJISd5KO0OygSu5jfbbDaIzC9+FRB/uA FnnCBpeI4Mza7HS7kHtd+WKVmsuMRlPEXtSGrEPLDRfGl1L0pmv52xbkbqvEU3dbBeBQ UABASLkw7MfxtT5Fyf4gdZgHyBcIc9rNun0CtqH0g3zWqSZ4Z4T9/KLJvyLhcXWatgAQ j+nVIasVo2wq7WeudlPn6mc3y37rzmQ7MUKQHterktKuRWDO8RYqpQ+R0MLfQlDsgJwv 0WmyhY7cvpZu3dAOTP3/xrAL2JNsgafegHiuz6mtgq1cA6KOCNA1sOSAg3Uo5/uvvZ/I KyiQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:mime-version:user-agent:date:message-id:subject :from:cc:to; bh=Rv/pF1qpMqu1zV6+vXY6dWH7v7hLgnJMyuj/ZZXHFVo=; b=PKZHGz8AoNrzf/xchvlBXj3zFoM2rSbUfNqRlxRipbCZqJyUAgbiOGU7IDCMu/qNAP Y9kZ5rN2WLa5GlfxQ1xP13J4qW07341mEq/EvTPtUMH2aKKKlgP7ldXU8kQurnh3rWXv 2sHN4DZ6eWH81m5NRVXsp8Y2tpI5fZKuyJk6hJfNKn/74GrhGsjS1q5sput9C8xz6peF qOq8NDnGKKFVUq5jvWzy0pN8jMjWW5gPNPjUQRJFZVo3sGXS7v/oqfIS8TElgy3Zq2Op zMjL9EignbdaQFYIY/rbTCdtgEo6rlMmzWcPNtURa+ZcHw8hVmi+w6Afq2B3U5zYbVLZ TQPg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l15si12060008ejc.100.2019.09.11.19.40.20; Wed, 11 Sep 2019 19:40:43 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728268AbfILBzx (ORCPT + 99 others); Wed, 11 Sep 2019 21:55:53 -0400 Received: from szxga04-in.huawei.com ([45.249.212.190]:2215 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726793AbfILBzw (ORCPT ); Wed, 11 Sep 2019 21:55:52 -0400 Received: from DGGEMS412-HUB.china.huawei.com (unknown [172.30.72.60]) by Forcepoint Email with ESMTP id 6FDF11646A7ADD38EC00; Thu, 12 Sep 2019 09:55:50 +0800 (CST) Received: from [127.0.0.1] (10.184.52.157) by DGGEMS412-HUB.china.huawei.com (10.3.19.212) with Microsoft SMTP Server id 14.3.439.0; Thu, 12 Sep 2019 09:55:47 +0800 To: , CC: From: shikemeng Subject: [PATCH] sched: fix migration to invalid cpu in __set_cpus_allowed_ptr Message-ID: <4a492fe8-09b7-d8ee-8a16-602e592b00b6@huawei.com> Date: Thu, 12 Sep 2019 09:55:27 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:68.0) Gecko/20100101 Thunderbird/68.1.0 MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.184.52.157] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From 089dbf0216628ac6ae98742ab90725ca9c2bf201 Mon Sep 17 00:00:00 2001 From: Date: Tue, 10 Sep 2019 09:44:58 -0400 Subject: [PATCH] sched: fix migration to invalid cpu in __set_cpus_allowed_ptr reason: migration to invalid cpu in __set_cpus_allowed_ptr archive path: patches/euleros/sched Oops occur when running qemu on arm64: Unable to handle kernel paging request at virtual address ffff000008effe40 Internal error: Oops: 96000007 [#1] SMP Process migration/0 (pid: 12, stack limit = 0x00000000084e3736) pstate: 20000085 (nzCv daIf -PAN -UAO) pc : __ll_sc___cmpxchg_case_acq_4+0x4/0x20 lr : move_queued_task.isra.21+0x124/0x298 ... Call trace: __ll_sc___cmpxchg_case_acq_4+0x4/0x20 __migrate_task+0xc8/0xe0 migration_cpu_stop+0x170/0x180 cpu_stopper_thread+0xec/0x178 smpboot_thread_fn+0x1ac/0x1e8 kthread+0x134/0x138 ret_from_fork+0x10/0x18 __set_cpus_allowed_ptr will choose an active dest_cpu in affinity mask to migrage the process if process is not currently running on any one of the CPUs specified in affinity mask.__set_cpus_allowed_ptr will choose an invalid dest_cpu(>= nr_cpu_ids, 1024 in my virtual machine) if CPUS in affinity mask are deactived by cpu_down after cpumask_intersects check.Cpumask_test_cpu of dest_cpu afterwards is overflow and may passes if corresponding bit is coincidentally set.As a consequence, kernel will access a invalid rq address associate with the invalid cpu in migration_cpu_stop->__migrate_task->move_queued_task and the Oops occurs. Process as follows may trigger the Oops: 1) A process repeatedly bind itself to cpu0 and cpu1 in turn by calling sched_setaffinity 2) A shell script repeatedly "echo 0 > /sys/devices/system/cpu/cpu1/online" and "echo 1 > /sys/devices/system/cpu/cpu1/online" in turn 3) Oops appears if the invalid cpu is set in memory after tested cpumask. Change-Id: I9c2f95aecd3da568991b7408397215f26c990e40 --- kernel/sched/core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 4b63fef..5181ea9 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1112,7 +1112,8 @@ static int __set_cpus_allowed_ptr(struct task_struct *p, if (cpumask_equal(&p->cpus_allowed, new_mask)) goto out; - if (!cpumask_intersects(new_mask, cpu_valid_mask)) { + dest_cpu = cpumask_any_and(cpu_valid_mask, new_mask); + if (dest_cpu >= nr_cpu_ids) { ret = -EINVAL; goto out; } @@ -1133,7 +1134,6 @@ static int __set_cpus_allowed_ptr(struct task_struct *p, if (cpumask_test_cpu(task_cpu(p), new_mask)) goto out; - dest_cpu = cpumask_any_and(cpu_valid_mask, new_mask); if (task_running(rq, p) || p->state == TASK_WAKING) { struct migration_arg arg = { p, dest_cpu }; /* Need help from migration thread: drop lock and wait. */ -- 1.8.5.6