Received: by 2002:a05:6a10:a852:0:0:0:0 with SMTP id d18csp791888pxy; Wed, 5 May 2021 14:07:39 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxZ/LFUjNKWRbbraescio68cp6wIRdwJARwzKwTf6gSWxMAGLvTrIVkLBb5v9Lv0Vxf9Smy X-Received: by 2002:a17:906:2559:: with SMTP id j25mr727825ejb.42.1620248859423; Wed, 05 May 2021 14:07:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1620248859; cv=none; d=google.com; s=arc-20160816; b=JrNW5eFwYh3O60uvMexzIxiS6xFx/nNQTi3tPCrCOxBxpyzOG6KM7M1Cj/lgUW75tY LFdfsmsNmHQDp1clZikdtObl+IlKMck3BmLe5X7axExypIzY91iV+gdhVU/Nsf6abYsa MBLSkqpOaRwyKVag1y7viyiVX8p7NHD9buttl6xs+BGeU09MpLc1aC0e+BKy7ABmiP0p 0hT3A3Hj1KTkfn2mryzyu1m/AyLPOKskzkYeL9/iNE/+wGrzHc1GK/Q9rUcmvbdccY80 kZHLuH8j8vg7ojjaVIvHOgwJtvFt5vrUhoos5OrPtXr+fPLJ8cZnglsOvjQh7nNkpsAx 0erQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:message-id:date:subject:cc:to:from :dkim-signature; bh=JUmGq2wlaOeze3MXhpw6Yda3Y4YIjTKC44x0gDwnWr8=; b=N7q4F/XKBixSSYMGCe5SxUlv7WcjuhHcO3IEHcwRDNMqTDxamqfMDuxpkM1q70z+No oaAG8oNjZF3jR1eL3j+mPvuvtwWW3iaQ9x1ygfIKs9OhRwk0t6SCrsQ6VT4KZFO3IfLG xnM1Jmz3mRGuhgYNBUYGqKIQvHjVb9xhMV7jc1bUduKW1Xut/r8fIbXPUnyj2WvNm5uT iDO8e3fE+9/lenX0XCBwPxUXSWxYAv5Iurmg5LRsDGjQo47SdKHs5p4VRyQHL8bFtemh 9p2h04IM6jGzk4viBUGFAmISHxBbn7fIO1BsvABrM4xtPwnn19vMt9rRTskHW6iUizso shAw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=BRvbfE4N; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b6si398029ejb.398.2021.05.05.14.07.16; Wed, 05 May 2021 14:07:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=BRvbfE4N; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233590AbhEEUHh (ORCPT + 99 others); Wed, 5 May 2021 16:07:37 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:45576 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233322AbhEEUHg (ORCPT ); Wed, 5 May 2021 16:07:36 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1620245199; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc; bh=JUmGq2wlaOeze3MXhpw6Yda3Y4YIjTKC44x0gDwnWr8=; b=BRvbfE4NJaCRq2Eg2RlHMirR/wrFOsXAaMnUXM/Zm8lO1RzgiahgTDTGcHiC1c3YEk9yuJ i044HOpMLuV+fz78uH9lrz9To1Vwubq7XH48HPQiTt3S7qpbsbeHeg6DJmzZrM7GZJ3ash 6nR2VYN0kpV0DIU9DpuO5S9gjnCMwSs= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-147-JAddeHJ5On-KhEpdV2Purg-1; Wed, 05 May 2021 16:06:37 -0400 X-MC-Unique: JAddeHJ5On-KhEpdV2Purg-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 3A6DB107ACED; Wed, 5 May 2021 20:06:35 +0000 (UTC) Received: from llong.com (ovpn-117-4.rdu2.redhat.com [10.10.117.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id D7C5710016FD; Wed, 5 May 2021 20:06:29 +0000 (UTC) From: Waiman Long To: Johannes Weiner , Michal Hocko , Vladimir Davydov , Andrew Morton , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Vlastimil Babka , Roman Gushchin , Shakeel Butt Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org, Waiman Long Subject: [PATCH v4 0/3] mm: memcg/slab: Fix objcg pointer array handling problem Date: Wed, 5 May 2021 16:06:07 -0400 Message-Id: <20210505200610.13943-1-longman@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org v4: - Incorporate suggestions from others like setting SLAB_ACCOUNT for KMALLOC_CGROUP caches into patch 2. - Add a new patch 3 to disable caches merging for KMALLOC_NORMAL caches as suggested by Roman. v3: - Update patch 2 commit log and rework kmalloc_type() to make it easier to read. v2: - Take suggestion from Vlastimil to use a new set of kmalloc-cg-* to handle the objcg pointer array allocation and freeing problems. Since the merging of the new slab memory controller in v5.9, the page structure stores a pointer to objcg pointer array for slab pages. When the slab has no used objects, it can be freed in free_slab() which will call kfree() to free the objcg pointer array in memcg_alloc_page_obj_cgroups(). If it happens that the objcg pointer array is the last used object in its slab, that slab may then be freed which may caused kfree() to be called again. With the right workload, the slab cache may be set up in a way that allows the recursive kfree() calling loop to nest deep enough to cause a kernel stack overflow and panic the system. In fact, we have a reproducer that can cause kernel stack overflow on a s390 system involving kmalloc-rcl-256 and kmalloc-rcl-128 slabs with the following kfree() loop recursively called 74 times: [ 285.520739] [<000000000ec432fc>] kfree+0x4bc/0x560 [ 285.520740] [<000000000ec43466>] __free_slab+0xc6/0x228 [ 285.520741] [<000000000ec41fc2>] __slab_free+0x3c2/0x3e0 [ 285.520742] [<000000000ec432fc>] kfree+0x4bc/0x560 : While investigating this issue, I also found an issue on the allocation side. If the objcg pointer array happen to come from the same slab or a circular dependency linkage is formed with multiple slabs, those affected slabs can never be freed again. This patch series addresses these two issues by introducing a new set of kmalloc-cg- caches split from kmalloc- caches. The new set will only contain non-reclaimable and non-dma objects that are accounted in memory cgroups whereas the old set are now for unaccounted objects only. By making this split, all the objcg pointer arrays will come from the kmalloc- caches, but those caches will never hold any objcg pointer array. As a result, deeply nested kfree() call and the unfreeable slab problems are now gone. Waiman Long (3): mm: memcg/slab: Properly set up gfp flags for objcg pointer array mm: memcg/slab: Create a new set of kmalloc-cg- caches mm: memcg/slab: Disable cache merging for KMALLOC_NORMAL caches include/linux/slab.h | 41 ++++++++++++++++++++++++++++++++--------- mm/memcontrol.c | 8 ++++++++ mm/slab.h | 1 - mm/slab_common.c | 32 ++++++++++++++++++++++++-------- 4 files changed, 64 insertions(+), 18 deletions(-) -- 2.18.1