Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752098AbdFNOTI (ORCPT ); Wed, 14 Jun 2017 10:19:08 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:46232 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750728AbdFNOTG (ORCPT ); Wed, 14 Jun 2017 10:19:06 -0400 Date: Wed, 14 Jun 2017 16:18:57 +0200 From: Martin Schwidefsky To: "Kirill A. Shutemov" Cc: Andrew Morton , Vlastimil Babka , Vineet Gupta , Russell King , Will Deacon , Catalin Marinas , Ralf Baechle , "David S. Miller" , Heiko Carstens , "Aneesh Kumar K . V" , Andrea Arcangeli , linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 3/3] mm, thp: Do not loose dirty bit in __split_huge_pmd_locked() In-Reply-To: <20170614135143.25068-4-kirill.shutemov@linux.intel.com> References: <20170614135143.25068-1-kirill.shutemov@linux.intel.com> <20170614135143.25068-4-kirill.shutemov@linux.intel.com> X-Mailer: Claws Mail 3.13.2 (GTK+ 2.24.30; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 x-cbid: 17061414-0020-0000-0000-00000388298A X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17061414-0021-0000-0000-0000420740ED Message-Id: <20170614161857.69d54338@mschwideX1> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-06-14_03:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1703280000 definitions=main-1706140239 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1456 Lines: 49 On Wed, 14 Jun 2017 16:51:43 +0300 "Kirill A. Shutemov" wrote: > Until pmdp_invalidate() pmd entry is present and CPU can update it, > setting dirty. Currently, we tranfer dirty bit to page too early and > there is window when we can miss dirty bit. > > Let's call SetPageDirty() after pmdp_invalidate(). > > Signed-off-by: Kirill A. Shutemov > ... > @@ -2046,6 +2043,14 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, > * pmd_populate. > */ > pmdp_invalidate(vma, haddr, pmd); > + > + /* > + * Transfer dirty bit to page after pmd invalidated, so CPU would not > + * be able to set it under us. > + */ > + if (pmd_dirty(*pmd)) > + SetPageDirty(page); > + > pmd_populate(mm, pmd, pgtable); > > if (freeze) { That won't work on s390. After pmdp_invalidate the pmd entry is gone, it has been replaced with _SEGMENT_ENTRY_EMPTY. This includes the dirty and referenced bits. The old scheme is entry = *pmd; pmdp_invalidate(vma, addr, pmd); if (pmd_dirty(entry)) ... Could we change pmdp_invalidate to make it return the old pmd entry? The pmdp_xchg_direct function already returns it, for s390 that would be an easy change. The above code snippet would change like this: entry = pmdp_invalidate(vma, addr, pmd); if (pmd_dirty(entry)) ... -- blue skies, Martin. "Reality continues to ruin my life." - Calvin.