This is v2 of [1] with changed subject:
"[PATCH v1 0/5] s390: page_mapcount(), page_has_private() and PG_arch_1"
Rebased on s390x/features which contains the page_mapcount() and
page_has_private() cleanups, and some PG_arch_1 cleanups from Willy. To
compensate, I added some more cleanups ;)
One "easy" fix upfront. Another issue I spotted is documented in [1].
Once this hits upstream, we can remove HAVE_ARCH_MAKE_PAGE_ACCESSIBLE
from core-mm and s390x, so only the folio variant will remain.
Compile tested, but not runtime tested with UV, I'll appreciate some
testing help from people with UV access and experience.
[1] https://lkml.kernel.org/r/[email protected]
v1 -> v2:
* Rebased on s390x/features:
* "s390/hugetlb: convert PG_arch_1 code to work on folio->flags"
-> pmd_folio() not available on s390x/features
* "s390/uv: don't call folio_wait_writeback() without a folio reference"
-> Willy's folio conversion is in s390x/features
* "s390/uv: convert PG_arch_1 users to only work on small folios"
-> Add comments
* Rearrange code and handle split_folio() return values properly. New
patches to handle splitting:
-> "s390/uv: gmap_make_secure() cleanups for further changes"
-> "s390/uv: split large folios in gmap_make_secure()"
* Added more cleanups:
-> "s390/uv: make uv_convert_from_secure() a static function"
-> "s390/uv: convert uv_destroy_owned_page() to uv_destroy_(folio|pte)()"
-> "s390/uv: convert uv_convert_owned_from_secure() to
uv_convert_from_secure_(folio|pte)()"
-> "s390/mm: implement HAVE_ARCH_MAKE_FOLIO_ACCESSIBLE"
Cc: Heiko Carstens <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Cc: Alexander Gordeev <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Sven Schnelle <[email protected]>
Cc: Janosch Frank <[email protected]>
Cc: Claudio Imbrenda <[email protected]>
Cc: Gerald Schaefer <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Thomas Huth <[email protected]>
David Hildenbrand (10):
s390/uv: don't call folio_wait_writeback() without a folio reference
s390/uv: gmap_make_secure() cleanups for further changes
s390/uv: split large folios in gmap_make_secure()
s390/uv: convert PG_arch_1 users to only work on small folios
s390/uv: update PG_arch_1 comment
s390/uv: make uv_convert_from_secure() a static function
s390/uv: convert uv_destroy_owned_page() to uv_destroy_(folio|pte)()
s390/uv: convert uv_convert_owned_from_secure() to
uv_convert_from_secure_(folio|pte)()
s390/uv: implement HAVE_ARCH_MAKE_FOLIO_ACCESSIBLE
s390/hugetlb: convert PG_arch_1 code to work on folio->flags
arch/s390/include/asm/page.h | 5 +
arch/s390/include/asm/pgtable.h | 8 +-
arch/s390/include/asm/uv.h | 12 +-
arch/s390/kernel/uv.c | 207 +++++++++++++++++++++-----------
arch/s390/mm/fault.c | 14 ++-
arch/s390/mm/gmap.c | 10 +-
arch/s390/mm/hugetlbpage.c | 8 +-
7 files changed, 172 insertions(+), 92 deletions(-)
--
2.44.0
Let's factor out handling of LRU cache draining and convert the if-else
chain to a switch-case.
Signed-off-by: David Hildenbrand <[email protected]>
---
arch/s390/kernel/uv.c | 66 ++++++++++++++++++++++++++-----------------
1 file changed, 40 insertions(+), 26 deletions(-)
diff --git a/arch/s390/kernel/uv.c b/arch/s390/kernel/uv.c
index 016993e9eb72..25fe28d189df 100644
--- a/arch/s390/kernel/uv.c
+++ b/arch/s390/kernel/uv.c
@@ -266,6 +266,36 @@ static bool should_export_before_import(struct uv_cb_header *uvcb, struct mm_str
return atomic_read(&mm->context.protected_count) > 1;
}
+/*
+ * Drain LRU caches: the local one on first invocation and the ones of all
+ * CPUs on successive invocations. Returns "true" on the first invocation.
+ */
+static bool drain_lru(bool *drain_lru_called)
+{
+ /*
+ * If we have tried a local drain and the folio refcount
+ * still does not match our expected safe value, try with a
+ * system wide drain. This is needed if the pagevecs holding
+ * the page are on a different CPU.
+ */
+ if (*drain_lru_called) {
+ lru_add_drain_all();
+ /* We give up here, don't retry immediately. */
+ return false;
+ }
+ /*
+ * We are here if the folio refcount does not match the
+ * expected safe value. The main culprits are usually
+ * pagevecs. With lru_add_drain() we drain the pagevecs
+ * on the local CPU so that hopefully the refcount will
+ * reach the expected safe value.
+ */
+ lru_add_drain();
+ *drain_lru_called = true;
+ /* The caller should try again immediately */
+ return true;
+}
+
/*
* Requests the Ultravisor to make a page accessible to a guest.
* If it's brought in the first time, it will be cleared. If
@@ -275,7 +305,7 @@ static bool should_export_before_import(struct uv_cb_header *uvcb, struct mm_str
int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb)
{
struct vm_area_struct *vma;
- bool local_drain = false;
+ bool drain_lru_called = false;
spinlock_t *ptelock;
unsigned long uaddr;
struct folio *folio;
@@ -331,37 +361,21 @@ int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb)
out:
mmap_read_unlock(gmap->mm);
- if (rc == -EAGAIN) {
+ switch (rc) {
+ case -EAGAIN:
/*
* If we are here because the UVC returned busy or partial
* completion, this is just a useless check, but it is safe.
*/
folio_wait_writeback(folio);
folio_put(folio);
- } else if (rc == -EBUSY) {
- /*
- * If we have tried a local drain and the folio refcount
- * still does not match our expected safe value, try with a
- * system wide drain. This is needed if the pagevecs holding
- * the page are on a different CPU.
- */
- if (local_drain) {
- lru_add_drain_all();
- /* We give up here, and let the caller try again */
- return -EAGAIN;
- }
- /*
- * We are here if the folio refcount does not match the
- * expected safe value. The main culprits are usually
- * pagevecs. With lru_add_drain() we drain the pagevecs
- * on the local CPU so that hopefully the refcount will
- * reach the expected safe value.
- */
- lru_add_drain();
- local_drain = true;
- /* And now we try again immediately after draining */
- goto again;
- } else if (rc == -ENXIO) {
+ return -EAGAIN;
+ case -EBUSY:
+ /* Additional folio references. */
+ if (drain_lru(&drain_lru_called))
+ goto again;
+ return -EAGAIN;
+ case -ENXIO:
if (gmap_fault(gmap, gaddr, FAULT_FLAG_WRITE))
return -EFAULT;
return -EAGAIN;
--
2.44.0
It's not used outside of uv.c, so let's make it a static function.
Signed-off-by: David Hildenbrand <[email protected]>
---
arch/s390/include/asm/uv.h | 6 ------
arch/s390/kernel/uv.c | 2 +-
2 files changed, 1 insertion(+), 7 deletions(-)
diff --git a/arch/s390/include/asm/uv.h b/arch/s390/include/asm/uv.h
index 0e7bd3873907..d2205ff97007 100644
--- a/arch/s390/include/asm/uv.h
+++ b/arch/s390/include/asm/uv.h
@@ -484,7 +484,6 @@ int uv_pin_shared(unsigned long paddr);
int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb);
int gmap_destroy_page(struct gmap *gmap, unsigned long gaddr);
int uv_destroy_owned_page(unsigned long paddr);
-int uv_convert_from_secure(unsigned long paddr);
int uv_convert_owned_from_secure(unsigned long paddr);
int gmap_convert_to_secure(struct gmap *gmap, unsigned long gaddr);
@@ -503,11 +502,6 @@ static inline int uv_destroy_owned_page(unsigned long paddr)
return 0;
}
-static inline int uv_convert_from_secure(unsigned long paddr)
-{
- return 0;
-}
-
static inline int uv_convert_owned_from_secure(unsigned long paddr)
{
return 0;
diff --git a/arch/s390/kernel/uv.c b/arch/s390/kernel/uv.c
index ecfc08902215..3d3250b406a6 100644
--- a/arch/s390/kernel/uv.c
+++ b/arch/s390/kernel/uv.c
@@ -156,7 +156,7 @@ int uv_destroy_owned_page(unsigned long paddr)
*
* @paddr: Absolute host address of page to be exported
*/
-int uv_convert_from_secure(unsigned long paddr)
+static int uv_convert_from_secure(unsigned long paddr)
{
struct uv_cb_cfs uvcb = {
.header.cmd = UVC_CMD_CONV_FROM_SEC_STOR,
--
2.44.0
Now that make_folio_secure() may only set PG_arch_1 for small folios,
let's convert relevant remaining UV code to only work on (small) folios
and simply reject large folios early. This way, we'll never end up
touching PG_arch_1 on tail pages of a large folio in UV code.
The folio_get()/folio_put() for functions that are documented to already
hold a folio reference look weird; likely they are required to make
concurrent gmap_make_secure() back off because the caller might only hold
an implicit reference due to the page mapping. So leave that alone for now.
Signed-off-by: David Hildenbrand <[email protected]>
---
arch/s390/include/asm/page.h | 2 ++
arch/s390/kernel/uv.c | 41 ++++++++++++++++++++++--------------
2 files changed, 27 insertions(+), 16 deletions(-)
diff --git a/arch/s390/include/asm/page.h b/arch/s390/include/asm/page.h
index 9381879f7ecf..b64384872c0f 100644
--- a/arch/s390/include/asm/page.h
+++ b/arch/s390/include/asm/page.h
@@ -214,7 +214,9 @@ static inline unsigned long __phys_addr(unsigned long x, bool is_31bit)
#define pfn_to_phys(pfn) ((pfn) << PAGE_SHIFT)
#define phys_to_page(phys) pfn_to_page(phys_to_pfn(phys))
+#define phys_to_folio(phys) page_folio(phys_to_page(phys))
#define page_to_phys(page) pfn_to_phys(page_to_pfn(page))
+#define folio_to_phys(page) pfn_to_phys(folio_pfn(folio))
static inline void *pfn_to_virt(unsigned long pfn)
{
diff --git a/arch/s390/kernel/uv.c b/arch/s390/kernel/uv.c
index 3c6d86e3e828..914dcec27329 100644
--- a/arch/s390/kernel/uv.c
+++ b/arch/s390/kernel/uv.c
@@ -135,14 +135,18 @@ static int uv_destroy_page(unsigned long paddr)
*/
int uv_destroy_owned_page(unsigned long paddr)
{
- struct page *page = phys_to_page(paddr);
+ struct folio *folio = phys_to_folio(paddr);
int rc;
- get_page(page);
+ /* See gmap_make_secure(): large folios cannot be secure */
+ if (unlikely(folio_test_large(folio)))
+ return 0;
+
+ folio_get(folio);
rc = uv_destroy_page(paddr);
if (!rc)
- clear_bit(PG_arch_1, &page->flags);
- put_page(page);
+ clear_bit(PG_arch_1, &folio->flags);
+ folio_put(folio);
return rc;
}
@@ -170,14 +174,18 @@ int uv_convert_from_secure(unsigned long paddr)
*/
int uv_convert_owned_from_secure(unsigned long paddr)
{
- struct page *page = phys_to_page(paddr);
+ struct folio *folio = phys_to_folio(paddr);
int rc;
- get_page(page);
+ /* See gmap_make_secure(): large folios cannot be secure */
+ if (unlikely(folio_test_large(folio)))
+ return 0;
+
+ folio_get(folio);
rc = uv_convert_from_secure(paddr);
if (!rc)
- clear_bit(PG_arch_1, &page->flags);
- put_page(page);
+ clear_bit(PG_arch_1, &folio->flags);
+ folio_put(folio);
return rc;
}
@@ -479,33 +487,34 @@ EXPORT_SYMBOL_GPL(gmap_destroy_page);
*/
int arch_make_page_accessible(struct page *page)
{
+ struct folio *folio = page_folio(page);
int rc = 0;
- /* Hugepage cannot be protected, so nothing to do */
- if (PageHuge(page))
+ /* See gmap_make_secure(): large folios cannot be secure */
+ if (unlikely(folio_test_large(folio)))
return 0;
/*
* PG_arch_1 is used in 3 places:
* 1. for kernel page tables during early boot
* 2. for storage keys of huge pages and KVM
- * 3. As an indication that this page might be secure. This can
+ * 3. As an indication that this small folio might be secure. This can
* overindicate, e.g. we set the bit before calling
* convert_to_secure.
* As secure pages are never huge, all 3 variants can co-exists.
*/
- if (!test_bit(PG_arch_1, &page->flags))
+ if (!test_bit(PG_arch_1, &folio->flags))
return 0;
- rc = uv_pin_shared(page_to_phys(page));
+ rc = uv_pin_shared(folio_to_phys(folio));
if (!rc) {
- clear_bit(PG_arch_1, &page->flags);
+ clear_bit(PG_arch_1, &folio->flags);
return 0;
}
- rc = uv_convert_from_secure(page_to_phys(page));
+ rc = uv_convert_from_secure(folio_to_phys(folio));
if (!rc) {
- clear_bit(PG_arch_1, &page->flags);
+ clear_bit(PG_arch_1, &folio->flags);
return 0;
}
--
2.44.0
We removed the usage of PG_arch_1 for page tables in commit
a51324c430db ("s390/cmma: rework no-dat handling").
Let's update the comment in UV to reflect that.
Reviewed-by: Claudio Imbrenda <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
---
arch/s390/kernel/uv.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/arch/s390/kernel/uv.c b/arch/s390/kernel/uv.c
index 914dcec27329..ecfc08902215 100644
--- a/arch/s390/kernel/uv.c
+++ b/arch/s390/kernel/uv.c
@@ -495,13 +495,12 @@ int arch_make_page_accessible(struct page *page)
return 0;
/*
- * PG_arch_1 is used in 3 places:
- * 1. for kernel page tables during early boot
- * 2. for storage keys of huge pages and KVM
- * 3. As an indication that this small folio might be secure. This can
+ * PG_arch_1 is used in 2 places:
+ * 1. for storage keys of hugetlb folios and KVM
+ * 2. As an indication that this small folio might be secure. This can
* overindicate, e.g. we set the bit before calling
* convert_to_secure.
- * As secure pages are never huge, all 3 variants can co-exists.
+ * As secure pages are never large folios, both variants can co-exists.
*/
if (!test_bit(PG_arch_1, &folio->flags))
return 0;
--
2.44.0
Let's have the following variants for destroying pages:
(1) uv_destroy(): Like uv_pin_shared() and uv_convert_from_secure(),
"low level" helper that operates on paddr and doesn't mess with folios.
(2) uv_destroy_folio(): Consumes a folio to which we hold a reference.
(3) uv_destroy_pte(): Consumes a PTE that holds a reference through the
mapping.
Unfortunately we need uv_destroy_pte(), because pfn_folio() and
friends are not available in pgtable.h.
Signed-off-by: David Hildenbrand <[email protected]>
---
arch/s390/include/asm/pgtable.h | 2 +-
arch/s390/include/asm/uv.h | 10 ++++++++--
arch/s390/kernel/uv.c | 24 +++++++++++++++++-------
arch/s390/mm/gmap.c | 6 ++++--
4 files changed, 30 insertions(+), 12 deletions(-)
diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index 60950e7a25f5..97e040617c29 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -1199,7 +1199,7 @@ static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm,
* The notifier should have destroyed all protected vCPUs at this
* point, so the destroy should be successful.
*/
- if (full && !uv_destroy_owned_page(pte_val(res) & PAGE_MASK))
+ if (full && !uv_destroy_pte(res))
return res;
/*
* If something went wrong and the page could not be destroyed, or
diff --git a/arch/s390/include/asm/uv.h b/arch/s390/include/asm/uv.h
index d2205ff97007..a1bef30066ef 100644
--- a/arch/s390/include/asm/uv.h
+++ b/arch/s390/include/asm/uv.h
@@ -483,7 +483,8 @@ static inline int is_prot_virt_host(void)
int uv_pin_shared(unsigned long paddr);
int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb);
int gmap_destroy_page(struct gmap *gmap, unsigned long gaddr);
-int uv_destroy_owned_page(unsigned long paddr);
+int uv_destroy_folio(struct folio *folio);
+int uv_destroy_pte(pte_t pte);
int uv_convert_owned_from_secure(unsigned long paddr);
int gmap_convert_to_secure(struct gmap *gmap, unsigned long gaddr);
@@ -497,7 +498,12 @@ static inline int uv_pin_shared(unsigned long paddr)
return 0;
}
-static inline int uv_destroy_owned_page(unsigned long paddr)
+static inline int uv_destroy_folio(struct folio *folio)
+{
+ return 0;
+}
+
+static inline int uv_destroy_pte(pte_t pte)
{
return 0;
}
diff --git a/arch/s390/kernel/uv.c b/arch/s390/kernel/uv.c
index 3d3250b406a6..61c1ce51c883 100644
--- a/arch/s390/kernel/uv.c
+++ b/arch/s390/kernel/uv.c
@@ -110,7 +110,7 @@ EXPORT_SYMBOL_GPL(uv_pin_shared);
*
* @paddr: Absolute host address of page to be destroyed
*/
-static int uv_destroy_page(unsigned long paddr)
+static int uv_destroy(unsigned long paddr)
{
struct uv_cb_cfs uvcb = {
.header.cmd = UVC_CMD_DESTR_SEC_STOR,
@@ -131,11 +131,10 @@ static int uv_destroy_page(unsigned long paddr)
}
/*
- * The caller must already hold a reference to the page
+ * The caller must already hold a reference to the folio
*/
-int uv_destroy_owned_page(unsigned long paddr)
+int uv_destroy_folio(struct folio *folio)
{
- struct folio *folio = phys_to_folio(paddr);
int rc;
/* See gmap_make_secure(): large folios cannot be secure */
@@ -143,13 +142,22 @@ int uv_destroy_owned_page(unsigned long paddr)
return 0;
folio_get(folio);
- rc = uv_destroy_page(paddr);
+ rc = uv_destroy(folio_to_phys(folio));
if (!rc)
clear_bit(PG_arch_1, &folio->flags);
folio_put(folio);
return rc;
}
+/*
+ * The present PTE still indirectly holds a folio reference through the mapping.
+ */
+int uv_destroy_pte(pte_t pte)
+{
+ VM_WARN_ON(!pte_present(pte));
+ return uv_destroy_folio(pfn_folio(pte_pfn(pte)));
+}
+
/*
* Requests the Ultravisor to encrypt a guest page and make it
* accessible to the host for paging (export).
@@ -437,6 +445,7 @@ int gmap_destroy_page(struct gmap *gmap, unsigned long gaddr)
{
struct vm_area_struct *vma;
unsigned long uaddr;
+ struct folio *folio;
struct page *page;
int rc;
@@ -460,7 +469,8 @@ int gmap_destroy_page(struct gmap *gmap, unsigned long gaddr)
page = follow_page(vma, uaddr, FOLL_WRITE | FOLL_GET);
if (IS_ERR_OR_NULL(page))
goto out;
- rc = uv_destroy_owned_page(page_to_phys(page));
+ folio = page_folio(page);
+ rc = uv_destroy_folio(folio);
/*
* Fault handlers can race; it is possible that two CPUs will fault
* on the same secure page. One CPU can destroy the page, reboot,
@@ -472,7 +482,7 @@ int gmap_destroy_page(struct gmap *gmap, unsigned long gaddr)
*/
if (rc)
rc = uv_convert_owned_from_secure(page_to_phys(page));
- put_page(page);
+ folio_put(folio);
out:
mmap_read_unlock(gmap->mm);
return rc;
diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c
index 094b43b121cd..0351cb139df4 100644
--- a/arch/s390/mm/gmap.c
+++ b/arch/s390/mm/gmap.c
@@ -2756,13 +2756,15 @@ static const struct mm_walk_ops gather_pages_ops = {
*/
void s390_uv_destroy_pfns(unsigned long count, unsigned long *pfns)
{
+ struct folio *folio;
unsigned long i;
for (i = 0; i < count; i++) {
+ folio = pfn_folio(pfns[i]);
/* we always have an extra reference */
- uv_destroy_owned_page(pfn_to_phys(pfns[i]));
+ uv_destroy_folio(folio);
/* get rid of the extra reference */
- put_page(pfn_to_page(pfns[i]));
+ folio_put(folio);
cond_resched();
}
}
--
2.44.0
Let's do the same as we did for uv_destroy_(folio|pte)() and
have the following variants:
(1) uv_convert_from_secure(): "low level" helper that operates on paddr
and does not mess with folios.
(2) uv_convert_from_secure_folio(): Consumes a folio to which we hold a
reference.
(3) uv_convert_from_secure_pte(): Consumes a PTE that holds a reference
through the mapping.
Unfortunately we need uv_convert_from_secure_pte(), because pfn_folio()
and friends are not available in pgtable.h.
Signed-off-by: David Hildenbrand <[email protected]>
---
arch/s390/include/asm/pgtable.h | 6 +++---
arch/s390/include/asm/uv.h | 4 ++--
arch/s390/kernel/uv.c | 18 +++++++++++++-----
3 files changed, 18 insertions(+), 10 deletions(-)
diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index 97e040617c29..5ffc4828c25a 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -1149,7 +1149,7 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
res = ptep_xchg_lazy(mm, addr, ptep, __pte(_PAGE_INVALID));
/* At this point the reference through the mapping is still present */
if (mm_is_protected(mm) && pte_present(res))
- uv_convert_owned_from_secure(pte_val(res) & PAGE_MASK);
+ uv_convert_from_secure_pte(res);
return res;
}
@@ -1167,7 +1167,7 @@ static inline pte_t ptep_clear_flush(struct vm_area_struct *vma,
res = ptep_xchg_direct(vma->vm_mm, addr, ptep, __pte(_PAGE_INVALID));
/* At this point the reference through the mapping is still present */
if (mm_is_protected(vma->vm_mm) && pte_present(res))
- uv_convert_owned_from_secure(pte_val(res) & PAGE_MASK);
+ uv_convert_from_secure_pte(res);
return res;
}
@@ -1206,7 +1206,7 @@ static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm,
* if this is not a mm teardown, the slower export is used as
* fallback instead.
*/
- uv_convert_owned_from_secure(pte_val(res) & PAGE_MASK);
+ uv_convert_from_secure_pte(res);
return res;
}
diff --git a/arch/s390/include/asm/uv.h b/arch/s390/include/asm/uv.h
index a1bef30066ef..0679445cac0b 100644
--- a/arch/s390/include/asm/uv.h
+++ b/arch/s390/include/asm/uv.h
@@ -485,7 +485,7 @@ int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb);
int gmap_destroy_page(struct gmap *gmap, unsigned long gaddr);
int uv_destroy_folio(struct folio *folio);
int uv_destroy_pte(pte_t pte);
-int uv_convert_owned_from_secure(unsigned long paddr);
+int uv_convert_from_secure_pte(pte_t pte);
int gmap_convert_to_secure(struct gmap *gmap, unsigned long gaddr);
void setup_uv(void);
@@ -508,7 +508,7 @@ static inline int uv_destroy_pte(pte_t pte)
return 0;
}
-static inline int uv_convert_owned_from_secure(unsigned long paddr)
+static inline int uv_convert_from_secure_pte(pte_t pte)
{
return 0;
}
diff --git a/arch/s390/kernel/uv.c b/arch/s390/kernel/uv.c
index 61c1ce51c883..b456066d72da 100644
--- a/arch/s390/kernel/uv.c
+++ b/arch/s390/kernel/uv.c
@@ -178,11 +178,10 @@ static int uv_convert_from_secure(unsigned long paddr)
}
/*
- * The caller must already hold a reference to the page
+ * The caller must already hold a reference to the folio.
*/
-int uv_convert_owned_from_secure(unsigned long paddr)
+static int uv_convert_from_secure_folio(struct folio *folio)
{
- struct folio *folio = phys_to_folio(paddr);
int rc;
/* See gmap_make_secure(): large folios cannot be secure */
@@ -190,13 +189,22 @@ int uv_convert_owned_from_secure(unsigned long paddr)
return 0;
folio_get(folio);
- rc = uv_convert_from_secure(paddr);
+ rc = uv_convert_from_secure(folio_to_phys(folio));
if (!rc)
clear_bit(PG_arch_1, &folio->flags);
folio_put(folio);
return rc;
}
+/*
+ * The present PTE still indirectly holds a folio reference through the mapping.
+ */
+int uv_convert_from_secure_pte(pte_t pte)
+{
+ VM_WARN_ON(!pte_present(pte));
+ return uv_convert_from_secure_folio(pfn_folio(pte_pfn(pte)));
+}
+
/*
* Calculate the expected ref_count for a folio that would otherwise have no
* further pins. This was cribbed from similar functions in other places in
@@ -481,7 +489,7 @@ int gmap_destroy_page(struct gmap *gmap, unsigned long gaddr)
* we instead try to export the page.
*/
if (rc)
- rc = uv_convert_owned_from_secure(page_to_phys(page));
+ rc = uv_convert_from_secure_folio(folio);
folio_put(folio);
out:
mmap_read_unlock(gmap->mm);
--
2.44.0
Let's make it clearer that we are always working on folio flags and
never page flags of tail pages.
Signed-off-by: David Hildenbrand <[email protected]>
---
arch/s390/mm/gmap.c | 4 ++--
arch/s390/mm/hugetlbpage.c | 8 ++++----
2 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c
index 0351cb139df4..9eea05cd93b7 100644
--- a/arch/s390/mm/gmap.c
+++ b/arch/s390/mm/gmap.c
@@ -2648,7 +2648,7 @@ static int __s390_enable_skey_hugetlb(pte_t *pte, unsigned long addr,
{
pmd_t *pmd = (pmd_t *)pte;
unsigned long start, end;
- struct page *page = pmd_page(*pmd);
+ struct folio *folio = page_folio(pmd_page(*pmd));
/*
* The write check makes sure we do not set a key on shared
@@ -2663,7 +2663,7 @@ static int __s390_enable_skey_hugetlb(pte_t *pte, unsigned long addr,
start = pmd_val(*pmd) & HPAGE_MASK;
end = start + HPAGE_SIZE - 1;
__storage_key_init_range(start, end);
- set_bit(PG_arch_1, &page->flags);
+ set_bit(PG_arch_1, &folio->flags);
cond_resched();
return 0;
}
diff --git a/arch/s390/mm/hugetlbpage.c b/arch/s390/mm/hugetlbpage.c
index c2e8242bd15d..a32047315f9a 100644
--- a/arch/s390/mm/hugetlbpage.c
+++ b/arch/s390/mm/hugetlbpage.c
@@ -121,7 +121,7 @@ static inline pte_t __rste_to_pte(unsigned long rste)
static void clear_huge_pte_skeys(struct mm_struct *mm, unsigned long rste)
{
- struct page *page;
+ struct folio *folio;
unsigned long size, paddr;
if (!mm_uses_skeys(mm) ||
@@ -129,16 +129,16 @@ static void clear_huge_pte_skeys(struct mm_struct *mm, unsigned long rste)
return;
if ((rste & _REGION_ENTRY_TYPE_MASK) == _REGION_ENTRY_TYPE_R3) {
- page = pud_page(__pud(rste));
+ folio = page_folio(pud_page(__pud(rste)));
size = PUD_SIZE;
paddr = rste & PUD_MASK;
} else {
- page = pmd_page(__pmd(rste));
+ folio = page_folio(pmd_page(__pmd(rste)));
size = PMD_SIZE;
paddr = rste & PMD_MASK;
}
- if (!test_and_set_bit(PG_arch_1, &page->flags))
+ if (!test_and_set_bit(PG_arch_1, &folio->flags))
__storage_key_init_range(paddr, paddr + size - 1);
}
--
2.44.0
Let's also implement HAVE_ARCH_MAKE_FOLIO_ACCESSIBLE, so we can convert
arch_make_page_accessible() to be a simple wrapper around
arch_make_folio_accessible(). Unfortuantely, we cannot do that in the
header.
There are only two arch_make_page_accessible() calls remaining in gup.c.
We can now drop HAVE_ARCH_MAKE_PAGE_ACCESSIBLE completely form core-MM.
We'll handle that separately, once the s390x part landed.
Suggested-by: Matthew Wilcox <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
---
arch/s390/include/asm/page.h | 3 +++
arch/s390/kernel/uv.c | 18 +++++++++++-------
arch/s390/mm/fault.c | 14 ++++++++------
3 files changed, 22 insertions(+), 13 deletions(-)
diff --git a/arch/s390/include/asm/page.h b/arch/s390/include/asm/page.h
index b64384872c0f..03bbc782e286 100644
--- a/arch/s390/include/asm/page.h
+++ b/arch/s390/include/asm/page.h
@@ -162,6 +162,7 @@ static inline int page_reset_referenced(unsigned long addr)
#define _PAGE_ACC_BITS 0xf0 /* HW access control bits */
struct page;
+struct folio;
void arch_free_page(struct page *page, int order);
void arch_alloc_page(struct page *page, int order);
@@ -174,6 +175,8 @@ static inline int devmem_is_allowed(unsigned long pfn)
#define HAVE_ARCH_ALLOC_PAGE
#if IS_ENABLED(CONFIG_PGSTE)
+int arch_make_folio_accessible(struct folio *folio);
+#define HAVE_ARCH_MAKE_FOLIO_ACCESSIBLE
int arch_make_page_accessible(struct page *page);
#define HAVE_ARCH_MAKE_PAGE_ACCESSIBLE
#endif
diff --git a/arch/s390/kernel/uv.c b/arch/s390/kernel/uv.c
index b456066d72da..fa62fa0e369f 100644
--- a/arch/s390/kernel/uv.c
+++ b/arch/s390/kernel/uv.c
@@ -498,14 +498,13 @@ int gmap_destroy_page(struct gmap *gmap, unsigned long gaddr)
EXPORT_SYMBOL_GPL(gmap_destroy_page);
/*
- * To be called with the page locked or with an extra reference! This will
- * prevent gmap_make_secure from touching the page concurrently. Having 2
- * parallel make_page_accessible is fine, as the UV calls will become a
- * no-op if the page is already exported.
+ * To be called with the folio locked or with an extra reference! This will
+ * prevent gmap_make_secure from touching the folio concurrently. Having 2
+ * parallel arch_make_folio_accessible is fine, as the UV calls will become a
+ * no-op if the folio is already exported.
*/
-int arch_make_page_accessible(struct page *page)
+int arch_make_folio_accessible(struct folio *folio)
{
- struct folio *folio = page_folio(page);
int rc = 0;
/* See gmap_make_secure(): large folios cannot be secure */
@@ -537,8 +536,13 @@ int arch_make_page_accessible(struct page *page)
return rc;
}
-EXPORT_SYMBOL_GPL(arch_make_page_accessible);
+EXPORT_SYMBOL_GPL(arch_make_folio_accessible);
+int arch_make_page_accessible(struct page *page)
+{
+ return arch_make_folio_accessible(page_folio(page));
+}
+EXPORT_SYMBOL_GPL(arch_make_page_accessible);
#endif
#if defined(CONFIG_PROTECTED_VIRTUALIZATION_GUEST) || IS_ENABLED(CONFIG_KVM)
diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
index c421dd44ffbe..a1ba58460593 100644
--- a/arch/s390/mm/fault.c
+++ b/arch/s390/mm/fault.c
@@ -491,6 +491,7 @@ void do_secure_storage_access(struct pt_regs *regs)
unsigned long addr = get_fault_address(regs);
struct vm_area_struct *vma;
struct mm_struct *mm;
+ struct folio *folio;
struct page *page;
struct gmap *gmap;
int rc;
@@ -538,17 +539,18 @@ void do_secure_storage_access(struct pt_regs *regs)
mmap_read_unlock(mm);
break;
}
- if (arch_make_page_accessible(page))
+ folio = page_folio(page);
+ if (arch_make_folio_accessible(folio))
send_sig(SIGSEGV, current, 0);
- put_page(page);
+ folio_put(folio);
mmap_read_unlock(mm);
break;
case KERNEL_FAULT:
- page = phys_to_page(addr);
- if (unlikely(!try_get_page(page)))
+ folio = phys_to_folio(addr);
+ if (unlikely(!folio_try_get(folio)))
break;
- rc = arch_make_page_accessible(page);
- put_page(page);
+ rc = arch_make_folio_accessible(folio);
+ folio_put(folio);
if (rc)
BUG();
break;
--
2.44.0
On 12.04.24 16:21, David Hildenbrand wrote:
> This is v2 of [1] with changed subject:
> "[PATCH v1 0/5] s390: page_mapcount(), page_has_private() and PG_arch_1"
>
> Rebased on s390x/features which contains the page_mapcount() and
> page_has_private() cleanups, and some PG_arch_1 cleanups from Willy. To
> compensate, I added some more cleanups ;)
>
> One "easy" fix upfront. Another issue I spotted is documented in [1].
>
> Once this hits upstream, we can remove HAVE_ARCH_MAKE_PAGE_ACCESSIBLE
> from core-mm and s390x, so only the folio variant will remain.
Ping.
--
Cheers,
David / dhildenb
On Tue, Apr 30, 2024 at 08:49:31PM +0200, David Hildenbrand wrote:
> On 12.04.24 16:21, David Hildenbrand wrote:
> > This is v2 of [1] with changed subject:
> > "[PATCH v1 0/5] s390: page_mapcount(), page_has_private() and PG_arch_1"
> >
> > Rebased on s390x/features which contains the page_mapcount() and
> > page_has_private() cleanups, and some PG_arch_1 cleanups from Willy. To
> > compensate, I added some more cleanups ;)
> >
> > One "easy" fix upfront. Another issue I spotted is documented in [1].
> >
> > Once this hits upstream, we can remove HAVE_ARCH_MAKE_PAGE_ACCESSIBLE
> > from core-mm and s390x, so only the folio variant will remain.
>
> Ping.
Claudio, Janosch, this series requires your review.
On Fri, 12 Apr 2024 16:21:14 +0200
David Hildenbrand <[email protected]> wrote:
> Now that make_folio_secure() may only set PG_arch_1 for small folios,
> let's convert relevant remaining UV code to only work on (small) folios
> and simply reject large folios early. This way, we'll never end up
> touching PG_arch_1 on tail pages of a large folio in UV code.
>
> The folio_get()/folio_put() for functions that are documented to already
> hold a folio reference look weird; likely they are required to make
> concurrent gmap_make_secure() back off because the caller might only hold
> an implicit reference due to the page mapping. So leave that alone for now.
>
> Signed-off-by: David Hildenbrand <[email protected]>
Reviewed-by: Claudio Imbrenda <[email protected]>
> ---
> arch/s390/include/asm/page.h | 2 ++
> arch/s390/kernel/uv.c | 41 ++++++++++++++++++++++--------------
> 2 files changed, 27 insertions(+), 16 deletions(-)
>
> diff --git a/arch/s390/include/asm/page.h b/arch/s390/include/asm/page.h
> index 9381879f7ecf..b64384872c0f 100644
> --- a/arch/s390/include/asm/page.h
> +++ b/arch/s390/include/asm/page.h
> @@ -214,7 +214,9 @@ static inline unsigned long __phys_addr(unsigned long x, bool is_31bit)
> #define pfn_to_phys(pfn) ((pfn) << PAGE_SHIFT)
>
> #define phys_to_page(phys) pfn_to_page(phys_to_pfn(phys))
> +#define phys_to_folio(phys) page_folio(phys_to_page(phys))
> #define page_to_phys(page) pfn_to_phys(page_to_pfn(page))
> +#define folio_to_phys(page) pfn_to_phys(folio_pfn(folio))
>
> static inline void *pfn_to_virt(unsigned long pfn)
> {
> diff --git a/arch/s390/kernel/uv.c b/arch/s390/kernel/uv.c
> index 3c6d86e3e828..914dcec27329 100644
> --- a/arch/s390/kernel/uv.c
> +++ b/arch/s390/kernel/uv.c
> @@ -135,14 +135,18 @@ static int uv_destroy_page(unsigned long paddr)
> */
> int uv_destroy_owned_page(unsigned long paddr)
> {
> - struct page *page = phys_to_page(paddr);
> + struct folio *folio = phys_to_folio(paddr);
> int rc;
>
> - get_page(page);
> + /* See gmap_make_secure(): large folios cannot be secure */
> + if (unlikely(folio_test_large(folio)))
> + return 0;
> +
> + folio_get(folio);
> rc = uv_destroy_page(paddr);
> if (!rc)
> - clear_bit(PG_arch_1, &page->flags);
> - put_page(page);
> + clear_bit(PG_arch_1, &folio->flags);
> + folio_put(folio);
> return rc;
> }
>
> @@ -170,14 +174,18 @@ int uv_convert_from_secure(unsigned long paddr)
> */
> int uv_convert_owned_from_secure(unsigned long paddr)
> {
> - struct page *page = phys_to_page(paddr);
> + struct folio *folio = phys_to_folio(paddr);
> int rc;
>
> - get_page(page);
> + /* See gmap_make_secure(): large folios cannot be secure */
> + if (unlikely(folio_test_large(folio)))
> + return 0;
> +
> + folio_get(folio);
> rc = uv_convert_from_secure(paddr);
> if (!rc)
> - clear_bit(PG_arch_1, &page->flags);
> - put_page(page);
> + clear_bit(PG_arch_1, &folio->flags);
> + folio_put(folio);
> return rc;
> }
>
> @@ -479,33 +487,34 @@ EXPORT_SYMBOL_GPL(gmap_destroy_page);
> */
> int arch_make_page_accessible(struct page *page)
> {
> + struct folio *folio = page_folio(page);
> int rc = 0;
>
> - /* Hugepage cannot be protected, so nothing to do */
> - if (PageHuge(page))
> + /* See gmap_make_secure(): large folios cannot be secure */
> + if (unlikely(folio_test_large(folio)))
> return 0;
>
> /*
> * PG_arch_1 is used in 3 places:
> * 1. for kernel page tables during early boot
> * 2. for storage keys of huge pages and KVM
> - * 3. As an indication that this page might be secure. This can
> + * 3. As an indication that this small folio might be secure. This can
> * overindicate, e.g. we set the bit before calling
> * convert_to_secure.
> * As secure pages are never huge, all 3 variants can co-exists.
> */
> - if (!test_bit(PG_arch_1, &page->flags))
> + if (!test_bit(PG_arch_1, &folio->flags))
> return 0;
>
> - rc = uv_pin_shared(page_to_phys(page));
> + rc = uv_pin_shared(folio_to_phys(folio));
> if (!rc) {
> - clear_bit(PG_arch_1, &page->flags);
> + clear_bit(PG_arch_1, &folio->flags);
> return 0;
> }
>
> - rc = uv_convert_from_secure(page_to_phys(page));
> + rc = uv_convert_from_secure(folio_to_phys(folio));
> if (!rc) {
> - clear_bit(PG_arch_1, &page->flags);
> + clear_bit(PG_arch_1, &folio->flags);
> return 0;
> }
>
On Fri, 12 Apr 2024 16:21:17 +0200
David Hildenbrand <[email protected]> wrote:
> Let's have the following variants for destroying pages:
>
> (1) uv_destroy(): Like uv_pin_shared() and uv_convert_from_secure(),
> "low level" helper that operates on paddr and doesn't mess with folios.
>
> (2) uv_destroy_folio(): Consumes a folio to which we hold a reference.
>
> (3) uv_destroy_pte(): Consumes a PTE that holds a reference through the
> mapping.
>
> Unfortunately we need uv_destroy_pte(), because pfn_folio() and
> friends are not available in pgtable.h.
>
> Signed-off-by: David Hildenbrand <[email protected]>
Reviewed-by: Claudio Imbrenda <[email protected]>
> ---
> arch/s390/include/asm/pgtable.h | 2 +-
> arch/s390/include/asm/uv.h | 10 ++++++++--
> arch/s390/kernel/uv.c | 24 +++++++++++++++++-------
> arch/s390/mm/gmap.c | 6 ++++--
> 4 files changed, 30 insertions(+), 12 deletions(-)
>
> diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
> index 60950e7a25f5..97e040617c29 100644
> --- a/arch/s390/include/asm/pgtable.h
> +++ b/arch/s390/include/asm/pgtable.h
> @@ -1199,7 +1199,7 @@ static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm,
> * The notifier should have destroyed all protected vCPUs at this
> * point, so the destroy should be successful.
> */
> - if (full && !uv_destroy_owned_page(pte_val(res) & PAGE_MASK))
> + if (full && !uv_destroy_pte(res))
> return res;
> /*
> * If something went wrong and the page could not be destroyed, or
> diff --git a/arch/s390/include/asm/uv.h b/arch/s390/include/asm/uv.h
> index d2205ff97007..a1bef30066ef 100644
> --- a/arch/s390/include/asm/uv.h
> +++ b/arch/s390/include/asm/uv.h
> @@ -483,7 +483,8 @@ static inline int is_prot_virt_host(void)
> int uv_pin_shared(unsigned long paddr);
> int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb);
> int gmap_destroy_page(struct gmap *gmap, unsigned long gaddr);
> -int uv_destroy_owned_page(unsigned long paddr);
> +int uv_destroy_folio(struct folio *folio);
> +int uv_destroy_pte(pte_t pte);
> int uv_convert_owned_from_secure(unsigned long paddr);
> int gmap_convert_to_secure(struct gmap *gmap, unsigned long gaddr);
>
> @@ -497,7 +498,12 @@ static inline int uv_pin_shared(unsigned long paddr)
> return 0;
> }
>
> -static inline int uv_destroy_owned_page(unsigned long paddr)
> +static inline int uv_destroy_folio(struct folio *folio)
> +{
> + return 0;
> +}
> +
> +static inline int uv_destroy_pte(pte_t pte)
> {
> return 0;
> }
> diff --git a/arch/s390/kernel/uv.c b/arch/s390/kernel/uv.c
> index 3d3250b406a6..61c1ce51c883 100644
> --- a/arch/s390/kernel/uv.c
> +++ b/arch/s390/kernel/uv.c
> @@ -110,7 +110,7 @@ EXPORT_SYMBOL_GPL(uv_pin_shared);
> *
> * @paddr: Absolute host address of page to be destroyed
> */
> -static int uv_destroy_page(unsigned long paddr)
> +static int uv_destroy(unsigned long paddr)
> {
> struct uv_cb_cfs uvcb = {
> .header.cmd = UVC_CMD_DESTR_SEC_STOR,
> @@ -131,11 +131,10 @@ static int uv_destroy_page(unsigned long paddr)
> }
>
> /*
> - * The caller must already hold a reference to the page
> + * The caller must already hold a reference to the folio
> */
> -int uv_destroy_owned_page(unsigned long paddr)
> +int uv_destroy_folio(struct folio *folio)
> {
> - struct folio *folio = phys_to_folio(paddr);
> int rc;
>
> /* See gmap_make_secure(): large folios cannot be secure */
> @@ -143,13 +142,22 @@ int uv_destroy_owned_page(unsigned long paddr)
> return 0;
>
> folio_get(folio);
> - rc = uv_destroy_page(paddr);
> + rc = uv_destroy(folio_to_phys(folio));
> if (!rc)
> clear_bit(PG_arch_1, &folio->flags);
> folio_put(folio);
> return rc;
> }
>
> +/*
> + * The present PTE still indirectly holds a folio reference through the mapping.
> + */
> +int uv_destroy_pte(pte_t pte)
> +{
> + VM_WARN_ON(!pte_present(pte));
> + return uv_destroy_folio(pfn_folio(pte_pfn(pte)));
> +}
> +
> /*
> * Requests the Ultravisor to encrypt a guest page and make it
> * accessible to the host for paging (export).
> @@ -437,6 +445,7 @@ int gmap_destroy_page(struct gmap *gmap, unsigned long gaddr)
> {
> struct vm_area_struct *vma;
> unsigned long uaddr;
> + struct folio *folio;
> struct page *page;
> int rc;
>
> @@ -460,7 +469,8 @@ int gmap_destroy_page(struct gmap *gmap, unsigned long gaddr)
> page = follow_page(vma, uaddr, FOLL_WRITE | FOLL_GET);
> if (IS_ERR_OR_NULL(page))
> goto out;
> - rc = uv_destroy_owned_page(page_to_phys(page));
> + folio = page_folio(page);
> + rc = uv_destroy_folio(folio);
> /*
> * Fault handlers can race; it is possible that two CPUs will fault
> * on the same secure page. One CPU can destroy the page, reboot,
> @@ -472,7 +482,7 @@ int gmap_destroy_page(struct gmap *gmap, unsigned long gaddr)
> */
> if (rc)
> rc = uv_convert_owned_from_secure(page_to_phys(page));
> - put_page(page);
> + folio_put(folio);
> out:
> mmap_read_unlock(gmap->mm);
> return rc;
> diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c
> index 094b43b121cd..0351cb139df4 100644
> --- a/arch/s390/mm/gmap.c
> +++ b/arch/s390/mm/gmap.c
> @@ -2756,13 +2756,15 @@ static const struct mm_walk_ops gather_pages_ops = {
> */
> void s390_uv_destroy_pfns(unsigned long count, unsigned long *pfns)
> {
> + struct folio *folio;
> unsigned long i;
>
> for (i = 0; i < count; i++) {
> + folio = pfn_folio(pfns[i]);
> /* we always have an extra reference */
> - uv_destroy_owned_page(pfn_to_phys(pfns[i]));
> + uv_destroy_folio(folio);
> /* get rid of the extra reference */
> - put_page(pfn_to_page(pfns[i]));
> + folio_put(folio);
> cond_resched();
> }
> }
On Fri, 12 Apr 2024 16:21:16 +0200
David Hildenbrand <[email protected]> wrote:
> It's not used outside of uv.c, so let's make it a static function.
>
> Signed-off-by: David Hildenbrand <[email protected]>
Reviewed-by: Claudio Imbrenda <[email protected]>
> ---
> arch/s390/include/asm/uv.h | 6 ------
> arch/s390/kernel/uv.c | 2 +-
> 2 files changed, 1 insertion(+), 7 deletions(-)
>
> diff --git a/arch/s390/include/asm/uv.h b/arch/s390/include/asm/uv.h
> index 0e7bd3873907..d2205ff97007 100644
> --- a/arch/s390/include/asm/uv.h
> +++ b/arch/s390/include/asm/uv.h
> @@ -484,7 +484,6 @@ int uv_pin_shared(unsigned long paddr);
> int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb);
> int gmap_destroy_page(struct gmap *gmap, unsigned long gaddr);
> int uv_destroy_owned_page(unsigned long paddr);
> -int uv_convert_from_secure(unsigned long paddr);
> int uv_convert_owned_from_secure(unsigned long paddr);
> int gmap_convert_to_secure(struct gmap *gmap, unsigned long gaddr);
>
> @@ -503,11 +502,6 @@ static inline int uv_destroy_owned_page(unsigned long paddr)
> return 0;
> }
>
> -static inline int uv_convert_from_secure(unsigned long paddr)
> -{
> - return 0;
> -}
> -
> static inline int uv_convert_owned_from_secure(unsigned long paddr)
> {
> return 0;
> diff --git a/arch/s390/kernel/uv.c b/arch/s390/kernel/uv.c
> index ecfc08902215..3d3250b406a6 100644
> --- a/arch/s390/kernel/uv.c
> +++ b/arch/s390/kernel/uv.c
> @@ -156,7 +156,7 @@ int uv_destroy_owned_page(unsigned long paddr)
> *
> * @paddr: Absolute host address of page to be exported
> */
> -int uv_convert_from_secure(unsigned long paddr)
> +static int uv_convert_from_secure(unsigned long paddr)
> {
> struct uv_cb_cfs uvcb = {
> .header.cmd = UVC_CMD_CONV_FROM_SEC_STOR,
On Fri, 12 Apr 2024 16:21:20 +0200
David Hildenbrand <[email protected]> wrote:
> Let's make it clearer that we are always working on folio flags and
> never page flags of tail pages.
please be a little more verbose, and explain what you are doing (i.e.
converting usages of page flags to folio flags), not just why.
>
> Signed-off-by: David Hildenbrand <[email protected]>
with a few extra words in the description:
Reviewed-by: Claudio Imbrenda <[email protected]>
> ---
> arch/s390/mm/gmap.c | 4 ++--
> arch/s390/mm/hugetlbpage.c | 8 ++++----
> 2 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c
> index 0351cb139df4..9eea05cd93b7 100644
> --- a/arch/s390/mm/gmap.c
> +++ b/arch/s390/mm/gmap.c
> @@ -2648,7 +2648,7 @@ static int __s390_enable_skey_hugetlb(pte_t *pte, unsigned long addr,
> {
> pmd_t *pmd = (pmd_t *)pte;
> unsigned long start, end;
> - struct page *page = pmd_page(*pmd);
> + struct folio *folio = page_folio(pmd_page(*pmd));
>
> /*
> * The write check makes sure we do not set a key on shared
> @@ -2663,7 +2663,7 @@ static int __s390_enable_skey_hugetlb(pte_t *pte, unsigned long addr,
> start = pmd_val(*pmd) & HPAGE_MASK;
> end = start + HPAGE_SIZE - 1;
> __storage_key_init_range(start, end);
> - set_bit(PG_arch_1, &page->flags);
> + set_bit(PG_arch_1, &folio->flags);
> cond_resched();
> return 0;
> }
> diff --git a/arch/s390/mm/hugetlbpage.c b/arch/s390/mm/hugetlbpage.c
> index c2e8242bd15d..a32047315f9a 100644
> --- a/arch/s390/mm/hugetlbpage.c
> +++ b/arch/s390/mm/hugetlbpage.c
> @@ -121,7 +121,7 @@ static inline pte_t __rste_to_pte(unsigned long rste)
>
> static void clear_huge_pte_skeys(struct mm_struct *mm, unsigned long rste)
> {
> - struct page *page;
> + struct folio *folio;
> unsigned long size, paddr;
>
> if (!mm_uses_skeys(mm) ||
> @@ -129,16 +129,16 @@ static void clear_huge_pte_skeys(struct mm_struct *mm, unsigned long rste)
> return;
>
> if ((rste & _REGION_ENTRY_TYPE_MASK) == _REGION_ENTRY_TYPE_R3) {
> - page = pud_page(__pud(rste));
> + folio = page_folio(pud_page(__pud(rste)));
> size = PUD_SIZE;
> paddr = rste & PUD_MASK;
> } else {
> - page = pmd_page(__pmd(rste));
> + folio = page_folio(pmd_page(__pmd(rste)));
> size = PMD_SIZE;
> paddr = rste & PMD_MASK;
> }
>
> - if (!test_and_set_bit(PG_arch_1, &page->flags))
> + if (!test_and_set_bit(PG_arch_1, &folio->flags))
> __storage_key_init_range(paddr, paddr + size - 1);
> }
>
On Fri, 12 Apr 2024 16:21:12 +0200
David Hildenbrand <[email protected]> wrote:
> Let's factor out handling of LRU cache draining and convert the if-else
> chain to a switch-case.
>
> Signed-off-by: David Hildenbrand <[email protected]>
it is a little ugly with the bool* parameter, but I can't think of
a better/nicer way to do it
Reviewed-by: Claudio Imbrenda <[email protected]>
> ---
> arch/s390/kernel/uv.c | 66 ++++++++++++++++++++++++++-----------------
> 1 file changed, 40 insertions(+), 26 deletions(-)
>
> diff --git a/arch/s390/kernel/uv.c b/arch/s390/kernel/uv.c
> index 016993e9eb72..25fe28d189df 100644
> --- a/arch/s390/kernel/uv.c
> +++ b/arch/s390/kernel/uv.c
> @@ -266,6 +266,36 @@ static bool should_export_before_import(struct uv_cb_header *uvcb, struct mm_str
> return atomic_read(&mm->context.protected_count) > 1;
> }
>
> +/*
> + * Drain LRU caches: the local one on first invocation and the ones of all
> + * CPUs on successive invocations. Returns "true" on the first invocation.
> + */
> +static bool drain_lru(bool *drain_lru_called)
> +{
> + /*
> + * If we have tried a local drain and the folio refcount
> + * still does not match our expected safe value, try with a
> + * system wide drain. This is needed if the pagevecs holding
> + * the page are on a different CPU.
> + */
> + if (*drain_lru_called) {
> + lru_add_drain_all();
> + /* We give up here, don't retry immediately. */
> + return false;
> + }
> + /*
> + * We are here if the folio refcount does not match the
> + * expected safe value. The main culprits are usually
> + * pagevecs. With lru_add_drain() we drain the pagevecs
> + * on the local CPU so that hopefully the refcount will
> + * reach the expected safe value.
> + */
> + lru_add_drain();
> + *drain_lru_called = true;
> + /* The caller should try again immediately */
> + return true;
> +}
> +
> /*
> * Requests the Ultravisor to make a page accessible to a guest.
> * If it's brought in the first time, it will be cleared. If
> @@ -275,7 +305,7 @@ static bool should_export_before_import(struct uv_cb_header *uvcb, struct mm_str
> int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb)
> {
> struct vm_area_struct *vma;
> - bool local_drain = false;
> + bool drain_lru_called = false;
> spinlock_t *ptelock;
> unsigned long uaddr;
> struct folio *folio;
> @@ -331,37 +361,21 @@ int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb)
> out:
> mmap_read_unlock(gmap->mm);
>
> - if (rc == -EAGAIN) {
> + switch (rc) {
> + case -EAGAIN:
> /*
> * If we are here because the UVC returned busy or partial
> * completion, this is just a useless check, but it is safe.
> */
> folio_wait_writeback(folio);
> folio_put(folio);
> - } else if (rc == -EBUSY) {
> - /*
> - * If we have tried a local drain and the folio refcount
> - * still does not match our expected safe value, try with a
> - * system wide drain. This is needed if the pagevecs holding
> - * the page are on a different CPU.
> - */
> - if (local_drain) {
> - lru_add_drain_all();
> - /* We give up here, and let the caller try again */
> - return -EAGAIN;
> - }
> - /*
> - * We are here if the folio refcount does not match the
> - * expected safe value. The main culprits are usually
> - * pagevecs. With lru_add_drain() we drain the pagevecs
> - * on the local CPU so that hopefully the refcount will
> - * reach the expected safe value.
> - */
> - lru_add_drain();
> - local_drain = true;
> - /* And now we try again immediately after draining */
> - goto again;
> - } else if (rc == -ENXIO) {
> + return -EAGAIN;
> + case -EBUSY:
> + /* Additional folio references. */
> + if (drain_lru(&drain_lru_called))
> + goto again;
> + return -EAGAIN;
> + case -ENXIO:
> if (gmap_fault(gmap, gaddr, FAULT_FLAG_WRITE))
> return -EFAULT;
> return -EAGAIN;
On Fri, 12 Apr 2024 16:21:19 +0200
David Hildenbrand <[email protected]> wrote:
> Let's also implement HAVE_ARCH_MAKE_FOLIO_ACCESSIBLE, so we can convert
> arch_make_page_accessible() to be a simple wrapper around
> arch_make_folio_accessible(). Unfortuantely, we cannot do that in the
> header.
>
> There are only two arch_make_page_accessible() calls remaining in gup.c.
> We can now drop HAVE_ARCH_MAKE_PAGE_ACCESSIBLE completely form core-MM.
> We'll handle that separately, once the s390x part landed.
>
> Suggested-by: Matthew Wilcox <[email protected]>
> Signed-off-by: David Hildenbrand <[email protected]>
Reviewed-by: Claudio Imbrenda <[email protected]>
> ---
> arch/s390/include/asm/page.h | 3 +++
> arch/s390/kernel/uv.c | 18 +++++++++++-------
> arch/s390/mm/fault.c | 14 ++++++++------
> 3 files changed, 22 insertions(+), 13 deletions(-)
>
> diff --git a/arch/s390/include/asm/page.h b/arch/s390/include/asm/page.h
> index b64384872c0f..03bbc782e286 100644
> --- a/arch/s390/include/asm/page.h
> +++ b/arch/s390/include/asm/page.h
> @@ -162,6 +162,7 @@ static inline int page_reset_referenced(unsigned long addr)
> #define _PAGE_ACC_BITS 0xf0 /* HW access control bits */
>
> struct page;
> +struct folio;
> void arch_free_page(struct page *page, int order);
> void arch_alloc_page(struct page *page, int order);
>
> @@ -174,6 +175,8 @@ static inline int devmem_is_allowed(unsigned long pfn)
> #define HAVE_ARCH_ALLOC_PAGE
>
> #if IS_ENABLED(CONFIG_PGSTE)
> +int arch_make_folio_accessible(struct folio *folio);
> +#define HAVE_ARCH_MAKE_FOLIO_ACCESSIBLE
> int arch_make_page_accessible(struct page *page);
> #define HAVE_ARCH_MAKE_PAGE_ACCESSIBLE
> #endif
> diff --git a/arch/s390/kernel/uv.c b/arch/s390/kernel/uv.c
> index b456066d72da..fa62fa0e369f 100644
> --- a/arch/s390/kernel/uv.c
> +++ b/arch/s390/kernel/uv.c
> @@ -498,14 +498,13 @@ int gmap_destroy_page(struct gmap *gmap, unsigned long gaddr)
> EXPORT_SYMBOL_GPL(gmap_destroy_page);
>
> /*
> - * To be called with the page locked or with an extra reference! This will
> - * prevent gmap_make_secure from touching the page concurrently. Having 2
> - * parallel make_page_accessible is fine, as the UV calls will become a
> - * no-op if the page is already exported.
> + * To be called with the folio locked or with an extra reference! This will
> + * prevent gmap_make_secure from touching the folio concurrently. Having 2
> + * parallel arch_make_folio_accessible is fine, as the UV calls will become a
> + * no-op if the folio is already exported.
> */
> -int arch_make_page_accessible(struct page *page)
> +int arch_make_folio_accessible(struct folio *folio)
> {
> - struct folio *folio = page_folio(page);
> int rc = 0;
>
> /* See gmap_make_secure(): large folios cannot be secure */
> @@ -537,8 +536,13 @@ int arch_make_page_accessible(struct page *page)
>
> return rc;
> }
> -EXPORT_SYMBOL_GPL(arch_make_page_accessible);
> +EXPORT_SYMBOL_GPL(arch_make_folio_accessible);
>
> +int arch_make_page_accessible(struct page *page)
> +{
> + return arch_make_folio_accessible(page_folio(page));
> +}
> +EXPORT_SYMBOL_GPL(arch_make_page_accessible);
> #endif
>
> #if defined(CONFIG_PROTECTED_VIRTUALIZATION_GUEST) || IS_ENABLED(CONFIG_KVM)
> diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
> index c421dd44ffbe..a1ba58460593 100644
> --- a/arch/s390/mm/fault.c
> +++ b/arch/s390/mm/fault.c
> @@ -491,6 +491,7 @@ void do_secure_storage_access(struct pt_regs *regs)
> unsigned long addr = get_fault_address(regs);
> struct vm_area_struct *vma;
> struct mm_struct *mm;
> + struct folio *folio;
> struct page *page;
> struct gmap *gmap;
> int rc;
> @@ -538,17 +539,18 @@ void do_secure_storage_access(struct pt_regs *regs)
> mmap_read_unlock(mm);
> break;
> }
> - if (arch_make_page_accessible(page))
> + folio = page_folio(page);
> + if (arch_make_folio_accessible(folio))
> send_sig(SIGSEGV, current, 0);
> - put_page(page);
> + folio_put(folio);
> mmap_read_unlock(mm);
> break;
> case KERNEL_FAULT:
> - page = phys_to_page(addr);
> - if (unlikely(!try_get_page(page)))
> + folio = phys_to_folio(addr);
> + if (unlikely(!folio_try_get(folio)))
> break;
> - rc = arch_make_page_accessible(page);
> - put_page(page);
> + rc = arch_make_folio_accessible(folio);
> + folio_put(folio);
> if (rc)
> BUG();
> break;
On Fri, 12 Apr 2024 16:21:18 +0200
David Hildenbrand <[email protected]> wrote:
> Let's do the same as we did for uv_destroy_(folio|pte)() and
> have the following variants:
>
> (1) uv_convert_from_secure(): "low level" helper that operates on paddr
> and does not mess with folios.
>
> (2) uv_convert_from_secure_folio(): Consumes a folio to which we hold a
> reference.
>
> (3) uv_convert_from_secure_pte(): Consumes a PTE that holds a reference
> through the mapping.
>
> Unfortunately we need uv_convert_from_secure_pte(), because pfn_folio()
> and friends are not available in pgtable.h.
>
> Signed-off-by: David Hildenbrand <[email protected]>
Reviewed-by: Claudio Imbrenda <[email protected]>
> ---
> arch/s390/include/asm/pgtable.h | 6 +++---
> arch/s390/include/asm/uv.h | 4 ++--
> arch/s390/kernel/uv.c | 18 +++++++++++++-----
> 3 files changed, 18 insertions(+), 10 deletions(-)
>
> diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
> index 97e040617c29..5ffc4828c25a 100644
> --- a/arch/s390/include/asm/pgtable.h
> +++ b/arch/s390/include/asm/pgtable.h
> @@ -1149,7 +1149,7 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
> res = ptep_xchg_lazy(mm, addr, ptep, __pte(_PAGE_INVALID));
> /* At this point the reference through the mapping is still present */
> if (mm_is_protected(mm) && pte_present(res))
> - uv_convert_owned_from_secure(pte_val(res) & PAGE_MASK);
> + uv_convert_from_secure_pte(res);
> return res;
> }
>
> @@ -1167,7 +1167,7 @@ static inline pte_t ptep_clear_flush(struct vm_area_struct *vma,
> res = ptep_xchg_direct(vma->vm_mm, addr, ptep, __pte(_PAGE_INVALID));
> /* At this point the reference through the mapping is still present */
> if (mm_is_protected(vma->vm_mm) && pte_present(res))
> - uv_convert_owned_from_secure(pte_val(res) & PAGE_MASK);
> + uv_convert_from_secure_pte(res);
> return res;
> }
>
> @@ -1206,7 +1206,7 @@ static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm,
> * if this is not a mm teardown, the slower export is used as
> * fallback instead.
> */
> - uv_convert_owned_from_secure(pte_val(res) & PAGE_MASK);
> + uv_convert_from_secure_pte(res);
> return res;
> }
>
> diff --git a/arch/s390/include/asm/uv.h b/arch/s390/include/asm/uv.h
> index a1bef30066ef..0679445cac0b 100644
> --- a/arch/s390/include/asm/uv.h
> +++ b/arch/s390/include/asm/uv.h
> @@ -485,7 +485,7 @@ int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb);
> int gmap_destroy_page(struct gmap *gmap, unsigned long gaddr);
> int uv_destroy_folio(struct folio *folio);
> int uv_destroy_pte(pte_t pte);
> -int uv_convert_owned_from_secure(unsigned long paddr);
> +int uv_convert_from_secure_pte(pte_t pte);
> int gmap_convert_to_secure(struct gmap *gmap, unsigned long gaddr);
>
> void setup_uv(void);
> @@ -508,7 +508,7 @@ static inline int uv_destroy_pte(pte_t pte)
> return 0;
> }
>
> -static inline int uv_convert_owned_from_secure(unsigned long paddr)
> +static inline int uv_convert_from_secure_pte(pte_t pte)
> {
> return 0;
> }
> diff --git a/arch/s390/kernel/uv.c b/arch/s390/kernel/uv.c
> index 61c1ce51c883..b456066d72da 100644
> --- a/arch/s390/kernel/uv.c
> +++ b/arch/s390/kernel/uv.c
> @@ -178,11 +178,10 @@ static int uv_convert_from_secure(unsigned long paddr)
> }
>
> /*
> - * The caller must already hold a reference to the page
> + * The caller must already hold a reference to the folio.
> */
> -int uv_convert_owned_from_secure(unsigned long paddr)
> +static int uv_convert_from_secure_folio(struct folio *folio)
> {
> - struct folio *folio = phys_to_folio(paddr);
> int rc;
>
> /* See gmap_make_secure(): large folios cannot be secure */
> @@ -190,13 +189,22 @@ int uv_convert_owned_from_secure(unsigned long paddr)
> return 0;
>
> folio_get(folio);
> - rc = uv_convert_from_secure(paddr);
> + rc = uv_convert_from_secure(folio_to_phys(folio));
> if (!rc)
> clear_bit(PG_arch_1, &folio->flags);
> folio_put(folio);
> return rc;
> }
>
> +/*
> + * The present PTE still indirectly holds a folio reference through the mapping.
> + */
> +int uv_convert_from_secure_pte(pte_t pte)
> +{
> + VM_WARN_ON(!pte_present(pte));
> + return uv_convert_from_secure_folio(pfn_folio(pte_pfn(pte)));
> +}
> +
> /*
> * Calculate the expected ref_count for a folio that would otherwise have no
> * further pins. This was cribbed from similar functions in other places in
> @@ -481,7 +489,7 @@ int gmap_destroy_page(struct gmap *gmap, unsigned long gaddr)
> * we instead try to export the page.
> */
> if (rc)
> - rc = uv_convert_owned_from_secure(page_to_phys(page));
> + rc = uv_convert_from_secure_folio(folio);
> folio_put(folio);
> out:
> mmap_read_unlock(gmap->mm);
On Mon, 6 May 2024 10:38:30 +0200
Heiko Carstens <[email protected]> wrote:
> On Tue, Apr 30, 2024 at 08:49:31PM +0200, David Hildenbrand wrote:
> > On 12.04.24 16:21, David Hildenbrand wrote:
> > > This is v2 of [1] with changed subject:
> > > "[PATCH v1 0/5] s390: page_mapcount(), page_has_private() and PG_arch_1"
> > >
> > > Rebased on s390x/features which contains the page_mapcount() and
> > > page_has_private() cleanups, and some PG_arch_1 cleanups from Willy. To
> > > compensate, I added some more cleanups ;)
> > >
> > > One "easy" fix upfront. Another issue I spotted is documented in [1].
> > >
> > > Once this hits upstream, we can remove HAVE_ARCH_MAKE_PAGE_ACCESSIBLE
> > > from core-mm and s390x, so only the folio variant will remain.
> >
> > Ping.
>
> Claudio, Janosch, this series requires your review.
oops! I had started reviewing it, but then other things got in the
way...
On 07.05.24 18:33, Claudio Imbrenda wrote:
> On Fri, 12 Apr 2024 16:21:20 +0200
> David Hildenbrand <[email protected]> wrote:
>
>> Let's make it clearer that we are always working on folio flags and
>> never page flags of tail pages.
>
> please be a little more verbose, and explain what you are doing (i.e.
> converting usages of page flags to folio flags), not just why.
>
>>
>> Signed-off-by: David Hildenbrand <[email protected]>
>
> with a few extra words in the description:
Let's make it clearer that we are always working on folio flags and
never page flags of tail pages by converting remaining PG_arch_1 users
that modify page->flags to modify folio->flags instead.
No functional change intended, because we would always have worked with
the head page (where page->flags corresponds to folio->flags) and never
with tail pages.
>
> Reviewed-by: Claudio Imbrenda <[email protected]>
Thanks for all the review!
--
Cheers,
David / dhildenb
On Wed, 8 May 2024 20:08:07 +0200
David Hildenbrand <[email protected]> wrote:
> On 07.05.24 18:33, Claudio Imbrenda wrote:
> > On Fri, 12 Apr 2024 16:21:20 +0200
> > David Hildenbrand <[email protected]> wrote:
> >
> >> Let's make it clearer that we are always working on folio flags and
> >> never page flags of tail pages.
> >
> > please be a little more verbose, and explain what you are doing (i.e.
> > converting usages of page flags to folio flags), not just why.
> >
> >>
> >> Signed-off-by: David Hildenbrand <[email protected]>
> >
> > with a few extra words in the description:
>
> Let's make it clearer that we are always working on folio flags and
> never page flags of tail pages by converting remaining PG_arch_1 users
> that modify page->flags to modify folio->flags instead.
>
> No functional change intended, because we would always have worked with
> the head page (where page->flags corresponds to folio->flags) and never
> with tail pages.
this works, thanks!
>
>
> >
> > Reviewed-by: Claudio Imbrenda <[email protected]>
>
> Thanks for all the review!
>