2018-11-06 16:29:35

by Arun KS

[permalink] [raw]
Subject: [PATCH v2 0/4] mm: convert totalram_pages, totalhigh_pages and managed pages to atomic

This series convert totalram_pages, totalhigh_pages and
zone->managed_pages to atomic variables.

The patch was comiple tested on x86(x86_64_defconfig & i386_defconfig)
on 4.20-rc1. And memory hotplug tested on arm64, but on an older version
of kernel.

Arun KS (4):
mm: Fix multiple evaluvations of totalram_pages and managed_pages
mm: Convert zone->managed_pages to atomic variable
mm: convert totalram_pages and totalhigh_pages variables to atomic
mm: Remove managed_page_count spinlock

arch/csky/mm/init.c | 4 +-
arch/powerpc/platforms/pseries/cmm.c | 10 ++--
arch/s390/mm/init.c | 2 +-
arch/um/kernel/mem.c | 3 +-
arch/x86/kernel/cpu/microcode/core.c | 5 +-
drivers/char/agp/backend.c | 4 +-
drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 2 +-
drivers/gpu/drm/i915/i915_gem.c | 2 +-
drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 4 +-
drivers/hv/hv_balloon.c | 19 +++----
drivers/md/dm-bufio.c | 2 +-
drivers/md/dm-crypt.c | 2 +-
drivers/md/dm-integrity.c | 2 +-
drivers/md/dm-stats.c | 2 +-
drivers/media/platform/mtk-vpu/mtk_vpu.c | 2 +-
drivers/misc/vmw_balloon.c | 2 +-
drivers/parisc/ccio-dma.c | 4 +-
drivers/parisc/sba_iommu.c | 4 +-
drivers/staging/android/ion/ion_system_heap.c | 2 +-
drivers/xen/xen-selfballoon.c | 6 +--
fs/ceph/super.h | 2 +-
fs/file_table.c | 7 +--
fs/fuse/inode.c | 2 +-
fs/nfs/write.c | 2 +-
fs/nfsd/nfscache.c | 2 +-
fs/ntfs/malloc.h | 2 +-
fs/proc/base.c | 2 +-
include/linux/highmem.h | 28 ++++++++++-
include/linux/mm.h | 27 +++++++++-
include/linux/mmzone.h | 15 +++---
include/linux/swap.h | 1 -
kernel/fork.c | 5 +-
kernel/kexec_core.c | 5 +-
kernel/power/snapshot.c | 2 +-
lib/show_mem.c | 2 +-
mm/highmem.c | 4 +-
mm/huge_memory.c | 2 +-
mm/kasan/quarantine.c | 2 +-
mm/memblock.c | 6 +--
mm/mm_init.c | 2 +-
mm/oom_kill.c | 2 +-
mm/page_alloc.c | 71 +++++++++++++--------------
mm/shmem.c | 7 +--
mm/slab.c | 2 +-
mm/swap.c | 2 +-
mm/util.c | 2 +-
mm/vmalloc.c | 4 +-
mm/vmstat.c | 4 +-
mm/workingset.c | 2 +-
mm/zswap.c | 4 +-
net/dccp/proto.c | 7 +--
net/decnet/dn_route.c | 2 +-
net/ipv4/tcp_metrics.c | 2 +-
net/netfilter/nf_conntrack_core.c | 7 +--
net/netfilter/xt_hashlimit.c | 5 +-
net/sctp/protocol.c | 7 +--
security/integrity/ima/ima_kexec.c | 2 +-
57 files changed, 193 insertions(+), 142 deletions(-)

--
1.9.1



2018-11-06 16:29:28

by Arun KS

[permalink] [raw]
Subject: [PATCH v2 4/4] mm: Remove managed_page_count spinlock

Now totalram_pages and managed_pages are atomic varibles. No need
of managed_page_count spinlock.

Signed-off-by: Arun KS <[email protected]>
Reviewed-by: Konstantin Khlebnikov <[email protected]>
Acked-by: Michal Hocko <[email protected]>
---
include/linux/mmzone.h | 6 ------
mm/page_alloc.c | 5 -----
2 files changed, 11 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index e73dc31..c71b4d9 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -428,12 +428,6 @@ struct zone {
* Write access to present_pages at runtime should be protected by
* mem_hotplug_begin/end(). Any reader who can't tolerant drift of
* present_pages should get_online_mems() to get a stable value.
- *
- * Read access to managed_pages should be safe because it's unsigned
- * long. Write access to zone->managed_pages and totalram_pages are
- * protected by managed_page_count_lock at runtime. Idealy only
- * adjust_managed_page_count() should be used instead of directly
- * touching zone->managed_pages and totalram_pages.
*/
atomic_long_t managed_pages;
unsigned long spanned_pages;
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 2a42c3f..4d78bde 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -122,9 +122,6 @@
};
EXPORT_SYMBOL(node_states);

-/* Protect totalram_pages and zone->managed_pages */
-static DEFINE_SPINLOCK(managed_page_count_lock);
-
atomic_long_t _totalram_pages __read_mostly;
unsigned long totalreserve_pages __read_mostly;
unsigned long totalcma_pages __read_mostly;
@@ -7064,14 +7061,12 @@ static int __init cmdline_parse_movablecore(char *p)

void adjust_managed_page_count(struct page *page, long count)
{
- spin_lock(&managed_page_count_lock);
atomic_long_add(count, &page_zone(page)->managed_pages);
totalram_pages_add(count);
#ifdef CONFIG_HIGHMEM
if (PageHighMem(page))
totalhigh_pages_add(count);
#endif
- spin_unlock(&managed_page_count_lock);
}
EXPORT_SYMBOL(adjust_managed_page_count);

--
1.9.1


2018-11-06 16:29:29

by Arun KS

[permalink] [raw]
Subject: [PATCH v2 1/4] mm: Fix multiple evaluvations of totalram_pages and managed_pages

This patch is in preparation to a later patch which converts totalram_pages
and zone->managed_pages to atomic variables. This patch does not introduce
any functional changes.

Signed-off-by: Arun KS <[email protected]>
Reviewed-by: Konstantin Khlebnikov <[email protected]>
---
arch/um/kernel/mem.c | 3 +--
arch/x86/kernel/cpu/microcode/core.c | 5 +++--
drivers/hv/hv_balloon.c | 19 ++++++++++---------
fs/file_table.c | 7 ++++---
kernel/fork.c | 5 +++--
kernel/kexec_core.c | 5 +++--
mm/page_alloc.c | 5 +++--
mm/shmem.c | 3 ++-
net/dccp/proto.c | 7 ++++---
net/netfilter/nf_conntrack_core.c | 7 ++++---
net/netfilter/xt_hashlimit.c | 5 +++--
net/sctp/protocol.c | 7 ++++---
12 files changed, 44 insertions(+), 34 deletions(-)

diff --git a/arch/um/kernel/mem.c b/arch/um/kernel/mem.c
index 1067469..134d3fd 100644
--- a/arch/um/kernel/mem.c
+++ b/arch/um/kernel/mem.c
@@ -51,8 +51,7 @@ void __init mem_init(void)

/* this will put all low memory onto the freelists */
memblock_free_all();
- max_low_pfn = totalram_pages;
- max_pfn = totalram_pages;
+ max_pfn = max_low_pfn = totalram_pages;
mem_init_print_info(NULL);
kmalloc_ok = 1;
}
diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
index 2637ff0..99c67ca 100644
--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -434,9 +434,10 @@ static ssize_t microcode_write(struct file *file, const char __user *buf,
size_t len, loff_t *ppos)
{
ssize_t ret = -EINVAL;
+ unsigned long totalram_pgs = totalram_pages;

- if ((len >> PAGE_SHIFT) > totalram_pages) {
- pr_err("too much data (max %ld pages)\n", totalram_pages);
+ if ((len >> PAGE_SHIFT) > totalram_pgs) {
+ pr_err("too much data (max %ld pages)\n", totalram_pgs);
return ret;
}

diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
index 4163151..cac4945 100644
--- a/drivers/hv/hv_balloon.c
+++ b/drivers/hv/hv_balloon.c
@@ -1090,6 +1090,7 @@ static void process_info(struct hv_dynmem_device *dm, struct dm_info_msg *msg)
static unsigned long compute_balloon_floor(void)
{
unsigned long min_pages;
+ unsigned long totalram_pgs = totalram_pages;
#define MB2PAGES(mb) ((mb) << (20 - PAGE_SHIFT))
/* Simple continuous piecewiese linear function:
* max MiB -> min MiB gradient
@@ -1102,16 +1103,16 @@ static unsigned long compute_balloon_floor(void)
* 8192 744 (1/16)
* 32768 1512 (1/32)
*/
- if (totalram_pages < MB2PAGES(128))
- min_pages = MB2PAGES(8) + (totalram_pages >> 1);
- else if (totalram_pages < MB2PAGES(512))
- min_pages = MB2PAGES(40) + (totalram_pages >> 2);
- else if (totalram_pages < MB2PAGES(2048))
- min_pages = MB2PAGES(104) + (totalram_pages >> 3);
- else if (totalram_pages < MB2PAGES(8192))
- min_pages = MB2PAGES(232) + (totalram_pages >> 4);
+ if (totalram_pgs < MB2PAGES(128))
+ min_pages = MB2PAGES(8) + (totalram_pgs >> 1);
+ else if (totalram_pgs < MB2PAGES(512))
+ min_pages = MB2PAGES(40) + (totalram_pgs >> 2);
+ else if (totalram_pgs < MB2PAGES(2048))
+ min_pages = MB2PAGES(104) + (totalram_pgs >> 3);
+ else if (totalram_pgs < MB2PAGES(8192))
+ min_pages = MB2PAGES(232) + (totalram_pgs >> 4);
else
- min_pages = MB2PAGES(488) + (totalram_pages >> 5);
+ min_pages = MB2PAGES(488) + (totalram_pgs >> 5);
#undef MB2PAGES
return min_pages;
}
diff --git a/fs/file_table.c b/fs/file_table.c
index e49af4c..6e3c088 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -380,10 +380,11 @@ void __init files_init(void)
void __init files_maxfiles_init(void)
{
unsigned long n;
- unsigned long memreserve = (totalram_pages - nr_free_pages()) * 3/2;
+ unsigned long totalram_pgs = totalram_pages;
+ unsigned long memreserve = (totalram_pgs - nr_free_pages()) * 3/2;

- memreserve = min(memreserve, totalram_pages - 1);
- n = ((totalram_pages - memreserve) * (PAGE_SIZE / 1024)) / 10;
+ memreserve = min(memreserve, totalram_pgs - 1);
+ n = ((totalram_pgs - memreserve) * (PAGE_SIZE / 1024)) / 10;

files_stat.max_files = max_t(unsigned long, n, NR_FILE);
}
diff --git a/kernel/fork.c b/kernel/fork.c
index 07cddff..7823f31 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -739,15 +739,16 @@ void __init __weak arch_task_cache_init(void) { }
static void set_max_threads(unsigned int max_threads_suggested)
{
u64 threads;
+ unsigned long totalram_pgs = totalram_pages;

/*
* The number of threads shall be limited such that the thread
* structures may only consume a small part of the available memory.
*/
- if (fls64(totalram_pages) + fls64(PAGE_SIZE) > 64)
+ if (fls64(totalram_pgs) + fls64(PAGE_SIZE) > 64)
threads = MAX_THREADS;
else
- threads = div64_u64((u64) totalram_pages * (u64) PAGE_SIZE,
+ threads = div64_u64((u64) totalram_pgs * (u64) PAGE_SIZE,
(u64) THREAD_SIZE * 8UL);

if (threads > max_threads_suggested)
diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index 86ef06d..dff217c 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -152,6 +152,7 @@ int sanity_check_segment_list(struct kimage *image)
int i;
unsigned long nr_segments = image->nr_segments;
unsigned long total_pages = 0;
+ unsigned long totalram_pgs = totalram_pages;

/*
* Verify we have good destination addresses. The caller is
@@ -217,13 +218,13 @@ int sanity_check_segment_list(struct kimage *image)
* wasted allocating pages, which can cause a soft lockup.
*/
for (i = 0; i < nr_segments; i++) {
- if (PAGE_COUNT(image->segment[i].memsz) > totalram_pages / 2)
+ if (PAGE_COUNT(image->segment[i].memsz) > totalram_pgs / 2)
return -EINVAL;

total_pages += PAGE_COUNT(image->segment[i].memsz);
}

- if (total_pages > totalram_pages / 2)
+ if (total_pages > totalram_pgs / 2)
return -EINVAL;

/*
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a919ba5..173312b 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7245,6 +7245,7 @@ static void calculate_totalreserve_pages(void)
for (i = 0; i < MAX_NR_ZONES; i++) {
struct zone *zone = pgdat->node_zones + i;
long max = 0;
+ unsigned long managed_pages = zone->managed_pages;

/* Find valid and maximum lowmem_reserve in the zone */
for (j = i; j < MAX_NR_ZONES; j++) {
@@ -7255,8 +7256,8 @@ static void calculate_totalreserve_pages(void)
/* we treat the high watermark as reserved pages. */
max += high_wmark_pages(zone);

- if (max > zone->managed_pages)
- max = zone->managed_pages;
+ if (max > managed_pages)
+ max = managed_pages;

pgdat->totalreserve_pages += max;

diff --git a/mm/shmem.c b/mm/shmem.c
index ea26d7a..6b91eab 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -114,7 +114,8 @@ static unsigned long shmem_default_max_blocks(void)

static unsigned long shmem_default_max_inodes(void)
{
- return min(totalram_pages - totalhigh_pages, totalram_pages / 2);
+ unsigned long totalram_pgs = totalram_pages;
+ return min(totalram_pgs - totalhigh_pages, totalram_pgs / 2);
}
#endif

diff --git a/net/dccp/proto.c b/net/dccp/proto.c
index 43733ac..f27daa1 100644
--- a/net/dccp/proto.c
+++ b/net/dccp/proto.c
@@ -1131,6 +1131,7 @@ static inline void dccp_mib_exit(void)
static int __init dccp_init(void)
{
unsigned long goal;
+ unsigned long totalram_pgs = totalram_pages;
int ehash_order, bhash_order, i;
int rc;

@@ -1154,10 +1155,10 @@ static int __init dccp_init(void)
*
* The methodology is similar to that of the buffer cache.
*/
- if (totalram_pages >= (128 * 1024))
- goal = totalram_pages >> (21 - PAGE_SHIFT);
+ if (totalram_pgs >= (128 * 1024))
+ goal = totalram_pgs >> (21 - PAGE_SHIFT);
else
- goal = totalram_pages >> (23 - PAGE_SHIFT);
+ goal = totalram_pgs >> (23 - PAGE_SHIFT);

if (thash_entries)
goal = (thash_entries *
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index ca1168d..0b1801e 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -2248,6 +2248,7 @@ static __always_inline unsigned int total_extension_size(void)

int nf_conntrack_init_start(void)
{
+ unsigned long totalram_pgs = totalram_pages;
int max_factor = 8;
int ret = -ENOMEM;
int i;
@@ -2267,11 +2268,11 @@ int nf_conntrack_init_start(void)
* >= 4GB machines have 65536 buckets.
*/
nf_conntrack_htable_size
- = (((totalram_pages << PAGE_SHIFT) / 16384)
+ = (((totalram_pgs << PAGE_SHIFT) / 16384)
/ sizeof(struct hlist_head));
- if (totalram_pages > (4 * (1024 * 1024 * 1024 / PAGE_SIZE)))
+ if (totalram_pgs > (4 * (1024 * 1024 * 1024 / PAGE_SIZE)))
nf_conntrack_htable_size = 65536;
- else if (totalram_pages > (1024 * 1024 * 1024 / PAGE_SIZE))
+ else if (totalram_pgs > (1024 * 1024 * 1024 / PAGE_SIZE))
nf_conntrack_htable_size = 16384;
if (nf_conntrack_htable_size < 32)
nf_conntrack_htable_size = 32;
diff --git a/net/netfilter/xt_hashlimit.c b/net/netfilter/xt_hashlimit.c
index 3e7d259..6cb9a74 100644
--- a/net/netfilter/xt_hashlimit.c
+++ b/net/netfilter/xt_hashlimit.c
@@ -274,14 +274,15 @@ static int htable_create(struct net *net, struct hashlimit_cfg3 *cfg,
struct xt_hashlimit_htable *hinfo;
const struct seq_operations *ops;
unsigned int size, i;
+ unsigned long totalram_pgs = totalram_pages;
int ret;

if (cfg->size) {
size = cfg->size;
} else {
- size = (totalram_pages << PAGE_SHIFT) / 16384 /
+ size = (totalram_pgs << PAGE_SHIFT) / 16384 /
sizeof(struct hlist_head);
- if (totalram_pages > 1024 * 1024 * 1024 / PAGE_SIZE)
+ if (totalram_pgs > 1024 * 1024 * 1024 / PAGE_SIZE)
size = 8192;
if (size < 16)
size = 16;
diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
index 9b277bd..7128f85 100644
--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -1368,6 +1368,7 @@ static __init int sctp_init(void)
int status = -EINVAL;
unsigned long goal;
unsigned long limit;
+ unsigned long totalram_pages;
int max_share;
int order;
int num_entries;
@@ -1426,10 +1427,10 @@ static __init int sctp_init(void)
* The methodology is similar to that of the tcp hash tables.
* Though not identical. Start by getting a goal size
*/
- if (totalram_pages >= (128 * 1024))
- goal = totalram_pages >> (22 - PAGE_SHIFT);
+ if (totalram_pgs >= (128 * 1024))
+ goal = totalram_pgs >> (22 - PAGE_SHIFT);
else
- goal = totalram_pages >> (24 - PAGE_SHIFT);
+ goal = totalram_pgs >> (24 - PAGE_SHIFT);

/* Then compute the page order for said goal */
order = get_order(goal);
--
1.9.1


2018-11-06 16:30:17

by Arun KS

[permalink] [raw]
Subject: [PATCH v2 2/4] mm: Convert zone->managed_pages to atomic variable

totalram_pages, zone->managed_pages and totalhigh_pages updates
are protected by managed_page_count_lock, but readers never care
about it. Convert these variables to atomic to avoid readers
potentially seeing a store tear.

This patch converts zone->managed_pages. Subsequent patches will
convert totalram_panges, totalhigh_pages and eventually
managed_page_count_lock will be removed.

Suggested-by: Michal Hocko <[email protected]>
Suggested-by: Vlastimil Babka <[email protected]>
Signed-off-by: Arun KS <[email protected]>
Reviewed-by: Konstantin Khlebnikov <[email protected]>
Acked-by: Michal Hocko <[email protected]>

---
Most of the changes are done by below coccinelle script,

@@
struct zone *z;
expression e1;
@@
(
- z->managed_pages = e1
+ atomic_long_set(&z->managed_pages, e1)
|
- e1->managed_pages++
+ atomic_long_inc(&e1->managed_pages)
|
- z->managed_pages
+ zone_managed_pages(z)
)

@@
expression e,e1;
@@
- e->managed_pages += e1
+ atomic_long_add(e1, &e->managed_pages)

@@
expression z;
@@
- z.managed_pages
+ zone_managed_pages(&z)

Then, manually apply following change,
include/linux/mmzone.h

- unsigned long managed_pages;
+ atomic_long_t managed_pages;

+static inline unsigned long zone_managed_pages(struct zone *zone)
+{
+ return (unsigned long)atomic_long_read(&zone->managed_pages);
+}

---
---
drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 2 +-
include/linux/mmzone.h | 9 +++++--
lib/show_mem.c | 2 +-
mm/memblock.c | 2 +-
mm/page_alloc.c | 44 +++++++++++++++++------------------
mm/vmstat.c | 4 ++--
6 files changed, 34 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
index 56412b0..c0e55bb 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
@@ -848,7 +848,7 @@ static int kfd_fill_mem_info_for_cpu(int numa_node_id, int *avail_size,
*/
pgdat = NODE_DATA(numa_node_id);
for (zone_type = 0; zone_type < MAX_NR_ZONES; zone_type++)
- mem_in_bytes += pgdat->node_zones[zone_type].managed_pages;
+ mem_in_bytes += zone_managed_pages(&pgdat->node_zones[zone_type]);
mem_in_bytes <<= PAGE_SHIFT;

sub_type_hdr->length_low = lower_32_bits(mem_in_bytes);
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 847705a..e73dc31 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -435,7 +435,7 @@ struct zone {
* adjust_managed_page_count() should be used instead of directly
* touching zone->managed_pages and totalram_pages.
*/
- unsigned long managed_pages;
+ atomic_long_t managed_pages;
unsigned long spanned_pages;
unsigned long present_pages;

@@ -524,6 +524,11 @@ enum pgdat_flags {
PGDAT_RECLAIM_LOCKED, /* prevents concurrent reclaim */
};

+static inline unsigned long zone_managed_pages(struct zone *zone)
+{
+ return (unsigned long)atomic_long_read(&zone->managed_pages);
+}
+
static inline unsigned long zone_end_pfn(const struct zone *zone)
{
return zone->zone_start_pfn + zone->spanned_pages;
@@ -814,7 +819,7 @@ static inline bool is_dev_zone(const struct zone *zone)
*/
static inline bool managed_zone(struct zone *zone)
{
- return zone->managed_pages;
+ return zone_managed_pages(zone);
}

/* Returns true if a zone has memory */
diff --git a/lib/show_mem.c b/lib/show_mem.c
index 0beaa1d..eefe67d 100644
--- a/lib/show_mem.c
+++ b/lib/show_mem.c
@@ -28,7 +28,7 @@ void show_mem(unsigned int filter, nodemask_t *nodemask)
continue;

total += zone->present_pages;
- reserved += zone->present_pages - zone->managed_pages;
+ reserved += zone->present_pages - zone_managed_pages(zone);

if (is_highmem_idx(zoneid))
highmem += zone->present_pages;
diff --git a/mm/memblock.c b/mm/memblock.c
index 7df468c..bbd82ab 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1950,7 +1950,7 @@ void reset_node_managed_pages(pg_data_t *pgdat)
struct zone *z;

for (z = pgdat->node_zones; z < pgdat->node_zones + MAX_NR_ZONES; z++)
- z->managed_pages = 0;
+ atomic_long_set(&z->managed_pages, 0);
}

void __init reset_all_zones_managed_pages(void)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 173312b..22e6645 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1279,7 +1279,7 @@ static void __init __free_pages_boot_core(struct page *page, unsigned int order)
__ClearPageReserved(p);
set_page_count(p, 0);

- page_zone(page)->managed_pages += nr_pages;
+ atomic_long_add(nr_pages, &page_zone(page)->managed_pages);
set_page_refcounted(page);
__free_pages(page, order);
}
@@ -2258,7 +2258,7 @@ static void reserve_highatomic_pageblock(struct page *page, struct zone *zone,
* Limit the number reserved to 1 pageblock or roughly 1% of a zone.
* Check is race-prone but harmless.
*/
- max_managed = (zone->managed_pages / 100) + pageblock_nr_pages;
+ max_managed = (zone_managed_pages(zone) / 100) + pageblock_nr_pages;
if (zone->nr_reserved_highatomic >= max_managed)
return;

@@ -4662,7 +4662,7 @@ static unsigned long nr_free_zone_pages(int offset)
struct zonelist *zonelist = node_zonelist(numa_node_id(), GFP_KERNEL);

for_each_zone_zonelist(zone, z, zonelist, offset) {
- unsigned long size = zone->managed_pages;
+ unsigned long size = zone_managed_pages(zone);
unsigned long high = high_wmark_pages(zone);
if (size > high)
sum += size - high;
@@ -4769,7 +4769,7 @@ void si_meminfo_node(struct sysinfo *val, int nid)
pg_data_t *pgdat = NODE_DATA(nid);

for (zone_type = 0; zone_type < MAX_NR_ZONES; zone_type++)
- managed_pages += pgdat->node_zones[zone_type].managed_pages;
+ managed_pages += zone_managed_pages(&pgdat->node_zones[zone_type]);
val->totalram = managed_pages;
val->sharedram = node_page_state(pgdat, NR_SHMEM);
val->freeram = sum_zone_node_page_state(nid, NR_FREE_PAGES);
@@ -4778,7 +4778,7 @@ void si_meminfo_node(struct sysinfo *val, int nid)
struct zone *zone = &pgdat->node_zones[zone_type];

if (is_highmem(zone)) {
- managed_highpages += zone->managed_pages;
+ managed_highpages += zone_managed_pages(zone);
free_highpages += zone_page_state(zone, NR_FREE_PAGES);
}
}
@@ -4985,7 +4985,7 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask)
K(zone_page_state(zone, NR_ZONE_UNEVICTABLE)),
K(zone_page_state(zone, NR_ZONE_WRITE_PENDING)),
K(zone->present_pages),
- K(zone->managed_pages),
+ K(zone_managed_pages(zone)),
K(zone_page_state(zone, NR_MLOCK)),
zone_page_state(zone, NR_KERNEL_STACK_KB),
K(zone_page_state(zone, NR_PAGETABLE)),
@@ -5645,7 +5645,7 @@ static int zone_batchsize(struct zone *zone)
* The per-cpu-pages pools are set to around 1000th of the
* size of the zone.
*/
- batch = zone->managed_pages / 1024;
+ batch = zone_managed_pages(zone) / 1024;
/* But no more than a meg. */
if (batch * PAGE_SIZE > 1024 * 1024)
batch = (1024 * 1024) / PAGE_SIZE;
@@ -5756,7 +5756,7 @@ static void pageset_set_high_and_batch(struct zone *zone,
{
if (percpu_pagelist_fraction)
pageset_set_high(pcp,
- (zone->managed_pages /
+ (zone_managed_pages(zone) /
percpu_pagelist_fraction));
else
pageset_set_batch(pcp, zone_batchsize(zone));
@@ -6311,7 +6311,7 @@ static void __meminit pgdat_init_internals(struct pglist_data *pgdat)
static void __meminit zone_init_internals(struct zone *zone, enum zone_type idx, int nid,
unsigned long remaining_pages)
{
- zone->managed_pages = remaining_pages;
+ atomic_long_set(&zone->managed_pages, remaining_pages);
zone_set_nid(zone, nid);
zone->name = zone_names[idx];
zone->zone_pgdat = NODE_DATA(nid);
@@ -7064,7 +7064,7 @@ static int __init cmdline_parse_movablecore(char *p)
void adjust_managed_page_count(struct page *page, long count)
{
spin_lock(&managed_page_count_lock);
- page_zone(page)->managed_pages += count;
+ atomic_long_add(count, &page_zone(page)->managed_pages);
totalram_pages += count;
#ifdef CONFIG_HIGHMEM
if (PageHighMem(page))
@@ -7112,7 +7112,7 @@ void free_highmem_page(struct page *page)
{
__free_reserved_page(page);
totalram_pages++;
- page_zone(page)->managed_pages++;
+ atomic_long_inc(&page_zone(page)->managed_pages);
totalhigh_pages++;
}
#endif
@@ -7245,7 +7245,7 @@ static void calculate_totalreserve_pages(void)
for (i = 0; i < MAX_NR_ZONES; i++) {
struct zone *zone = pgdat->node_zones + i;
long max = 0;
- unsigned long managed_pages = zone->managed_pages;
+ unsigned long managed_pages = zone_managed_pages(zone);

/* Find valid and maximum lowmem_reserve in the zone */
for (j = i; j < MAX_NR_ZONES; j++) {
@@ -7281,7 +7281,7 @@ static void setup_per_zone_lowmem_reserve(void)
for_each_online_pgdat(pgdat) {
for (j = 0; j < MAX_NR_ZONES; j++) {
struct zone *zone = pgdat->node_zones + j;
- unsigned long managed_pages = zone->managed_pages;
+ unsigned long managed_pages = zone_managed_pages(zone);

zone->lowmem_reserve[j] = 0;

@@ -7299,7 +7299,7 @@ static void setup_per_zone_lowmem_reserve(void)
lower_zone->lowmem_reserve[j] =
managed_pages / sysctl_lowmem_reserve_ratio[idx];
}
- managed_pages += lower_zone->managed_pages;
+ managed_pages += zone_managed_pages(lower_zone);
}
}
}
@@ -7318,14 +7318,14 @@ static void __setup_per_zone_wmarks(void)
/* Calculate total number of !ZONE_HIGHMEM pages */
for_each_zone(zone) {
if (!is_highmem(zone))
- lowmem_pages += zone->managed_pages;
+ lowmem_pages += zone_managed_pages(zone);
}

for_each_zone(zone) {
u64 tmp;

spin_lock_irqsave(&zone->lock, flags);
- tmp = (u64)pages_min * zone->managed_pages;
+ tmp = (u64)pages_min * zone_managed_pages(zone);
do_div(tmp, lowmem_pages);
if (is_highmem(zone)) {
/*
@@ -7339,7 +7339,7 @@ static void __setup_per_zone_wmarks(void)
*/
unsigned long min_pages;

- min_pages = zone->managed_pages / 1024;
+ min_pages = zone_managed_pages(zone) / 1024;
min_pages = clamp(min_pages, SWAP_CLUSTER_MAX, 128UL);
zone->watermark[WMARK_MIN] = min_pages;
} else {
@@ -7356,7 +7356,7 @@ static void __setup_per_zone_wmarks(void)
* ensure a minimum size on small systems.
*/
tmp = max_t(u64, tmp >> 2,
- mult_frac(zone->managed_pages,
+ mult_frac(zone_managed_pages(zone),
watermark_scale_factor, 10000));

zone->watermark[WMARK_LOW] = min_wmark_pages(zone) + tmp;
@@ -7486,8 +7486,8 @@ static void setup_min_unmapped_ratio(void)
pgdat->min_unmapped_pages = 0;

for_each_zone(zone)
- zone->zone_pgdat->min_unmapped_pages += (zone->managed_pages *
- sysctl_min_unmapped_ratio) / 100;
+ zone->zone_pgdat->min_unmapped_pages += (zone_managed_pages(zone) *
+ sysctl_min_unmapped_ratio) / 100;
}


@@ -7514,8 +7514,8 @@ static void setup_min_slab_ratio(void)
pgdat->min_slab_pages = 0;

for_each_zone(zone)
- zone->zone_pgdat->min_slab_pages += (zone->managed_pages *
- sysctl_min_slab_ratio) / 100;
+ zone->zone_pgdat->min_slab_pages += (zone_managed_pages(zone) *
+ sysctl_min_slab_ratio) / 100;
}

int sysctl_min_slab_ratio_sysctl_handler(struct ctl_table *table, int write,
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 6038ce5..9fee037 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -227,7 +227,7 @@ int calculate_normal_threshold(struct zone *zone)
* 125 1024 10 16-32 GB 9
*/

- mem = zone->managed_pages >> (27 - PAGE_SHIFT);
+ mem = zone_managed_pages(zone) >> (27 - PAGE_SHIFT);

threshold = 2 * fls(num_online_cpus()) * (1 + fls(mem));

@@ -1569,7 +1569,7 @@ static void zoneinfo_show_print(struct seq_file *m, pg_data_t *pgdat,
high_wmark_pages(zone),
zone->spanned_pages,
zone->present_pages,
- zone->managed_pages);
+ zone_managed_pages(zone));

seq_printf(m,
"\n protection: (%ld",
--
1.9.1


2018-11-06 16:30:39

by Arun KS

[permalink] [raw]
Subject: [PATCH v2 3/4] mm: convert totalram_pages and totalhigh_pages variables to atomic

totalram_pages and totalhigh_pages are made static inline function.

Suggested-by: Michal Hocko <[email protected]>
Suggested-by: Vlastimil Babka <[email protected]>
Signed-off-by: Arun KS <[email protected]>
Reviewed-by: Konstantin Khlebnikov <[email protected]>
Acked-by: Michal Hocko <[email protected]>

---
coccinelle script to make most of the changes,

@@
declarer name EXPORT_SYMBOL;
symbol totalram_pages;
expression e;
@@
(
EXPORT_SYMBOL(totalram_pages);
|
- totalram_pages = e
+ totalram_pages_set(e)
|
- totalram_pages += e
+ totalram_pages_add(e)
|
- totalram_pages++
+ totalram_pages_inc()
|
- totalram_pages--
+ totalram_pages_dec()
|
- totalram_pages
+ totalram_pages()
)

@@
symbol totalhigh_pages;
expression e;
@@
(
EXPORT_SYMBOL(totalhigh_pages);
|
- totalhigh_pages = e
+ totalhigh_pages_set(e)
|
- totalhigh_pages += e
+ totalhigh_pages_add(e)
|
- totalhigh_pages++
+ totalhigh_pages_inc()
|
- totalhigh_pages--
+ totalhigh_pages_dec()
|
- totalhigh_pages
+ totalhigh_pages()
)

Manaually apply all changes of following files,

include/linux/highmem.h
include/linux/mm.h
include/linux/swap.h
mm/highmem.c

and for mm/page_alloc.c mannualy apply only below changes,

#include <linux/stddef.h>
#include <linux/mm.h>
+#include <linux/highmem.h>
#include <linux/swap.h>
#include <linux/interrupt.h>
#include <linux/pagemap.h>

/* Protect totalram_pages and zone->managed_pages */
static DEFINE_SPINLOCK(managed_page_count_lock);

-unsigned long totalram_pages __read_mostly;
+atomic_long_t _totalram_pages __read_mostly;
unsigned long totalreserve_pages __read_mostly;
unsigned long totalcma_pages __read_mostly;
---
---
arch/csky/mm/init.c | 4 ++--
arch/powerpc/platforms/pseries/cmm.c | 10 +++++-----
arch/s390/mm/init.c | 2 +-
arch/um/kernel/mem.c | 2 +-
arch/x86/kernel/cpu/microcode/core.c | 2 +-
drivers/char/agp/backend.c | 4 ++--
drivers/gpu/drm/i915/i915_gem.c | 2 +-
drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 4 ++--
drivers/hv/hv_balloon.c | 2 +-
drivers/md/dm-bufio.c | 2 +-
drivers/md/dm-crypt.c | 2 +-
drivers/md/dm-integrity.c | 2 +-
drivers/md/dm-stats.c | 2 +-
drivers/media/platform/mtk-vpu/mtk_vpu.c | 2 +-
drivers/misc/vmw_balloon.c | 2 +-
drivers/parisc/ccio-dma.c | 4 ++--
drivers/parisc/sba_iommu.c | 4 ++--
drivers/staging/android/ion/ion_system_heap.c | 2 +-
drivers/xen/xen-selfballoon.c | 6 +++---
fs/ceph/super.h | 2 +-
fs/file_table.c | 2 +-
fs/fuse/inode.c | 2 +-
fs/nfs/write.c | 2 +-
fs/nfsd/nfscache.c | 2 +-
fs/ntfs/malloc.h | 2 +-
fs/proc/base.c | 2 +-
include/linux/highmem.h | 28 +++++++++++++++++++++++++--
include/linux/mm.h | 27 +++++++++++++++++++++++++-
include/linux/swap.h | 1 -
kernel/fork.c | 2 +-
kernel/kexec_core.c | 2 +-
kernel/power/snapshot.c | 2 +-
mm/highmem.c | 4 +---
mm/huge_memory.c | 2 +-
mm/kasan/quarantine.c | 2 +-
mm/memblock.c | 4 ++--
mm/mm_init.c | 2 +-
mm/oom_kill.c | 2 +-
mm/page_alloc.c | 19 +++++++++---------
mm/shmem.c | 8 ++++----
mm/slab.c | 2 +-
mm/swap.c | 2 +-
mm/util.c | 2 +-
mm/vmalloc.c | 4 ++--
mm/workingset.c | 2 +-
mm/zswap.c | 4 ++--
net/dccp/proto.c | 2 +-
net/decnet/dn_route.c | 2 +-
net/ipv4/tcp_metrics.c | 2 +-
net/netfilter/nf_conntrack_core.c | 2 +-
net/netfilter/xt_hashlimit.c | 2 +-
security/integrity/ima/ima_kexec.c | 2 +-
52 files changed, 127 insertions(+), 80 deletions(-)

diff --git a/arch/csky/mm/init.c b/arch/csky/mm/init.c
index dc07c07..66e5970 100644
--- a/arch/csky/mm/init.c
+++ b/arch/csky/mm/init.c
@@ -71,7 +71,7 @@ void free_initrd_mem(unsigned long start, unsigned long end)
ClearPageReserved(virt_to_page(start));
init_page_count(virt_to_page(start));
free_page(start);
- totalram_pages++;
+ totalram_pages_inc();
}
}
#endif
@@ -88,7 +88,7 @@ void free_initmem(void)
ClearPageReserved(virt_to_page(addr));
init_page_count(virt_to_page(addr));
free_page(addr);
- totalram_pages++;
+ totalram_pages_inc();
addr += PAGE_SIZE;
}

diff --git a/arch/powerpc/platforms/pseries/cmm.c b/arch/powerpc/platforms/pseries/cmm.c
index 25427a4..e8d63a6 100644
--- a/arch/powerpc/platforms/pseries/cmm.c
+++ b/arch/powerpc/platforms/pseries/cmm.c
@@ -208,7 +208,7 @@ static long cmm_alloc_pages(long nr)

pa->page[pa->index++] = addr;
loaned_pages++;
- totalram_pages--;
+ totalram_pages_dec();
spin_unlock(&cmm_lock);
nr--;
}
@@ -247,7 +247,7 @@ static long cmm_free_pages(long nr)
free_page(addr);
loaned_pages--;
nr--;
- totalram_pages++;
+ totalram_pages_inc();
}
spin_unlock(&cmm_lock);
cmm_dbg("End request with %ld pages unfulfilled\n", nr);
@@ -291,7 +291,7 @@ static void cmm_get_mpp(void)
int rc;
struct hvcall_mpp_data mpp_data;
signed long active_pages_target, page_loan_request, target;
- signed long total_pages = totalram_pages + loaned_pages;
+ signed long total_pages = totalram_pages() + loaned_pages;
signed long min_mem_pages = (min_mem_mb * 1024 * 1024) / PAGE_SIZE;

rc = h_get_mpp(&mpp_data);
@@ -322,7 +322,7 @@ static void cmm_get_mpp(void)

cmm_dbg("delta = %ld, loaned = %lu, target = %lu, oom = %lu, totalram = %lu\n",
page_loan_request, loaned_pages, loaned_pages_target,
- oom_freed_pages, totalram_pages);
+ oom_freed_pages, totalram_pages());
}

static struct notifier_block cmm_oom_nb = {
@@ -581,7 +581,7 @@ static int cmm_mem_going_offline(void *arg)
free_page(pa_curr->page[idx]);
freed++;
loaned_pages--;
- totalram_pages++;
+ totalram_pages_inc();
pa_curr->page[idx] = pa_last->page[--pa_last->index];
if (pa_last->index == 0) {
if (pa_curr == pa_last)
diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
index 76d0708..5038819 100644
--- a/arch/s390/mm/init.c
+++ b/arch/s390/mm/init.c
@@ -59,7 +59,7 @@ static void __init setup_zero_pages(void)
order = 7;

/* Limit number of empty zero pages for small memory sizes */
- while (order > 2 && (totalram_pages >> 10) < (1UL << order))
+ while (order > 2 && (totalram_pages() >> 10) < (1UL << order))
order--;

empty_zero_page = __get_free_pages(GFP_KERNEL | __GFP_ZERO, order);
diff --git a/arch/um/kernel/mem.c b/arch/um/kernel/mem.c
index 134d3fd..64b62a8 100644
--- a/arch/um/kernel/mem.c
+++ b/arch/um/kernel/mem.c
@@ -51,7 +51,7 @@ void __init mem_init(void)

/* this will put all low memory onto the freelists */
memblock_free_all();
- max_pfn = max_low_pfn = totalram_pages;
+ max_pfn = max_low_pfn = totalram_pages();
mem_init_print_info(NULL);
kmalloc_ok = 1;
}
diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
index 99c67ca..8594641 100644
--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -434,7 +434,7 @@ static ssize_t microcode_write(struct file *file, const char __user *buf,
size_t len, loff_t *ppos)
{
ssize_t ret = -EINVAL;
- unsigned long totalram_pgs = totalram_pages;
+ unsigned long totalram_pgs = totalram_pages();

if ((len >> PAGE_SHIFT) > totalram_pgs) {
pr_err("too much data (max %ld pages)\n", totalram_pgs);
diff --git a/drivers/char/agp/backend.c b/drivers/char/agp/backend.c
index 38ffb28..004a3ce 100644
--- a/drivers/char/agp/backend.c
+++ b/drivers/char/agp/backend.c
@@ -115,9 +115,9 @@ static int agp_find_max(void)
long memory, index, result;

#if PAGE_SHIFT < 20
- memory = totalram_pages >> (20 - PAGE_SHIFT);
+ memory = totalram_pages() >> (20 - PAGE_SHIFT);
#else
- memory = totalram_pages << (PAGE_SHIFT - 20);
+ memory = totalram_pages() << (PAGE_SHIFT - 20);
#endif
index = 1;

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 0c8aa57..6ed0e75 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2539,7 +2539,7 @@ static int i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
* If there's no chance of allocating enough pages for the whole
* object, bail early.
*/
- if (page_count > totalram_pages)
+ if (page_count > totalram_pages())
return -ENOMEM;

st = kmalloc(sizeof(*st), GFP_KERNEL);
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
index 8e2e269..91a8fa4 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -170,7 +170,7 @@ static int igt_ppgtt_alloc(void *arg)
* This should ensure that we do not run into the oomkiller during
* the test and take down the machine wilfully.
*/
- limit = totalram_pages << PAGE_SHIFT;
+ limit = totalram_pages() << PAGE_SHIFT;
limit = min(ppgtt->vm.total, limit);

/* Check we can allocate the entire range */
@@ -1244,7 +1244,7 @@ static int exercise_mock(struct drm_i915_private *i915,
u64 hole_start, u64 hole_end,
unsigned long end_time))
{
- const u64 limit = totalram_pages << PAGE_SHIFT;
+ const u64 limit = totalram_pages() << PAGE_SHIFT;
struct i915_gem_context *ctx;
struct i915_hw_ppgtt *ppgtt;
IGT_TIMEOUT(end_time);
diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
index cac4945..99bd058 100644
--- a/drivers/hv/hv_balloon.c
+++ b/drivers/hv/hv_balloon.c
@@ -1090,7 +1090,7 @@ static void process_info(struct hv_dynmem_device *dm, struct dm_info_msg *msg)
static unsigned long compute_balloon_floor(void)
{
unsigned long min_pages;
- unsigned long totalram_pgs = totalram_pages;
+ unsigned long totalram_pgs = totalram_pages();
#define MB2PAGES(mb) ((mb) << (20 - PAGE_SHIFT))
/* Simple continuous piecewiese linear function:
* max MiB -> min MiB gradient
diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
index dc385b7..8b0b628 100644
--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -1887,7 +1887,7 @@ static int __init dm_bufio_init(void)
dm_bufio_allocated_vmalloc = 0;
dm_bufio_current_allocated = 0;

- mem = (__u64)mult_frac(totalram_pages - totalhigh_pages,
+ mem = (__u64)mult_frac(totalram_pages() - totalhigh_pages(),
DM_BUFIO_MEMORY_PERCENT, 100) << PAGE_SHIFT;

if (mem > ULONG_MAX)
diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index b8eec51..f3f2ac0 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -2158,7 +2158,7 @@ static int crypt_wipe_key(struct crypt_config *cc)

static void crypt_calculate_pages_per_client(void)
{
- unsigned long pages = (totalram_pages - totalhigh_pages) * DM_CRYPT_MEMORY_PERCENT / 100;
+ unsigned long pages = (totalram_pages() - totalhigh_pages()) * DM_CRYPT_MEMORY_PERCENT / 100;

if (!dm_crypt_clients_n)
return;
diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c
index bb3096b..c12fa01 100644
--- a/drivers/md/dm-integrity.c
+++ b/drivers/md/dm-integrity.c
@@ -2843,7 +2843,7 @@ static int create_journal(struct dm_integrity_c *ic, char **error)
journal_pages = roundup((__u64)ic->journal_sections * ic->journal_section_sectors,
PAGE_SIZE >> SECTOR_SHIFT) >> (PAGE_SHIFT - SECTOR_SHIFT);
journal_desc_size = journal_pages * sizeof(struct page_list);
- if (journal_pages >= totalram_pages - totalhigh_pages || journal_desc_size > ULONG_MAX) {
+ if (journal_pages >= totalram_pages() - totalhigh_pages() || journal_desc_size > ULONG_MAX) {
*error = "Journal doesn't fit into memory";
r = -ENOMEM;
goto bad;
diff --git a/drivers/md/dm-stats.c b/drivers/md/dm-stats.c
index 21de30b..45b92a3 100644
--- a/drivers/md/dm-stats.c
+++ b/drivers/md/dm-stats.c
@@ -85,7 +85,7 @@ static bool __check_shared_memory(size_t alloc_size)
a = shared_memory_amount + alloc_size;
if (a < shared_memory_amount)
return false;
- if (a >> PAGE_SHIFT > totalram_pages / DM_STATS_MEMORY_FACTOR)
+ if (a >> PAGE_SHIFT > totalram_pages() / DM_STATS_MEMORY_FACTOR)
return false;
#ifdef CONFIG_MMU
if (a > (VMALLOC_END - VMALLOC_START) / DM_STATS_VMALLOC_FACTOR)
diff --git a/drivers/media/platform/mtk-vpu/mtk_vpu.c b/drivers/media/platform/mtk-vpu/mtk_vpu.c
index 616f78b..b660249 100644
--- a/drivers/media/platform/mtk-vpu/mtk_vpu.c
+++ b/drivers/media/platform/mtk-vpu/mtk_vpu.c
@@ -855,7 +855,7 @@ static int mtk_vpu_probe(struct platform_device *pdev)
/* Set PTCM to 96K and DTCM to 32K */
vpu_cfg_writel(vpu, 0x2, VPU_TCM_CFG);

- vpu->enable_4GB = !!(totalram_pages > (SZ_2G >> PAGE_SHIFT));
+ vpu->enable_4GB = !!(totalram_pages() > (SZ_2G >> PAGE_SHIFT));
dev_info(dev, "4GB mode %u\n", vpu->enable_4GB);

if (vpu->enable_4GB) {
diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index 9b0b3fa..e6126a4 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -570,7 +570,7 @@ static int vmballoon_send_get_target(struct vmballoon *b)
unsigned long status;
unsigned long limit;

- limit = totalram_pages;
+ limit = totalram_pages();

/* Ensure limit fits in 32-bits */
if (limit != (u32)limit)
diff --git a/drivers/parisc/ccio-dma.c b/drivers/parisc/ccio-dma.c
index 701a7d6..358e380 100644
--- a/drivers/parisc/ccio-dma.c
+++ b/drivers/parisc/ccio-dma.c
@@ -1251,7 +1251,7 @@ void __init ccio_cujo20_fixup(struct parisc_device *cujo, u32 iovp)
** Hot-Plug/Removal of PCI cards. (aka PCI OLARD).
*/

- iova_space_size = (u32) (totalram_pages / count_parisc_driver(&ccio_driver));
+ iova_space_size = (u32) (totalram_pages() / count_parisc_driver(&ccio_driver));

/* limit IOVA space size to 1MB-1GB */

@@ -1290,7 +1290,7 @@ void __init ccio_cujo20_fixup(struct parisc_device *cujo, u32 iovp)

DBG_INIT("%s() hpa 0x%p mem %luMB IOV %dMB (%d bits)\n",
__func__, ioc->ioc_regs,
- (unsigned long) totalram_pages >> (20 - PAGE_SHIFT),
+ (unsigned long) totalram_pages() >> (20 - PAGE_SHIFT),
iova_space_size>>20,
iov_order + PAGE_SHIFT);

diff --git a/drivers/parisc/sba_iommu.c b/drivers/parisc/sba_iommu.c
index c1e599a..e065594 100644
--- a/drivers/parisc/sba_iommu.c
+++ b/drivers/parisc/sba_iommu.c
@@ -1414,7 +1414,7 @@ static int setup_ibase_imask_callback(struct device *dev, void *data)
** for DMA hints - ergo only 30 bits max.
*/

- iova_space_size = (u32) (totalram_pages/global_ioc_cnt);
+ iova_space_size = (u32) (totalram_pages()/global_ioc_cnt);

/* limit IOVA space size to 1MB-1GB */
if (iova_space_size < (1 << (20 - PAGE_SHIFT))) {
@@ -1439,7 +1439,7 @@ static int setup_ibase_imask_callback(struct device *dev, void *data)
DBG_INIT("%s() hpa 0x%lx mem %ldMB IOV %dMB (%d bits)\n",
__func__,
ioc->ioc_hpa,
- (unsigned long) totalram_pages >> (20 - PAGE_SHIFT),
+ (unsigned long) totalram_pages() >> (20 - PAGE_SHIFT),
iova_space_size>>20,
iov_order + PAGE_SHIFT);

diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c
index 548bb02..6cb0eeb 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -110,7 +110,7 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
unsigned long size_remaining = PAGE_ALIGN(size);
unsigned int max_order = orders[0];

- if (size / PAGE_SIZE > totalram_pages / 2)
+ if (size / PAGE_SIZE > totalram_pages() / 2)
return -ENOMEM;

INIT_LIST_HEAD(&pages);
diff --git a/drivers/xen/xen-selfballoon.c b/drivers/xen/xen-selfballoon.c
index 5165aa8..246f612 100644
--- a/drivers/xen/xen-selfballoon.c
+++ b/drivers/xen/xen-selfballoon.c
@@ -189,7 +189,7 @@ static void selfballoon_process(struct work_struct *work)
bool reset_timer = false;

if (xen_selfballooning_enabled) {
- cur_pages = totalram_pages;
+ cur_pages = totalram_pages();
tgt_pages = cur_pages; /* default is no change */
goal_pages = vm_memory_committed() +
totalreserve_pages +
@@ -227,7 +227,7 @@ static void selfballoon_process(struct work_struct *work)
if (tgt_pages < floor_pages)
tgt_pages = floor_pages;
balloon_set_new_target(tgt_pages +
- balloon_stats.current_pages - totalram_pages);
+ balloon_stats.current_pages - totalram_pages());
reset_timer = true;
}
#ifdef CONFIG_FRONTSWAP
@@ -569,7 +569,7 @@ int xen_selfballoon_init(bool use_selfballooning, bool use_frontswap_selfshrink)
* much more reliably and response faster in some cases.
*/
if (!selfballoon_reserved_mb) {
- reserve_pages = totalram_pages / 10;
+ reserve_pages = totalram_pages() / 10;
selfballoon_reserved_mb = PAGES2MB(reserve_pages);
}
schedule_delayed_work(&selfballoon_worker, selfballoon_interval * HZ);
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index c005a54..9a2d861 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -808,7 +808,7 @@ static inline int default_congestion_kb(void)
* This allows larger machines to have larger/more transfers.
* Limit the default to 256M
*/
- congestion_kb = (16*int_sqrt(totalram_pages)) << (PAGE_SHIFT-10);
+ congestion_kb = (16*int_sqrt(totalram_pages())) << (PAGE_SHIFT-10);
if (congestion_kb > 256*1024)
congestion_kb = 256*1024;

diff --git a/fs/file_table.c b/fs/file_table.c
index 6e3c088..ee1bb23 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -380,7 +380,7 @@ void __init files_init(void)
void __init files_maxfiles_init(void)
{
unsigned long n;
- unsigned long totalram_pgs = totalram_pages;
+ unsigned long totalram_pgs = totalram_pages();
unsigned long memreserve = (totalram_pgs - nr_free_pages()) * 3/2;

memreserve = min(memreserve, totalram_pgs - 1);
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index 0b94b23..2121e71 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -824,7 +824,7 @@ static struct dentry *fuse_get_parent(struct dentry *child)
static void sanitize_global_limit(unsigned *limit)
{
if (*limit == 0)
- *limit = ((totalram_pages << PAGE_SHIFT) >> 13) /
+ *limit = ((totalram_pages() << PAGE_SHIFT) >> 13) /
sizeof(struct fuse_req);

if (*limit >= 1 << 16)
diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index 586726a..4f15665 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -2121,7 +2121,7 @@ int __init nfs_init_writepagecache(void)
* This allows larger machines to have larger/more transfers.
* Limit the default to 256M
*/
- nfs_congestion_kb = (16*int_sqrt(totalram_pages)) << (PAGE_SHIFT-10);
+ nfs_congestion_kb = (16*int_sqrt(totalram_pages())) << (PAGE_SHIFT-10);
if (nfs_congestion_kb > 256*1024)
nfs_congestion_kb = 256*1024;

diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c
index e2fe0e9..da52b59 100644
--- a/fs/nfsd/nfscache.c
+++ b/fs/nfsd/nfscache.c
@@ -99,7 +99,7 @@ static unsigned long nfsd_reply_cache_scan(struct shrinker *shrink,
nfsd_cache_size_limit(void)
{
unsigned int limit;
- unsigned long low_pages = totalram_pages - totalhigh_pages;
+ unsigned long low_pages = totalram_pages() - totalhigh_pages();

limit = (16 * int_sqrt(low_pages)) << (PAGE_SHIFT-10);
return min_t(unsigned int, limit, 256*1024);
diff --git a/fs/ntfs/malloc.h b/fs/ntfs/malloc.h
index ab172e5..5becc8a 100644
--- a/fs/ntfs/malloc.h
+++ b/fs/ntfs/malloc.h
@@ -47,7 +47,7 @@ static inline void *__ntfs_malloc(unsigned long size, gfp_t gfp_mask)
return kmalloc(PAGE_SIZE, gfp_mask & ~__GFP_HIGHMEM);
/* return (void *)__get_free_page(gfp_mask); */
}
- if (likely((size >> PAGE_SHIFT) < totalram_pages))
+ if (likely((size >> PAGE_SHIFT) < totalram_pages()))
return __vmalloc(size, gfp_mask, PAGE_KERNEL);
return NULL;
}
diff --git a/fs/proc/base.c b/fs/proc/base.c
index ce34654..d7fd1ca 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -530,7 +530,7 @@ static ssize_t lstats_write(struct file *file, const char __user *buf,
static int proc_oom_score(struct seq_file *m, struct pid_namespace *ns,
struct pid *pid, struct task_struct *task)
{
- unsigned long totalpages = totalram_pages + total_swap_pages;
+ unsigned long totalpages = totalram_pages() + total_swap_pages;
unsigned long points = 0;

points = oom_badness(task, NULL, NULL, totalpages) *
diff --git a/include/linux/highmem.h b/include/linux/highmem.h
index 0690679..cea3a01 100644
--- a/include/linux/highmem.h
+++ b/include/linux/highmem.h
@@ -36,7 +36,31 @@ static inline void invalidate_kernel_vmap_range(void *vaddr, int size)

/* declarations for linux/mm/highmem.c */
unsigned int nr_free_highpages(void);
-extern unsigned long totalhigh_pages;
+extern atomic_long_t _totalhigh_pages;
+static inline unsigned long totalhigh_pages(void)
+{
+ return (unsigned long)atomic_long_read(&_totalhigh_pages);
+}
+
+static inline void totalhigh_pages_inc(void)
+{
+ atomic_long_inc(&_totalhigh_pages);
+}
+
+static inline void totalhigh_pages_dec(void)
+{
+ atomic_long_dec(&_totalhigh_pages);
+}
+
+static inline void totalhigh_pages_add(long count)
+{
+ atomic_long_add(count, &_totalhigh_pages);
+}
+
+static inline void totalhigh_pages_set(long val)
+{
+ atomic_long_set(&_totalhigh_pages, val);
+}

void kmap_flush_unused(void);

@@ -51,7 +75,7 @@ static inline struct page *kmap_to_page(void *addr)
return virt_to_page(addr);
}

-#define totalhigh_pages 0UL
+static inline unsigned long totalhigh_pages(void) { return 0UL; }

#ifndef ARCH_HAS_KMAP
static inline void *kmap(struct page *page)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index fcf9cc9..d2c1646 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -48,7 +48,32 @@ static inline void set_max_mapnr(unsigned long limit)
static inline void set_max_mapnr(unsigned long limit) { }
#endif

-extern unsigned long totalram_pages;
+extern atomic_long_t _totalram_pages;
+static inline unsigned long totalram_pages(void)
+{
+ return (unsigned long)atomic_long_read(&_totalram_pages);
+}
+
+static inline void totalram_pages_inc(void)
+{
+ atomic_long_inc(&_totalram_pages);
+}
+
+static inline void totalram_pages_dec(void)
+{
+ atomic_long_dec(&_totalram_pages);
+}
+
+static inline void totalram_pages_add(long count)
+{
+ atomic_long_add(count, &_totalram_pages);
+}
+
+static inline void totalram_pages_set(long val)
+{
+ atomic_long_set(&_totalram_pages, val);
+}
+
extern void * high_memory;
extern int page_cluster;

diff --git a/include/linux/swap.h b/include/linux/swap.h
index d8a07a4..ea66108 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -308,7 +308,6 @@ struct vma_swap_readahead {
} while (0)

/* linux/mm/page_alloc.c */
-extern unsigned long totalram_pages;
extern unsigned long totalreserve_pages;
extern unsigned long nr_free_buffer_pages(void);
extern unsigned long nr_free_pagecache_pages(void);
diff --git a/kernel/fork.c b/kernel/fork.c
index 7823f31..ba2c517 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -739,7 +739,7 @@ void __init __weak arch_task_cache_init(void) { }
static void set_max_threads(unsigned int max_threads_suggested)
{
u64 threads;
- unsigned long totalram_pgs = totalram_pages;
+ unsigned long totalram_pgs = totalram_pages();

/*
* The number of threads shall be limited such that the thread
diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index dff217c..7c50f56 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -152,7 +152,7 @@ int sanity_check_segment_list(struct kimage *image)
int i;
unsigned long nr_segments = image->nr_segments;
unsigned long total_pages = 0;
- unsigned long totalram_pgs = totalram_pages;
+ unsigned long totalram_pgs = totalram_pages();

/*
* Verify we have good destination addresses. The caller is
diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
index b0308a2..640b203 100644
--- a/kernel/power/snapshot.c
+++ b/kernel/power/snapshot.c
@@ -105,7 +105,7 @@ void __init hibernate_reserved_size_init(void)

void __init hibernate_image_size_init(void)
{
- image_size = ((totalram_pages * 2) / 5) * PAGE_SIZE;
+ image_size = ((totalram_pages() * 2) / 5) * PAGE_SIZE;
}

/*
diff --git a/mm/highmem.c b/mm/highmem.c
index 59db322..02a9a4b 100644
--- a/mm/highmem.c
+++ b/mm/highmem.c
@@ -105,9 +105,7 @@ static inline wait_queue_head_t *get_pkmap_wait_queue_head(unsigned int color)
}
#endif

-unsigned long totalhigh_pages __read_mostly;
-EXPORT_SYMBOL(totalhigh_pages);
-
+atomic_long_t _totalhigh_pages __read_mostly;

EXPORT_PER_CPU_SYMBOL(__kmap_atomic_idx);

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 55478ab..6e88f72 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -420,7 +420,7 @@ static int __init hugepage_init(void)
* where the extra memory used could hurt more than TLB overhead
* is likely to save. The admin can still enable it through /sys.
*/
- if (totalram_pages < (512 << (20 - PAGE_SHIFT))) {
+ if (totalram_pages() < (512 << (20 - PAGE_SHIFT))) {
transparent_hugepage_flags = 0;
return 0;
}
diff --git a/mm/kasan/quarantine.c b/mm/kasan/quarantine.c
index b209dba..5be4639 100644
--- a/mm/kasan/quarantine.c
+++ b/mm/kasan/quarantine.c
@@ -236,7 +236,7 @@ void quarantine_reduce(void)
* Update quarantine size in case of hotplug. Allocate a fraction of
* the installed memory to quarantine minus per-cpu queue limits.
*/
- total_size = (READ_ONCE(totalram_pages) << PAGE_SHIFT) /
+ total_size = (READ_ONCE(totalram_pages()) << PAGE_SHIFT) /
QUARANTINE_FRACTION;
percpu_quarantines = QUARANTINE_PERCPU_SIZE * num_online_cpus();
new_quarantine_size = (total_size < percpu_quarantines) ?
diff --git a/mm/memblock.c b/mm/memblock.c
index bbd82ab..2aa1598 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1576,7 +1576,7 @@ void __init __memblock_free_late(phys_addr_t base, phys_addr_t size)

for (; cursor < end; cursor++) {
memblock_free_pages(pfn_to_page(cursor), cursor, 0);
- totalram_pages++;
+ totalram_pages_inc();
}
}

@@ -1978,7 +1978,7 @@ unsigned long __init memblock_free_all(void)
reset_all_zones_managed_pages();

pages = free_low_memory_core_early();
- totalram_pages += pages;
+ totalram_pages_add(pages);

return pages;
}
diff --git a/mm/mm_init.c b/mm/mm_init.c
index 6838a53..3391710 100644
--- a/mm/mm_init.c
+++ b/mm/mm_init.c
@@ -146,7 +146,7 @@ static void __meminit mm_compute_batch(void)
s32 batch = max_t(s32, nr*2, 32);

/* batch size set to 0.4% of (total memory/#cpus), or max int32 */
- memsized_batch = min_t(u64, (totalram_pages/nr)/256, 0x7fffffff);
+ memsized_batch = min_t(u64, (totalram_pages()/nr)/256, 0x7fffffff);

vm_committed_as_batch = max_t(s32, memsized_batch, batch);
}
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 6589f60..21d4877 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -269,7 +269,7 @@ static enum oom_constraint constrained_alloc(struct oom_control *oc)
}

/* Default to all available memory */
- oc->totalpages = totalram_pages + total_swap_pages;
+ oc->totalpages = totalram_pages() + total_swap_pages;

if (!IS_ENABLED(CONFIG_NUMA))
return CONSTRAINT_NONE;
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 22e6645..2a42c3f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -16,6 +16,7 @@

#include <linux/stddef.h>
#include <linux/mm.h>
+#include <linux/highmem.h>
#include <linux/swap.h>
#include <linux/interrupt.h>
#include <linux/pagemap.h>
@@ -124,7 +125,7 @@
/* Protect totalram_pages and zone->managed_pages */
static DEFINE_SPINLOCK(managed_page_count_lock);

-unsigned long totalram_pages __read_mostly;
+atomic_long_t _totalram_pages __read_mostly;
unsigned long totalreserve_pages __read_mostly;
unsigned long totalcma_pages __read_mostly;

@@ -4748,11 +4749,11 @@ long si_mem_available(void)

void si_meminfo(struct sysinfo *val)
{
- val->totalram = totalram_pages;
+ val->totalram = totalram_pages();
val->sharedram = global_node_page_state(NR_SHMEM);
val->freeram = global_zone_page_state(NR_FREE_PAGES);
val->bufferram = nr_blockdev_pages();
- val->totalhigh = totalhigh_pages;
+ val->totalhigh = totalhigh_pages();
val->freehigh = nr_free_highpages();
val->mem_unit = PAGE_SIZE;
}
@@ -7065,10 +7066,10 @@ void adjust_managed_page_count(struct page *page, long count)
{
spin_lock(&managed_page_count_lock);
atomic_long_add(count, &page_zone(page)->managed_pages);
- totalram_pages += count;
+ totalram_pages_add(count);
#ifdef CONFIG_HIGHMEM
if (PageHighMem(page))
- totalhigh_pages += count;
+ totalhigh_pages_add(count);
#endif
spin_unlock(&managed_page_count_lock);
}
@@ -7111,9 +7112,9 @@ unsigned long free_reserved_area(void *start, void *end, int poison, char *s)
void free_highmem_page(struct page *page)
{
__free_reserved_page(page);
- totalram_pages++;
+ totalram_pages_inc();
atomic_long_inc(&page_zone(page)->managed_pages);
- totalhigh_pages++;
+ totalhigh_pages_inc();
}
#endif

@@ -7162,10 +7163,10 @@ void __init mem_init_print_info(const char *str)
physpages << (PAGE_SHIFT - 10),
codesize >> 10, datasize >> 10, rosize >> 10,
(init_data_size + init_code_size) >> 10, bss_size >> 10,
- (physpages - totalram_pages - totalcma_pages) << (PAGE_SHIFT - 10),
+ (physpages - totalram_pages() - totalcma_pages) << (PAGE_SHIFT - 10),
totalcma_pages << (PAGE_SHIFT - 10),
#ifdef CONFIG_HIGHMEM
- totalhigh_pages << (PAGE_SHIFT - 10),
+ totalhigh_pages() << (PAGE_SHIFT - 10),
#endif
str ? ", " : "", str ? str : "");
}
diff --git a/mm/shmem.c b/mm/shmem.c
index 6b91eab..649a144 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -109,13 +109,13 @@ struct shmem_falloc {
#ifdef CONFIG_TMPFS
static unsigned long shmem_default_max_blocks(void)
{
- return totalram_pages / 2;
+ return totalram_pages() / 2;
}

static unsigned long shmem_default_max_inodes(void)
{
- unsigned long totalram_pgs = totalram_pages;
- return min(totalram_pgs - totalhigh_pages, totalram_pgs / 2);
+ unsigned long totalram_pgs = totalram_pages();
+ return min(totalram_pgs - totalhigh_pages(), totalram_pgs / 2);
}
#endif

@@ -3275,7 +3275,7 @@ static int shmem_parse_options(char *options, struct shmem_sb_info *sbinfo,
size = memparse(value,&rest);
if (*rest == '%') {
size <<= PAGE_SHIFT;
- size *= totalram_pages;
+ size *= totalram_pages();
do_div(size, 100);
rest++;
}
diff --git a/mm/slab.c b/mm/slab.c
index 2a5654b..bc3de2f 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1248,7 +1248,7 @@ void __init kmem_cache_init(void)
* page orders on machines with more than 32MB of memory if
* not overridden on the command line.
*/
- if (!slab_max_order_set && totalram_pages > (32 << 20) >> PAGE_SHIFT)
+ if (!slab_max_order_set && totalram_pages() > (32 << 20) >> PAGE_SHIFT)
slab_max_order = SLAB_MAX_ORDER_HI;

/* Bootstrap is tricky, because several objects are allocated
diff --git a/mm/swap.c b/mm/swap.c
index aa48371..a87bd4c 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -1023,7 +1023,7 @@ unsigned pagevec_lookup_range_nr_tag(struct pagevec *pvec,
*/
void __init swap_setup(void)
{
- unsigned long megs = totalram_pages >> (20 - PAGE_SHIFT);
+ unsigned long megs = totalram_pages() >> (20 - PAGE_SHIFT);

/* Use a smaller cluster for small-memory machines */
if (megs < 16)
diff --git a/mm/util.c b/mm/util.c
index 8bf08b5..4df23d6 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -593,7 +593,7 @@ unsigned long vm_commit_limit(void)
if (sysctl_overcommit_kbytes)
allowed = sysctl_overcommit_kbytes >> (PAGE_SHIFT - 10);
else
- allowed = ((totalram_pages - hugetlb_total_pages())
+ allowed = ((totalram_pages() - hugetlb_total_pages())
* sysctl_overcommit_ratio / 100);
allowed += total_swap_pages;

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 97d4b25..871e41c 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1634,7 +1634,7 @@ void *vmap(struct page **pages, unsigned int count,

might_sleep();

- if (count > totalram_pages)
+ if (count > totalram_pages())
return NULL;

size = (unsigned long)count << PAGE_SHIFT;
@@ -1739,7 +1739,7 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align,
unsigned long real_size = size;

size = PAGE_ALIGN(size);
- if (!size || (size >> PAGE_SHIFT) > totalram_pages)
+ if (!size || (size >> PAGE_SHIFT) > totalram_pages())
goto fail;

area = __get_vm_area_node(size, align, VM_ALLOC | VM_UNINITIALIZED |
diff --git a/mm/workingset.c b/mm/workingset.c
index d46f8c9..dcb994f 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -549,7 +549,7 @@ static int __init workingset_init(void)
* double the initial memory by using totalram_pages as-is.
*/
timestamp_bits = BITS_PER_LONG - EVICTION_SHIFT;
- max_order = fls_long(totalram_pages - 1);
+ max_order = fls_long(totalram_pages() - 1);
if (max_order > timestamp_bits)
bucket_order = max_order - timestamp_bits;
pr_info("workingset: timestamp_bits=%d max_order=%d bucket_order=%u\n",
diff --git a/mm/zswap.c b/mm/zswap.c
index cd91fd9..a4e4d36 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -219,8 +219,8 @@ struct zswap_tree {

static bool zswap_is_full(void)
{
- return totalram_pages * zswap_max_pool_percent / 100 <
- DIV_ROUND_UP(zswap_pool_total_size, PAGE_SIZE);
+ return totalram_pages() * zswap_max_pool_percent / 100 <
+ DIV_ROUND_UP(zswap_pool_total_size, PAGE_SIZE);
}

static void zswap_update_total_size(void)
diff --git a/net/dccp/proto.c b/net/dccp/proto.c
index f27daa1..1b4d39b 100644
--- a/net/dccp/proto.c
+++ b/net/dccp/proto.c
@@ -1131,7 +1131,7 @@ static inline void dccp_mib_exit(void)
static int __init dccp_init(void)
{
unsigned long goal;
- unsigned long totalram_pgs = totalram_pages;
+ unsigned long totalram_pgs = totalram_pages();
int ehash_order, bhash_order, i;
int rc;

diff --git a/net/decnet/dn_route.c b/net/decnet/dn_route.c
index 1c002c0..950613e 100644
--- a/net/decnet/dn_route.c
+++ b/net/decnet/dn_route.c
@@ -1866,7 +1866,7 @@ void __init dn_route_init(void)
dn_route_timer.expires = jiffies + decnet_dst_gc_interval * HZ;
add_timer(&dn_route_timer);

- goal = totalram_pages >> (26 - PAGE_SHIFT);
+ goal = totalram_pages() >> (26 - PAGE_SHIFT);

for(order = 0; (1UL << order) < goal; order++)
/* NOTHING */;
diff --git a/net/ipv4/tcp_metrics.c b/net/ipv4/tcp_metrics.c
index 03b51cd..b467a7c 100644
--- a/net/ipv4/tcp_metrics.c
+++ b/net/ipv4/tcp_metrics.c
@@ -1000,7 +1000,7 @@ static int __net_init tcp_net_metrics_init(struct net *net)

slots = tcpmhash_entries;
if (!slots) {
- if (totalram_pages >= 128 * 1024)
+ if (totalram_pages() >= 128 * 1024)
slots = 16 * 1024;
else
slots = 8 * 1024;
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 0b1801e..edc83f2 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -2248,7 +2248,7 @@ static __always_inline unsigned int total_extension_size(void)

int nf_conntrack_init_start(void)
{
- unsigned long totalram_pgs = totalram_pages;
+ unsigned long totalram_pgs = totalram_pages();
int max_factor = 8;
int ret = -ENOMEM;
int i;
diff --git a/net/netfilter/xt_hashlimit.c b/net/netfilter/xt_hashlimit.c
index 6cb9a74..2df06c4f 100644
--- a/net/netfilter/xt_hashlimit.c
+++ b/net/netfilter/xt_hashlimit.c
@@ -274,7 +274,7 @@ static int htable_create(struct net *net, struct hashlimit_cfg3 *cfg,
struct xt_hashlimit_htable *hinfo;
const struct seq_operations *ops;
unsigned int size, i;
- unsigned long totalram_pgs = totalram_pages;
+ unsigned long totalram_pgs = totalram_pages();
int ret;

if (cfg->size) {
diff --git a/security/integrity/ima/ima_kexec.c b/security/integrity/ima/ima_kexec.c
index 16bd187..d6f3280 100644
--- a/security/integrity/ima/ima_kexec.c
+++ b/security/integrity/ima/ima_kexec.c
@@ -106,7 +106,7 @@ void ima_add_kexec_buffer(struct kimage *image)
kexec_segment_size = ALIGN(ima_get_binary_runtime_size() +
PAGE_SIZE / 2, PAGE_SIZE);
if ((kexec_segment_size == ULONG_MAX) ||
- ((kexec_segment_size >> PAGE_SHIFT) > totalram_pages / 2)) {
+ ((kexec_segment_size >> PAGE_SHIFT) > totalram_pages() / 2)) {
pr_err("Binary measurement list too large.\n");
return;
}
--
1.9.1


2018-11-07 08:21:23

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] mm: Fix multiple evaluvations of totalram_pages and managed_pages

On Tue 06-11-18 21:51:47, Arun KS wrote:
> This patch is in preparation to a later patch which converts totalram_pages
> and zone->managed_pages to atomic variables. This patch does not introduce
> any functional changes.

I forgot to comment on this one. The patch makes a lot of sense. But I
would be little bit more conservative and won't claim "no functional
changes". As things stand now multiple reads in the same function are
racy (without holding the lock). I do not see any example of an
obviously harmful case but claiming the above is too strong of a
statement. I would simply go with something like "Please note that
re-reading the value might lead to a different value and as such it
could lead to unexpected behavior. There are no known bugs as a result
of the current code but it is better to prevent from them in principle."

> Signed-off-by: Arun KS <[email protected]>
> Reviewed-by: Konstantin Khlebnikov <[email protected]>

Other than that
Acked-by: Michal Hocko <[email protected]>

> ---
> arch/um/kernel/mem.c | 3 +--
> arch/x86/kernel/cpu/microcode/core.c | 5 +++--
> drivers/hv/hv_balloon.c | 19 ++++++++++---------
> fs/file_table.c | 7 ++++---
> kernel/fork.c | 5 +++--
> kernel/kexec_core.c | 5 +++--
> mm/page_alloc.c | 5 +++--
> mm/shmem.c | 3 ++-
> net/dccp/proto.c | 7 ++++---
> net/netfilter/nf_conntrack_core.c | 7 ++++---
> net/netfilter/xt_hashlimit.c | 5 +++--
> net/sctp/protocol.c | 7 ++++---
> 12 files changed, 44 insertions(+), 34 deletions(-)
>
> diff --git a/arch/um/kernel/mem.c b/arch/um/kernel/mem.c
> index 1067469..134d3fd 100644
> --- a/arch/um/kernel/mem.c
> +++ b/arch/um/kernel/mem.c
> @@ -51,8 +51,7 @@ void __init mem_init(void)
>
> /* this will put all low memory onto the freelists */
> memblock_free_all();
> - max_low_pfn = totalram_pages;
> - max_pfn = totalram_pages;
> + max_pfn = max_low_pfn = totalram_pages;
> mem_init_print_info(NULL);
> kmalloc_ok = 1;
> }
> diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
> index 2637ff0..99c67ca 100644
> --- a/arch/x86/kernel/cpu/microcode/core.c
> +++ b/arch/x86/kernel/cpu/microcode/core.c
> @@ -434,9 +434,10 @@ static ssize_t microcode_write(struct file *file, const char __user *buf,
> size_t len, loff_t *ppos)
> {
> ssize_t ret = -EINVAL;
> + unsigned long totalram_pgs = totalram_pages;
>
> - if ((len >> PAGE_SHIFT) > totalram_pages) {
> - pr_err("too much data (max %ld pages)\n", totalram_pages);
> + if ((len >> PAGE_SHIFT) > totalram_pgs) {
> + pr_err("too much data (max %ld pages)\n", totalram_pgs);
> return ret;
> }
>
> diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
> index 4163151..cac4945 100644
> --- a/drivers/hv/hv_balloon.c
> +++ b/drivers/hv/hv_balloon.c
> @@ -1090,6 +1090,7 @@ static void process_info(struct hv_dynmem_device *dm, struct dm_info_msg *msg)
> static unsigned long compute_balloon_floor(void)
> {
> unsigned long min_pages;
> + unsigned long totalram_pgs = totalram_pages;
> #define MB2PAGES(mb) ((mb) << (20 - PAGE_SHIFT))
> /* Simple continuous piecewiese linear function:
> * max MiB -> min MiB gradient
> @@ -1102,16 +1103,16 @@ static unsigned long compute_balloon_floor(void)
> * 8192 744 (1/16)
> * 32768 1512 (1/32)
> */
> - if (totalram_pages < MB2PAGES(128))
> - min_pages = MB2PAGES(8) + (totalram_pages >> 1);
> - else if (totalram_pages < MB2PAGES(512))
> - min_pages = MB2PAGES(40) + (totalram_pages >> 2);
> - else if (totalram_pages < MB2PAGES(2048))
> - min_pages = MB2PAGES(104) + (totalram_pages >> 3);
> - else if (totalram_pages < MB2PAGES(8192))
> - min_pages = MB2PAGES(232) + (totalram_pages >> 4);
> + if (totalram_pgs < MB2PAGES(128))
> + min_pages = MB2PAGES(8) + (totalram_pgs >> 1);
> + else if (totalram_pgs < MB2PAGES(512))
> + min_pages = MB2PAGES(40) + (totalram_pgs >> 2);
> + else if (totalram_pgs < MB2PAGES(2048))
> + min_pages = MB2PAGES(104) + (totalram_pgs >> 3);
> + else if (totalram_pgs < MB2PAGES(8192))
> + min_pages = MB2PAGES(232) + (totalram_pgs >> 4);
> else
> - min_pages = MB2PAGES(488) + (totalram_pages >> 5);
> + min_pages = MB2PAGES(488) + (totalram_pgs >> 5);
> #undef MB2PAGES
> return min_pages;
> }
> diff --git a/fs/file_table.c b/fs/file_table.c
> index e49af4c..6e3c088 100644
> --- a/fs/file_table.c
> +++ b/fs/file_table.c
> @@ -380,10 +380,11 @@ void __init files_init(void)
> void __init files_maxfiles_init(void)
> {
> unsigned long n;
> - unsigned long memreserve = (totalram_pages - nr_free_pages()) * 3/2;
> + unsigned long totalram_pgs = totalram_pages;
> + unsigned long memreserve = (totalram_pgs - nr_free_pages()) * 3/2;
>
> - memreserve = min(memreserve, totalram_pages - 1);
> - n = ((totalram_pages - memreserve) * (PAGE_SIZE / 1024)) / 10;
> + memreserve = min(memreserve, totalram_pgs - 1);
> + n = ((totalram_pgs - memreserve) * (PAGE_SIZE / 1024)) / 10;
>
> files_stat.max_files = max_t(unsigned long, n, NR_FILE);
> }
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 07cddff..7823f31 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -739,15 +739,16 @@ void __init __weak arch_task_cache_init(void) { }
> static void set_max_threads(unsigned int max_threads_suggested)
> {
> u64 threads;
> + unsigned long totalram_pgs = totalram_pages;
>
> /*
> * The number of threads shall be limited such that the thread
> * structures may only consume a small part of the available memory.
> */
> - if (fls64(totalram_pages) + fls64(PAGE_SIZE) > 64)
> + if (fls64(totalram_pgs) + fls64(PAGE_SIZE) > 64)
> threads = MAX_THREADS;
> else
> - threads = div64_u64((u64) totalram_pages * (u64) PAGE_SIZE,
> + threads = div64_u64((u64) totalram_pgs * (u64) PAGE_SIZE,
> (u64) THREAD_SIZE * 8UL);
>
> if (threads > max_threads_suggested)
> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> index 86ef06d..dff217c 100644
> --- a/kernel/kexec_core.c
> +++ b/kernel/kexec_core.c
> @@ -152,6 +152,7 @@ int sanity_check_segment_list(struct kimage *image)
> int i;
> unsigned long nr_segments = image->nr_segments;
> unsigned long total_pages = 0;
> + unsigned long totalram_pgs = totalram_pages;
>
> /*
> * Verify we have good destination addresses. The caller is
> @@ -217,13 +218,13 @@ int sanity_check_segment_list(struct kimage *image)
> * wasted allocating pages, which can cause a soft lockup.
> */
> for (i = 0; i < nr_segments; i++) {
> - if (PAGE_COUNT(image->segment[i].memsz) > totalram_pages / 2)
> + if (PAGE_COUNT(image->segment[i].memsz) > totalram_pgs / 2)
> return -EINVAL;
>
> total_pages += PAGE_COUNT(image->segment[i].memsz);
> }
>
> - if (total_pages > totalram_pages / 2)
> + if (total_pages > totalram_pgs / 2)
> return -EINVAL;
>
> /*
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index a919ba5..173312b 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -7245,6 +7245,7 @@ static void calculate_totalreserve_pages(void)
> for (i = 0; i < MAX_NR_ZONES; i++) {
> struct zone *zone = pgdat->node_zones + i;
> long max = 0;
> + unsigned long managed_pages = zone->managed_pages;
>
> /* Find valid and maximum lowmem_reserve in the zone */
> for (j = i; j < MAX_NR_ZONES; j++) {
> @@ -7255,8 +7256,8 @@ static void calculate_totalreserve_pages(void)
> /* we treat the high watermark as reserved pages. */
> max += high_wmark_pages(zone);
>
> - if (max > zone->managed_pages)
> - max = zone->managed_pages;
> + if (max > managed_pages)
> + max = managed_pages;
>
> pgdat->totalreserve_pages += max;
>
> diff --git a/mm/shmem.c b/mm/shmem.c
> index ea26d7a..6b91eab 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -114,7 +114,8 @@ static unsigned long shmem_default_max_blocks(void)
>
> static unsigned long shmem_default_max_inodes(void)
> {
> - return min(totalram_pages - totalhigh_pages, totalram_pages / 2);
> + unsigned long totalram_pgs = totalram_pages;
> + return min(totalram_pgs - totalhigh_pages, totalram_pgs / 2);
> }
> #endif
>
> diff --git a/net/dccp/proto.c b/net/dccp/proto.c
> index 43733ac..f27daa1 100644
> --- a/net/dccp/proto.c
> +++ b/net/dccp/proto.c
> @@ -1131,6 +1131,7 @@ static inline void dccp_mib_exit(void)
> static int __init dccp_init(void)
> {
> unsigned long goal;
> + unsigned long totalram_pgs = totalram_pages;
> int ehash_order, bhash_order, i;
> int rc;
>
> @@ -1154,10 +1155,10 @@ static int __init dccp_init(void)
> *
> * The methodology is similar to that of the buffer cache.
> */
> - if (totalram_pages >= (128 * 1024))
> - goal = totalram_pages >> (21 - PAGE_SHIFT);
> + if (totalram_pgs >= (128 * 1024))
> + goal = totalram_pgs >> (21 - PAGE_SHIFT);
> else
> - goal = totalram_pages >> (23 - PAGE_SHIFT);
> + goal = totalram_pgs >> (23 - PAGE_SHIFT);
>
> if (thash_entries)
> goal = (thash_entries *
> diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
> index ca1168d..0b1801e 100644
> --- a/net/netfilter/nf_conntrack_core.c
> +++ b/net/netfilter/nf_conntrack_core.c
> @@ -2248,6 +2248,7 @@ static __always_inline unsigned int total_extension_size(void)
>
> int nf_conntrack_init_start(void)
> {
> + unsigned long totalram_pgs = totalram_pages;
> int max_factor = 8;
> int ret = -ENOMEM;
> int i;
> @@ -2267,11 +2268,11 @@ int nf_conntrack_init_start(void)
> * >= 4GB machines have 65536 buckets.
> */
> nf_conntrack_htable_size
> - = (((totalram_pages << PAGE_SHIFT) / 16384)
> + = (((totalram_pgs << PAGE_SHIFT) / 16384)
> / sizeof(struct hlist_head));
> - if (totalram_pages > (4 * (1024 * 1024 * 1024 / PAGE_SIZE)))
> + if (totalram_pgs > (4 * (1024 * 1024 * 1024 / PAGE_SIZE)))
> nf_conntrack_htable_size = 65536;
> - else if (totalram_pages > (1024 * 1024 * 1024 / PAGE_SIZE))
> + else if (totalram_pgs > (1024 * 1024 * 1024 / PAGE_SIZE))
> nf_conntrack_htable_size = 16384;
> if (nf_conntrack_htable_size < 32)
> nf_conntrack_htable_size = 32;
> diff --git a/net/netfilter/xt_hashlimit.c b/net/netfilter/xt_hashlimit.c
> index 3e7d259..6cb9a74 100644
> --- a/net/netfilter/xt_hashlimit.c
> +++ b/net/netfilter/xt_hashlimit.c
> @@ -274,14 +274,15 @@ static int htable_create(struct net *net, struct hashlimit_cfg3 *cfg,
> struct xt_hashlimit_htable *hinfo;
> const struct seq_operations *ops;
> unsigned int size, i;
> + unsigned long totalram_pgs = totalram_pages;
> int ret;
>
> if (cfg->size) {
> size = cfg->size;
> } else {
> - size = (totalram_pages << PAGE_SHIFT) / 16384 /
> + size = (totalram_pgs << PAGE_SHIFT) / 16384 /
> sizeof(struct hlist_head);
> - if (totalram_pages > 1024 * 1024 * 1024 / PAGE_SIZE)
> + if (totalram_pgs > 1024 * 1024 * 1024 / PAGE_SIZE)
> size = 8192;
> if (size < 16)
> size = 16;
> diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
> index 9b277bd..7128f85 100644
> --- a/net/sctp/protocol.c
> +++ b/net/sctp/protocol.c
> @@ -1368,6 +1368,7 @@ static __init int sctp_init(void)
> int status = -EINVAL;
> unsigned long goal;
> unsigned long limit;
> + unsigned long totalram_pages;
> int max_share;
> int order;
> int num_entries;
> @@ -1426,10 +1427,10 @@ static __init int sctp_init(void)
> * The methodology is similar to that of the tcp hash tables.
> * Though not identical. Start by getting a goal size
> */
> - if (totalram_pages >= (128 * 1024))
> - goal = totalram_pages >> (22 - PAGE_SHIFT);
> + if (totalram_pgs >= (128 * 1024))
> + goal = totalram_pgs >> (22 - PAGE_SHIFT);
> else
> - goal = totalram_pages >> (24 - PAGE_SHIFT);
> + goal = totalram_pgs >> (24 - PAGE_SHIFT);
>
> /* Then compute the page order for said goal */
> order = get_order(goal);
> --
> 1.9.1

--
Michal Hocko
SUSE Labs

2018-11-07 08:44:45

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] mm: Fix multiple evaluvations of totalram_pages and managed_pages

On 11/7/18 9:20 AM, Michal Hocko wrote:
> On Tue 06-11-18 21:51:47, Arun KS wrote:

Hi,

there's typo in subject: evaluvations -> evaluations.

However, "fix" is also misleading (more below), so I'd suggest something
like:

mm: reference totalram_pages and managed_pages once per function

>> This patch is in preparation to a later patch which converts totalram_pages
>> and zone->managed_pages to atomic variables. This patch does not introduce
>> any functional changes.
>
> I forgot to comment on this one. The patch makes a lot of sense. But I
> would be little bit more conservative and won't claim "no functional
> changes". As things stand now multiple reads in the same function are
> racy (without holding the lock). I do not see any example of an
> obviously harmful case but claiming the above is too strong of a
> statement. I would simply go with something like "Please note that
> re-reading the value might lead to a different value and as such it
> could lead to unexpected behavior. There are no known bugs as a result
> of the current code but it is better to prevent from them in principle."

However, the new code doesn't use READ_ONCE(), so the compiler is free
to read the value multiple times, and before the patch it was free to
read it just once, as the variables are not volatile. So strictly
speaking this is indeed not a functional change (if compiler decides
differently based on the patch, it's an implementation detail).

So even in my suggested subject above, 'reference' is meant as a source
code reference, not really a memory read reference. Couldn't think of a
better word though.

2018-11-07 08:59:46

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH v2 2/4] mm: Convert zone->managed_pages to atomic variable

On 11/6/18 5:21 PM, Arun KS wrote:
> totalram_pages, zone->managed_pages and totalhigh_pages updates
> are protected by managed_page_count_lock, but readers never care
> about it. Convert these variables to atomic to avoid readers
> potentially seeing a store tear.
>
> This patch converts zone->managed_pages. Subsequent patches will
> convert totalram_panges, totalhigh_pages and eventually
> managed_page_count_lock will be removed.
>
> Suggested-by: Michal Hocko <[email protected]>
> Suggested-by: Vlastimil Babka <[email protected]>
> Signed-off-by: Arun KS <[email protected]>
> Reviewed-by: Konstantin Khlebnikov <[email protected]>
> Acked-by: Michal Hocko <[email protected]>

Acked-by: Vlastimil Babka <[email protected]>

2018-11-07 09:05:44

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] mm: convert totalram_pages and totalhigh_pages variables to atomic

On 11/6/18 5:21 PM, Arun KS wrote:
> totalram_pages and totalhigh_pages are made static inline function.
>
> Suggested-by: Michal Hocko <[email protected]>
> Suggested-by: Vlastimil Babka <[email protected]>
> Signed-off-by: Arun KS <[email protected]>
> Reviewed-by: Konstantin Khlebnikov <[email protected]>
> Acked-by: Michal Hocko <[email protected]>

Acked-by: Vlastimil Babka <[email protected]>

One bug (probably) below:

> diff --git a/mm/highmem.c b/mm/highmem.c
> index 59db322..02a9a4b 100644
> --- a/mm/highmem.c
> +++ b/mm/highmem.c
> @@ -105,9 +105,7 @@ static inline wait_queue_head_t *get_pkmap_wait_queue_head(unsigned int color)
> }
> #endif
>
> -unsigned long totalhigh_pages __read_mostly;
> -EXPORT_SYMBOL(totalhigh_pages);

I think you still need to export _totalhigh_pages so that modules can
use the inline accessors.

> -
> +atomic_long_t _totalhigh_pages __read_mostly;
>
> EXPORT_PER_CPU_SYMBOL(__kmap_atomic_idx);
>

2018-11-07 09:24:33

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] mm: Fix multiple evaluvations of totalram_pages and managed_pages

On Wed 07-11-18 09:44:00, Vlastimil Babka wrote:
> On 11/7/18 9:20 AM, Michal Hocko wrote:
> > On Tue 06-11-18 21:51:47, Arun KS wrote:
>
> Hi,
>
> there's typo in subject: evaluvations -> evaluations.
>
> However, "fix" is also misleading (more below), so I'd suggest something
> like:
>
> mm: reference totalram_pages and managed_pages once per function
>
> >> This patch is in preparation to a later patch which converts totalram_pages
> >> and zone->managed_pages to atomic variables. This patch does not introduce
> >> any functional changes.
> >
> > I forgot to comment on this one. The patch makes a lot of sense. But I
> > would be little bit more conservative and won't claim "no functional
> > changes". As things stand now multiple reads in the same function are
> > racy (without holding the lock). I do not see any example of an
> > obviously harmful case but claiming the above is too strong of a
> > statement. I would simply go with something like "Please note that
> > re-reading the value might lead to a different value and as such it
> > could lead to unexpected behavior. There are no known bugs as a result
> > of the current code but it is better to prevent from them in principle."
>
> However, the new code doesn't use READ_ONCE(), so the compiler is free
> to read the value multiple times, and before the patch it was free to
> read it just once, as the variables are not volatile. So strictly
> speaking this is indeed not a functional change (if compiler decides
> differently based on the patch, it's an implementation detail).

Yes, compiler is allowed to optimize this either way without READ_ONCE
but it is allowed to do two reads so claiming no functional change is a
bit problematic. Not that this would be a reason to discuss this in
length...
--
Michal Hocko
SUSE Labs

2018-11-07 19:55:24

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] mm: Fix multiple evaluvations of totalram_pages and managed_pages

Hi Arun,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v4.20-rc1 next-20181107]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url: https://github.com/0day-ci/linux/commits/Arun-KS/mm-Fix-multiple-evaluvations-of-totalram_pages-and-managed_pages/20181108-025657
config: i386-randconfig-x014-201844 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=i386

All warnings (new ones prefixed by >>):

In file included from include/linux/export.h:45:0,
from include/linux/linkage.h:7,
from include/linux/kernel.h:7,
from include/linux/list.h:9,
from include/linux/module.h:9,
from net/sctp/protocol.c:44:
net/sctp/protocol.c: In function 'sctp_init':
net/sctp/protocol.c:1430:6: error: 'totalram_pgs' undeclared (first use in this function); did you mean 'totalram_pages'?
if (totalram_pgs >= (128 * 1024))
^
include/linux/compiler.h:58:30: note: in definition of macro '__trace_if'
if (__builtin_constant_p(!!(cond)) ? !!(cond) : \
^~~~
>> net/sctp/protocol.c:1430:2: note: in expansion of macro 'if'
if (totalram_pgs >= (128 * 1024))
^~
net/sctp/protocol.c:1430:6: note: each undeclared identifier is reported only once for each function it appears in
if (totalram_pgs >= (128 * 1024))
^
include/linux/compiler.h:58:30: note: in definition of macro '__trace_if'
if (__builtin_constant_p(!!(cond)) ? !!(cond) : \
^~~~
>> net/sctp/protocol.c:1430:2: note: in expansion of macro 'if'
if (totalram_pgs >= (128 * 1024))
^~
net/sctp/protocol.c:1371:16: warning: unused variable 'totalram_pages' [-Wunused-variable]
unsigned long totalram_pages;
^~~~~~~~~~~~~~

vim +/if +1430 net/sctp/protocol.c

1363
1364 /* Initialize the universe into something sensible. */
1365 static __init int sctp_init(void)
1366 {
1367 int i;
1368 int status = -EINVAL;
1369 unsigned long goal;
1370 unsigned long limit;
1371 unsigned long totalram_pages;
1372 int max_share;
1373 int order;
1374 int num_entries;
1375 int max_entry_order;
1376
1377 sock_skb_cb_check_size(sizeof(struct sctp_ulpevent));
1378
1379 /* Allocate bind_bucket and chunk caches. */
1380 status = -ENOBUFS;
1381 sctp_bucket_cachep = kmem_cache_create("sctp_bind_bucket",
1382 sizeof(struct sctp_bind_bucket),
1383 0, SLAB_HWCACHE_ALIGN,
1384 NULL);
1385 if (!sctp_bucket_cachep)
1386 goto out;
1387
1388 sctp_chunk_cachep = kmem_cache_create("sctp_chunk",
1389 sizeof(struct sctp_chunk),
1390 0, SLAB_HWCACHE_ALIGN,
1391 NULL);
1392 if (!sctp_chunk_cachep)
1393 goto err_chunk_cachep;
1394
1395 status = percpu_counter_init(&sctp_sockets_allocated, 0, GFP_KERNEL);
1396 if (status)
1397 goto err_percpu_counter_init;
1398
1399 /* Implementation specific variables. */
1400
1401 /* Initialize default stream count setup information. */
1402 sctp_max_instreams = SCTP_DEFAULT_INSTREAMS;
1403 sctp_max_outstreams = SCTP_DEFAULT_OUTSTREAMS;
1404
1405 /* Initialize handle used for association ids. */
1406 idr_init(&sctp_assocs_id);
1407
1408 limit = nr_free_buffer_pages() / 8;
1409 limit = max(limit, 128UL);
1410 sysctl_sctp_mem[0] = limit / 4 * 3;
1411 sysctl_sctp_mem[1] = limit;
1412 sysctl_sctp_mem[2] = sysctl_sctp_mem[0] * 2;
1413
1414 /* Set per-socket limits to no more than 1/128 the pressure threshold*/
1415 limit = (sysctl_sctp_mem[1]) << (PAGE_SHIFT - 7);
1416 max_share = min(4UL*1024*1024, limit);
1417
1418 sysctl_sctp_rmem[0] = SK_MEM_QUANTUM; /* give each asoc 1 page min */
1419 sysctl_sctp_rmem[1] = 1500 * SKB_TRUESIZE(1);
1420 sysctl_sctp_rmem[2] = max(sysctl_sctp_rmem[1], max_share);
1421
1422 sysctl_sctp_wmem[0] = SK_MEM_QUANTUM;
1423 sysctl_sctp_wmem[1] = 16*1024;
1424 sysctl_sctp_wmem[2] = max(64*1024, max_share);
1425
1426 /* Size and allocate the association hash table.
1427 * The methodology is similar to that of the tcp hash tables.
1428 * Though not identical. Start by getting a goal size
1429 */
> 1430 if (totalram_pgs >= (128 * 1024))
1431 goal = totalram_pgs >> (22 - PAGE_SHIFT);
1432 else
1433 goal = totalram_pgs >> (24 - PAGE_SHIFT);
1434
1435 /* Then compute the page order for said goal */
1436 order = get_order(goal);
1437
1438 /* Now compute the required page order for the maximum sized table we
1439 * want to create
1440 */
1441 max_entry_order = get_order(MAX_SCTP_PORT_HASH_ENTRIES *
1442 sizeof(struct sctp_bind_hashbucket));
1443
1444 /* Limit the page order by that maximum hash table size */
1445 order = min(order, max_entry_order);
1446
1447 /* Allocate and initialize the endpoint hash table. */
1448 sctp_ep_hashsize = 64;
1449 sctp_ep_hashtable =
1450 kmalloc_array(64, sizeof(struct sctp_hashbucket), GFP_KERNEL);
1451 if (!sctp_ep_hashtable) {
1452 pr_err("Failed endpoint_hash alloc\n");
1453 status = -ENOMEM;
1454 goto err_ehash_alloc;
1455 }
1456 for (i = 0; i < sctp_ep_hashsize; i++) {
1457 rwlock_init(&sctp_ep_hashtable[i].lock);
1458 INIT_HLIST_HEAD(&sctp_ep_hashtable[i].chain);
1459 }
1460
1461 /* Allocate and initialize the SCTP port hash table.
1462 * Note that order is initalized to start at the max sized
1463 * table we want to support. If we can't get that many pages
1464 * reduce the order and try again
1465 */
1466 do {
1467 sctp_port_hashtable = (struct sctp_bind_hashbucket *)
1468 __get_free_pages(GFP_KERNEL | __GFP_NOWARN, order);
1469 } while (!sctp_port_hashtable && --order > 0);
1470
1471 if (!sctp_port_hashtable) {
1472 pr_err("Failed bind hash alloc\n");
1473 status = -ENOMEM;
1474 goto err_bhash_alloc;
1475 }
1476
1477 /* Now compute the number of entries that will fit in the
1478 * port hash space we allocated
1479 */
1480 num_entries = (1UL << order) * PAGE_SIZE /
1481 sizeof(struct sctp_bind_hashbucket);
1482
1483 /* And finish by rounding it down to the nearest power of two
1484 * this wastes some memory of course, but its needed because
1485 * the hash function operates based on the assumption that
1486 * that the number of entries is a power of two
1487 */
1488 sctp_port_hashsize = rounddown_pow_of_two(num_entries);
1489
1490 for (i = 0; i < sctp_port_hashsize; i++) {
1491 spin_lock_init(&sctp_port_hashtable[i].lock);
1492 INIT_HLIST_HEAD(&sctp_port_hashtable[i].chain);
1493 }
1494
1495 status = sctp_transport_hashtable_init();
1496 if (status)
1497 goto err_thash_alloc;
1498
1499 pr_info("Hash tables configured (bind %d/%d)\n", sctp_port_hashsize,
1500 num_entries);
1501
1502 sctp_sysctl_register();
1503
1504 INIT_LIST_HEAD(&sctp_address_families);
1505 sctp_v4_pf_init();
1506 sctp_v6_pf_init();
1507 sctp_sched_ops_init();
1508
1509 status = register_pernet_subsys(&sctp_defaults_ops);
1510 if (status)
1511 goto err_register_defaults;
1512
1513 status = sctp_v4_protosw_init();
1514 if (status)
1515 goto err_protosw_init;
1516
1517 status = sctp_v6_protosw_init();
1518 if (status)
1519 goto err_v6_protosw_init;
1520
1521 status = register_pernet_subsys(&sctp_ctrlsock_ops);
1522 if (status)
1523 goto err_register_ctrlsock;
1524
1525 status = sctp_v4_add_protocol();
1526 if (status)
1527 goto err_add_protocol;
1528
1529 /* Register SCTP with inet6 layer. */
1530 status = sctp_v6_add_protocol();
1531 if (status)
1532 goto err_v6_add_protocol;
1533
1534 if (sctp_offload_init() < 0)
1535 pr_crit("%s: Cannot add SCTP protocol offload\n", __func__);
1536
1537 out:
1538 return status;
1539 err_v6_add_protocol:
1540 sctp_v4_del_protocol();
1541 err_add_protocol:
1542 unregister_pernet_subsys(&sctp_ctrlsock_ops);
1543 err_register_ctrlsock:
1544 sctp_v6_protosw_exit();
1545 err_v6_protosw_init:
1546 sctp_v4_protosw_exit();
1547 err_protosw_init:
1548 unregister_pernet_subsys(&sctp_defaults_ops);
1549 err_register_defaults:
1550 sctp_v4_pf_exit();
1551 sctp_v6_pf_exit();
1552 sctp_sysctl_unregister();
1553 free_pages((unsigned long)sctp_port_hashtable,
1554 get_order(sctp_port_hashsize *
1555 sizeof(struct sctp_bind_hashbucket)));
1556 err_bhash_alloc:
1557 sctp_transport_hashtable_destroy();
1558 err_thash_alloc:
1559 kfree(sctp_ep_hashtable);
1560 err_ehash_alloc:
1561 percpu_counter_destroy(&sctp_sockets_allocated);
1562 err_percpu_counter_init:
1563 kmem_cache_destroy(sctp_chunk_cachep);
1564 err_chunk_cachep:
1565 kmem_cache_destroy(sctp_bucket_cachep);
1566 goto out;
1567 }
1568

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation


Attachments:
(No filename) (9.52 kB)
.config.gz (27.23 kB)
Download all attachments

2018-11-07 20:08:51

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] mm: Fix multiple evaluvations of totalram_pages and managed_pages

Hi Arun,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v4.20-rc1 next-20181107]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url: https://github.com/0day-ci/linux/commits/Arun-KS/mm-Fix-multiple-evaluvations-of-totalram_pages-and-managed_pages/20181108-025657
config: i386-randconfig-x004-201844 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=i386

All errors (new ones prefixed by >>):

net//sctp/protocol.c: In function 'sctp_init':
>> net//sctp/protocol.c:1430:6: error: 'totalram_pgs' undeclared (first use in this function); did you mean 'totalram_pages'?
if (totalram_pgs >= (128 * 1024))
^~~~~~~~~~~~
totalram_pages
net//sctp/protocol.c:1430:6: note: each undeclared identifier is reported only once for each function it appears in
net//sctp/protocol.c:1371:16: warning: unused variable 'totalram_pages' [-Wunused-variable]
unsigned long totalram_pages;
^~~~~~~~~~~~~~

vim +1430 net//sctp/protocol.c

1363
1364 /* Initialize the universe into something sensible. */
1365 static __init int sctp_init(void)
1366 {
1367 int i;
1368 int status = -EINVAL;
1369 unsigned long goal;
1370 unsigned long limit;
1371 unsigned long totalram_pages;
1372 int max_share;
1373 int order;
1374 int num_entries;
1375 int max_entry_order;
1376
1377 sock_skb_cb_check_size(sizeof(struct sctp_ulpevent));
1378
1379 /* Allocate bind_bucket and chunk caches. */
1380 status = -ENOBUFS;
1381 sctp_bucket_cachep = kmem_cache_create("sctp_bind_bucket",
1382 sizeof(struct sctp_bind_bucket),
1383 0, SLAB_HWCACHE_ALIGN,
1384 NULL);
1385 if (!sctp_bucket_cachep)
1386 goto out;
1387
1388 sctp_chunk_cachep = kmem_cache_create("sctp_chunk",
1389 sizeof(struct sctp_chunk),
1390 0, SLAB_HWCACHE_ALIGN,
1391 NULL);
1392 if (!sctp_chunk_cachep)
1393 goto err_chunk_cachep;
1394
1395 status = percpu_counter_init(&sctp_sockets_allocated, 0, GFP_KERNEL);
1396 if (status)
1397 goto err_percpu_counter_init;
1398
1399 /* Implementation specific variables. */
1400
1401 /* Initialize default stream count setup information. */
1402 sctp_max_instreams = SCTP_DEFAULT_INSTREAMS;
1403 sctp_max_outstreams = SCTP_DEFAULT_OUTSTREAMS;
1404
1405 /* Initialize handle used for association ids. */
1406 idr_init(&sctp_assocs_id);
1407
1408 limit = nr_free_buffer_pages() / 8;
1409 limit = max(limit, 128UL);
1410 sysctl_sctp_mem[0] = limit / 4 * 3;
1411 sysctl_sctp_mem[1] = limit;
1412 sysctl_sctp_mem[2] = sysctl_sctp_mem[0] * 2;
1413
1414 /* Set per-socket limits to no more than 1/128 the pressure threshold*/
1415 limit = (sysctl_sctp_mem[1]) << (PAGE_SHIFT - 7);
1416 max_share = min(4UL*1024*1024, limit);
1417
1418 sysctl_sctp_rmem[0] = SK_MEM_QUANTUM; /* give each asoc 1 page min */
1419 sysctl_sctp_rmem[1] = 1500 * SKB_TRUESIZE(1);
1420 sysctl_sctp_rmem[2] = max(sysctl_sctp_rmem[1], max_share);
1421
1422 sysctl_sctp_wmem[0] = SK_MEM_QUANTUM;
1423 sysctl_sctp_wmem[1] = 16*1024;
1424 sysctl_sctp_wmem[2] = max(64*1024, max_share);
1425
1426 /* Size and allocate the association hash table.
1427 * The methodology is similar to that of the tcp hash tables.
1428 * Though not identical. Start by getting a goal size
1429 */
> 1430 if (totalram_pgs >= (128 * 1024))
1431 goal = totalram_pgs >> (22 - PAGE_SHIFT);
1432 else
1433 goal = totalram_pgs >> (24 - PAGE_SHIFT);
1434
1435 /* Then compute the page order for said goal */
1436 order = get_order(goal);
1437
1438 /* Now compute the required page order for the maximum sized table we
1439 * want to create
1440 */
1441 max_entry_order = get_order(MAX_SCTP_PORT_HASH_ENTRIES *
1442 sizeof(struct sctp_bind_hashbucket));
1443
1444 /* Limit the page order by that maximum hash table size */
1445 order = min(order, max_entry_order);
1446
1447 /* Allocate and initialize the endpoint hash table. */
1448 sctp_ep_hashsize = 64;
1449 sctp_ep_hashtable =
1450 kmalloc_array(64, sizeof(struct sctp_hashbucket), GFP_KERNEL);
1451 if (!sctp_ep_hashtable) {
1452 pr_err("Failed endpoint_hash alloc\n");
1453 status = -ENOMEM;
1454 goto err_ehash_alloc;
1455 }
1456 for (i = 0; i < sctp_ep_hashsize; i++) {
1457 rwlock_init(&sctp_ep_hashtable[i].lock);
1458 INIT_HLIST_HEAD(&sctp_ep_hashtable[i].chain);
1459 }
1460
1461 /* Allocate and initialize the SCTP port hash table.
1462 * Note that order is initalized to start at the max sized
1463 * table we want to support. If we can't get that many pages
1464 * reduce the order and try again
1465 */
1466 do {
1467 sctp_port_hashtable = (struct sctp_bind_hashbucket *)
1468 __get_free_pages(GFP_KERNEL | __GFP_NOWARN, order);
1469 } while (!sctp_port_hashtable && --order > 0);
1470
1471 if (!sctp_port_hashtable) {
1472 pr_err("Failed bind hash alloc\n");
1473 status = -ENOMEM;
1474 goto err_bhash_alloc;
1475 }
1476
1477 /* Now compute the number of entries that will fit in the
1478 * port hash space we allocated
1479 */
1480 num_entries = (1UL << order) * PAGE_SIZE /
1481 sizeof(struct sctp_bind_hashbucket);
1482
1483 /* And finish by rounding it down to the nearest power of two
1484 * this wastes some memory of course, but its needed because
1485 * the hash function operates based on the assumption that
1486 * that the number of entries is a power of two
1487 */
1488 sctp_port_hashsize = rounddown_pow_of_two(num_entries);
1489
1490 for (i = 0; i < sctp_port_hashsize; i++) {
1491 spin_lock_init(&sctp_port_hashtable[i].lock);
1492 INIT_HLIST_HEAD(&sctp_port_hashtable[i].chain);
1493 }
1494
1495 status = sctp_transport_hashtable_init();
1496 if (status)
1497 goto err_thash_alloc;
1498
1499 pr_info("Hash tables configured (bind %d/%d)\n", sctp_port_hashsize,
1500 num_entries);
1501
1502 sctp_sysctl_register();
1503
1504 INIT_LIST_HEAD(&sctp_address_families);
1505 sctp_v4_pf_init();
1506 sctp_v6_pf_init();
1507 sctp_sched_ops_init();
1508
1509 status = register_pernet_subsys(&sctp_defaults_ops);
1510 if (status)
1511 goto err_register_defaults;
1512
1513 status = sctp_v4_protosw_init();
1514 if (status)
1515 goto err_protosw_init;
1516
1517 status = sctp_v6_protosw_init();
1518 if (status)
1519 goto err_v6_protosw_init;
1520
1521 status = register_pernet_subsys(&sctp_ctrlsock_ops);
1522 if (status)
1523 goto err_register_ctrlsock;
1524
1525 status = sctp_v4_add_protocol();
1526 if (status)
1527 goto err_add_protocol;
1528
1529 /* Register SCTP with inet6 layer. */
1530 status = sctp_v6_add_protocol();
1531 if (status)
1532 goto err_v6_add_protocol;
1533
1534 if (sctp_offload_init() < 0)
1535 pr_crit("%s: Cannot add SCTP protocol offload\n", __func__);
1536
1537 out:
1538 return status;
1539 err_v6_add_protocol:
1540 sctp_v4_del_protocol();
1541 err_add_protocol:
1542 unregister_pernet_subsys(&sctp_ctrlsock_ops);
1543 err_register_ctrlsock:
1544 sctp_v6_protosw_exit();
1545 err_v6_protosw_init:
1546 sctp_v4_protosw_exit();
1547 err_protosw_init:
1548 unregister_pernet_subsys(&sctp_defaults_ops);
1549 err_register_defaults:
1550 sctp_v4_pf_exit();
1551 sctp_v6_pf_exit();
1552 sctp_sysctl_unregister();
1553 free_pages((unsigned long)sctp_port_hashtable,
1554 get_order(sctp_port_hashsize *
1555 sizeof(struct sctp_bind_hashbucket)));
1556 err_bhash_alloc:
1557 sctp_transport_hashtable_destroy();
1558 err_thash_alloc:
1559 kfree(sctp_ep_hashtable);
1560 err_ehash_alloc:
1561 percpu_counter_destroy(&sctp_sockets_allocated);
1562 err_percpu_counter_init:
1563 kmem_cache_destroy(sctp_chunk_cachep);
1564 err_chunk_cachep:
1565 kmem_cache_destroy(sctp_bucket_cachep);
1566 goto out;
1567 }
1568

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation


Attachments:
(No filename) (8.63 kB)
.config.gz (28.42 kB)
Download all attachments

2018-11-07 20:09:27

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] mm: convert totalram_pages and totalhigh_pages variables to atomic

Hi Arun,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v4.20-rc1 next-20181107]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url: https://github.com/0day-ci/linux/commits/Arun-KS/mm-Fix-multiple-evaluvations-of-totalram_pages-and-managed_pages/20181108-025657
config: x86_64-randconfig-x018-201844 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64

All error/warnings (new ones prefixed by >>):

In file included from include/asm-generic/bug.h:5:0,
from arch/x86/include/asm/bug.h:47,
from include/linux/bug.h:5,
from include/linux/mmdebug.h:5,
from include/linux/gfp.h:5,
from mm/kasan/quarantine.c:20:
mm/kasan/quarantine.c: In function 'quarantine_reduce':
>> include/linux/compiler.h:246:20: error: lvalue required as unary '&' operand
__read_once_size(&(x), __u.__c, sizeof(x)); \
^
>> include/linux/compiler.h:252:22: note: in expansion of macro '__READ_ONCE'
#define READ_ONCE(x) __READ_ONCE(x, 1)
^~~~~~~~~~~
>> mm/kasan/quarantine.c:239:16: note: in expansion of macro 'READ_ONCE'
total_size = (READ_ONCE(totalram_pages()) << PAGE_SHIFT) /
^~~~~~~~~
include/linux/compiler.h:248:28: error: lvalue required as unary '&' operand
__read_once_size_nocheck(&(x), __u.__c, sizeof(x)); \
^
>> include/linux/compiler.h:252:22: note: in expansion of macro '__READ_ONCE'
#define READ_ONCE(x) __READ_ONCE(x, 1)
^~~~~~~~~~~
>> mm/kasan/quarantine.c:239:16: note: in expansion of macro 'READ_ONCE'
total_size = (READ_ONCE(totalram_pages()) << PAGE_SHIFT) /
^~~~~~~~~
--
In file included from include/asm-generic/bug.h:5:0,
from arch/x86/include/asm/bug.h:47,
from include/linux/bug.h:5,
from include/linux/mmdebug.h:5,
from include/linux/gfp.h:5,
from mm//kasan/quarantine.c:20:
mm//kasan/quarantine.c: In function 'quarantine_reduce':
>> include/linux/compiler.h:246:20: error: lvalue required as unary '&' operand
__read_once_size(&(x), __u.__c, sizeof(x)); \
^
>> include/linux/compiler.h:252:22: note: in expansion of macro '__READ_ONCE'
#define READ_ONCE(x) __READ_ONCE(x, 1)
^~~~~~~~~~~
mm//kasan/quarantine.c:239:16: note: in expansion of macro 'READ_ONCE'
total_size = (READ_ONCE(totalram_pages()) << PAGE_SHIFT) /
^~~~~~~~~
include/linux/compiler.h:248:28: error: lvalue required as unary '&' operand
__read_once_size_nocheck(&(x), __u.__c, sizeof(x)); \
^
>> include/linux/compiler.h:252:22: note: in expansion of macro '__READ_ONCE'
#define READ_ONCE(x) __READ_ONCE(x, 1)
^~~~~~~~~~~
mm//kasan/quarantine.c:239:16: note: in expansion of macro 'READ_ONCE'
total_size = (READ_ONCE(totalram_pages()) << PAGE_SHIFT) /
^~~~~~~~~

vim +/READ_ONCE +239 mm/kasan/quarantine.c

211
212 void quarantine_reduce(void)
213 {
214 size_t total_size, new_quarantine_size, percpu_quarantines;
215 unsigned long flags;
216 int srcu_idx;
217 struct qlist_head to_free = QLIST_INIT;
218
219 if (likely(READ_ONCE(quarantine_size) <=
220 READ_ONCE(quarantine_max_size)))
221 return;
222
223 /*
224 * srcu critical section ensures that quarantine_remove_cache()
225 * will not miss objects belonging to the cache while they are in our
226 * local to_free list. srcu is chosen because (1) it gives us private
227 * grace period domain that does not interfere with anything else,
228 * and (2) it allows synchronize_srcu() to return without waiting
229 * if there are no pending read critical sections (which is the
230 * expected case).
231 */
232 srcu_idx = srcu_read_lock(&remove_cache_srcu);
233 raw_spin_lock_irqsave(&quarantine_lock, flags);
234
235 /*
236 * Update quarantine size in case of hotplug. Allocate a fraction of
237 * the installed memory to quarantine minus per-cpu queue limits.
238 */
> 239 total_size = (READ_ONCE(totalram_pages()) << PAGE_SHIFT) /
240 QUARANTINE_FRACTION;
241 percpu_quarantines = QUARANTINE_PERCPU_SIZE * num_online_cpus();
242 new_quarantine_size = (total_size < percpu_quarantines) ?
243 0 : total_size - percpu_quarantines;
244 WRITE_ONCE(quarantine_max_size, new_quarantine_size);
245 /* Aim at consuming at most 1/2 of slots in quarantine. */
246 WRITE_ONCE(quarantine_batch_size, max((size_t)QUARANTINE_PERCPU_SIZE,
247 2 * total_size / QUARANTINE_BATCHES));
248
249 if (likely(quarantine_size > quarantine_max_size)) {
250 qlist_move_all(&global_quarantine[quarantine_head], &to_free);
251 WRITE_ONCE(quarantine_size, quarantine_size - to_free.bytes);
252 quarantine_head++;
253 if (quarantine_head == QUARANTINE_BATCHES)
254 quarantine_head = 0;
255 }
256
257 raw_spin_unlock_irqrestore(&quarantine_lock, flags);
258
259 qlist_free_all(&to_free, NULL);
260 srcu_read_unlock(&remove_cache_srcu, srcu_idx);
261 }
262

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation


Attachments:
(No filename) (5.73 kB)
.config.gz (30.24 kB)
Download all attachments

2018-11-07 20:26:13

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] mm: convert totalram_pages and totalhigh_pages variables to atomic

Hi Arun,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v4.20-rc1 next-20181107]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url: https://github.com/0day-ci/linux/commits/Arun-KS/mm-Fix-multiple-evaluvations-of-totalram_pages-and-managed_pages/20181108-025657
config: i386-randconfig-s2-201844 (attached as .config)
compiler: gcc-6 (Debian 6.4.0-9) 6.4.0 20171026
reproduce:
# save the attached .config to linux build tree
make ARCH=i386

All errors (new ones prefixed by >>):

>> ERROR: "_totalram_pages" [drivers/gpu/drm/i915/i915.ko] undefined!
ERROR: "_totalram_pages" [drivers/char/agp/agpgart.ko] undefined!

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation


Attachments:
(No filename) (949.00 B)
.config.gz (27.22 kB)
Download all attachments

2018-11-07 21:29:06

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] mm: Remove managed_page_count spinlock

Hi Arun,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v4.20-rc1 next-20181107]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url: https://github.com/0day-ci/linux/commits/Arun-KS/mm-Fix-multiple-evaluvations-of-totalram_pages-and-managed_pages/20181108-025657
config: x86_64-allmodconfig (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64

All errors (new ones prefixed by >>):

>> mm/kasan/quarantine.c:239:23: error: not addressable
>> mm/kasan/quarantine.c:239:23: error: not addressable
In file included from include/asm-generic/bug.h:5:0,
from arch/x86/include/asm/bug.h:47,
from include/linux/bug.h:5,
from include/linux/mmdebug.h:5,
from include/linux/gfp.h:5,
from mm/kasan/quarantine.c:20:
mm/kasan/quarantine.c: In function 'quarantine_reduce':
include/linux/compiler.h:246:20: error: lvalue required as unary '&' operand
__read_once_size(&(x), __u.__c, sizeof(x)); \
^
include/linux/compiler.h:252:22: note: in expansion of macro '__READ_ONCE'
#define READ_ONCE(x) __READ_ONCE(x, 1)
^~~~~~~~~~~
mm/kasan/quarantine.c:239:16: note: in expansion of macro 'READ_ONCE'
total_size = (READ_ONCE(totalram_pages()) << PAGE_SHIFT) /
^~~~~~~~~
include/linux/compiler.h:248:28: error: lvalue required as unary '&' operand
__read_once_size_nocheck(&(x), __u.__c, sizeof(x)); \
^
include/linux/compiler.h:252:22: note: in expansion of macro '__READ_ONCE'
#define READ_ONCE(x) __READ_ONCE(x, 1)
^~~~~~~~~~~
mm/kasan/quarantine.c:239:16: note: in expansion of macro 'READ_ONCE'
total_size = (READ_ONCE(totalram_pages()) << PAGE_SHIFT) /
^~~~~~~~~
--
include/linux/slab.h:332:43: warning: dubious: x & !y
include/linux/slab.h:332:43: warning: dubious: x & !y
>> net/sctp/protocol.c:1430:13: error: undefined identifier 'totalram_pgs'
net/sctp/protocol.c:1431:24: error: undefined identifier 'totalram_pgs'
net/sctp/protocol.c:1433:24: error: undefined identifier 'totalram_pgs'
>> /bin/bash: line 1: 74457 Segmentation fault sparse -D__linux__ -Dlinux -D__STDC__ -Dunix -D__unix__ -Wbitwise -Wno-return-void -Wno-unknown-attribute -D__CHECK_ENDIAN__ -D__x86_64__ -mlittle-endian -m64 -Wp,-MD,net/sctp/.protocol.o.d -nostdinc -isystem /usr/lib/gcc/x86_64-linux-gnu/7/include -Iarch/x86/include -I./arch/x86/include/generated -Iinclude -I./include -Iarch/x86/include/uapi -I./arch/x86/include/generated/uapi -Iinclude/uapi -I./include/generated/uapi -include include/linux/kconfig.h -include include/linux/compiler_types.h -Inet/sctp -Inet/sctp -D__KERNEL__ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -fshort-wchar -Werror-implicit-function-declaration -Wno-format-security -std=gnu89 -fno-PIE -DCC_HAVE_ASM_GOTO -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx -m64 -falign-jumps=1 -falign-loops=1 -mno-80387 -mno-fp-ret-in-387 -mpreferred-stack-boundary=3 -mskip-rax-setup -mtune=generic -mno-red-zone -mcmodel=kernel -funit-at-a-time -DCONFIG_X86_X32_ABI -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -DCONFIG_AS_CFI_SECTIONS=1 -DCONFIG_AS_FXSAVEQ=1 -DCONFIG_AS_SSSE3=1 -DCONFIG_AS_AVX=1 -DCONFIG_AS_AVX2=1 -DCONFIG_AS_AVX512=1 -DCONFIG_AS_SHA1_NI=1 -DCONFIG_AS_SHA256_NI=1 -pipe -Wno-sign-compare -fno-asynchronous-unwind-tables -mindirect-branch=thunk-extern -mindirect-branch-register -DRETPOLINE -Wa,arch/x86/kernel/macros.s -Wa,- -fno-delete-null-pointer-checks -Wno-frame-address -Wno-format-truncation -Wno-format-overflow -Wno-int-in-bool-context -O2 --param=allow-store-data-races=0 -fplugin=./scripts/gcc-plugins/latent_entropy_plugin.so -fplugin=./scripts/gcc-plugins/structleak_plugin.so -fplugin=./scripts/gcc-plugins/randomize_layout_plugin.so -fplugin=./scripts/gcc-plugins/stackleak_plugin.so -DLATENT_ENTROPY_PLUGIN -DSTRUCTLEAK_PLUGIN -DRANDSTRUCT_PLUGIN -DSTACKLEAK_PLUGIN -fplugin-arg-stackleak_plugin-track-min-size=100 -fno-reorder-blocks -fno-ipa-cp-clone -fno-partial-inlining -Wframe-larger-than=8192 -fstack-protector-strong -Wno-unused-but-set-variable -Wno-unused-const-variable -fno-var-tracking-assignments -pg -mrecord-mcount -mfentry -DCC_USING_FENTRY -fno-inline-functions-called-once -Wdeclaration-after-statement -Wvla -Wno-pointer-sign -fno-strict-overflow -fno-merge-all-constants -fmerge-constants -fno-stack-check -fconserve-stack -Werror=implicit-int -Werror=strict-prototypes -Werror=date-time -Werror=incompatible-pointer-types -Werror=designated-init -fsanitize=kernel-address -fasan-shadow-offset=0xdffffc0000000000 --param asan-globals=1 --param asan-instrumentation-with-call-threshold=0 --param asan-stack=1 -fsanitize-coverage=trace-pc -DMODULE -DKBUILD_BASENAME='"protocol"' -DKBUILD_MODNAME='"sctp"' net/sctp/protocol.c

vim +239 mm/kasan/quarantine.c

55834c59 Alexander Potapenko 2016-05-20 211
55834c59 Alexander Potapenko 2016-05-20 212 void quarantine_reduce(void)
55834c59 Alexander Potapenko 2016-05-20 213 {
64abdcb2 Dmitry Vyukov 2016-12-12 214 size_t total_size, new_quarantine_size, percpu_quarantines;
55834c59 Alexander Potapenko 2016-05-20 215 unsigned long flags;
ce5bec54 Dmitry Vyukov 2017-03-09 216 int srcu_idx;
55834c59 Alexander Potapenko 2016-05-20 217 struct qlist_head to_free = QLIST_INIT;
55834c59 Alexander Potapenko 2016-05-20 218
64abdcb2 Dmitry Vyukov 2016-12-12 219 if (likely(READ_ONCE(quarantine_size) <=
64abdcb2 Dmitry Vyukov 2016-12-12 220 READ_ONCE(quarantine_max_size)))
55834c59 Alexander Potapenko 2016-05-20 221 return;
55834c59 Alexander Potapenko 2016-05-20 222
ce5bec54 Dmitry Vyukov 2017-03-09 223 /*
ce5bec54 Dmitry Vyukov 2017-03-09 224 * srcu critical section ensures that quarantine_remove_cache()
ce5bec54 Dmitry Vyukov 2017-03-09 225 * will not miss objects belonging to the cache while they are in our
ce5bec54 Dmitry Vyukov 2017-03-09 226 * local to_free list. srcu is chosen because (1) it gives us private
ce5bec54 Dmitry Vyukov 2017-03-09 227 * grace period domain that does not interfere with anything else,
ce5bec54 Dmitry Vyukov 2017-03-09 228 * and (2) it allows synchronize_srcu() to return without waiting
ce5bec54 Dmitry Vyukov 2017-03-09 229 * if there are no pending read critical sections (which is the
ce5bec54 Dmitry Vyukov 2017-03-09 230 * expected case).
ce5bec54 Dmitry Vyukov 2017-03-09 231 */
ce5bec54 Dmitry Vyukov 2017-03-09 232 srcu_idx = srcu_read_lock(&remove_cache_srcu);
026d1eaf Clark Williams 2018-10-26 233 raw_spin_lock_irqsave(&quarantine_lock, flags);
55834c59 Alexander Potapenko 2016-05-20 234
55834c59 Alexander Potapenko 2016-05-20 235 /*
55834c59 Alexander Potapenko 2016-05-20 236 * Update quarantine size in case of hotplug. Allocate a fraction of
55834c59 Alexander Potapenko 2016-05-20 237 * the installed memory to quarantine minus per-cpu queue limits.
55834c59 Alexander Potapenko 2016-05-20 238 */
a399c534 Arun KS 2018-11-06 @239 total_size = (READ_ONCE(totalram_pages()) << PAGE_SHIFT) /
55834c59 Alexander Potapenko 2016-05-20 240 QUARANTINE_FRACTION;
c3cee372 Alexander Potapenko 2016-08-02 241 percpu_quarantines = QUARANTINE_PERCPU_SIZE * num_online_cpus();
64abdcb2 Dmitry Vyukov 2016-12-12 242 new_quarantine_size = (total_size < percpu_quarantines) ?
64abdcb2 Dmitry Vyukov 2016-12-12 243 0 : total_size - percpu_quarantines;
64abdcb2 Dmitry Vyukov 2016-12-12 244 WRITE_ONCE(quarantine_max_size, new_quarantine_size);
64abdcb2 Dmitry Vyukov 2016-12-12 245 /* Aim at consuming at most 1/2 of slots in quarantine. */
64abdcb2 Dmitry Vyukov 2016-12-12 246 WRITE_ONCE(quarantine_batch_size, max((size_t)QUARANTINE_PERCPU_SIZE,
64abdcb2 Dmitry Vyukov 2016-12-12 247 2 * total_size / QUARANTINE_BATCHES));
64abdcb2 Dmitry Vyukov 2016-12-12 248
64abdcb2 Dmitry Vyukov 2016-12-12 249 if (likely(quarantine_size > quarantine_max_size)) {
64abdcb2 Dmitry Vyukov 2016-12-12 250 qlist_move_all(&global_quarantine[quarantine_head], &to_free);
64abdcb2 Dmitry Vyukov 2016-12-12 251 WRITE_ONCE(quarantine_size, quarantine_size - to_free.bytes);
64abdcb2 Dmitry Vyukov 2016-12-12 252 quarantine_head++;
64abdcb2 Dmitry Vyukov 2016-12-12 253 if (quarantine_head == QUARANTINE_BATCHES)
64abdcb2 Dmitry Vyukov 2016-12-12 254 quarantine_head = 0;
55834c59 Alexander Potapenko 2016-05-20 255 }
55834c59 Alexander Potapenko 2016-05-20 256
026d1eaf Clark Williams 2018-10-26 257 raw_spin_unlock_irqrestore(&quarantine_lock, flags);
55834c59 Alexander Potapenko 2016-05-20 258
55834c59 Alexander Potapenko 2016-05-20 259 qlist_free_all(&to_free, NULL);
ce5bec54 Dmitry Vyukov 2017-03-09 260 srcu_read_unlock(&remove_cache_srcu, srcu_idx);
55834c59 Alexander Potapenko 2016-05-20 261 }
55834c59 Alexander Potapenko 2016-05-20 262

:::::: The code at line 239 was first introduced by commit
:::::: a399c534492723c9d2f175bc2b66aa930abd895f mm: convert totalram_pages and totalhigh_pages variables to atomic

:::::: TO: Arun KS <[email protected]>
:::::: CC: 0day robot <[email protected]>

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation


Attachments:
(No filename) (9.72 kB)
.config.gz (65.04 kB)
Download all attachments

2018-11-08 07:24:21

by Arun KS

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] mm: convert totalram_pages and totalhigh_pages variables to atomic

On 2018-11-07 14:34, Vlastimil Babka wrote:
> On 11/6/18 5:21 PM, Arun KS wrote:
>> totalram_pages and totalhigh_pages are made static inline function.
>>
>> Suggested-by: Michal Hocko <[email protected]>
>> Suggested-by: Vlastimil Babka <[email protected]>
>> Signed-off-by: Arun KS <[email protected]>
>> Reviewed-by: Konstantin Khlebnikov <[email protected]>
>> Acked-by: Michal Hocko <[email protected]>
>
> Acked-by: Vlastimil Babka <[email protected]>
>
> One bug (probably) below:
>
>> diff --git a/mm/highmem.c b/mm/highmem.c
>> index 59db322..02a9a4b 100644
>> --- a/mm/highmem.c
>> +++ b/mm/highmem.c
>> @@ -105,9 +105,7 @@ static inline wait_queue_head_t
>> *get_pkmap_wait_queue_head(unsigned int color)
>> }
>> #endif
>>
>> -unsigned long totalhigh_pages __read_mostly;
>> -EXPORT_SYMBOL(totalhigh_pages);
>
> I think you still need to export _totalhigh_pages so that modules can
> use the inline accessors.

Thanks for pointing this. I missed that. Will do the same for
_totalram_pages.

Regards,
Arun

>
>> -
>> +atomic_long_t _totalhigh_pages __read_mostly;
>>
>> EXPORT_PER_CPU_SYMBOL(__kmap_atomic_idx);
>>