LinuxLists.cc - [PATCH 4.19 00/34] 4.19.197-rc1 review

2021-07-09 13:22:58

by Greg Kroah-Hartman

[permalink] [raw]

Subject: [PATCH 4.19 00/34] 4.19.197-rc1 review

This is the start of the stable review cycle for the 4.19.197 release.
There are 34 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.

Responses should be made by Sun, 11 Jul 2021 13:14:09 +0000.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.197-rc1.gz
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <[email protected]>
Linux 4.19.197-rc1

Tony Lindgren <[email protected]>
clocksource/drivers/timer-ti-dm: Handle dra7 timer wrap errata i940

Tony Lindgren <[email protected]>
clocksource/drivers/timer-ti-dm: Prepare to handle dra7 timer wrap issue

Tony Lindgren <[email protected]>
clocksource/drivers/timer-ti-dm: Add clockevent and clocksource support

afzal mohammed <[email protected]>
ARM: OMAP: replace setup_irq() by request_irq()

Alper Gun <[email protected]>
KVM: SVM: Call SEV Guest Decommission if ASID binding fails

Juergen Gross <[email protected]>
xen/events: reset active flag for lateeoi events later

Petr Mladek <[email protected]>
kthread: prevent deadlock when kthread_mod_delayed_work() races with kthread_cancel_delayed_work_sync()

Petr Mladek <[email protected]>
kthread_worker: split code for canceling the delayed work timer

Anson Huang <[email protected]>
ARM: dts: imx6qdl-sabresd: Remove incorrect power supply assignment

David Rientjes <[email protected]>
KVM: SVM: Periodically schedule when unregistering regions on destroy

Tahsin Erdogan <[email protected]>
ext4: eliminate bogus error in ext4_data_block_valid_rcu()

Christian König <[email protected]>
drm/nouveau: fix dma_address check for CPU/GPU sync

ManYi Li <[email protected]>
scsi: sr: Return appropriate error code when disk is ejected

Hugh Dickins <[email protected]>
mm, futex: fix shared futex pgoff on shmem huge page

Hugh Dickins <[email protected]>
mm/thp: another PVMW_SYNC fix in page_vma_mapped_walk()

Hugh Dickins <[email protected]>
mm/thp: fix page_vma_mapped_walk() if THP mapped by ptes

Hugh Dickins <[email protected]>
mm: page_vma_mapped_walk(): get vma_address_end() earlier

Hugh Dickins <[email protected]>
mm: page_vma_mapped_walk(): use goto instead of while (1)

Hugh Dickins <[email protected]>
mm: page_vma_mapped_walk(): add a level of indentation

Hugh Dickins <[email protected]>
mm: page_vma_mapped_walk(): crossing page table boundary

Hugh Dickins <[email protected]>
mm: page_vma_mapped_walk(): prettify PVMW_MIGRATION block

Hugh Dickins <[email protected]>
mm: page_vma_mapped_walk(): use pmde for *pvmw->pmd

Hugh Dickins <[email protected]>
mm: page_vma_mapped_walk(): settle PageHuge on entry

Hugh Dickins <[email protected]>
mm: page_vma_mapped_walk(): use page for pvmw->page

Yang Shi <[email protected]>
mm: thp: replace DEBUG_VM BUG with VM_WARN when unmap fails for split

Hugh Dickins <[email protected]>
mm/thp: unmap_mapping_page() to fix THP truncate_cleanup_page()

Jue Wang <[email protected]>
mm/thp: fix page_address_in_vma() on file THP tails

Hugh Dickins <[email protected]>
mm/thp: fix vma_address() if virtual address below file offset

Hugh Dickins <[email protected]>
mm/thp: try_to_unmap() use TTU_SYNC for safe splitting

Hugh Dickins <[email protected]>
mm/thp: make is_huge_zero_pmd() safe and quicker

Hugh Dickins <[email protected]>
mm/thp: fix __split_huge_pmd_locked() on shmem migration entry

Miaohe Lin <[email protected]>
mm/rmap: use page_not_mapped in try_to_unmap()

Miaohe Lin <[email protected]>
mm/rmap: remove unneeded semicolon in page_not_mapped()

Alex Shi <[email protected]>
mm: add VM_WARN_ON_ONCE_PAGE() macro

-------------

Diffstat:

Makefile | 4 +-
arch/arm/boot/dts/dra7.dtsi | 11 ++
arch/arm/boot/dts/imx6qdl-sabresd.dtsi | 4 -
arch/arm/mach-omap1/pm.c | 13 ++-
arch/arm/mach-omap1/time.c | 10 +-
arch/arm/mach-omap1/timer32k.c | 10 +-
arch/arm/mach-omap2/board-generic.c | 4 +-
arch/arm/mach-omap2/timer.c | 181 +++++++++++++++++++++++----------
arch/x86/kvm/svm.c | 33 ++++--
drivers/clk/ti/clk-7xx.c | 1 +
drivers/gpu/drm/nouveau/nouveau_bo.c | 4 +-
drivers/scsi/sr.c | 2 +
drivers/xen/events/events_base.c | 23 ++++-
fs/ext4/block_validity.c | 4 +-
include/linux/cpuhotplug.h | 1 +
include/linux/huge_mm.h | 8 +-
include/linux/hugetlb.h | 16 ---
include/linux/mm.h | 3 +
include/linux/mmdebug.h | 13 +++
include/linux/pagemap.h | 13 +--
include/linux/rmap.h | 3 +-
kernel/futex.c | 2 +-
kernel/kthread.c | 77 +++++++++-----
mm/huge_memory.c | 56 +++++-----
mm/hugetlb.c | 5 +-
mm/internal.h | 53 +++++++---
mm/memory.c | 41 ++++++++
mm/page_vma_mapped.c | 160 ++++++++++++++++++-----------
mm/pgtable-generic.c | 4 +-
mm/rmap.c | 48 +++++----
mm/truncate.c | 43 ++++----
31 files changed, 546 insertions(+), 304 deletions(-)

2021-07-09 13:23:07

by Greg Kroah-Hartman

[permalink] [raw]

Subject: [PATCH 4.19 22/34] scsi: sr: Return appropriate error code when disk is ejected

From: ManYi Li <[email protected]>

[ Upstream commit 7dd753ca59d6c8cc09aa1ed24f7657524803c7f3 ]

Handle a reported media event code of 3. This indicates that the media has
been removed from the drive and user intervention is required to proceed.
Return DISK_EVENT_EJECT_REQUEST in that case.

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: ManYi Li <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/scsi/sr.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/scsi/sr.c b/drivers/scsi/sr.c
index 45c8bf39ad23..acf0c244141f 100644
--- a/drivers/scsi/sr.c
+++ b/drivers/scsi/sr.c
@@ -216,6 +216,8 @@ static unsigned int sr_get_events(struct scsi_device *sdev)
return DISK_EVENT_EJECT_REQUEST;
else if (med->media_event_code == 2)
return DISK_EVENT_MEDIA_CHANGE;
+ else if (med->media_event_code == 3)
+ return DISK_EVENT_EJECT_REQUEST;
return 0;
}

--
2.30.2

2021-07-09 13:23:09

by Greg Kroah-Hartman

[permalink] [raw]

Subject: [PATCH 4.19 33/34] clocksource/drivers/timer-ti-dm: Prepare to handle dra7 timer wrap issue

From: Tony Lindgren <[email protected]>

commit 3efe7a878a11c13b5297057bfc1e5639ce1241ce upstream.

There is a timer wrap issue on dra7 for the ARM architected timer.
In a typical clock configuration the timer fails to wrap after 388 days.

To work around the issue, we need to use timer-ti-dm timers instead.

Let's prepare for adding support for percpu timers by adding a common
dmtimer_clkevt_init_common() and call it from __omap_sync32k_timer_init().
This patch makes no intentional functional changes.

Cc: Daniel Lezcano <[email protected]>
Cc: Keerthy <[email protected]>
Cc: Tero Kristo <[email protected]>
[[email protected]: backported to 4.19.y]
Signed-off-by: Tony Lindgren <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/arm/mach-omap2/timer.c | 34 +++++++++++++++++++---------------
1 file changed, 19 insertions(+), 15 deletions(-)

--- a/arch/arm/mach-omap2/timer.c
+++ b/arch/arm/mach-omap2/timer.c
@@ -368,18 +368,21 @@ void tick_broadcast(const struct cpumask
}
#endif

-static void __init omap2_gp_clockevent_init(int gptimer_id,
- const char *fck_source,
- const char *property)
+static void __init dmtimer_clkevt_init_common(struct dmtimer_clockevent *clkevt,
+ int gptimer_id,
+ const char *fck_source,
+ unsigned int features,
+ const struct cpumask *cpumask,
+ const char *property,
+ int rating, const char *name)
{
- struct dmtimer_clockevent *clkevt = &clockevent;
struct omap_dm_timer *timer = &clkevt->timer;
int res;

timer->id = gptimer_id;
timer->errata = omap_dm_timer_get_errata();
- clkevt->dev.features = CLOCK_EVT_FEAT_PERIODIC | CLOCK_EVT_FEAT_ONESHOT;
- clkevt->dev.rating = 300;
+ clkevt->dev.features = features;
+ clkevt->dev.rating = rating;
clkevt->dev.set_next_event = omap2_gp_timer_set_next_event;
clkevt->dev.set_state_shutdown = omap2_gp_timer_shutdown;
clkevt->dev.set_state_periodic = omap2_gp_timer_set_periodic;
@@ -397,19 +400,15 @@ static void __init omap2_gp_clockevent_i
&clkevt->dev.name, OMAP_TIMER_POSTED);
BUG_ON(res);

- clkevt->dev.cpumask = cpu_possible_mask;
+ clkevt->dev.cpumask = cpumask;
clkevt->dev.irq = omap_dm_timer_get_irq(timer);

- if (request_irq(timer->irq, omap2_gp_timer_interrupt,
- IRQF_TIMER | IRQF_IRQPOLL, "gp_timer", clkevt))
- pr_err("Failed to request irq %d (gp_timer)\n", timer->irq);
+ if (request_irq(clkevt->dev.irq, omap2_gp_timer_interrupt,
+ IRQF_TIMER | IRQF_IRQPOLL, name, clkevt))
+ pr_err("Failed to request irq %d (gp_timer)\n", clkevt->dev.irq);

__omap_dm_timer_int_enable(timer, OMAP_TIMER_INT_OVERFLOW);

- clockevents_config_and_register(&clkevt->dev, timer->rate,
- 3, /* Timer internal resynch latency */
- 0xffffffff);
-
if (soc_is_am33xx() || soc_is_am43xx()) {
clkevt->dev.suspend = omap_clkevt_idle;
clkevt->dev.resume = omap_clkevt_unidle;
@@ -559,7 +558,12 @@ static void __init __omap_sync32k_timer_
{
omap_clk_init();
omap_dmtimer_init();
- omap2_gp_clockevent_init(clkev_nr, clkev_src, clkev_prop);
+ dmtimer_clkevt_init_common(&clockevent, clkev_nr, clkev_src,
+ CLOCK_EVT_FEAT_PERIODIC | CLOCK_EVT_FEAT_ONESHOT,
+ cpu_possible_mask, clkev_prop, 300, "clockevent");
+ clockevents_config_and_register(&clockevent.dev, clockevent.timer.rate,
+ 3, /* Timer internal resynch latency */
+ 0xffffffff);

/* Enable the use of clocksource="gp_timer" kernel parameter */
if (use_gptimer_clksrc || gptimer)

2021-07-09 13:23:20

by Greg Kroah-Hartman

[permalink] [raw]

Subject: [PATCH 4.19 32/34] clocksource/drivers/timer-ti-dm: Add clockevent and clocksource support

From: Tony Lindgren <[email protected]>

commit 52762fbd1c4778ac9b173624ca0faacd22ef4724 upstream.

We can move the TI dmtimer clockevent and clocksource to live under
drivers/clocksource if we rely only on the clock framework, and handle
the module configuration directly in the clocksource driver based on the
device tree data.

This removes the early dependency with system timers to the interconnect
related code, and we can probe pretty much everything else later on at
the module_init level.

Let's first add a new driver for timer-ti-dm-systimer based on existing
arch/arm/mach-omap2/timer.c. Then let's start moving SoCs to probe with
device tree data while still keeping the old timer.c. And eventually we
can just drop the old timer.c.

Let's take the opportunity to switch to use readl/writel as pointed out
by Daniel Lezcano <[email protected]>. This allows further
clean-up of the timer-ti-dm code the a lot of the shared helpers can
just become static to the non-syster related code.

Note the boards can optionally configure different timer source clocks
if needed with assigned-clocks and assigned-clock-parents.

Cc: Daniel Lezcano <[email protected]>
Cc: Keerthy <[email protected]>
Cc: Tero Kristo <[email protected]>
[[email protected]: backported to 4.19.y]
Signed-off-by: Tony Lindgren <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/arm/mach-omap2/timer.c | 109 ++++++++++++++++++++++++++------------------
1 file changed, 65 insertions(+), 44 deletions(-)

--- a/arch/arm/mach-omap2/timer.c
+++ b/arch/arm/mach-omap2/timer.c
@@ -64,15 +64,28 @@

/* Clockevent code */

-static struct omap_dm_timer clkev;
-static struct clock_event_device clockevent_gpt;
-
/* Clockevent hwmod for am335x and am437x suspend */
static struct omap_hwmod *clockevent_gpt_hwmod;

/* Clockesource hwmod for am437x suspend */
static struct omap_hwmod *clocksource_gpt_hwmod;

+struct dmtimer_clockevent {
+ struct clock_event_device dev;
+ struct omap_dm_timer timer;
+};
+
+static struct dmtimer_clockevent clockevent;
+
+static struct omap_dm_timer *to_dmtimer(struct clock_event_device *clockevent)
+{
+ struct dmtimer_clockevent *clkevt =
+ container_of(clockevent, struct dmtimer_clockevent, dev);
+ struct omap_dm_timer *timer = &clkevt->timer;
+
+ return timer;
+}
+
#ifdef CONFIG_SOC_HAS_REALTIME_COUNTER
static unsigned long arch_timer_freq;

@@ -84,10 +97,11 @@ void set_cntfreq(void)

static irqreturn_t omap2_gp_timer_interrupt(int irq, void *dev_id)
{
- struct clock_event_device *evt = &clockevent_gpt;
-
- __omap_dm_timer_write_status(&clkev, OMAP_TIMER_INT_OVERFLOW);
+ struct dmtimer_clockevent *clkevt = dev_id;
+ struct clock_event_device *evt = &clkevt->dev;
+ struct omap_dm_timer *timer = &clkevt->timer;

+ __omap_dm_timer_write_status(timer, OMAP_TIMER_INT_OVERFLOW);
evt->event_handler(evt);
return IRQ_HANDLED;
}
@@ -95,7 +109,9 @@ static irqreturn_t omap2_gp_timer_interr
static int omap2_gp_timer_set_next_event(unsigned long cycles,
struct clock_event_device *evt)
{
- __omap_dm_timer_load_start(&clkev, OMAP_TIMER_CTRL_ST,
+ struct omap_dm_timer *timer = to_dmtimer(evt);
+
+ __omap_dm_timer_load_start(timer, OMAP_TIMER_CTRL_ST,
0xffffffff - cycles, OMAP_TIMER_POSTED);

return 0;
@@ -103,22 +119,26 @@ static int omap2_gp_timer_set_next_event

static int omap2_gp_timer_shutdown(struct clock_event_device *evt)
{
- __omap_dm_timer_stop(&clkev, OMAP_TIMER_POSTED, clkev.rate);
+ struct omap_dm_timer *timer = to_dmtimer(evt);
+
+ __omap_dm_timer_stop(timer, OMAP_TIMER_POSTED, timer->rate);
+
return 0;
}

static int omap2_gp_timer_set_periodic(struct clock_event_device *evt)
{
+ struct omap_dm_timer *timer = to_dmtimer(evt);
u32 period;

- __omap_dm_timer_stop(&clkev, OMAP_TIMER_POSTED, clkev.rate);
+ __omap_dm_timer_stop(timer, OMAP_TIMER_POSTED, timer->rate);

- period = clkev.rate / HZ;
+ period = timer->rate / HZ;
period -= 1;
/* Looks like we need to first set the load value separately */
- __omap_dm_timer_write(&clkev, OMAP_TIMER_LOAD_REG, 0xffffffff - period,
+ __omap_dm_timer_write(timer, OMAP_TIMER_LOAD_REG, 0xffffffff - period,
OMAP_TIMER_POSTED);
- __omap_dm_timer_load_start(&clkev,
+ __omap_dm_timer_load_start(timer,
OMAP_TIMER_CTRL_AR | OMAP_TIMER_CTRL_ST,
0xffffffff - period, OMAP_TIMER_POSTED);
return 0;
@@ -132,26 +152,17 @@ static void omap_clkevt_idle(struct cloc
omap_hwmod_idle(clockevent_gpt_hwmod);
}

-static void omap_clkevt_unidle(struct clock_event_device *unused)
+static void omap_clkevt_unidle(struct clock_event_device *evt)
{
+ struct omap_dm_timer *timer = to_dmtimer(evt);
+
if (!clockevent_gpt_hwmod)
return;

omap_hwmod_enable(clockevent_gpt_hwmod);
- __omap_dm_timer_int_enable(&clkev, OMAP_TIMER_INT_OVERFLOW);
+ __omap_dm_timer_int_enable(timer, OMAP_TIMER_INT_OVERFLOW);
}

-static struct clock_event_device clockevent_gpt = {
- .features = CLOCK_EVT_FEAT_PERIODIC |
- CLOCK_EVT_FEAT_ONESHOT,
- .rating = 300,
- .set_next_event = omap2_gp_timer_set_next_event,
- .set_state_shutdown = omap2_gp_timer_shutdown,
- .set_state_periodic = omap2_gp_timer_set_periodic,
- .set_state_oneshot = omap2_gp_timer_shutdown,
- .tick_resume = omap2_gp_timer_shutdown,
-};
-
static const struct of_device_id omap_timer_match[] __initconst = {
{ .compatible = "ti,omap2420-timer", },
{ .compatible = "ti,omap3430-timer", },
@@ -361,44 +372,54 @@ static void __init omap2_gp_clockevent_i
const char *fck_source,
const char *property)
{
+ struct dmtimer_clockevent *clkevt = &clockevent;
+ struct omap_dm_timer *timer = &clkevt->timer;
int res;

- clkev.id = gptimer_id;
- clkev.errata = omap_dm_timer_get_errata();
+ timer->id = gptimer_id;
+ timer->errata = omap_dm_timer_get_errata();
+ clkevt->dev.features = CLOCK_EVT_FEAT_PERIODIC | CLOCK_EVT_FEAT_ONESHOT;
+ clkevt->dev.rating = 300;
+ clkevt->dev.set_next_event = omap2_gp_timer_set_next_event;
+ clkevt->dev.set_state_shutdown = omap2_gp_timer_shutdown;
+ clkevt->dev.set_state_periodic = omap2_gp_timer_set_periodic;
+ clkevt->dev.set_state_oneshot = omap2_gp_timer_shutdown;
+ clkevt->dev.tick_resume = omap2_gp_timer_shutdown;

/*
* For clock-event timers we never read the timer counter and
* so we are not impacted by errata i103 and i767. Therefore,
* we can safely ignore this errata for clock-event timers.
*/
- __omap_dm_timer_override_errata(&clkev, OMAP_TIMER_ERRATA_I103_I767);
+ __omap_dm_timer_override_errata(timer, OMAP_TIMER_ERRATA_I103_I767);

- res = omap_dm_timer_init_one(&clkev, fck_source, property,
- &clockevent_gpt.name, OMAP_TIMER_POSTED);
+ res = omap_dm_timer_init_one(timer, fck_source, property,
+ &clkevt->dev.name, OMAP_TIMER_POSTED);
BUG_ON(res);

- if (request_irq(clkev.irq, omap2_gp_timer_interrupt,
- IRQF_TIMER | IRQF_IRQPOLL, "gp_timer", &clkev))
- pr_err("Failed to request irq %d (gp_timer)\n", clkev.irq);
-
- __omap_dm_timer_int_enable(&clkev, OMAP_TIMER_INT_OVERFLOW);
-
- clockevent_gpt.cpumask = cpu_possible_mask;
- clockevent_gpt.irq = omap_dm_timer_get_irq(&clkev);
- clockevents_config_and_register(&clockevent_gpt, clkev.rate,
+ clkevt->dev.cpumask = cpu_possible_mask;
+ clkevt->dev.irq = omap_dm_timer_get_irq(timer);
+
+ if (request_irq(timer->irq, omap2_gp_timer_interrupt,
+ IRQF_TIMER | IRQF_IRQPOLL, "gp_timer", clkevt))
+ pr_err("Failed to request irq %d (gp_timer)\n", timer->irq);
+
+ __omap_dm_timer_int_enable(timer, OMAP_TIMER_INT_OVERFLOW);
+
+ clockevents_config_and_register(&clkevt->dev, timer->rate,
3, /* Timer internal resynch latency */
0xffffffff);

if (soc_is_am33xx() || soc_is_am43xx()) {
- clockevent_gpt.suspend = omap_clkevt_idle;
- clockevent_gpt.resume = omap_clkevt_unidle;
+ clkevt->dev.suspend = omap_clkevt_idle;
+ clkevt->dev.resume = omap_clkevt_unidle;

clockevent_gpt_hwmod =
- omap_hwmod_lookup(clockevent_gpt.name);
+ omap_hwmod_lookup(clkevt->dev.name);
}

- pr_info("OMAP clockevent source: %s at %lu Hz\n", clockevent_gpt.name,
- clkev.rate);
+ pr_info("OMAP clockevent source: %s at %lu Hz\n", clkevt->dev.name,
+ timer->rate);
}

/* Clocksource code */

2021-07-09 13:23:21

by Greg Kroah-Hartman

[permalink] [raw]

Subject: [PATCH 4.19 09/34] mm/thp: unmap_mapping_page() to fix THP truncate_cleanup_page()

From: Hugh Dickins <[email protected]>

[ Upstream commit 22061a1ffabdb9c3385de159c5db7aac3a4df1cc ]

There is a race between THP unmapping and truncation, when truncate sees
pmd_none() and skips the entry, after munmap's zap_huge_pmd() cleared
it, but before its page_remove_rmap() gets to decrement
compound_mapcount: generating false "BUG: Bad page cache" reports that
the page is still mapped when deleted. This commit fixes that, but not
in the way I hoped.

The first attempt used try_to_unmap(page, TTU_SYNC|TTU_IGNORE_MLOCK)
instead of unmap_mapping_range() in truncate_cleanup_page(): it has
often been an annoyance that we usually call unmap_mapping_range() with
no pages locked, but there apply it to a single locked page.
try_to_unmap() looks more suitable for a single locked page.

However, try_to_unmap_one() contains a VM_BUG_ON_PAGE(!pvmw.pte,page):
it is used to insert THP migration entries, but not used to unmap THPs.
Copy zap_huge_pmd() and add THP handling now? Perhaps, but their TLB
needs are different, I'm too ignorant of the DAX cases, and couldn't
decide how far to go for anon+swap. Set that aside.

The second attempt took a different tack: make no change in truncate.c,
but modify zap_huge_pmd() to insert an invalidated huge pmd instead of
clearing it initially, then pmd_clear() between page_remove_rmap() and
unlocking at the end. Nice. But powerpc blows that approach out of the
water, with its serialize_against_pte_lookup(), and interesting pgtable
usage. It would need serious help to get working on powerpc (with a
minor optimization issue on s390 too). Set that aside.

Just add an "if (page_mapped(page)) synchronize_rcu();" or other such
delay, after unmapping in truncate_cleanup_page()? Perhaps, but though
that's likely to reduce or eliminate the number of incidents, it would
give less assurance of whether we had identified the problem correctly.

This successful iteration introduces "unmap_mapping_page(page)" instead
of try_to_unmap(), and goes the usual unmap_mapping_range_tree() route,
with an addition to details. Then zap_pmd_range() watches for this
case, and does spin_unlock(pmd_lock) if so - just like
page_vma_mapped_walk() now does in the PVMW_SYNC case. Not pretty, but
safe.

Note that unmap_mapping_page() is doing a VM_BUG_ON(!PageLocked) to
assert its interface; but currently that's only used to make sure that
page->mapping is stable, and zap_pmd_range() doesn't care if the page is
locked or not. Along these lines, in invalidate_inode_pages2_range()
move the initial unmap_mapping_range() out from under page lock, before
then calling unmap_mapping_page() under page lock if still mapped.

Link: https://lkml.kernel.org/r/[email protected]
Fixes: fc127da085c2 ("truncate: handle file thp")
Signed-off-by: Hugh Dickins <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Reviewed-by: Yang Shi <[email protected]>
Cc: Alistair Popple <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: Jue Wang <[email protected]>
Cc: "Matthew Wilcox (Oracle)" <[email protected]>
Cc: Miaohe Lin <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: Ralph Campbell <[email protected]>
Cc: Shakeel Butt <[email protected]>
Cc: Wang Yugui <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

Note on stable backport: fixed up call to truncate_cleanup_page()
in truncate_inode_pages_range(). Use hpage_nr_pages() in
unmap_mapping_page().

Signed-off-by: Hugh Dickins <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
include/linux/mm.h | 3 +++
mm/memory.c | 41 +++++++++++++++++++++++++++++++++++++++++
mm/truncate.c | 43 +++++++++++++++++++------------------------
3 files changed, 63 insertions(+), 24 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index f6ecf41aea83..c736c677b876 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1338,6 +1338,7 @@ struct zap_details {
struct address_space *check_mapping; /* Check page->mapping if set */
pgoff_t first_index; /* Lowest page->index to unmap */
pgoff_t last_index; /* Highest page->index to unmap */
+ struct page *single_page; /* Locked page to be unmapped */
};

struct page *_vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
@@ -1428,6 +1429,7 @@ extern vm_fault_t handle_mm_fault(struct vm_area_struct *vma,
extern int fixup_user_fault(struct task_struct *tsk, struct mm_struct *mm,
unsigned long address, unsigned int fault_flags,
bool *unlocked);
+void unmap_mapping_page(struct page *page);
void unmap_mapping_pages(struct address_space *mapping,
pgoff_t start, pgoff_t nr, bool even_cows);
void unmap_mapping_range(struct address_space *mapping,
@@ -1448,6 +1450,7 @@ static inline int fixup_user_fault(struct task_struct *tsk,
BUG();
return -EFAULT;
}
+static inline void unmap_mapping_page(struct page *page) { }
static inline void unmap_mapping_pages(struct address_space *mapping,
pgoff_t start, pgoff_t nr, bool even_cows) { }
static inline void unmap_mapping_range(struct address_space *mapping,
diff --git a/mm/memory.c b/mm/memory.c
index c2011c51f15d..49b546cdce0d 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1439,7 +1439,18 @@ static inline unsigned long zap_pmd_range(struct mmu_gather *tlb,
else if (zap_huge_pmd(tlb, vma, pmd, addr))
goto next;
/* fall through */
+ } else if (details && details->single_page &&
+ PageTransCompound(details->single_page) &&
+ next - addr == HPAGE_PMD_SIZE && pmd_none(*pmd)) {
+ spinlock_t *ptl = pmd_lock(tlb->mm, pmd);
+ /*
+ * Take and drop THP pmd lock so that we cannot return
+ * prematurely, while zap_huge_pmd() has cleared *pmd,
+ * but not yet decremented compound_mapcount().
+ */
+ spin_unlock(ptl);
}
+
/*
* Here there can be other concurrent MADV_DONTNEED or
* trans huge page faults running, and if the pmd is
@@ -2924,6 +2935,36 @@ static inline void unmap_mapping_range_tree(struct rb_root_cached *root,
}
}

+/**
+ * unmap_mapping_page() - Unmap single page from processes.
+ * @page: The locked page to be unmapped.
+ *
+ * Unmap this page from any userspace process which still has it mmaped.
+ * Typically, for efficiency, the range of nearby pages has already been
+ * unmapped by unmap_mapping_pages() or unmap_mapping_range(). But once
+ * truncation or invalidation holds the lock on a page, it may find that
+ * the page has been remapped again: and then uses unmap_mapping_page()
+ * to unmap it finally.
+ */
+void unmap_mapping_page(struct page *page)
+{
+ struct address_space *mapping = page->mapping;
+ struct zap_details details = { };
+
+ VM_BUG_ON(!PageLocked(page));
+ VM_BUG_ON(PageTail(page));
+
+ details.check_mapping = mapping;
+ details.first_index = page->index;
+ details.last_index = page->index + hpage_nr_pages(page) - 1;
+ details.single_page = page;
+
+ i_mmap_lock_write(mapping);
+ if (unlikely(!RB_EMPTY_ROOT(&mapping->i_mmap.rb_root)))
+ unmap_mapping_range_tree(&mapping->i_mmap, &details);
+ i_mmap_unlock_write(mapping);
+}
+
/**
* unmap_mapping_pages() - Unmap pages from processes.
* @mapping: The address space containing pages to be unmapped.
diff --git a/mm/truncate.c b/mm/truncate.c
index 71b65aab8077..43c73db17a0a 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -175,13 +175,10 @@ void do_invalidatepage(struct page *page, unsigned int offset,
* its lock, b) when a concurrent invalidate_mapping_pages got there first and
* c) when tmpfs swizzles a page between a tmpfs inode and swapper_space.
*/
-static void
-truncate_cleanup_page(struct address_space *mapping, struct page *page)
+static void truncate_cleanup_page(struct page *page)
{
- if (page_mapped(page)) {
- pgoff_t nr = PageTransHuge(page) ? HPAGE_PMD_NR : 1;
- unmap_mapping_pages(mapping, page->index, nr, false);
- }
+ if (page_mapped(page))
+ unmap_mapping_page(page);

if (page_has_private(page))
do_invalidatepage(page, 0, PAGE_SIZE);
@@ -226,7 +223,7 @@ int truncate_inode_page(struct address_space *mapping, struct page *page)
if (page->mapping != mapping)
return -EIO;

- truncate_cleanup_page(mapping, page);
+ truncate_cleanup_page(page);
delete_from_page_cache(page);
return 0;
}
@@ -364,7 +361,7 @@ void truncate_inode_pages_range(struct address_space *mapping,
pagevec_add(&locked_pvec, page);
}
for (i = 0; i < pagevec_count(&locked_pvec); i++)
- truncate_cleanup_page(mapping, locked_pvec.pages[i]);
+ truncate_cleanup_page(locked_pvec.pages[i]);
delete_from_page_cache_batch(mapping, &locked_pvec);
for (i = 0; i < pagevec_count(&locked_pvec); i++)
unlock_page(locked_pvec.pages[i]);
@@ -703,6 +700,16 @@ int invalidate_inode_pages2_range(struct address_space *mapping,
continue;
}

+ if (!did_range_unmap && page_mapped(page)) {
+ /*
+ * If page is mapped, before taking its lock,
+ * zap the rest of the file in one hit.
+ */
+ unmap_mapping_pages(mapping, index,
+ (1 + end - index), false);
+ did_range_unmap = 1;
+ }
+
lock_page(page);
WARN_ON(page_to_index(page) != index);
if (page->mapping != mapping) {
@@ -710,23 +717,11 @@ int invalidate_inode_pages2_range(struct address_space *mapping,
continue;
}
wait_on_page_writeback(page);
- if (page_mapped(page)) {
- if (!did_range_unmap) {
- /*
- * Zap the rest of the file in one hit.
- */
- unmap_mapping_pages(mapping, index,
- (1 + end - index), false);
- did_range_unmap = 1;
- } else {
- /*
- * Just zap this page
- */
- unmap_mapping_pages(mapping, index,
- 1, false);
- }
- }
+
+ if (page_mapped(page))
+ unmap_mapping_page(page);
BUG_ON(page_mapped(page));
+
ret2 = do_launder_page(mapping, page);
if (ret2 == 0) {
if (!invalidate_complete_page2(mapping, page))
--
2.30.2

2021-07-09 13:23:23

by Greg Kroah-Hartman

[permalink] [raw]

Subject: [PATCH 4.19 31/34] ARM: OMAP: replace setup_irq() by request_irq()

From: afzal mohammed <[email protected]>

commit b75ca5217743e4d7076cf65e044e88389e44318d upstream.

request_irq() is preferred over setup_irq(). Invocations of setup_irq()
occur after memory allocators are ready.

Per tglx[1], setup_irq() existed in olden days when allocators were not
ready by the time early interrupts were initialized.

Hence replace setup_irq() by request_irq().

[1] https://lkml.kernel.org/r/alpine.DEB.2.20.1710191609480.1971@nanos

Cc: Daniel Lezcano <[email protected]>
Cc: Keerthy <[email protected]>
Cc: Tero Kristo <[email protected]>
Signed-off-by: afzal mohammed <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/arm/mach-omap1/pm.c | 13 ++++++-------
arch/arm/mach-omap1/time.c | 10 +++-------
arch/arm/mach-omap1/timer32k.c | 10 +++-------
arch/arm/mach-omap2/timer.c | 11 +++--------
4 files changed, 15 insertions(+), 29 deletions(-)

--- a/arch/arm/mach-omap1/pm.c
+++ b/arch/arm/mach-omap1/pm.c
@@ -610,11 +610,6 @@ static irqreturn_t omap_wakeup_interrupt
return IRQ_HANDLED;
}

-static struct irqaction omap_wakeup_irq = {
- .name = "peripheral wakeup",
- .handler = omap_wakeup_interrupt
-};
-

static const struct platform_suspend_ops omap_pm_ops = {
@@ -627,6 +622,7 @@ static const struct platform_suspend_ops
static int __init omap_pm_init(void)
{
int error = 0;
+ int irq;

if (!cpu_class_is_omap1())
return -ENODEV;
@@ -670,9 +666,12 @@ static int __init omap_pm_init(void)
arm_pm_idle = omap1_pm_idle;

if (cpu_is_omap7xx())
- setup_irq(INT_7XX_WAKE_UP_REQ, &omap_wakeup_irq);
+ irq = INT_7XX_WAKE_UP_REQ;
else if (cpu_is_omap16xx())
- setup_irq(INT_1610_WAKE_UP_REQ, &omap_wakeup_irq);
+ irq = INT_1610_WAKE_UP_REQ;
+ if (request_irq(irq, omap_wakeup_interrupt, 0, "peripheral wakeup",
+ NULL))
+ pr_err("Failed to request irq %d (peripheral wakeup)\n", irq);

/* Program new power ramp-up time
* (0 for most boards since we don't lower voltage when in deep sleep)
--- a/arch/arm/mach-omap1/time.c
+++ b/arch/arm/mach-omap1/time.c
@@ -155,15 +155,11 @@ static irqreturn_t omap_mpu_timer1_inter
return IRQ_HANDLED;
}

-static struct irqaction omap_mpu_timer1_irq = {
- .name = "mpu_timer1",
- .flags = IRQF_TIMER | IRQF_IRQPOLL,
- .handler = omap_mpu_timer1_interrupt,
-};
-
static __init void omap_init_mpu_timer(unsigned long rate)
{
- setup_irq(INT_TIMER1, &omap_mpu_timer1_irq);
+ if (request_irq(INT_TIMER1, omap_mpu_timer1_interrupt,
+ IRQF_TIMER | IRQF_IRQPOLL, "mpu_timer1", NULL))
+ pr_err("Failed to request irq %d (mpu_timer1)\n", INT_TIMER1);
omap_mpu_timer_start(0, (rate / HZ) - 1, 1);

clockevent_mpu_timer1.cpumask = cpumask_of(0);
--- a/arch/arm/mach-omap1/timer32k.c
+++ b/arch/arm/mach-omap1/timer32k.c
@@ -148,15 +148,11 @@ static irqreturn_t omap_32k_timer_interr
return IRQ_HANDLED;
}

-static struct irqaction omap_32k_timer_irq = {
- .name = "32KHz timer",
- .flags = IRQF_TIMER | IRQF_IRQPOLL,
- .handler = omap_32k_timer_interrupt,
-};
-
static __init void omap_init_32k_timer(void)
{
- setup_irq(INT_OS_TIMER, &omap_32k_timer_irq);
+ if (request_irq(INT_OS_TIMER, omap_32k_timer_interrupt,
+ IRQF_TIMER | IRQF_IRQPOLL, "32KHz timer", NULL))
+ pr_err("Failed to request irq %d(32KHz timer)\n", INT_OS_TIMER);

clockevent_32k_timer.cpumask = cpumask_of(0);
clockevents_config_and_register(&clockevent_32k_timer,
--- a/arch/arm/mach-omap2/timer.c
+++ b/arch/arm/mach-omap2/timer.c
@@ -92,12 +92,6 @@ static irqreturn_t omap2_gp_timer_interr
return IRQ_HANDLED;
}

-static struct irqaction omap2_gp_timer_irq = {
- .name = "gp_timer",
- .flags = IRQF_TIMER | IRQF_IRQPOLL,
- .handler = omap2_gp_timer_interrupt,
-};
-
static int omap2_gp_timer_set_next_event(unsigned long cycles,
struct clock_event_device *evt)
{
@@ -383,8 +377,9 @@ static void __init omap2_gp_clockevent_i
&clockevent_gpt.name, OMAP_TIMER_POSTED);
BUG_ON(res);

- omap2_gp_timer_irq.dev_id = &clkev;
- setup_irq(clkev.irq, &omap2_gp_timer_irq);
+ if (request_irq(clkev.irq, omap2_gp_timer_interrupt,
+ IRQF_TIMER | IRQF_IRQPOLL, "gp_timer", &clkev))
+ pr_err("Failed to request irq %d (gp_timer)\n", clkev.irq);

__omap_dm_timer_int_enable(&clkev, OMAP_TIMER_INT_OVERFLOW);

2021-07-09 13:23:23

by Greg Kroah-Hartman

[permalink] [raw]

Subject: [PATCH 4.19 07/34] mm/thp: fix vma_address() if virtual address below file offset

From: Hugh Dickins <[email protected]>

[ Upstream commit 494334e43c16d63b878536a26505397fce6ff3a2 ]

Running certain tests with a DEBUG_VM kernel would crash within hours,
on the total_mapcount BUG() in split_huge_page_to_list(), while trying
to free up some memory by punching a hole in a shmem huge page: split's
try_to_unmap() was unable to find all the mappings of the page (which,
on a !DEBUG_VM kernel, would then keep the huge page pinned in memory).

When that BUG() was changed to a WARN(), it would later crash on the
VM_BUG_ON_VMA(end < vma->vm_start || start >= vma->vm_end, vma) in
mm/internal.h:vma_address(), used by rmap_walk_file() for
try_to_unmap().

vma_address() is usually correct, but there's a wraparound case when the
vm_start address is unusually low, but vm_pgoff not so low:
vma_address() chooses max(start, vma->vm_start), but that decides on the
wrong address, because start has become almost ULONG_MAX.

Rewrite vma_address() to be more careful about vm_pgoff; move the
VM_BUG_ON_VMA() out of it, returning -EFAULT for errors, so that it can
be safely used from page_mapped_in_vma() and page_address_in_vma() too.

Add vma_address_end() to apply similar care to end address calculation,
in page_vma_mapped_walk() and page_mkclean_one() and try_to_unmap_one();
though it raises a question of whether callers would do better to supply
pvmw->end to page_vma_mapped_walk() - I chose not, for a smaller patch.

An irritation is that their apparent generality breaks down on KSM
pages, which cannot be located by the page->index that page_to_pgoff()
uses: as commit 4b0ece6fa016 ("mm: migrate: fix remove_migration_pte()
for ksm pages") once discovered. I dithered over the best thing to do
about that, and have ended up with a VM_BUG_ON_PAGE(PageKsm) in both
vma_address() and vma_address_end(); though the only place in danger of
using it on them was try_to_unmap_one().

Sidenote: vma_address() and vma_address_end() now use compound_nr() on a
head page, instead of thp_size(): to make the right calculation on a
hugetlbfs page, whether or not THPs are configured. try_to_unmap() is
used on hugetlbfs pages, but perhaps the wrong calculation never
mattered.

Link: https://lkml.kernel.org/r/[email protected]
Fixes: a8fa41ad2f6f ("mm, rmap: check all VMAs that PTE-mapped THP can be part of")
Signed-off-by: Hugh Dickins <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Cc: Alistair Popple <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: Jue Wang <[email protected]>
Cc: "Matthew Wilcox (Oracle)" <[email protected]>
Cc: Miaohe Lin <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: Ralph Campbell <[email protected]>
Cc: Shakeel Butt <[email protected]>
Cc: Wang Yugui <[email protected]>
Cc: Yang Shi <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

Note on stable backport: fixed up conflicts on intervening thp_size(),
and mmu_notifier_range initializations; substitute for compound_nr().

Signed-off-by: Hugh Dickins <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
mm/internal.h | 53 ++++++++++++++++++++++++++++++++------------
mm/page_vma_mapped.c | 16 +++++--------
mm/rmap.c | 14 ++++++------
3 files changed, 52 insertions(+), 31 deletions(-)

diff --git a/mm/internal.h b/mm/internal.h
index 397183c8fe47..3a2e973138d3 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -331,27 +331,52 @@ static inline void mlock_migrate_page(struct page *newpage, struct page *page)
extern pmd_t maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma);

/*
- * At what user virtual address is page expected in @vma?
+ * At what user virtual address is page expected in vma?
+ * Returns -EFAULT if all of the page is outside the range of vma.
+ * If page is a compound head, the entire compound page is considered.
*/
static inline unsigned long
-__vma_address(struct page *page, struct vm_area_struct *vma)
+vma_address(struct page *page, struct vm_area_struct *vma)
{
- pgoff_t pgoff = page_to_pgoff(page);
- return vma->vm_start + ((pgoff - vma->vm_pgoff) << PAGE_SHIFT);
+ pgoff_t pgoff;
+ unsigned long address;
+
+ VM_BUG_ON_PAGE(PageKsm(page), page); /* KSM page->index unusable */
+ pgoff = page_to_pgoff(page);
+ if (pgoff >= vma->vm_pgoff) {
+ address = vma->vm_start +
+ ((pgoff - vma->vm_pgoff) << PAGE_SHIFT);
+ /* Check for address beyond vma (or wrapped through 0?) */
+ if (address < vma->vm_start || address >= vma->vm_end)
+ address = -EFAULT;
+ } else if (PageHead(page) &&
+ pgoff + (1UL << compound_order(page)) - 1 >= vma->vm_pgoff) {
+ /* Test above avoids possibility of wrap to 0 on 32-bit */
+ address = vma->vm_start;
+ } else {
+ address = -EFAULT;
+ }
+ return address;
}

+/*
+ * Then at what user virtual address will none of the page be found in vma?
+ * Assumes that vma_address() already returned a good starting address.
+ * If page is a compound head, the entire compound page is considered.
+ */
static inline unsigned long
-vma_address(struct page *page, struct vm_area_struct *vma)
+vma_address_end(struct page *page, struct vm_area_struct *vma)
{
- unsigned long start, end;
-
- start = __vma_address(page, vma);
- end = start + PAGE_SIZE * (hpage_nr_pages(page) - 1);
-
- /* page should be within @vma mapping range */
- VM_BUG_ON_VMA(end < vma->vm_start || start >= vma->vm_end, vma);
-
- return max(start, vma->vm_start);
+ pgoff_t pgoff;
+ unsigned long address;
+
+ VM_BUG_ON_PAGE(PageKsm(page), page); /* KSM page->index unusable */
+ pgoff = page_to_pgoff(page) + (1UL << compound_order(page));
+ address = vma->vm_start + ((pgoff - vma->vm_pgoff) << PAGE_SHIFT);
+ /* Check for address beyond vma (or wrapped through 0?) */
+ if (address < vma->vm_start || address > vma->vm_end)
+ address = vma->vm_end;
+ return address;
}

#else /* !CONFIG_MMU */
diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
index 08e283ad4660..bb7488e86bea 100644
--- a/mm/page_vma_mapped.c
+++ b/mm/page_vma_mapped.c
@@ -224,18 +224,18 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
if (!map_pte(pvmw))
goto next_pte;
while (1) {
+ unsigned long end;
+
if (check_pte(pvmw))
return true;
next_pte:
/* Seek to next pte only makes sense for THP */
if (!PageTransHuge(pvmw->page) || PageHuge(pvmw->page))
return not_found(pvmw);
+ end = vma_address_end(pvmw->page, pvmw->vma);
do {
pvmw->address += PAGE_SIZE;
- if (pvmw->address >= pvmw->vma->vm_end ||
- pvmw->address >=
- __vma_address(pvmw->page, pvmw->vma) +
- hpage_nr_pages(pvmw->page) * PAGE_SIZE)
+ if (pvmw->address >= end)
return not_found(pvmw);
/* Did we cross page table boundary? */
if (pvmw->address % PMD_SIZE == 0) {
@@ -273,14 +273,10 @@ int page_mapped_in_vma(struct page *page, struct vm_area_struct *vma)
.vma = vma,
.flags = PVMW_SYNC,
};
- unsigned long start, end;
-
- start = __vma_address(page, vma);
- end = start + PAGE_SIZE * (hpage_nr_pages(page) - 1);

- if (unlikely(end < vma->vm_start || start >= vma->vm_end))
+ pvmw.address = vma_address(page, vma);
+ if (pvmw.address == -EFAULT)
return 0;
- pvmw.address = max(start, vma->vm_start);
if (!page_vma_mapped_walk(&pvmw))
return 0;
page_vma_mapped_walk_done(&pvmw);
diff --git a/mm/rmap.c b/mm/rmap.c
index 5df055654e63..ff56c460af53 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -686,7 +686,6 @@ static bool should_defer_flush(struct mm_struct *mm, enum ttu_flags flags)
*/
unsigned long page_address_in_vma(struct page *page, struct vm_area_struct *vma)
{
- unsigned long address;
if (PageAnon(page)) {
struct anon_vma *page__anon_vma = page_anon_vma(page);
/*
@@ -701,10 +700,8 @@ unsigned long page_address_in_vma(struct page *page, struct vm_area_struct *vma)
return -EFAULT;
} else
return -EFAULT;
- address = __vma_address(page, vma);
- if (unlikely(address < vma->vm_start || address >= vma->vm_end))
- return -EFAULT;
- return address;
+
+ return vma_address(page, vma);
}

pmd_t *mm_find_pmd(struct mm_struct *mm, unsigned long address)
@@ -896,7 +893,7 @@ static bool page_mkclean_one(struct page *page, struct vm_area_struct *vma,
* We have to assume the worse case ie pmd for invalidation. Note that
* the page can not be free from this function.
*/
- end = min(vma->vm_end, start + (PAGE_SIZE << compound_order(page)));
+ end = vma_address_end(page, vma);
mmu_notifier_invalidate_range_start(vma->vm_mm, start, end);

while (page_vma_mapped_walk(&pvmw)) {
@@ -1378,7 +1375,8 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
* Note that the page can not be free in this function as call of
* try_to_unmap() must hold a reference on the page.
*/
- end = min(vma->vm_end, start + (PAGE_SIZE << compound_order(page)));
+ end = PageKsm(page) ?
+ address + PAGE_SIZE : vma_address_end(page, vma);
if (PageHuge(page)) {
/*
* If sharing is possible, start and end will be adjusted
@@ -1835,6 +1833,7 @@ static void rmap_walk_anon(struct page *page, struct rmap_walk_control *rwc,
struct vm_area_struct *vma = avc->vma;
unsigned long address = vma_address(page, vma);

+ VM_BUG_ON_VMA(address == -EFAULT, vma);
cond_resched();

if (rwc->invalid_vma && rwc->invalid_vma(vma, rwc->arg))
@@ -1889,6 +1888,7 @@ static void rmap_walk_file(struct page *page, struct rmap_walk_control *rwc,
pgoff_start, pgoff_end) {
unsigned long address = vma_address(page, vma);

+ VM_BUG_ON_VMA(address == -EFAULT, vma);
cond_resched();

if (rwc->invalid_vma && rwc->invalid_vma(vma, rwc->arg))
--
2.30.2

2021-07-09 13:23:34

by Greg Kroah-Hartman

[permalink] [raw]

Subject: [PATCH 4.19 29/34] xen/events: reset active flag for lateeoi events later

From: Juergen Gross <[email protected]>

commit 3de218ff39b9e3f0d453fe3154f12a174de44b25 upstream.

In order to avoid a race condition for user events when changing
cpu affinity reset the active flag only when EOI-ing the event.

This is working fine as all user events are lateeoi events. Note that
lateeoi_ack_mask_dynirq() is not modified as there is no explicit call
to xen_irq_lateeoi() expected later.

Cc: [email protected]
Reported-by: Julien Grall <[email protected]>
Fixes: b6622798bc50b62 ("xen/events: avoid handling the same event on two cpus at the same time")
Tested-by: Julien Grall <[email protected]>
Signed-off-by: Juergen Gross <[email protected]>
Reviewed-by: Boris Ostrovsky <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Juergen Gross <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/xen/events/events_base.c | 23 +++++++++++++++++++----
1 file changed, 19 insertions(+), 4 deletions(-)

--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -524,6 +524,9 @@ static void xen_irq_lateeoi_locked(struc
}

info->eoi_time = 0;
+
+ /* is_active hasn't been reset yet, do it now. */
+ smp_store_release(&info->is_active, 0);
do_unmask(info, EVT_MASK_REASON_EOI_PENDING);
}

@@ -1780,10 +1783,22 @@ static void lateeoi_ack_dynirq(struct ir
struct irq_info *info = info_for_irq(data->irq);
evtchn_port_t evtchn = info ? info->evtchn : 0;

- if (VALID_EVTCHN(evtchn)) {
- do_mask(info, EVT_MASK_REASON_EOI_PENDING);
- ack_dynirq(data);
- }
+ if (!VALID_EVTCHN(evtchn))
+ return;
+
+ do_mask(info, EVT_MASK_REASON_EOI_PENDING);
+
+ if (unlikely(irqd_is_setaffinity_pending(data)) &&
+ likely(!irqd_irq_disabled(data))) {
+ do_mask(info, EVT_MASK_REASON_TEMPORARY);
+
+ clear_evtchn(evtchn);
+
+ irq_move_masked_irq(data);
+
+ do_unmask(info, EVT_MASK_REASON_TEMPORARY);
+ } else
+ clear_evtchn(evtchn);
}

static void lateeoi_mask_ack_dynirq(struct irq_data *data)

2021-07-09 13:23:42

by Greg Kroah-Hartman

[permalink] [raw]

Subject: [PATCH 4.19 16/34] mm: page_vma_mapped_walk(): add a level of indentation

From: Hugh Dickins <[email protected]>

[ Upstream commit b3807a91aca7d21c05d5790612e49969117a72b9 ]

page_vma_mapped_walk() cleanup: add a level of indentation to much of
the body, making no functional change in this commit, but reducing the
later diff when this is all converted to a loop.

[[email protected]: : page_vma_mapped_walk(): add a level of indentation fix]
Link: https://lkml.kernel.org/r/[email protected]

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Hugh Dickins <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Cc: Alistair Popple <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: Ralph Campbell <[email protected]>
Cc: Wang Yugui <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yang Shi <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
mm/page_vma_mapped.c | 105 ++++++++++++++++++++++---------------------
1 file changed, 55 insertions(+), 50 deletions(-)

diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
index 1fccadd0eb6d..819945678264 100644
--- a/mm/page_vma_mapped.c
+++ b/mm/page_vma_mapped.c
@@ -169,62 +169,67 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
if (pvmw->pte)
goto next_pte;
restart:
- pgd = pgd_offset(mm, pvmw->address);
- if (!pgd_present(*pgd))
- return false;
- p4d = p4d_offset(pgd, pvmw->address);
- if (!p4d_present(*p4d))
- return false;
- pud = pud_offset(p4d, pvmw->address);
- if (!pud_present(*pud))
- return false;
- pvmw->pmd = pmd_offset(pud, pvmw->address);
- /*
- * Make sure the pmd value isn't cached in a register by the
- * compiler and used as a stale value after we've observed a
- * subsequent update.
- */
- pmde = READ_ONCE(*pvmw->pmd);
- if (pmd_trans_huge(pmde) || is_pmd_migration_entry(pmde)) {
- pvmw->ptl = pmd_lock(mm, pvmw->pmd);
- pmde = *pvmw->pmd;
- if (likely(pmd_trans_huge(pmde))) {
- if (pvmw->flags & PVMW_MIGRATION)
- return not_found(pvmw);
- if (pmd_page(pmde) != page)
- return not_found(pvmw);
- return true;
- }
- if (!pmd_present(pmde)) {
- swp_entry_t entry;
+ {
+ pgd = pgd_offset(mm, pvmw->address);
+ if (!pgd_present(*pgd))
+ return false;
+ p4d = p4d_offset(pgd, pvmw->address);
+ if (!p4d_present(*p4d))
+ return false;
+ pud = pud_offset(p4d, pvmw->address);
+ if (!pud_present(*pud))
+ return false;

- if (!thp_migration_supported() ||
- !(pvmw->flags & PVMW_MIGRATION))
- return not_found(pvmw);
- entry = pmd_to_swp_entry(pmde);
- if (!is_migration_entry(entry) ||
- migration_entry_to_page(entry) != page)
- return not_found(pvmw);
- return true;
- }
- /* THP pmd was split under us: handle on pte level */
- spin_unlock(pvmw->ptl);
- pvmw->ptl = NULL;
- } else if (!pmd_present(pmde)) {
+ pvmw->pmd = pmd_offset(pud, pvmw->address);
/*
- * If PVMW_SYNC, take and drop THP pmd lock so that we
- * cannot return prematurely, while zap_huge_pmd() has
- * cleared *pmd but not decremented compound_mapcount().
+ * Make sure the pmd value isn't cached in a register by the
+ * compiler and used as a stale value after we've observed a
+ * subsequent update.
*/
- if ((pvmw->flags & PVMW_SYNC) && PageTransCompound(page)) {
- spinlock_t *ptl = pmd_lock(mm, pvmw->pmd);
+ pmde = READ_ONCE(*pvmw->pmd);

- spin_unlock(ptl);
+ if (pmd_trans_huge(pmde) || is_pmd_migration_entry(pmde)) {
+ pvmw->ptl = pmd_lock(mm, pvmw->pmd);
+ pmde = *pvmw->pmd;
+ if (likely(pmd_trans_huge(pmde))) {
+ if (pvmw->flags & PVMW_MIGRATION)
+ return not_found(pvmw);
+ if (pmd_page(pmde) != page)
+ return not_found(pvmw);
+ return true;
+ }
+ if (!pmd_present(pmde)) {
+ swp_entry_t entry;
+
+ if (!thp_migration_supported() ||
+ !(pvmw->flags & PVMW_MIGRATION))
+ return not_found(pvmw);
+ entry = pmd_to_swp_entry(pmde);
+ if (!is_migration_entry(entry) ||
+ migration_entry_to_page(entry) != page)
+ return not_found(pvmw);
+ return true;
+ }
+ /* THP pmd was split under us: handle on pte level */
+ spin_unlock(pvmw->ptl);
+ pvmw->ptl = NULL;
+ } else if (!pmd_present(pmde)) {
+ /*
+ * If PVMW_SYNC, take and drop THP pmd lock so that we
+ * cannot return prematurely, while zap_huge_pmd() has
+ * cleared *pmd but not decremented compound_mapcount().
+ */
+ if ((pvmw->flags & PVMW_SYNC) &&
+ PageTransCompound(page)) {
+ spinlock_t *ptl = pmd_lock(mm, pvmw->pmd);
+
+ spin_unlock(ptl);
+ }
+ return false;
}
- return false;
+ if (!map_pte(pvmw))
+ goto next_pte;
}
- if (!map_pte(pvmw))
- goto next_pte;
while (1) {
unsigned long end;

--
2.30.2

2021-07-09 13:23:48

by Greg Kroah-Hartman

[permalink] [raw]

Subject: [PATCH 4.19 18/34] mm: page_vma_mapped_walk(): get vma_address_end() earlier

From: Hugh Dickins <[email protected]>

[ Upstream commit a765c417d876cc635f628365ec9aa6f09470069a ]

page_vma_mapped_walk() cleanup: get THP's vma_address_end() at the
start, rather than later at next_pte.

It's a little unnecessary overhead on the first call, but makes for a
simpler loop in the following commit.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Hugh Dickins <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Cc: Alistair Popple <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: Ralph Campbell <[email protected]>
Cc: Wang Yugui <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yang Shi <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
mm/page_vma_mapped.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
index 470f19c917ad..7239c38e99e2 100644
--- a/mm/page_vma_mapped.c
+++ b/mm/page_vma_mapped.c
@@ -167,6 +167,15 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
return true;
}

+ /*
+ * Seek to next pte only makes sense for THP.
+ * But more important than that optimization, is to filter out
+ * any PageKsm page: whose page->index misleads vma_address()
+ * and vma_address_end() to disaster.
+ */
+ end = PageTransCompound(page) ?
+ vma_address_end(page, pvmw->vma) :
+ pvmw->address + PAGE_SIZE;
if (pvmw->pte)
goto next_pte;
restart:
@@ -234,10 +243,6 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
if (check_pte(pvmw))
return true;
next_pte:
- /* Seek to next pte only makes sense for THP */
- if (!PageTransHuge(page))
- return not_found(pvmw);
- end = vma_address_end(page, pvmw->vma);
do {
pvmw->address += PAGE_SIZE;
if (pvmw->address >= end)
--
2.30.2

2021-07-09 13:23:49

by Greg Kroah-Hartman

[permalink] [raw]

Subject: [PATCH 4.19 23/34] drm/nouveau: fix dma_address check for CPU/GPU sync

From: Christian König <[email protected]>

[ Upstream commit d330099115597bbc238d6758a4930e72b49ea9ba ]

AGP for example doesn't have a dma_address array.

Signed-off-by: Christian König <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/gpu/drm/nouveau/nouveau_bo.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c
index 7214022dfb91..d230536e7086 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -512,7 +512,7 @@ nouveau_bo_sync_for_device(struct nouveau_bo *nvbo)
struct ttm_dma_tt *ttm_dma = (struct ttm_dma_tt *)nvbo->bo.ttm;
int i;

- if (!ttm_dma)
+ if (!ttm_dma || !ttm_dma->dma_address)
return;

/* Don't waste time looping if the object is coherent */
@@ -532,7 +532,7 @@ nouveau_bo_sync_for_cpu(struct nouveau_bo *nvbo)
struct ttm_dma_tt *ttm_dma = (struct ttm_dma_tt *)nvbo->bo.ttm;
int i;

- if (!ttm_dma)
+ if (!ttm_dma || !ttm_dma->dma_address)
return;

/* Don't waste time looping if the object is coherent */
--
2.30.2

2021-07-09 13:23:52

by Greg Kroah-Hartman

[permalink] [raw]

Subject: [PATCH 4.19 20/34] mm/thp: another PVMW_SYNC fix in page_vma_mapped_walk()

From: Hugh Dickins <[email protected]>

[ Upstream commit a7a69d8ba88d8dcee7ef00e91d413a4bd003a814 ]

Aha! Shouldn't that quick scan over pte_none()s make sure that it holds
ptlock in the PVMW_SYNC case? That too might have been responsible for
BUGs or WARNs in split_huge_page_to_list() or its unmap_page(), though
I've never seen any.

Link: https://lkml.kernel.org/r/[email protected]
Link: https://lore.kernel.org/linux-mm/[email protected]/
Fixes: ace71a19cec5 ("mm: introduce page_vma_mapped_walk()")
Signed-off-by: Hugh Dickins <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Tested-by: Wang Yugui <[email protected]>
Cc: Alistair Popple <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: Ralph Campbell <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yang Shi <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
mm/page_vma_mapped.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
index b7a8009c1549..edca78609318 100644
--- a/mm/page_vma_mapped.c
+++ b/mm/page_vma_mapped.c
@@ -272,6 +272,10 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
goto restart;
}
pvmw->pte++;
+ if ((pvmw->flags & PVMW_SYNC) && !pvmw->ptl) {
+ pvmw->ptl = pte_lockptr(mm, pvmw->pmd);
+ spin_lock(pvmw->ptl);
+ }
} while (pte_none(*pvmw->pte));

if (!pvmw->ptl) {
--
2.30.2

2021-07-09 13:23:59

by Greg Kroah-Hartman

[permalink] [raw]

Subject: [PATCH 4.19 30/34] KVM: SVM: Call SEV Guest Decommission if ASID binding fails

From: Alper Gun <[email protected]>

commit 934002cd660b035b926438244b4294e647507e13 upstream.

Send SEV_CMD_DECOMMISSION command to PSP firmware if ASID binding
fails. If a failure happens after a successful LAUNCH_START command,
a decommission command should be executed. Otherwise, guest context
will be unfreed inside the AMD SP. After the firmware will not have
memory to allocate more SEV guest context, LAUNCH_START command will
begin to fail with SEV_RET_RESOURCE_LIMIT error.

The existing code calls decommission inside sev_unbind_asid, but it is
not called if a failure happens before guest activation succeeds. If
sev_bind_asid fails, decommission is never called. PSP firmware has a
limit for the number of guests. If sev_asid_binding fails many times,
PSP firmware will not have resources to create another guest context.

Cc: [email protected]
Fixes: 59414c989220 ("KVM: SVM: Add support for KVM_SEV_LAUNCH_START command")
Reported-by: Peter Gonda <[email protected]>
Signed-off-by: Alper Gun <[email protected]>
Reviewed-by: Marc Orr <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/kvm/svm.c | 32 +++++++++++++++++++++-----------
1 file changed, 21 insertions(+), 11 deletions(-)

--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1791,9 +1791,25 @@ static void sev_asid_free(struct kvm *kv
__sev_asid_free(sev->asid);
}

-static void sev_unbind_asid(struct kvm *kvm, unsigned int handle)
+static void sev_decommission(unsigned int handle)
{
struct sev_data_decommission *decommission;
+
+ if (!handle)
+ return;
+
+ decommission = kzalloc(sizeof(*decommission), GFP_KERNEL);
+ if (!decommission)
+ return;
+
+ decommission->handle = handle;
+ sev_guest_decommission(decommission, NULL);
+
+ kfree(decommission);
+}
+
+static void sev_unbind_asid(struct kvm *kvm, unsigned int handle)
+{
struct sev_data_deactivate *data;

if (!handle)
@@ -1811,15 +1827,7 @@ static void sev_unbind_asid(struct kvm *
sev_guest_df_flush(NULL);
kfree(data);

- decommission = kzalloc(sizeof(*decommission), GFP_KERNEL);
- if (!decommission)
- return;
-
- /* decommission handle */
- decommission->handle = handle;
- sev_guest_decommission(decommission, NULL);
-
- kfree(decommission);
+ sev_decommission(handle);
}

static struct page **sev_pin_memory(struct kvm *kvm, unsigned long uaddr,
@@ -6469,8 +6477,10 @@ static int sev_launch_start(struct kvm *

/* Bind ASID to this guest */
ret = sev_bind_asid(kvm, start->handle, error);
- if (ret)
+ if (ret) {
+ sev_decommission(start->handle);
goto e_free_session;
+ }

/* return handle to userspace */
params.handle = start->handle;

2021-07-09 13:24:11

by Greg Kroah-Hartman

[permalink] [raw]

Subject: [PATCH 4.19 27/34] kthread_worker: split code for canceling the delayed work timer

From: Petr Mladek <[email protected]>

commit 34b3d5344719d14fd2185b2d9459b3abcb8cf9d8 upstream.

Patch series "kthread_worker: Fix race between kthread_mod_delayed_work()
and kthread_cancel_delayed_work_sync()".

This patchset fixes the race between kthread_mod_delayed_work() and
kthread_cancel_delayed_work_sync() including proper return value
handling.

This patch (of 2):

Simple code refactoring as a preparation step for fixing a race between
kthread_mod_delayed_work() and kthread_cancel_delayed_work_sync().

It does not modify the existing behavior.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Petr Mladek <[email protected]>
Cc: <[email protected]>
Cc: Martin Liu <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Nathan Chancellor <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
kernel/kthread.c | 46 +++++++++++++++++++++++++++++-----------------
1 file changed, 29 insertions(+), 17 deletions(-)

--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -1010,6 +1010,33 @@ void kthread_flush_work(struct kthread_w
EXPORT_SYMBOL_GPL(kthread_flush_work);

/*
+ * Make sure that the timer is neither set nor running and could
+ * not manipulate the work list_head any longer.
+ *
+ * The function is called under worker->lock. The lock is temporary
+ * released but the timer can't be set again in the meantime.
+ */
+static void kthread_cancel_delayed_work_timer(struct kthread_work *work,
+ unsigned long *flags)
+{
+ struct kthread_delayed_work *dwork =
+ container_of(work, struct kthread_delayed_work, work);
+ struct kthread_worker *worker = work->worker;
+
+ /*
+ * del_timer_sync() must be called to make sure that the timer
+ * callback is not running. The lock must be temporary released
+ * to avoid a deadlock with the callback. In the meantime,
+ * any queuing is blocked by setting the canceling counter.
+ */
+ work->canceling++;
+ spin_unlock_irqrestore(&worker->lock, *flags);
+ del_timer_sync(&dwork->timer);
+ spin_lock_irqsave(&worker->lock, *flags);
+ work->canceling--;
+}
+
+/*
* This function removes the work from the worker queue. Also it makes sure
* that it won't get queued later via the delayed work's timer.
*
@@ -1023,23 +1050,8 @@ static bool __kthread_cancel_work(struct
unsigned long *flags)
{
/* Try to cancel the timer if exists. */
- if (is_dwork) {
- struct kthread_delayed_work *dwork =
- container_of(work, struct kthread_delayed_work, work);
- struct kthread_worker *worker = work->worker;
-
- /*
- * del_timer_sync() must be called to make sure that the timer
- * callback is not running. The lock must be temporary released
- * to avoid a deadlock with the callback. In the meantime,
- * any queuing is blocked by setting the canceling counter.
- */
- work->canceling++;
- spin_unlock_irqrestore(&worker->lock, *flags);
- del_timer_sync(&dwork->timer);
- spin_lock_irqsave(&worker->lock, *flags);
- work->canceling--;
- }
+ if (is_dwork)
+ kthread_cancel_delayed_work_timer(work, flags);

/*
* Try to remove the work from a worker list. It might either

2021-07-09 13:24:30

by Greg Kroah-Hartman

[permalink] [raw]

Subject: [PATCH 4.19 34/34] clocksource/drivers/timer-ti-dm: Handle dra7 timer wrap errata i940

From: Tony Lindgren <[email protected]>

commit 25de4ce5ed02994aea8bc111d133308f6fd62566 upstream.

There is a timer wrap issue on dra7 for the ARM architected timer.
In a typical clock configuration the timer fails to wrap after 388 days.

To work around the issue, we need to use timer-ti-dm percpu timers instead.

Let's configure dmtimer3 and 4 as percpu timers by default, and warn about
the issue if the dtb is not configured properly.

For more information, please see the errata for "AM572x Sitara Processors
Silicon Revisions 1.1, 2.0":

https://www.ti.com/lit/er/sprz429m/sprz429m.pdf

The concept is based on earlier reference patches done by Tero Kristo and
Keerthy.

Cc: Daniel Lezcano <[email protected]>
Cc: Keerthy <[email protected]>
Cc: Tero Kristo <[email protected]>
[[email protected]: backported to 4.19.y]
Signed-off-by: Tony Lindgren <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/arm/boot/dts/dra7.dtsi | 11 +++++++
arch/arm/mach-omap2/board-generic.c | 4 +-
arch/arm/mach-omap2/timer.c | 53 +++++++++++++++++++++++++++++++++++-
drivers/clk/ti/clk-7xx.c | 1
include/linux/cpuhotplug.h | 1
5 files changed, 67 insertions(+), 3 deletions(-)

--- a/arch/arm/boot/dts/dra7.dtsi
+++ b/arch/arm/boot/dts/dra7.dtsi
@@ -48,6 +48,7 @@

timer {
compatible = "arm,armv7-timer";
+ status = "disabled"; /* See ARM architected timer wrap erratum i940 */
interrupts = <GIC_PPI 13 (GIC_CPU_MASK_SIMPLE(2) | IRQ_TYPE_LEVEL_LOW)>,
<GIC_PPI 14 (GIC_CPU_MASK_SIMPLE(2) | IRQ_TYPE_LEVEL_LOW)>,
<GIC_PPI 11 (GIC_CPU_MASK_SIMPLE(2) | IRQ_TYPE_LEVEL_LOW)>,
@@ -910,6 +911,8 @@
reg = <0x48032000 0x80>;
interrupts = <GIC_SPI 33 IRQ_TYPE_LEVEL_HIGH>;
ti,hwmods = "timer2";
+ clock-names = "fck";
+ clocks = <&l4per_clkctrl DRA7_TIMER2_CLKCTRL 24>;
};

timer3: timer@48034000 {
@@ -917,6 +920,10 @@
reg = <0x48034000 0x80>;
interrupts = <GIC_SPI 34 IRQ_TYPE_LEVEL_HIGH>;
ti,hwmods = "timer3";
+ clock-names = "fck";
+ clocks = <&l4per_clkctrl DRA7_TIMER3_CLKCTRL 24>;
+ assigned-clocks = <&l4per_clkctrl DRA7_TIMER3_CLKCTRL 24>;
+ assigned-clock-parents = <&timer_sys_clk_div>;
};

timer4: timer@48036000 {
@@ -924,6 +931,10 @@
reg = <0x48036000 0x80>;
interrupts = <GIC_SPI 35 IRQ_TYPE_LEVEL_HIGH>;
ti,hwmods = "timer4";
+ clock-names = "fck";
+ clocks = <&l4per_clkctrl DRA7_TIMER4_CLKCTRL 24>;
+ assigned-clocks = <&l4per_clkctrl DRA7_TIMER4_CLKCTRL 24>;
+ assigned-clock-parents = <&timer_sys_clk_div>;
};

timer5: timer@48820000 {
--- a/arch/arm/mach-omap2/board-generic.c
+++ b/arch/arm/mach-omap2/board-generic.c
@@ -330,7 +330,7 @@ DT_MACHINE_START(DRA74X_DT, "Generic DRA
.init_late = dra7xx_init_late,
.init_irq = omap_gic_of_init,
.init_machine = omap_generic_init,
- .init_time = omap5_realtime_timer_init,
+ .init_time = omap3_gptimer_timer_init,
.dt_compat = dra74x_boards_compat,
.restart = omap44xx_restart,
MACHINE_END
@@ -353,7 +353,7 @@ DT_MACHINE_START(DRA72X_DT, "Generic DRA
.init_late = dra7xx_init_late,
.init_irq = omap_gic_of_init,
.init_machine = omap_generic_init,
- .init_time = omap5_realtime_timer_init,
+ .init_time = omap3_gptimer_timer_init,
.dt_compat = dra72x_boards_compat,
.restart = omap44xx_restart,
MACHINE_END
--- a/arch/arm/mach-omap2/timer.c
+++ b/arch/arm/mach-omap2/timer.c
@@ -42,6 +42,7 @@
#include <linux/platform_device.h>
#include <linux/platform_data/dmtimer-omap.h>
#include <linux/sched_clock.h>
+#include <linux/cpu.h>

#include <asm/mach/time.h>
#include <asm/smp_twd.h>
@@ -421,6 +422,53 @@ static void __init dmtimer_clkevt_init_c
timer->rate);
}

+static DEFINE_PER_CPU(struct dmtimer_clockevent, dmtimer_percpu_timer);
+
+static int omap_gptimer_starting_cpu(unsigned int cpu)
+{
+ struct dmtimer_clockevent *clkevt = per_cpu_ptr(&dmtimer_percpu_timer, cpu);
+ struct clock_event_device *dev = &clkevt->dev;
+ struct omap_dm_timer *timer = &clkevt->timer;
+
+ clockevents_config_and_register(dev, timer->rate, 3, ULONG_MAX);
+ irq_force_affinity(dev->irq, cpumask_of(cpu));
+
+ return 0;
+}
+
+static int __init dmtimer_percpu_quirk_init(void)
+{
+ struct dmtimer_clockevent *clkevt;
+ struct clock_event_device *dev;
+ struct device_node *arm_timer;
+ struct omap_dm_timer *timer;
+ int cpu = 0;
+
+ arm_timer = of_find_compatible_node(NULL, NULL, "arm,armv7-timer");
+ if (of_device_is_available(arm_timer)) {
+ pr_warn_once("ARM architected timer wrap issue i940 detected\n");
+ return 0;
+ }
+
+ for_each_possible_cpu(cpu) {
+ clkevt = per_cpu_ptr(&dmtimer_percpu_timer, cpu);
+ dev = &clkevt->dev;
+ timer = &clkevt->timer;
+
+ dmtimer_clkevt_init_common(clkevt, 0, "timer_sys_ck",
+ CLOCK_EVT_FEAT_ONESHOT,
+ cpumask_of(cpu),
+ "assigned-clock-parents",
+ 500, "percpu timer");
+ }
+
+ cpuhp_setup_state(CPUHP_AP_OMAP_DM_TIMER_STARTING,
+ "clockevents/omap/gptimer:starting",
+ omap_gptimer_starting_cpu, NULL);
+
+ return 0;
+}
+
/* Clocksource code */
static struct omap_dm_timer clksrc;
static bool use_gptimer_clksrc __initdata;
@@ -565,6 +613,9 @@ static void __init __omap_sync32k_timer_
3, /* Timer internal resynch latency */
0xffffffff);

+ if (soc_is_dra7xx())
+ dmtimer_percpu_quirk_init();
+
/* Enable the use of clocksource="gp_timer" kernel parameter */
if (use_gptimer_clksrc || gptimer)
omap2_gptimer_clocksource_init(clksrc_nr, clksrc_src,
@@ -592,7 +643,7 @@ void __init omap3_secure_sync32k_timer_i
#endif /* CONFIG_ARCH_OMAP3 */

#if defined(CONFIG_ARCH_OMAP3) || defined(CONFIG_SOC_AM33XX) || \
- defined(CONFIG_SOC_AM43XX)
+ defined(CONFIG_SOC_AM43XX) || defined(CONFIG_SOC_DRA7XX)
void __init omap3_gptimer_timer_init(void)
{
__omap_sync32k_timer_init(2, "timer_sys_ck", NULL,
--- a/drivers/clk/ti/clk-7xx.c
+++ b/drivers/clk/ti/clk-7xx.c
@@ -733,6 +733,7 @@ const struct omap_clkctrl_data dra7_clkc
static struct ti_dt_clk dra7xx_clks[] = {
DT_CLK(NULL, "timer_32k_ck", "sys_32k_ck"),
DT_CLK(NULL, "sys_clkin_ck", "timer_sys_clk_div"),
+ DT_CLK(NULL, "timer_sys_ck", "timer_sys_clk_div"),
DT_CLK(NULL, "sys_clkin", "sys_clkin1"),
DT_CLK(NULL, "atl_dpll_clk_mux", "atl_cm:0000:24"),
DT_CLK(NULL, "atl_gfclk_mux", "atl_cm:0000:26"),
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -118,6 +118,7 @@ enum cpuhp_state {
CPUHP_AP_ARM_L2X0_STARTING,
CPUHP_AP_EXYNOS4_MCT_TIMER_STARTING,
CPUHP_AP_ARM_ARCH_TIMER_STARTING,
+ CPUHP_AP_OMAP_DM_TIMER_STARTING,
CPUHP_AP_ARM_GLOBAL_TIMER_STARTING,
CPUHP_AP_JCORE_TIMER_STARTING,
CPUHP_AP_ARM_TWD_STARTING,

2021-07-09 21:43:47

[permalink] [raw]

Subject: Re: [PATCH 4.19 00/34] 4.19.197-rc1 review

On 7/9/21 7:20 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.19.197 release.
> There are 34 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sun, 11 Jul 2021 13:14:09 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.197-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>

Compiled and booted on my test system. No dmesg regressions.

Tested-by: Shuah Khan <[email protected]>

thanks,
-- Shuah

2021-07-10 10:38:07

by Sudip Mukherjee

[permalink] [raw]

Subject: Re: [PATCH 4.19 00/34] 4.19.197-rc1 review

Hi Greg,

On Fri, Jul 09, 2021 at 03:20:16PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.19.197 release.
> There are 34 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sun, 11 Jul 2021 13:14:09 +0000.
> Anything received after that time might be too late.

Build test:
mips (gcc version 11.1.1 20210702): 63 configs -> no failure
arm (gcc version 11.1.1 20210702): 116 configs -> no new failure
arm64 (gcc version 11.1.1 20210702): 2 configs -> no failure
x86_64 (gcc version 10.2.1 20210110): 2 configs -> no failure

Boot test:
x86_64: Booted on my test laptop. No regression.
x86_64: Booted on qemu. No regression.

Tested-by: Sudip Mukherjee <[email protected]>

--
Regards
Sudip

2021-07-10 13:49:17

by Naresh Kamboju

[permalink] [raw]

Subject: Re: [PATCH 4.19 00/34] 4.19.197-rc1 review

On Fri, 9 Jul 2021 at 18:51, Greg Kroah-Hartman
<[email protected]> wrote:
>
> This is the start of the stable review cycle for the 4.19.197 release.
> There are 34 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sun, 11 Jul 2021 13:14:09 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.197-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

Results from Linaro’s test farm.
No regressions on arm64, arm, x86_64, and i386.

Tested-by: Linux Kernel Functional Testing <[email protected]>

## Build
* kernel: 4.19.197-rc1
* git: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
* git branch: linux-4.19.y
* git commit: df520a4397b2a281e4b2ac05b0c85f8875c3ea11
* git describe: v4.19.196-35-gdf520a4397b2
* test details:
https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-4.19.y/build/v4.19.196-35-gdf520a4397b2

## No regressions (compared to v4.19.196-25-g7bbc9654862c)

## No fixes (compared to v4.19.196-25-g7bbc9654862c)

## Test result summary
total: 82675, pass: 62074, fail: 2967, skip: 15186, xfail: 2448,

## Build Summary
* arm: 97 total, 97 passed, 0 failed
* arm64: 25 total, 25 passed, 0 failed
* dragonboard-410c: 2 total, 2 passed, 0 failed
* hi6220-hikey: 2 total, 2 passed, 0 failed
* i386: 15 total, 15 passed, 0 failed
* juno-r2: 2 total, 2 passed, 0 failed
* mips: 39 total, 39 passed, 0 failed
* s390: 9 total, 9 passed, 0 failed
* sparc: 9 total, 9 passed, 0 failed
* x15: 2 total, 2 passed, 0 failed
* x86: 2 total, 2 passed, 0 failed
* x86_64: 15 total, 15 passed, 0 failed

## Test suites summary
* fwts
* igt-gpu-tools
* install-android-platform-tools-r2600
* kselftest-
* kselftest-android
* kselftest-bpf
* kselftest-breakpoints
* kselftest-capabilities
* kselftest-cgroup
* kselftest-clone3
* kselftest-core
* kselftest-cpu-hotplug
* kselftest-cpufreq
* kselftest-drivers
* kselftest-efivarfs
* kselftest-filesystems
* kselftest-firmware
* kselftest-fpu
* kselftest-futex
* kselftest-gpio
* kselftest-intel_pstate
* kselftest-ipc
* kselftest-ir
* kselftest-kcmp
* kselftest-kexec
* kselftest-kvm
* kselftest-lib
* kselftest-livepatch
* kselftest-lkdtm
* kselftest-membarrier
* kselftest-memfd
* kselftest-memory-hotplug
* kselftest-mincore
* kselftest-mount
* kselftest-mqueue
* kselftest-net
* kselftest-netfilter
* kselftest-nsfs
* kselftest-openat2
* kselftest-pid_namespace
* kselftest-pidfd
* kselftest-proc
* kselftest-pstore
* kselftest-ptrace
* kselftest-rseq
* kselftest-rtc
* kselftest-seccomp
* kselftest-sigaltstack
* kselftest-size
* kselftest-splice
* kselftest-static_keys
* kselftest-sync
* kselftest-sysctl
* kselftest-tc-testing
* kselftest-timens
* kselftest-timers
* kselftest-tmpfs
* kselftest-tpm2
* kselftest-user
* kselftest-vm
* kselftest-vsyscall-mode-native-
* kselftest-vsyscall-mode-none-
* kselftest-x86
* kselftest-zram
* kvm-unit-tests
* libhugetlbfs
* linux-log-parser
* ltp-cap_bounds-tests
* ltp-commands-tests
* ltp-containers-tests
* ltp-controllers-tests
* ltp-cpuhotplug-tests
* ltp-crypto-tests
* ltp-cve-tests
* ltp-dio-tests
* ltp-fcntl-locktests-tests
* ltp-filecaps-tests
* ltp-fs-tests
* ltp-fs_bind-tests
* ltp-fs_perms_simple-tests
* ltp-fsx-tests
* ltp-hugetlb-tests
* ltp-io-tests
* ltp-ipc-tests
* ltp-math-tests
* ltp-mm-tests
* ltp-nptl-tests
* ltp-open-posix-tests
* ltp-pty-tests
* ltp-sched-tests
* ltp-securebits-tests
* ltp-syscalls-tests
* ltp-tracing-tests
* network-basic-tests
* packetdrill
* perf
* rcutorture
* ssuite
* v4l2-compliance

--
Linaro LKFT
https://lkft.linaro.org

2021-07-10 19:54:05

by Guenter Roeck

[permalink] [raw]

Subject: Re: [PATCH 4.19 00/34] 4.19.197-rc1 review

On Fri, Jul 09, 2021 at 03:20:16PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.19.197 release.
> There are 34 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sun, 11 Jul 2021 13:14:09 +0000.
> Anything received after that time might be too late.
>

Build results:
total: 155 pass: 155 fail: 0
Qemu test results:
total: 424 pass: 424 fail: 0

Tested-by: Guenter Roeck <[email protected]>

Guenter

2021-07-11 08:02:55

by Pavel Machek

[permalink] [raw]

Subject: Re: [PATCH 4.19 00/34] 4.19.197-rc1 review

Hi!

> This is the start of the stable review cycle for the 4.19.197 release.
> There are 34 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.

CIP testing did not find any problems here:

https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/tree/linux-4.19.y

Tested-by: Pavel Machek (CIP) <[email protected]>

Best regards,
Pavel
--
DENX Software Engineering GmbH, Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany

Attachments:

(No filename) (1.10 kB)
signature.asc (201.00 B)
Download all attachments

2021-07-12 01:00:16

[permalink] [raw]

Subject: Re: [PATCH 4.19 00/34] 4.19.197-rc1 review

On 2021/7/9 21:20, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.19.197 release.
> There are 34 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sun, 11 Jul 2021 13:14:09 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.197-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>

Tested on arm64 and x86 for 4.19.197-rc1,

Kernel repo:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Branch: linux-4.19.y
Version: 4.19.197-rc1
Commit: df520a4397b2a281e4b2ac05b0c85f8875c3ea11
Compiler: gcc version 7.3.0 (GCC)

arm64:
--------------------------------------------------------------------
Testcase Result Summary:
total: 8858
passed: 8858
failed: 0
timeout: 0
--------------------------------------------------------------------

x86:
--------------------------------------------------------------------
Testcase Result Summary:
total: 8858
passed: 8858
failed: 0
timeout: 0
--------------------------------------------------------------------

Tested-by: Hulk Robot <[email protected]>