2019-02-04 10:45:20

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 00/30] 4.9.155-stable review

This is the start of the stable review cycle for the 4.9.155 release.
There are 30 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.

Responses should be made by Wed Feb 6 10:35:37 UTC 2019.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.155-rc1.gz
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <[email protected]>
Linux 4.9.155-rc1

Amir Goldstein <[email protected]>
fanotify: fix handling of events on child sub-directory

Dave Chinner <[email protected]>
fs: don't scan the inode cache before SB_BORN is set

Benjamin Herrenschmidt <[email protected]>
drivers: core: Remove glue dirs from sysfs earlier

Paulo Alcantara <[email protected]>
cifs: Always resolve hostname before reconnecting

David Hildenbrand <[email protected]>
mm: migrate: don't rely on __PageMovable() of newpage after unlocking it

Naoya Horiguchi <[email protected]>
mm: hwpoison: use do_send_sig_info() instead of force_sig()

Shakeel Butt <[email protected]>
mm, oom: fix use-after-free in oom_kill_process

Andrei Vagin <[email protected]>
kernel/exit.c: release ptraced tasks before zap_pid_ns_processes

Stefan Wahren <[email protected]>
mmc: sdhci-iproc: handle mmc_of_parse() errors during probe

João Paulo Rechi Vita <[email protected]>
platform/x86: asus-nb-wmi: Drop mapping of 0x33 and 0x34 scan codes

João Paulo Rechi Vita <[email protected]>
platform/x86: asus-nb-wmi: Map 0x35 to KEY_SCREENLOCK

Andreas Gruenbacher <[email protected]>
gfs2: Revert "Fix loop in gfs2_rbm_find"

James Morse <[email protected]>
arm64: hibernate: Clean the __hyp_text to PoC after resume

James Morse <[email protected]>
arm64: hyp-stub: Forbid kprobing of the hyp-stub

Ard Biesheuvel <[email protected]>
arm64: kaslr: ensure randomized quantities are clean also when kaslr is off

Koen Vandeputte <[email protected]>
ARM: cns3xxx: Fix writing to wrong PCI config registers after alignment

Waiman Long <[email protected]>
fs/dcache: Fix incorrect nr_dentry_unused accounting in shrink_dcache_sb()

Pavel Shilovsky <[email protected]>
CIFS: Do not count -ENODATA as failure for query directory

Daniel Borkmann <[email protected]>
ipvlan, l3mdev: fix broken l3s mode wrt local routes

Jacob Wen <[email protected]>
l2tp: fix reading optional fields of L2TPv3

Lorenzo Bianconi <[email protected]>
l2tp: remove l2specific_len dependency in l2tp_core

Aya Levin <[email protected]>
net/mlx5e: Allow MAC invalidation while spoofchk is ON

Mathias Thore <[email protected]>
ucc_geth: Reset BQL queue when stopping device

Bernard Pidoux <[email protected]>
net/rose: fix NULL ax25_cb kernel panic

Cong Wang <[email protected]>
netrom: switch to sock timer API

Aya Levin <[email protected]>
net/mlx4_core: Add masking for a few queries on HCA caps

Jacob Wen <[email protected]>
l2tp: copy 4 more bytes to linear part if necessary

David Ahern <[email protected]>
ipv6: Consider sk_bound_dev_if when binding a socket to an address

Jimmy Durand Wesolowski <[email protected]>
fs: add the fsnotify call to vfs_iter_write

Greg Kroah-Hartman <[email protected]>
Fix "net: ipv4: do not handle duplicate fragments as overlapping"


-------------

Diffstat:

Makefile | 4 +-
arch/arm/mach-cns3xxx/pcie.c | 2 +-
arch/arm64/kernel/hibernate.c | 4 +-
arch/arm64/kernel/hyp-stub.S | 2 +
arch/arm64/kernel/kaslr.c | 1 +
drivers/base/core.c | 2 +
drivers/mmc/host/sdhci-iproc.c | 5 +-
drivers/net/ethernet/freescale/ucc_geth.c | 2 +
drivers/net/ethernet/mellanox/mlx4/fw.c | 75 ++++++++++++++---------
drivers/net/ethernet/mellanox/mlx5/core/eswitch.c | 19 ++----
drivers/net/ipvlan/ipvlan_main.c | 6 +-
drivers/platform/x86/asus-nb-wmi.c | 3 +-
fs/cifs/connect.c | 53 ++++++++++++++++
fs/cifs/smb2pdu.c | 4 +-
fs/dcache.c | 6 +-
fs/gfs2/rgrp.c | 2 +-
fs/notify/fsnotify.c | 8 ++-
fs/read_write.c | 4 +-
fs/super.c | 30 +++++++--
include/linux/kobject.h | 17 +++++
include/linux/netdevice.h | 8 +++
include/net/l3mdev.h | 3 +-
kernel/exit.c | 12 +++-
mm/memory-failure.c | 3 +-
mm/migrate.c | 7 ++-
mm/oom_kill.c | 8 +++
net/ipv4/ip_fragment.c | 2 +-
net/ipv6/af_inet6.c | 3 +
net/l2tp/l2tp_core.c | 43 ++++++-------
net/l2tp/l2tp_core.h | 31 ++++++++++
net/l2tp/l2tp_ip.c | 3 +
net/l2tp/l2tp_ip6.c | 3 +
net/netrom/nr_timer.c | 20 +++---
net/rose/rose_route.c | 5 ++
34 files changed, 293 insertions(+), 107 deletions(-)




2019-02-04 10:44:15

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 13/30] CIFS: Do not count -ENODATA as failure for query directory

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Pavel Shilovsky <[email protected]>

commit 8e6e72aeceaaed5aeeb1cb43d3085de7ceb14f79 upstream.

Signed-off-by: Pavel Shilovsky <[email protected]>
Signed-off-by: Steve French <[email protected]>
CC: Stable <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/cifs/smb2pdu.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/fs/cifs/smb2pdu.c
+++ b/fs/cifs/smb2pdu.c
@@ -2686,8 +2686,8 @@ SMB2_query_directory(const unsigned int
if (rc == -ENODATA && rsp->hdr.Status == STATUS_NO_MORE_FILES) {
srch_inf->endOfSearch = true;
rc = 0;
- }
- cifs_stats_fail_inc(tcon, SMB2_QUERY_DIRECTORY_HE);
+ } else
+ cifs_stats_fail_inc(tcon, SMB2_QUERY_DIRECTORY_HE);
goto qdir_exit;
}




2019-02-04 10:44:32

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 19/30] gfs2: Revert "Fix loop in gfs2_rbm_find"

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andreas Gruenbacher <[email protected]>

commit e74c98ca2d6ae4376cc15fa2a22483430909d96b upstream.

This reverts commit 2d29f6b96d8f80322ed2dd895bca590491c38d34.

It turns out that the fix can lead to a ~20 percent performance regression
in initial writes to the page cache according to iozone. Let's revert this
for now to have more time for a proper fix.

Cc: [email protected] # v3.13+
Signed-off-by: Andreas Gruenbacher <[email protected]>
Signed-off-by: Bob Peterson <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/gfs2/rgrp.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/gfs2/rgrp.c
+++ b/fs/gfs2/rgrp.c
@@ -1705,9 +1705,9 @@ static int gfs2_rbm_find(struct gfs2_rbm
goto next_iter;
}
if (ret == -E2BIG) {
- n += rbm->bii - initial_bii;
rbm->bii = 0;
rbm->offset = 0;
+ n += (rbm->bii - initial_bii);
goto res_covered_end_of_rgrp;
}
return ret;



2019-02-04 10:44:38

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 20/30] platform/x86: asus-nb-wmi: Map 0x35 to KEY_SCREENLOCK

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

[ Upstream commit b3f2f3799a972d3863d0fdc2ab6287aef6ca631f ]

When the OS registers to handle events from the display off hotkey the
EC will send a notification with 0x35 for every key press, independent
of the backlight state.

The behavior of this key on Windows, with the ATKACPI driver from Asus
installed, is turning off the backlight of all connected displays with a
fading effect, and any cursor input or key press turning the backlight
back on. The key press or cursor input that wakes up the display is also
passed through to the application under the cursor or under focus.

The key that matches this behavior the closest is KEY_SCREENLOCK.

Signed-off-by: João Paulo Rechi Vita <[email protected]>
Signed-off-by: Andy Shevchenko <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/platform/x86/asus-nb-wmi.c | 1 +
1 file changed, 1 insertion(+)

--- a/drivers/platform/x86/asus-nb-wmi.c
+++ b/drivers/platform/x86/asus-nb-wmi.c
@@ -479,6 +479,7 @@ static const struct key_entry asus_nb_wm
{ KE_KEY, 0x32, { KEY_MUTE } },
{ KE_KEY, 0x33, { KEY_DISPLAYTOGGLE } }, /* LCD on */
{ KE_KEY, 0x34, { KEY_DISPLAY_OFF } }, /* LCD off */
+ { KE_KEY, 0x35, { KEY_SCREENLOCK } },
{ KE_KEY, 0x40, { KEY_PREVIOUSSONG } },
{ KE_KEY, 0x41, { KEY_NEXTSONG } },
{ KE_KEY, 0x43, { KEY_STOPCD } }, /* Stop/Eject */



2019-02-04 10:44:50

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 04/30] l2tp: copy 4 more bytes to linear part if necessary

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Jacob Wen <[email protected]>

[ Upstream commit 91c524708de6207f59dd3512518d8a1c7b434ee3 ]

The size of L2TPv2 header with all optional fields is 14 bytes.
l2tp_udp_recv_core only moves 10 bytes to the linear part of a
skb. This may lead to l2tp_recv_common read data outside of a skb.

This patch make sure that there is at least 14 bytes in the linear
part of a skb to meet the maximum need of l2tp_udp_recv_core and
l2tp_recv_common. The minimum size of both PPP HDLC-like frame and
Ethernet frame is larger than 14 bytes, so we are safe to do so.

Also remove L2TP_HDR_SIZE_NOSEQ, it is unused now.

Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts")
Suggested-by: Guillaume Nault <[email protected]>
Signed-off-by: Jacob Wen <[email protected]>
Acked-by: Guillaume Nault <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/l2tp/l2tp_core.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -83,8 +83,7 @@
#define L2TP_SLFLAG_S 0x40000000
#define L2TP_SL_SEQ_MASK 0x00ffffff

-#define L2TP_HDR_SIZE_SEQ 10
-#define L2TP_HDR_SIZE_NOSEQ 6
+#define L2TP_HDR_SIZE_MAX 14

/* Default trace flags */
#define L2TP_DEFAULT_DEBUG_FLAGS 0
@@ -944,7 +943,7 @@ static int l2tp_udp_recv_core(struct l2t
__skb_pull(skb, sizeof(struct udphdr));

/* Short packet? */
- if (!pskb_may_pull(skb, L2TP_HDR_SIZE_SEQ)) {
+ if (!pskb_may_pull(skb, L2TP_HDR_SIZE_MAX)) {
l2tp_info(tunnel, L2TP_MSG_DATA,
"%s: recv short packet (len=%d)\n",
tunnel->name, skb->len);



2019-02-04 10:44:54

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 06/30] netrom: switch to sock timer API

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Cong Wang <[email protected]>

[ Upstream commit 63346650c1a94a92be61a57416ac88c0a47c4327 ]

sk_reset_timer() and sk_stop_timer() properly handle
sock refcnt for timer function. Switching to them
could fix a refcounting bug reported by syzbot.

Reported-and-tested-by: [email protected]
Cc: Ralf Baechle <[email protected]>
Cc: [email protected]
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/netrom/nr_timer.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)

--- a/net/netrom/nr_timer.c
+++ b/net/netrom/nr_timer.c
@@ -53,21 +53,21 @@ void nr_start_t1timer(struct sock *sk)
{
struct nr_sock *nr = nr_sk(sk);

- mod_timer(&nr->t1timer, jiffies + nr->t1);
+ sk_reset_timer(sk, &nr->t1timer, jiffies + nr->t1);
}

void nr_start_t2timer(struct sock *sk)
{
struct nr_sock *nr = nr_sk(sk);

- mod_timer(&nr->t2timer, jiffies + nr->t2);
+ sk_reset_timer(sk, &nr->t2timer, jiffies + nr->t2);
}

void nr_start_t4timer(struct sock *sk)
{
struct nr_sock *nr = nr_sk(sk);

- mod_timer(&nr->t4timer, jiffies + nr->t4);
+ sk_reset_timer(sk, &nr->t4timer, jiffies + nr->t4);
}

void nr_start_idletimer(struct sock *sk)
@@ -75,37 +75,37 @@ void nr_start_idletimer(struct sock *sk)
struct nr_sock *nr = nr_sk(sk);

if (nr->idle > 0)
- mod_timer(&nr->idletimer, jiffies + nr->idle);
+ sk_reset_timer(sk, &nr->idletimer, jiffies + nr->idle);
}

void nr_start_heartbeat(struct sock *sk)
{
- mod_timer(&sk->sk_timer, jiffies + 5 * HZ);
+ sk_reset_timer(sk, &sk->sk_timer, jiffies + 5 * HZ);
}

void nr_stop_t1timer(struct sock *sk)
{
- del_timer(&nr_sk(sk)->t1timer);
+ sk_stop_timer(sk, &nr_sk(sk)->t1timer);
}

void nr_stop_t2timer(struct sock *sk)
{
- del_timer(&nr_sk(sk)->t2timer);
+ sk_stop_timer(sk, &nr_sk(sk)->t2timer);
}

void nr_stop_t4timer(struct sock *sk)
{
- del_timer(&nr_sk(sk)->t4timer);
+ sk_stop_timer(sk, &nr_sk(sk)->t4timer);
}

void nr_stop_idletimer(struct sock *sk)
{
- del_timer(&nr_sk(sk)->idletimer);
+ sk_stop_timer(sk, &nr_sk(sk)->idletimer);
}

void nr_stop_heartbeat(struct sock *sk)
{
- del_timer(&sk->sk_timer);
+ sk_stop_timer(sk, &sk->sk_timer);
}

int nr_t1timer_running(struct sock *sk)



2019-02-04 10:44:58

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 07/30] net/rose: fix NULL ax25_cb kernel panic

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Bernard Pidoux <[email protected]>

[ Upstream commit b0cf029234f9b18e10703ba5147f0389c382bccc ]

When an internally generated frame is handled by rose_xmit(),
rose_route_frame() is called:

if (!rose_route_frame(skb, NULL)) {
dev_kfree_skb(skb);
stats->tx_errors++;
return NETDEV_TX_OK;
}

We have the same code sequence in Net/Rom where an internally generated
frame is handled by nr_xmit() calling nr_route_frame(skb, NULL).
However, in this function NULL argument is tested while it is not in
rose_route_frame().
Then kernel panic occurs later on when calling ax25cmp() with a NULL
ax25_cb argument as reported many times and recently with syzbot.

We need to test if ax25 is NULL before using it.

Testing:
Built kernel with CONFIG_ROSE=y.

Signed-off-by: Bernard Pidoux <[email protected]>
Acked-by: Dmitry Vyukov <[email protected]>
Reported-by: [email protected]
Cc: "David S. Miller" <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: Bernard Pidoux <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/rose/rose_route.c | 5 +++++
1 file changed, 5 insertions(+)

--- a/net/rose/rose_route.c
+++ b/net/rose/rose_route.c
@@ -848,6 +848,7 @@ void rose_link_device_down(struct net_de

/*
* Route a frame to an appropriate AX.25 connection.
+ * A NULL ax25_cb indicates an internally generated frame.
*/
int rose_route_frame(struct sk_buff *skb, ax25_cb *ax25)
{
@@ -865,6 +866,10 @@ int rose_route_frame(struct sk_buff *skb

if (skb->len < ROSE_MIN_LEN)
return res;
+
+ if (!ax25)
+ return rose_loopback_queue(skb, NULL);
+
frametype = skb->data[2];
lci = ((skb->data[0] << 8) & 0xF00) + ((skb->data[1] << 0) & 0x0FF);
if (frametype == ROSE_CALL_REQUEST &&



2019-02-04 10:45:09

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 01/30] Fix "net: ipv4: do not handle duplicate fragments as overlapping"

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Greg Kroah-Hartman <[email protected]>

ade446403bfb ("net: ipv4: do not handle duplicate fragments as
overlapping") was backported to many stable trees, but it had a problem
that was "accidentally" fixed by the upstream commit 0ff89efb5246 ("ip:
fail fast on IP defrag errors")

This is the fixup for that problem as we do not want the larger patch in
the older stable trees.

Fixes: ade446403bfb ("net: ipv4: do not handle duplicate fragments as overlapping")
Reported-by: Ivan Babrou <[email protected]>
Reported-by: Eric Dumazet <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/ipv4/ip_fragment.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -423,6 +423,7 @@ static int ip_frag_queue(struct ipq *qp,
* fragment.
*/

+ err = -EINVAL;
/* Find out where to put this fragment. */
prev_tail = qp->q.fragments_tail;
if (!prev_tail)
@@ -499,7 +500,6 @@ static int ip_frag_queue(struct ipq *qp,

discard_qp:
inet_frag_kill(&qp->q);
- err = -EINVAL;
__IP_INC_STATS(net, IPSTATS_MIB_REASM_OVERLAPS);
err:
kfree_skb(skb);



2019-02-04 10:45:22

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 29/30] fs: dont scan the inode cache before SB_BORN is set

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dave Chinner <[email protected]>

commit 79f546a696bff2590169fb5684e23d65f4d9f591 upstream.

We recently had an oops reported on a 4.14 kernel in
xfs_reclaim_inodes_count() where sb->s_fs_info pointed to garbage
and so the m_perag_tree lookup walked into lala land. It produces
an oops down this path during the failed mount:

radix_tree_gang_lookup_tag+0xc4/0x130
xfs_perag_get_tag+0x37/0xf0
xfs_reclaim_inodes_count+0x32/0x40
xfs_fs_nr_cached_objects+0x11/0x20
super_cache_count+0x35/0xc0
shrink_slab.part.66+0xb1/0x370
shrink_node+0x7e/0x1a0
try_to_free_pages+0x199/0x470
__alloc_pages_slowpath+0x3a1/0xd20
__alloc_pages_nodemask+0x1c3/0x200
cache_grow_begin+0x20b/0x2e0
fallback_alloc+0x160/0x200
kmem_cache_alloc+0x111/0x4e0

The problem is that the superblock shrinker is running before the
filesystem structures it depends on have been fully set up. i.e.
the shrinker is registered in sget(), before ->fill_super() has been
called, and the shrinker can call into the filesystem before
fill_super() does it's setup work. Essentially we are exposed to
both use-after-free and use-before-initialisation bugs here.

To fix this, add a check for the SB_BORN flag in super_cache_count.
In general, this flag is not set until ->fs_mount() completes
successfully, so we know that it is set after the filesystem
setup has completed. This matches the trylock_super() behaviour
which will not let super_cache_scan() run if SB_BORN is not set, and
hence will not allow the superblock shrinker from entering the
filesystem while it is being set up or after it has failed setup
and is being torn down.

Cc: [email protected]
Signed-Off-By: Dave Chinner <[email protected]>
Signed-off-by: Al Viro <[email protected]>
Signed-off-by: Aaron Lu <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/super.c | 30 ++++++++++++++++++++++++------
1 file changed, 24 insertions(+), 6 deletions(-)

--- a/fs/super.c
+++ b/fs/super.c
@@ -119,13 +119,23 @@ static unsigned long super_cache_count(s
sb = container_of(shrink, struct super_block, s_shrink);

/*
- * Don't call trylock_super as it is a potential
- * scalability bottleneck. The counts could get updated
- * between super_cache_count and super_cache_scan anyway.
- * Call to super_cache_count with shrinker_rwsem held
- * ensures the safety of call to list_lru_shrink_count() and
- * s_op->nr_cached_objects().
+ * We don't call trylock_super() here as it is a scalability bottleneck,
+ * so we're exposed to partial setup state. The shrinker rwsem does not
+ * protect filesystem operations backing list_lru_shrink_count() or
+ * s_op->nr_cached_objects(). Counts can change between
+ * super_cache_count and super_cache_scan, so we really don't need locks
+ * here.
+ *
+ * However, if we are currently mounting the superblock, the underlying
+ * filesystem might be in a state of partial construction and hence it
+ * is dangerous to access it. trylock_super() uses a MS_BORN check to
+ * avoid this situation, so do the same here. The memory barrier is
+ * matched with the one in mount_fs() as we don't hold locks here.
*/
+ if (!(sb->s_flags & MS_BORN))
+ return 0;
+ smp_rmb();
+
if (sb->s_op && sb->s_op->nr_cached_objects)
total_objects = sb->s_op->nr_cached_objects(sb, sc);

@@ -1193,6 +1203,14 @@ mount_fs(struct file_system_type *type,
sb = root->d_sb;
BUG_ON(!sb);
WARN_ON(!sb->s_bdi);
+
+ /*
+ * Write barrier is for super_cache_count(). We place it before setting
+ * MS_BORN as the data dependency between the two functions is the
+ * superblock structure contents that we just set up, not the MS_BORN
+ * flag.
+ */
+ smp_wmb();
sb->s_flags |= MS_BORN;

error = security_sb_kern_mount(sb, flags, secdata);



2019-02-04 10:45:30

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 30/30] fanotify: fix handling of events on child sub-directory

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Amir Goldstein <[email protected]>

commit b469e7e47c8a075cc08bcd1e85d4365134bdcdd5 upstream.

When an event is reported on a sub-directory and the parent inode has
a mark mask with FS_EVENT_ON_CHILD|FS_ISDIR, the event will be sent to
fsnotify() even if the event type is not in the parent mark mask
(e.g. FS_OPEN).

Further more, if that event happened on a mount or a filesystem with
a mount/sb mark that does have that event type in their mask, the "on
child" event will be reported on the mount/sb mark. That is not
desired, because user will get a duplicate event for the same action.

Note that the event reported on the victim inode is never merged with
the event reported on the parent inode, because of the check in
should_merge(): old_fsn->inode == new_fsn->inode.

Fix this by looking for a match of an actual event type (i.e. not just
FS_ISDIR) in parent's inode mark mask and by not reporting an "on child"
event to group if event type is only found on mount/sb marks.

[backport hint: The bug seems to have always been in fanotify, but this
patch will only apply cleanly to v4.19.y]

Cc: <[email protected]> # v4.19
Signed-off-by: Amir Goldstein <[email protected]>
Signed-off-by: Jan Kara <[email protected]>
[amir: backport to v4.9]
Signed-off-by: Amir Goldstein <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/notify/fsnotify.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

--- a/fs/notify/fsnotify.c
+++ b/fs/notify/fsnotify.c
@@ -101,9 +101,9 @@ int __fsnotify_parent(struct path *path,
parent = dget_parent(dentry);
p_inode = parent->d_inode;

- if (unlikely(!fsnotify_inode_watches_children(p_inode)))
+ if (unlikely(!fsnotify_inode_watches_children(p_inode))) {
__fsnotify_update_child_dentry_flags(p_inode);
- else if (p_inode->i_fsnotify_mask & mask) {
+ } else if (p_inode->i_fsnotify_mask & mask & ~FS_EVENT_ON_CHILD) {
struct name_snapshot name;

/* we are notifying a parent so come up with the new mask which
@@ -207,6 +207,10 @@ int fsnotify(struct inode *to_tell, __u3
else
mnt = NULL;

+ /* An event "on child" is not intended for a mount mark */
+ if (mask & FS_EVENT_ON_CHILD)
+ mnt = NULL;
+
/*
* Optimization: srcu_read_lock() has a memory barrier which can
* be expensive. It protects walking the *_fsnotify_marks lists.



2019-02-04 10:45:37

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 16/30] arm64: kaslr: ensure randomized quantities are clean also when kaslr is off

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Ard Biesheuvel <[email protected]>

commit 8ea235932314311f15ea6cf65c1393ed7e31af70 upstream.

Commit 1598ecda7b23 ("arm64: kaslr: ensure randomized quantities are
clean to the PoC") added cache maintenance to ensure that global
variables set by the kaslr init routine are not wiped clean due to
cache invalidation occurring during the second round of page table
creation.

However, if kaslr_early_init() exits early with no randomization
being applied (either due to the lack of a seed, or because the user
has disabled kaslr explicitly), no cache maintenance is performed,
leading to the same issue we attempted to fix earlier, as far as the
module_alloc_base variable is concerned.

Note that module_alloc_base cannot be initialized statically, because
that would cause it to be subject to a R_AARCH64_RELATIVE relocation,
causing it to be overwritten by the second round of KASLR relocation
processing.

Fixes: f80fb3a3d508 ("arm64: add support for kernel ASLR")
Cc: <[email protected]> # v4.6+
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm64/kernel/kaslr.c | 1 +
1 file changed, 1 insertion(+)

--- a/arch/arm64/kernel/kaslr.c
+++ b/arch/arm64/kernel/kaslr.c
@@ -88,6 +88,7 @@ u64 __init kaslr_early_init(u64 dt_phys,
* we end up running with module randomization disabled.
*/
module_alloc_base = (u64)_etext - MODULES_VSIZE;
+ __flush_dcache_area(&module_alloc_base, sizeof(module_alloc_base));

/*
* Try to map the FDT early. If this fails, we simply bail,



2019-02-04 10:45:40

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 21/30] platform/x86: asus-nb-wmi: Drop mapping of 0x33 and 0x34 scan codes

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

[ Upstream commit 71b12beaf12f21a53bfe100795d0797f1035b570 ]

According to Asus firmware engineers, the meaning of these codes is only
to notify the OS that the screen brightness has been turned on/off by
the EC. This does not match the meaning of KEY_DISPLAYTOGGLE /
KEY_DISPLAY_OFF, where userspace is expected to change the display
brightness.

Signed-off-by: João Paulo Rechi Vita <[email protected]>
Signed-off-by: Andy Shevchenko <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/platform/x86/asus-nb-wmi.c | 2 --
1 file changed, 2 deletions(-)

--- a/drivers/platform/x86/asus-nb-wmi.c
+++ b/drivers/platform/x86/asus-nb-wmi.c
@@ -477,8 +477,6 @@ static const struct key_entry asus_nb_wm
{ KE_KEY, 0x30, { KEY_VOLUMEUP } },
{ KE_KEY, 0x31, { KEY_VOLUMEDOWN } },
{ KE_KEY, 0x32, { KEY_MUTE } },
- { KE_KEY, 0x33, { KEY_DISPLAYTOGGLE } }, /* LCD on */
- { KE_KEY, 0x34, { KEY_DISPLAY_OFF } }, /* LCD off */
{ KE_KEY, 0x35, { KEY_SCREENLOCK } },
{ KE_KEY, 0x40, { KEY_PREVIOUSSONG } },
{ KE_KEY, 0x41, { KEY_NEXTSONG } },



2019-02-04 10:46:30

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 27/30] cifs: Always resolve hostname before reconnecting

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Paulo Alcantara <[email protected]>

commit 28eb24ff75c5ac130eb326b3b4d0dcecfc0f427d upstream.

In case a hostname resolves to a different IP address (e.g. long
running mounts), make sure to resolve it every time prior to calling
generic_ip_connect() in reconnect.

Suggested-by: Steve French <[email protected]>
Signed-off-by: Paulo Alcantara <[email protected]>
Signed-off-by: Steve French <[email protected]>
Signed-off-by: Pavel Shilovsky <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/cifs/connect.c | 53 +++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 53 insertions(+)

--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -48,6 +48,7 @@
#include "cifs_unicode.h"
#include "cifs_debug.h"
#include "cifs_fs_sb.h"
+#include "dns_resolve.h"
#include "ntlmssp.h"
#include "nterr.h"
#include "rfc1002pdu.h"
@@ -307,6 +308,53 @@ static int cifs_setup_volume_info(struct
const char *devname);

/*
+ * Resolve hostname and set ip addr in tcp ses. Useful for hostnames that may
+ * get their ip addresses changed at some point.
+ *
+ * This should be called with server->srv_mutex held.
+ */
+#ifdef CONFIG_CIFS_DFS_UPCALL
+static int reconn_set_ipaddr(struct TCP_Server_Info *server)
+{
+ int rc;
+ int len;
+ char *unc, *ipaddr = NULL;
+
+ if (!server->hostname)
+ return -EINVAL;
+
+ len = strlen(server->hostname) + 3;
+
+ unc = kmalloc(len, GFP_KERNEL);
+ if (!unc) {
+ cifs_dbg(FYI, "%s: failed to create UNC path\n", __func__);
+ return -ENOMEM;
+ }
+ snprintf(unc, len, "\\\\%s", server->hostname);
+
+ rc = dns_resolve_server_name_to_ip(unc, &ipaddr);
+ kfree(unc);
+
+ if (rc < 0) {
+ cifs_dbg(FYI, "%s: failed to resolve server part of %s to IP: %d\n",
+ __func__, server->hostname, rc);
+ return rc;
+ }
+
+ rc = cifs_convert_address((struct sockaddr *)&server->dstaddr, ipaddr,
+ strlen(ipaddr));
+ kfree(ipaddr);
+
+ return !rc ? -1 : 0;
+}
+#else
+static inline int reconn_set_ipaddr(struct TCP_Server_Info *server)
+{
+ return 0;
+}
+#endif
+
+/*
* cifs tcp session reconnection
*
* mark tcp session as reconnecting so temporarily locked
@@ -403,6 +451,11 @@ cifs_reconnect(struct TCP_Server_Info *s
rc = generic_ip_connect(server);
if (rc) {
cifs_dbg(FYI, "reconnect error %d\n", rc);
+ rc = reconn_set_ipaddr(server);
+ if (rc) {
+ cifs_dbg(FYI, "%s: failed to resolve hostname: %d\n",
+ __func__, rc);
+ }
mutex_unlock(&server->srv_mutex);
msleep(3000);
} else {



2019-02-04 10:50:36

by Amir Goldstein

[permalink] [raw]
Subject: Re: [PATCH 4.9 30/30] fanotify: fix handling of events on child sub-directory

On Mon, Feb 4, 2019 at 12:44 PM Greg Kroah-Hartman
<[email protected]> wrote:
>
> 4.9-stable review patch. If anyone has any objections, please let me know.
>
> ------------------
>
> From: Amir Goldstein <[email protected]>
>
> commit b469e7e47c8a075cc08bcd1e85d4365134bdcdd5 upstream.
>
> When an event is reported on a sub-directory and the parent inode has
> a mark mask with FS_EVENT_ON_CHILD|FS_ISDIR, the event will be sent to
> fsnotify() even if the event type is not in the parent mark mask
> (e.g. FS_OPEN).
>
> Further more, if that event happened on a mount or a filesystem with
> a mount/sb mark that does have that event type in their mask, the "on
> child" event will be reported on the mount/sb mark. That is not
> desired, because user will get a duplicate event for the same action.
>
> Note that the event reported on the victim inode is never merged with
> the event reported on the parent inode, because of the check in
> should_merge(): old_fsn->inode == new_fsn->inode.
>
> Fix this by looking for a match of an actual event type (i.e. not just
> FS_ISDIR) in parent's inode mark mask and by not reporting an "on child"
> event to group if event type is only found on mount/sb marks.
>
> [backport hint: The bug seems to have always been in fanotify, but this
> patch will only apply cleanly to v4.19.y]
>

Same comment about this backport hint being misleading in the
context of the backport patch.

> Cc: <[email protected]> # v4.19
> Signed-off-by: Amir Goldstein <[email protected]>
> Signed-off-by: Jan Kara <[email protected]>
> [amir: backport to v4.9]
> Signed-off-by: Amir Goldstein <[email protected]>
> Signed-off-by: Greg Kroah-Hartman <[email protected]>
>
> ---
> fs/notify/fsnotify.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> --- a/fs/notify/fsnotify.c
> +++ b/fs/notify/fsnotify.c
> @@ -101,9 +101,9 @@ int __fsnotify_parent(struct path *path,
> parent = dget_parent(dentry);
> p_inode = parent->d_inode;
>
> - if (unlikely(!fsnotify_inode_watches_children(p_inode)))
> + if (unlikely(!fsnotify_inode_watches_children(p_inode))) {
> __fsnotify_update_child_dentry_flags(p_inode);
> - else if (p_inode->i_fsnotify_mask & mask) {
> + } else if (p_inode->i_fsnotify_mask & mask & ~FS_EVENT_ON_CHILD) {
> struct name_snapshot name;
>
> /* we are notifying a parent so come up with the new mask which
> @@ -207,6 +207,10 @@ int fsnotify(struct inode *to_tell, __u3
> else
> mnt = NULL;
>
> + /* An event "on child" is not intended for a mount mark */
> + if (mask & FS_EVENT_ON_CHILD)
> + mnt = NULL;
> +
> /*
> * Optimization: srcu_read_lock() has a memory barrier which can
> * be expensive. It protects walking the *_fsnotify_marks lists.
>
>

2019-02-04 11:06:56

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 23/30] kernel/exit.c: release ptraced tasks before zap_pid_ns_processes

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andrei Vagin <[email protected]>

commit 8fb335e078378c8426fabeed1ebee1fbf915690c upstream.

Currently, exit_ptrace() adds all ptraced tasks in a dead list, then
zap_pid_ns_processes() waits on all tasks in a current pidns, and only
then are tasks from the dead list released.

zap_pid_ns_processes() can get stuck on waiting tasks from the dead
list. In this case, we will have one unkillable process with one or
more dead children.

Thanks to Oleg for the advice to release tasks in find_child_reaper().

Link: http://lkml.kernel.org/r/[email protected]
Fixes: 7c8bd2322c7f ("exit: ptrace: shift "reap dead" code from exit_ptrace() to forget_original_parent()")
Signed-off-by: Andrei Vagin <[email protected]>
Signed-off-by: Oleg Nesterov <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
kernel/exit.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)

--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -525,12 +525,14 @@ static struct task_struct *find_alive_th
return NULL;
}

-static struct task_struct *find_child_reaper(struct task_struct *father)
+static struct task_struct *find_child_reaper(struct task_struct *father,
+ struct list_head *dead)
__releases(&tasklist_lock)
__acquires(&tasklist_lock)
{
struct pid_namespace *pid_ns = task_active_pid_ns(father);
struct task_struct *reaper = pid_ns->child_reaper;
+ struct task_struct *p, *n;

if (likely(reaper != father))
return reaper;
@@ -546,6 +548,12 @@ static struct task_struct *find_child_re
panic("Attempted to kill init! exitcode=0x%08x\n",
father->signal->group_exit_code ?: father->exit_code);
}
+
+ list_for_each_entry_safe(p, n, dead, ptrace_entry) {
+ list_del_init(&p->ptrace_entry);
+ release_task(p);
+ }
+
zap_pid_ns_processes(pid_ns);
write_lock_irq(&tasklist_lock);

@@ -632,7 +640,7 @@ static void forget_original_parent(struc
exit_ptrace(father, dead);

/* Can drop and reacquire tasklist_lock */
- reaper = find_child_reaper(father);
+ reaper = find_child_reaper(father, dead);
if (list_empty(&father->children))
return;




2019-02-04 11:06:58

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 28/30] drivers: core: Remove glue dirs from sysfs earlier

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Benjamin Herrenschmidt <[email protected]>

commit 726e41097920a73e4c7c33385dcc0debb1281e18 upstream.

For devices with a class, we create a "glue" directory between
the parent device and the new device with the class name.

This directory is never "explicitely" removed when empty however,
this is left to the implicit sysfs removal done by kobject_release()
when the object loses its last reference via kobject_put().

This is problematic because as long as it's not been removed from
sysfs, it is still present in the class kset and in sysfs directory
structure.

The presence in the class kset exposes a use after free bug fixed
by the previous patch, but the presence in sysfs means that until
the kobject is released, which can take a while (especially with
kobject debugging), any attempt at re-creating such as binding a
new device for that class/parent pair, will result in a sysfs
duplicate file name error.

This fixes it by instead doing an explicit kobject_del() when
the glue dir is empty, by keeping track of the number of
child devices of the gluedir.

This is made easy by the fact that all glue dir operations are
done with a global mutex, and there's already a function
(cleanup_glue_dir) called in all the right places taking that
mutex that can be enhanced for this. It appears that this was
in fact the intent of the function, but the implementation was
wrong.

Signed-off-by: Benjamin Herrenschmidt <[email protected]>
Acked-by: Linus Torvalds <[email protected]>
Signed-off-by: Zubin Mithra <[email protected]>
Cc: Guenter Roeck <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/base/core.c | 2 ++
include/linux/kobject.h | 17 +++++++++++++++++
2 files changed, 19 insertions(+)

--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -862,6 +862,8 @@ static void cleanup_glue_dir(struct devi
return;

mutex_lock(&gdp_mutex);
+ if (!kobject_has_children(glue_dir))
+ kobject_del(glue_dir);
kobject_put(glue_dir);
mutex_unlock(&gdp_mutex);
}
--- a/include/linux/kobject.h
+++ b/include/linux/kobject.h
@@ -113,6 +113,23 @@ extern void kobject_put(struct kobject *
extern const void *kobject_namespace(struct kobject *kobj);
extern char *kobject_get_path(struct kobject *kobj, gfp_t flag);

+/**
+ * kobject_has_children - Returns whether a kobject has children.
+ * @kobj: the object to test
+ *
+ * This will return whether a kobject has other kobjects as children.
+ *
+ * It does NOT account for the presence of attribute files, only sub
+ * directories. It also assumes there is no concurrent addition or
+ * removal of such children, and thus relies on external locking.
+ */
+static inline bool kobject_has_children(struct kobject *kobj)
+{
+ WARN_ON_ONCE(atomic_read(&kobj->kref.refcount) == 0);
+
+ return kobj->sd && kobj->sd->dir.subdirs;
+}
+
struct kobj_type {
void (*release)(struct kobject *kobj);
const struct sysfs_ops *sysfs_ops;



2019-02-04 11:07:14

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 25/30] mm: hwpoison: use do_send_sig_info() instead of force_sig()

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Naoya Horiguchi <[email protected]>

commit 6376360ecbe525a9c17b3d081dfd88ba3e4ed65b upstream.

Currently memory_failure() is racy against process's exiting, which
results in kernel crash by null pointer dereference.

The root cause is that memory_failure() uses force_sig() to forcibly
kill asynchronous (meaning not in the current context) processes. As
discussed in thread https://lkml.org/lkml/2010/6/8/236 years ago for OOM
fixes, this is not a right thing to do. OOM solves this issue by using
do_send_sig_info() as done in commit d2d393099de2 ("signal:
oom_kill_task: use SEND_SIG_FORCED instead of force_sig()"), so this
patch is suggesting to do the same for hwpoison. do_send_sig_info()
properly accesses to siglock with lock_task_sighand(), so is free from
the reported race.

I confirmed that the reported bug reproduces with inserting some delay
in kill_procs(), and it never reproduces with this patch.

Note that memory_failure() can send another type of signal using
force_sig_mceerr(), and the reported race shouldn't happen on it because
force_sig_mceerr() is called only for synchronous processes (i.e.
BUS_MCEERR_AR happens only when some process accesses to the corrupted
memory.)

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Naoya Horiguchi <[email protected]>
Reported-by: Jane Chu <[email protected]>
Reviewed-by: Dan Williams <[email protected]>
Reviewed-by: William Kucharski <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
mm/memory-failure.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -336,7 +336,8 @@ static void kill_procs(struct list_head
if (fail || tk->addr_valid == 0) {
pr_err("Memory failure: %#lx: forcibly killing %s:%d because of failure to unmap corrupted page\n",
pfn, tk->tsk->comm, tk->tsk->pid);
- force_sig(SIGKILL, tk->tsk);
+ do_send_sig_info(SIGKILL, SEND_SIG_PRIV,
+ tk->tsk, PIDTYPE_PID);
}

/*



2019-02-04 11:07:24

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 09/30] net/mlx5e: Allow MAC invalidation while spoofchk is ON

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Aya Levin <[email protected]>

[ Upstream commit 9d2cbdc5d334967c35b5f58c7bf3208e17325647 ]

Prior to this patch the driver prohibited spoof checking on invalid MAC.
Now the user can set this configuration if it wishes to.

This is required since libvirt might invalidate the VF Mac by setting it
to zero, while spoofcheck is ON.

Fixes: 1ab2068a4c66 ("net/mlx5: Implement vports admin state backup/restore")
Signed-off-by: Aya Levin <[email protected]>
Reviewed-by: Eran Ben Elisha <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/mellanox/mlx5/core/eswitch.c | 19 ++++++-------------
1 file changed, 6 insertions(+), 13 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
@@ -1216,14 +1216,6 @@ static int esw_vport_ingress_config(stru
int err = 0;
u8 *smac_v;

- if (vport->info.spoofchk && !is_valid_ether_addr(vport->info.mac)) {
- mlx5_core_warn(esw->dev,
- "vport[%d] configure ingress rules failed, illegal mac with spoofchk\n",
- vport->vport);
- return -EPERM;
-
- }
-
esw_vport_cleanup_ingress_rules(esw, vport);

if (!vport->info.vlan && !vport->info.qos && !vport->info.spoofchk) {
@@ -1709,13 +1701,10 @@ int mlx5_eswitch_set_vport_mac(struct ml
mutex_lock(&esw->state_lock);
evport = &esw->vports[vport];

- if (evport->info.spoofchk && !is_valid_ether_addr(mac)) {
+ if (evport->info.spoofchk && !is_valid_ether_addr(mac))
mlx5_core_warn(esw->dev,
- "MAC invalidation is not allowed when spoofchk is on, vport(%d)\n",
+ "Set invalid MAC while spoofchk is on, vport(%d)\n",
vport);
- err = -EPERM;
- goto unlock;
- }

err = mlx5_modify_nic_vport_mac_address(esw->dev, vport, mac);
if (err) {
@@ -1859,6 +1848,10 @@ int mlx5_eswitch_set_vport_spoofchk(stru
evport = &esw->vports[vport];
pschk = evport->info.spoofchk;
evport->info.spoofchk = spoofchk;
+ if (pschk && !is_valid_ether_addr(evport->info.mac))
+ mlx5_core_warn(esw->dev,
+ "Spoofchk in set while MAC is invalid, vport(%d)\n",
+ evport->vport);
if (evport->enabled && esw->mode == SRIOV_LEGACY)
err = esw_vport_ingress_config(esw, evport);
if (err)



2019-02-04 11:07:25

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 08/30] ucc_geth: Reset BQL queue when stopping device

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Mathias Thore <[email protected]>

[ Upstream commit e15aa3b2b1388c399c1a2ce08550d2cc4f7e3e14 ]

After a timeout event caused by for example a broadcast storm, when
the MAC and PHY are reset, the BQL TX queue needs to be reset as
well. Otherwise, the device will exhibit severe performance issues
even after the storm has ended.

Co-authored-by: David Gounaris <[email protected]>
Signed-off-by: Mathias Thore <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/freescale/ucc_geth.c | 2 ++
1 file changed, 2 insertions(+)

--- a/drivers/net/ethernet/freescale/ucc_geth.c
+++ b/drivers/net/ethernet/freescale/ucc_geth.c
@@ -1888,6 +1888,8 @@ static void ucc_geth_free_tx(struct ucc_
u16 i, j;
u8 __iomem *bd;

+ netdev_reset_queue(ugeth->ndev);
+
ug_info = ugeth->ug_info;
uf_info = &ug_info->uf_info;




2019-02-04 11:07:36

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 05/30] net/mlx4_core: Add masking for a few queries on HCA caps

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Aya Levin <[email protected]>

[ Upstream commit a40ded6043658444ee4dd6ee374119e4e98b33fc ]

Driver reads the query HCA capabilities without the corresponding masks.
Without the correct masks, the base addresses of the queues are
unaligned. In addition some reserved bits were wrongly read. Using the
correct masks, ensures alignment of the base addresses and allows future
firmware versions safe use of the reserved bits.

Fixes: ab9c17a009ee ("mlx4_core: Modify driver initialization flow to accommodate SRIOV for Ethernet")
Fixes: 0ff1fb654bec ("{NET, IB}/mlx4: Add device managed flow steering firmware API")
Signed-off-by: Aya Levin <[email protected]>
Signed-off-by: Tariq Toukan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/mellanox/mlx4/fw.c | 75 +++++++++++++++++++-------------
1 file changed, 46 insertions(+), 29 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlx4/fw.c
+++ b/drivers/net/ethernet/mellanox/mlx4/fw.c
@@ -2037,9 +2037,11 @@ int mlx4_QUERY_HCA(struct mlx4_dev *dev,
{
struct mlx4_cmd_mailbox *mailbox;
__be32 *outbox;
+ u64 qword_field;
u32 dword_field;
- int err;
+ u16 word_field;
u8 byte_field;
+ int err;
static const u8 a0_dmfs_query_hw_steering[] = {
[0] = MLX4_STEERING_DMFS_A0_DEFAULT,
[1] = MLX4_STEERING_DMFS_A0_DYNAMIC,
@@ -2067,19 +2069,32 @@ int mlx4_QUERY_HCA(struct mlx4_dev *dev,

/* QPC/EEC/CQC/EQC/RDMARC attributes */

- MLX4_GET(param->qpc_base, outbox, INIT_HCA_QPC_BASE_OFFSET);
- MLX4_GET(param->log_num_qps, outbox, INIT_HCA_LOG_QP_OFFSET);
- MLX4_GET(param->srqc_base, outbox, INIT_HCA_SRQC_BASE_OFFSET);
- MLX4_GET(param->log_num_srqs, outbox, INIT_HCA_LOG_SRQ_OFFSET);
- MLX4_GET(param->cqc_base, outbox, INIT_HCA_CQC_BASE_OFFSET);
- MLX4_GET(param->log_num_cqs, outbox, INIT_HCA_LOG_CQ_OFFSET);
- MLX4_GET(param->altc_base, outbox, INIT_HCA_ALTC_BASE_OFFSET);
- MLX4_GET(param->auxc_base, outbox, INIT_HCA_AUXC_BASE_OFFSET);
- MLX4_GET(param->eqc_base, outbox, INIT_HCA_EQC_BASE_OFFSET);
- MLX4_GET(param->log_num_eqs, outbox, INIT_HCA_LOG_EQ_OFFSET);
- MLX4_GET(param->num_sys_eqs, outbox, INIT_HCA_NUM_SYS_EQS_OFFSET);
- MLX4_GET(param->rdmarc_base, outbox, INIT_HCA_RDMARC_BASE_OFFSET);
- MLX4_GET(param->log_rd_per_qp, outbox, INIT_HCA_LOG_RD_OFFSET);
+ MLX4_GET(qword_field, outbox, INIT_HCA_QPC_BASE_OFFSET);
+ param->qpc_base = qword_field & ~((u64)0x1f);
+ MLX4_GET(byte_field, outbox, INIT_HCA_LOG_QP_OFFSET);
+ param->log_num_qps = byte_field & 0x1f;
+ MLX4_GET(qword_field, outbox, INIT_HCA_SRQC_BASE_OFFSET);
+ param->srqc_base = qword_field & ~((u64)0x1f);
+ MLX4_GET(byte_field, outbox, INIT_HCA_LOG_SRQ_OFFSET);
+ param->log_num_srqs = byte_field & 0x1f;
+ MLX4_GET(qword_field, outbox, INIT_HCA_CQC_BASE_OFFSET);
+ param->cqc_base = qword_field & ~((u64)0x1f);
+ MLX4_GET(byte_field, outbox, INIT_HCA_LOG_CQ_OFFSET);
+ param->log_num_cqs = byte_field & 0x1f;
+ MLX4_GET(qword_field, outbox, INIT_HCA_ALTC_BASE_OFFSET);
+ param->altc_base = qword_field;
+ MLX4_GET(qword_field, outbox, INIT_HCA_AUXC_BASE_OFFSET);
+ param->auxc_base = qword_field;
+ MLX4_GET(qword_field, outbox, INIT_HCA_EQC_BASE_OFFSET);
+ param->eqc_base = qword_field & ~((u64)0x1f);
+ MLX4_GET(byte_field, outbox, INIT_HCA_LOG_EQ_OFFSET);
+ param->log_num_eqs = byte_field & 0x1f;
+ MLX4_GET(word_field, outbox, INIT_HCA_NUM_SYS_EQS_OFFSET);
+ param->num_sys_eqs = word_field & 0xfff;
+ MLX4_GET(qword_field, outbox, INIT_HCA_RDMARC_BASE_OFFSET);
+ param->rdmarc_base = qword_field & ~((u64)0x1f);
+ MLX4_GET(byte_field, outbox, INIT_HCA_LOG_RD_OFFSET);
+ param->log_rd_per_qp = byte_field & 0x7;

MLX4_GET(dword_field, outbox, INIT_HCA_FLAGS_OFFSET);
if (dword_field & (1 << INIT_HCA_DEVICE_MANAGED_FLOW_STEERING_EN)) {
@@ -2098,22 +2113,21 @@ int mlx4_QUERY_HCA(struct mlx4_dev *dev,
/* steering attributes */
if (param->steering_mode == MLX4_STEERING_MODE_DEVICE_MANAGED) {
MLX4_GET(param->mc_base, outbox, INIT_HCA_FS_BASE_OFFSET);
- MLX4_GET(param->log_mc_entry_sz, outbox,
- INIT_HCA_FS_LOG_ENTRY_SZ_OFFSET);
- MLX4_GET(param->log_mc_table_sz, outbox,
- INIT_HCA_FS_LOG_TABLE_SZ_OFFSET);
- MLX4_GET(byte_field, outbox,
- INIT_HCA_FS_A0_OFFSET);
+ MLX4_GET(byte_field, outbox, INIT_HCA_FS_LOG_ENTRY_SZ_OFFSET);
+ param->log_mc_entry_sz = byte_field & 0x1f;
+ MLX4_GET(byte_field, outbox, INIT_HCA_FS_LOG_TABLE_SZ_OFFSET);
+ param->log_mc_table_sz = byte_field & 0x1f;
+ MLX4_GET(byte_field, outbox, INIT_HCA_FS_A0_OFFSET);
param->dmfs_high_steer_mode =
a0_dmfs_query_hw_steering[(byte_field >> 6) & 3];
} else {
MLX4_GET(param->mc_base, outbox, INIT_HCA_MC_BASE_OFFSET);
- MLX4_GET(param->log_mc_entry_sz, outbox,
- INIT_HCA_LOG_MC_ENTRY_SZ_OFFSET);
- MLX4_GET(param->log_mc_hash_sz, outbox,
- INIT_HCA_LOG_MC_HASH_SZ_OFFSET);
- MLX4_GET(param->log_mc_table_sz, outbox,
- INIT_HCA_LOG_MC_TABLE_SZ_OFFSET);
+ MLX4_GET(byte_field, outbox, INIT_HCA_LOG_MC_ENTRY_SZ_OFFSET);
+ param->log_mc_entry_sz = byte_field & 0x1f;
+ MLX4_GET(byte_field, outbox, INIT_HCA_LOG_MC_HASH_SZ_OFFSET);
+ param->log_mc_hash_sz = byte_field & 0x1f;
+ MLX4_GET(byte_field, outbox, INIT_HCA_LOG_MC_TABLE_SZ_OFFSET);
+ param->log_mc_table_sz = byte_field & 0x1f;
}

/* CX3 is capable of extending CQEs/EQEs from 32 to 64 bytes */
@@ -2137,15 +2151,18 @@ int mlx4_QUERY_HCA(struct mlx4_dev *dev,
/* TPT attributes */

MLX4_GET(param->dmpt_base, outbox, INIT_HCA_DMPT_BASE_OFFSET);
- MLX4_GET(param->mw_enabled, outbox, INIT_HCA_TPT_MW_OFFSET);
- MLX4_GET(param->log_mpt_sz, outbox, INIT_HCA_LOG_MPT_SZ_OFFSET);
+ MLX4_GET(byte_field, outbox, INIT_HCA_TPT_MW_OFFSET);
+ param->mw_enabled = byte_field >> 7;
+ MLX4_GET(byte_field, outbox, INIT_HCA_LOG_MPT_SZ_OFFSET);
+ param->log_mpt_sz = byte_field & 0x3f;
MLX4_GET(param->mtt_base, outbox, INIT_HCA_MTT_BASE_OFFSET);
MLX4_GET(param->cmpt_base, outbox, INIT_HCA_CMPT_BASE_OFFSET);

/* UAR attributes */

MLX4_GET(param->uar_page_sz, outbox, INIT_HCA_UAR_PAGE_SZ_OFFSET);
- MLX4_GET(param->log_uar_sz, outbox, INIT_HCA_LOG_UAR_SZ_OFFSET);
+ MLX4_GET(byte_field, outbox, INIT_HCA_LOG_UAR_SZ_OFFSET);
+ param->log_uar_sz = byte_field & 0xf;

/* phv_check enable */
MLX4_GET(byte_field, outbox, INIT_HCA_CACHELINE_SZ_OFFSET);



2019-02-04 11:07:41

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 22/30] mmc: sdhci-iproc: handle mmc_of_parse() errors during probe

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Stefan Wahren <[email protected]>

commit 2bd44dadd5bfb4135162322fd0b45a174d4ad5bf upstream.

We need to handle mmc_of_parse() errors during probe.

This finally fixes the wifi regression on Raspberry Pi 3 series.
In error case the wifi chip was permanently in reset because of
the power sequence depending on the deferred probe of the GPIO expander.

Fixes: b580c52d58d9 ("mmc: sdhci-iproc: add IPROC SDHCI driver")
Cc: [email protected]
Signed-off-by: Stefan Wahren <[email protected]>
Acked-by: Adrian Hunter <[email protected]>
Signed-off-by: Ulf Hansson <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>


---
drivers/mmc/host/sdhci-iproc.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

--- a/drivers/mmc/host/sdhci-iproc.c
+++ b/drivers/mmc/host/sdhci-iproc.c
@@ -242,7 +242,10 @@ static int sdhci_iproc_probe(struct plat

iproc_host->data = iproc_data;

- mmc_of_parse(host->mmc);
+ ret = mmc_of_parse(host->mmc);
+ if (ret)
+ goto err;
+
sdhci_get_of_property(pdev);

host->mmc->caps |= iproc_host->data->mmc_caps;



2019-02-04 11:07:45

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 24/30] mm, oom: fix use-after-free in oom_kill_process

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Shakeel Butt <[email protected]>

commit cefc7ef3c87d02fc9307835868ff721ea12cc597 upstream.

Syzbot instance running on upstream kernel found a use-after-free bug in
oom_kill_process. On further inspection it seems like the process
selected to be oom-killed has exited even before reaching
read_lock(&tasklist_lock) in oom_kill_process(). More specifically the
tsk->usage is 1 which is due to get_task_struct() in oom_evaluate_task()
and the put_task_struct within for_each_thread() frees the tsk and
for_each_thread() tries to access the tsk. The easiest fix is to do
get/put across the for_each_thread() on the selected task.

Now the next question is should we continue with the oom-kill as the
previously selected task has exited? However before adding more
complexity and heuristics, let's answer why we even look at the children
of oom-kill selected task? The select_bad_process() has already selected
the worst process in the system/memcg. Due to race, the selected
process might not be the worst at the kill time but does that matter?
The userspace can use the oom_score_adj interface to prefer children to
be killed before the parent. I looked at the history but it seems like
this is there before git history.

Link: http://lkml.kernel.org/r/[email protected]
Reported-by: [email protected]
Fixes: 6b0c81b3be11 ("mm, oom: reduce dependency on tasklist_lock")
Signed-off-by: Shakeel Butt <[email protected]>
Reviewed-by: Roman Gushchin <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Tetsuo Handa <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
mm/oom_kill.c | 8 ++++++++
1 file changed, 8 insertions(+)

--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -861,6 +861,13 @@ static void oom_kill_process(struct oom_
* still freeing memory.
*/
read_lock(&tasklist_lock);
+
+ /*
+ * The task 'p' might have already exited before reaching here. The
+ * put_task_struct() will free task_struct 'p' while the loop still try
+ * to access the field of 'p', so, get an extra reference.
+ */
+ get_task_struct(p);
for_each_thread(p, t) {
list_for_each_entry(child, &t->children, sibling) {
unsigned int child_points;
@@ -880,6 +887,7 @@ static void oom_kill_process(struct oom_
}
}
}
+ put_task_struct(p);
read_unlock(&tasklist_lock);

p = find_lock_task_mm(victim);



2019-02-04 11:07:50

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 03/30] ipv6: Consider sk_bound_dev_if when binding a socket to an address

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: David Ahern <[email protected]>

[ Upstream commit c5ee066333ebc322a24a00a743ed941a0c68617e ]

IPv6 does not consider if the socket is bound to a device when binding
to an address. The result is that a socket can be bound to eth0 and then
bound to the address of eth1. If the device is a VRF, the result is that
a socket can only be bound to an address in the default VRF.

Resolve by considering the device if sk_bound_dev_if is set.

This problem exists from the beginning of git history.

Signed-off-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv6/af_inet6.c | 3 +++
1 file changed, 3 insertions(+)

--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -359,6 +359,9 @@ int inet6_bind(struct socket *sock, stru
err = -EINVAL;
goto out_unlock;
}
+ }
+
+ if (sk->sk_bound_dev_if) {
dev = dev_get_by_index_rcu(net, sk->sk_bound_dev_if);
if (!dev) {
err = -ENODEV;



2019-02-04 11:07:54

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 18/30] arm64: hibernate: Clean the __hyp_text to PoC after resume

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: James Morse <[email protected]>

commit f7daa9c8fd191724b9ab9580a7be55cd1a67d799 upstream.

During resume hibernate restores all physical memory. Any memory
that is accessed with the MMU disabled needs to be cleaned to the
PoC.

KVMs __hyp_text was previously ommitted as it runs with the MMU
enabled, but now that the hyp-stub is located in this section,
we must clean __hyp_text too.

This ensures secondary CPUs that come online after hibernate
has finished resuming, and load KVM via the freshly written
hyp-stub see the correct instructions.

Signed-off-by: James Morse <[email protected]>
Cc: [email protected]
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm64/kernel/hibernate.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

--- a/arch/arm64/kernel/hibernate.c
+++ b/arch/arm64/kernel/hibernate.c
@@ -297,8 +297,10 @@ int swsusp_arch_suspend(void)
dcache_clean_range(__idmap_text_start, __idmap_text_end);

/* Clean kvm setup code to PoC? */
- if (el2_reset_needed())
+ if (el2_reset_needed()) {
dcache_clean_range(__hyp_idmap_text_start, __hyp_idmap_text_end);
+ dcache_clean_range(__hyp_text_start, __hyp_text_end);
+ }

/*
* Tell the hibernation core that we've just restored



2019-02-04 11:08:15

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 12/30] ipvlan, l3mdev: fix broken l3s mode wrt local routes

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Daniel Borkmann <[email protected]>

[ Upstream commit d5256083f62e2720f75bb3c5a928a0afe47d6bc3 ]

While implementing ipvlan l3 and l3s mode for kubernetes CNI plugin,
I ran into the issue that while l3 mode is working fine, l3s mode
does not have any connectivity to kube-apiserver and hence all pods
end up in Error state as well. The ipvlan master device sits on
top of a bond device and hostns traffic to kube-apiserver (also running
in hostns) is DNATed from 10.152.183.1:443 to 139.178.29.207:37573
where the latter is the address of the bond0. While in l3 mode, a
curl to https://10.152.183.1:443 or to https://139.178.29.207:37573
works fine from hostns, neither of them do in case of l3s. In the
latter only a curl to https://127.0.0.1:37573 appeared to work where
for local addresses of bond0 I saw kernel suddenly starting to emit
ARP requests to query HW address of bond0 which remained unanswered
and neighbor entries in INCOMPLETE state. These ARP requests only
happen while in l3s.

Debugging this further, I found the issue is that l3s mode is piggy-
backing on l3 master device, and in this case local routes are using
l3mdev_master_dev_rcu(dev) instead of net->loopback_dev as per commit
f5a0aab84b74 ("net: ipv4: dst for local input routes should use l3mdev
if relevant") and 5f02ce24c269 ("net: l3mdev: Allow the l3mdev to be
a loopback"). I found that reverting them back into using the
net->loopback_dev fixed ipvlan l3s connectivity and got everything
working for the CNI.

Now judging from 4fbae7d83c98 ("ipvlan: Introduce l3s mode") and the
l3mdev paper in [0] the only sole reason why ipvlan l3s is relying
on l3 master device is to get the l3mdev_ip_rcv() receive hook for
setting the dst entry of the input route without adding its own
ipvlan specific hacks into the receive path, however, any l3 domain
semantics beyond just that are breaking l3s operation. Note that
ipvlan also has the ability to dynamically switch its internal
operation from l3 to l3s for all ports via ipvlan_set_port_mode()
at runtime. In any case, l3 vs l3s soley distinguishes itself by
'de-confusing' netfilter through switching skb->dev to ipvlan slave
device late in NF_INET_LOCAL_IN before handing the skb to L4.

Minimal fix taken here is to add a IFF_L3MDEV_RX_HANDLER flag which,
if set from ipvlan setup, gets us only the wanted l3mdev_l3_rcv() hook
without any additional l3mdev semantics on top. This should also have
minimal impact since dev->priv_flags is already hot in cache. With
this set, l3s mode is working fine and I also get things like
masquerading pod traffic on the ipvlan master properly working.

[0] https://netdevconf.org/1.2/papers/ahern-what-is-l3mdev-paper.pdf

Fixes: f5a0aab84b74 ("net: ipv4: dst for local input routes should use l3mdev if relevant")
Fixes: 5f02ce24c269 ("net: l3mdev: Allow the l3mdev to be a loopback")
Fixes: 4fbae7d83c98 ("ipvlan: Introduce l3s mode")
Signed-off-by: Daniel Borkmann <[email protected]>
Cc: Mahesh Bandewar <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Florian Westphal <[email protected]>
Cc: Martynas Pumputis <[email protected]>
Acked-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ipvlan/ipvlan_main.c | 6 +++---
include/linux/netdevice.h | 8 ++++++++
include/net/l3mdev.h | 3 ++-
3 files changed, 13 insertions(+), 4 deletions(-)

--- a/drivers/net/ipvlan/ipvlan_main.c
+++ b/drivers/net/ipvlan/ipvlan_main.c
@@ -85,12 +85,12 @@ static int ipvlan_set_port_mode(struct i
err = ipvlan_register_nf_hook();
if (!err) {
mdev->l3mdev_ops = &ipvl_l3mdev_ops;
- mdev->priv_flags |= IFF_L3MDEV_MASTER;
+ mdev->priv_flags |= IFF_L3MDEV_RX_HANDLER;
} else
goto fail;
} else if (port->mode == IPVLAN_MODE_L3S) {
/* Old mode was L3S */
- mdev->priv_flags &= ~IFF_L3MDEV_MASTER;
+ mdev->priv_flags &= ~IFF_L3MDEV_RX_HANDLER;
ipvlan_unregister_nf_hook();
mdev->l3mdev_ops = NULL;
}
@@ -158,7 +158,7 @@ static void ipvlan_port_destroy(struct n

dev->priv_flags &= ~IFF_IPVLAN_MASTER;
if (port->mode == IPVLAN_MODE_L3S) {
- dev->priv_flags &= ~IFF_L3MDEV_MASTER;
+ dev->priv_flags &= ~IFF_L3MDEV_RX_HANDLER;
ipvlan_unregister_nf_hook();
dev->l3mdev_ops = NULL;
}
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1368,6 +1368,7 @@ struct net_device_ops {
* @IFF_PHONY_HEADROOM: the headroom value is controlled by an external
* entity (i.e. the master device for bridged veth)
* @IFF_MACSEC: device is a MACsec device
+ * @IFF_L3MDEV_RX_HANDLER: only invoke the rx handler of L3 master device
*/
enum netdev_priv_flags {
IFF_802_1Q_VLAN = 1<<0,
@@ -1398,6 +1399,7 @@ enum netdev_priv_flags {
IFF_RXFH_CONFIGURED = 1<<25,
IFF_PHONY_HEADROOM = 1<<26,
IFF_MACSEC = 1<<27,
+ IFF_L3MDEV_RX_HANDLER = 1<<28,
};

#define IFF_802_1Q_VLAN IFF_802_1Q_VLAN
@@ -1427,6 +1429,7 @@ enum netdev_priv_flags {
#define IFF_TEAM IFF_TEAM
#define IFF_RXFH_CONFIGURED IFF_RXFH_CONFIGURED
#define IFF_MACSEC IFF_MACSEC
+#define IFF_L3MDEV_RX_HANDLER IFF_L3MDEV_RX_HANDLER

/**
* struct net_device - The DEVICE structure.
@@ -4244,6 +4247,11 @@ static inline bool netif_supports_nofcs(
return dev->priv_flags & IFF_SUPP_NOFCS;
}

+static inline bool netif_has_l3_rx_handler(const struct net_device *dev)
+{
+ return dev->priv_flags & IFF_L3MDEV_RX_HANDLER;
+}
+
static inline bool netif_is_l3_master(const struct net_device *dev)
{
return dev->priv_flags & IFF_L3MDEV_MASTER;
--- a/include/net/l3mdev.h
+++ b/include/net/l3mdev.h
@@ -142,7 +142,8 @@ struct sk_buff *l3mdev_l3_rcv(struct sk_

if (netif_is_l3_slave(skb->dev))
master = netdev_master_upper_dev_get_rcu(skb->dev);
- else if (netif_is_l3_master(skb->dev))
+ else if (netif_is_l3_master(skb->dev) ||
+ netif_has_l3_rx_handler(skb->dev))
master = skb->dev;

if (master && master->l3mdev_ops->l3mdev_l3_rcv)



2019-02-04 11:08:18

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 26/30] mm: migrate: dont rely on __PageMovable() of newpage after unlocking it

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: David Hildenbrand <[email protected]>

commit e0a352fabce61f730341d119fbedf71ffdb8663f upstream.

We had a race in the old balloon compaction code before b1123ea6d3b3
("mm: balloon: use general non-lru movable page feature") refactored it
that became visible after backporting 195a8c43e93d ("virtio-balloon:
deflate via a page list") without the refactoring.

The bug existed from commit d6d86c0a7f8d ("mm/balloon_compaction:
redesign ballooned pages management") till b1123ea6d3b3 ("mm: balloon:
use general non-lru movable page feature"). d6d86c0a7f8d
("mm/balloon_compaction: redesign ballooned pages management") was
backported to 3.12, so the broken kernels are stable kernels [3.12 -
4.7].

There was a subtle race between dropping the page lock of the newpage in
__unmap_and_move() and checking for __is_movable_balloon_page(newpage).

Just after dropping this page lock, virtio-balloon could go ahead and
deflate the newpage, effectively dequeueing it and clearing PageBalloon,
in turn making __is_movable_balloon_page(newpage) fail.

This resulted in dropping the reference of the newpage via
putback_lru_page(newpage) instead of put_page(newpage), leading to
page->lru getting modified and a !LRU page ending up in the LRU lists.
With 195a8c43e93d ("virtio-balloon: deflate via a page list")
backported, one would suddenly get corrupted lists in
release_pages_balloon():

- WARNING: CPU: 13 PID: 6586 at lib/list_debug.c:59 __list_del_entry+0xa1/0xd0
- list_del corruption. prev->next should be ffffe253961090a0, but was dead000000000100

Nowadays this race is no longer possible, but it is hidden behind very
ugly handling of __ClearPageMovable() and __PageMovable().

__ClearPageMovable() will not make __PageMovable() fail, only
PageMovable(). So the new check (__PageMovable(newpage)) will still
hold even after newpage was dequeued by virtio-balloon.

If anybody would ever change that special handling, the BUG would be
introduced again. So instead, make it explicit and use the information
of the original isolated page before migration.

This patch can be backported fairly easy to stable kernels (in contrast
to the refactoring).

Link: http://lkml.kernel.org/r/[email protected]
Fixes: d6d86c0a7f8d ("mm/balloon_compaction: redesign ballooned pages management")
Signed-off-by: David Hildenbrand <[email protected]>
Reported-by: Vratislav Bendel <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Acked-by: Rafael Aquini <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Dominik Brodowski <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Vratislav Bendel <[email protected]>
Cc: Rafael Aquini <[email protected]>
Cc: Konstantin Khlebnikov <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: <[email protected]> [3.12 - 4.7]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
mm/migrate.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1044,10 +1044,13 @@ out:
* If migration is successful, decrease refcount of the newpage
* which will not free the page because new page owner increased
* refcounter. As well, if it is LRU page, add the page to LRU
- * list in here.
+ * list in here. Use the old state of the isolated source page to
+ * determine if we migrated a LRU page. newpage was already unlocked
+ * and possibly modified by its owner - don't rely on the page
+ * state.
*/
if (rc == MIGRATEPAGE_SUCCESS) {
- if (unlikely(__PageMovable(newpage)))
+ if (unlikely(!is_lru))
put_page(newpage);
else
putback_lru_page(newpage);



2019-02-04 11:08:24

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 02/30] fs: add the fsnotify call to vfs_iter_write

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Jimmy Durand Wesolowski <[email protected]>

A bug has been discovered when redirecting splice output to regular files
on EXT4 and tmpfs. Other filesystems might be affected.
This commit fixes the issue for stable series kernel, using one of the
change introduced during the rewrite and refactoring of vfs_iter_write in
4.13, specifically in the
commit abbb65899aec ("fs: implement vfs_iter_write using do_iter_write").

This issue affects v4.4 and v4.9 stable series of kernels.

Without this fix for v4.4 and v4.9 stable, the following upstream commits
(and their dependencies would need to be backported):
* commit abbb65899aec ("fs: implement vfs_iter_write using do_iter_write")
* commit 18e9710ee59c ("fs: implement vfs_iter_read using do_iter_read")
* commit edab5fe38c2c
("fs: move more code into do_iter_read/do_iter_write")
* commit 19c735868dd0 ("fs: remove __do_readv_writev")
* commit 26c87fb7d10d ("fs: remove do_compat_readv_writev")
* commit 251b42a1dc64 ("fs: remove do_readv_writev")

as well as the following dependencies:
* commit bb7462b6fd64
("vfs: use helpers for calling f_op->{read,write}_iter()")
* commit 0f78d06ac1e9
("vfs: pass type instead of fn to do_{loop,iter}_readv_writev()")
* commit 7687a7a4435f
("vfs: extract common parts of {compat_,}do_readv_writev()")

In order to reduce the changes, this commit uses only the part of
commit abbb65899aec ("fs: implement vfs_iter_write using do_iter_write")
that fixes the issue.

This issue and the reproducer can be found on
https://bugzilla.kernel.org/show_bug.cgi?id=85381

Reported-by: Richard Li <[email protected]>
Reported-by: Chad Miller <[email protected]>
Reviewed-by: Stefan Nuernberger <[email protected]>
Reviewed-by: Frank Becker <[email protected]>
Signed-off-by: Jimmy Durand Wesolowski <[email protected]>
---
fs/read_write.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -392,8 +392,10 @@ ssize_t vfs_iter_write(struct file *file
iter->type |= WRITE;
ret = file->f_op->write_iter(&kiocb, iter);
BUG_ON(ret == -EIOCBQUEUED);
- if (ret > 0)
+ if (ret > 0) {
*ppos = kiocb.ki_pos;
+ fsnotify_modify(file);
+ }
return ret;
}
EXPORT_SYMBOL(vfs_iter_write);



2019-02-04 11:08:36

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 10/30] l2tp: remove l2specific_len dependency in l2tp_core

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Lorenzo Bianconi <[email protected]>

commit 62e7b6a57c7b9bf3c6fd99418eeec05b08a85c38 upstream.

Remove l2specific_len dependency while building l2tpv3 header or
parsing the received frame since default L2-Specific Sublayer is
always four bytes long and we don't need to rely on a user supplied
value.
Moreover in l2tp netlink code there are no sanity checks to
enforce the relation between l2specific_len and l2specific_type,
so sending a malformed netlink message is possible to set
l2specific_type to L2TP_L2SPECTYPE_DEFAULT (or even
L2TP_L2SPECTYPE_NONE) and set l2specific_len to a value greater than
4 leaking memory on the wire and sending corrupted frames.

Reviewed-by: Guillaume Nault <[email protected]>
Tested-by: Guillaume Nault <[email protected]>
Signed-off-by: Lorenzo Bianconi <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/l2tp/l2tp_core.c | 34 ++++++++++++++++------------------
net/l2tp/l2tp_core.h | 11 +++++++++++
2 files changed, 27 insertions(+), 18 deletions(-)

--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -795,11 +795,9 @@ void l2tp_recv_common(struct l2tp_sessio
"%s: recv data ns=%u, session nr=%u\n",
session->name, ns, session->nr);
}
+ ptr += 4;
}

- /* Advance past L2-specific header, if present */
- ptr += session->l2specific_len;
-
if (L2TP_SKB_CB(skb)->has_seq) {
/* Received a packet with sequence numbers. If we're the LNS,
* check if we sre sending sequence numbers and if not,
@@ -1121,21 +1119,20 @@ static int l2tp_build_l2tpv3_header(stru
memcpy(bufp, &session->cookie[0], session->cookie_len);
bufp += session->cookie_len;
}
- if (session->l2specific_len) {
- if (session->l2specific_type == L2TP_L2SPECTYPE_DEFAULT) {
- u32 l2h = 0;
- if (session->send_seq) {
- l2h = 0x40000000 | session->ns;
- session->ns++;
- session->ns &= 0xffffff;
- l2tp_dbg(session, L2TP_MSG_SEQ,
- "%s: updated ns to %u\n",
- session->name, session->ns);
- }
+ if (session->l2specific_type == L2TP_L2SPECTYPE_DEFAULT) {
+ u32 l2h = 0;

- *((__be32 *) bufp) = htonl(l2h);
+ if (session->send_seq) {
+ l2h = 0x40000000 | session->ns;
+ session->ns++;
+ session->ns &= 0xffffff;
+ l2tp_dbg(session, L2TP_MSG_SEQ,
+ "%s: updated ns to %u\n",
+ session->name, session->ns);
}
- bufp += session->l2specific_len;
+
+ *((__be32 *)bufp) = htonl(l2h);
+ bufp += 4;
}

return bufp - optr;
@@ -1812,7 +1809,7 @@ int l2tp_session_delete(struct l2tp_sess
EXPORT_SYMBOL_GPL(l2tp_session_delete);

/* We come here whenever a session's send_seq, cookie_len or
- * l2specific_len parameters are set.
+ * l2specific_type parameters are set.
*/
void l2tp_session_set_header_len(struct l2tp_session *session, int version)
{
@@ -1821,7 +1818,8 @@ void l2tp_session_set_header_len(struct
if (session->send_seq)
session->hdr_len += 4;
} else {
- session->hdr_len = 4 + session->cookie_len + session->l2specific_len;
+ session->hdr_len = 4 + session->cookie_len;
+ session->hdr_len += l2tp_get_l2specific_len(session);
if (session->tunnel->encap == L2TP_ENCAPTYPE_UDP)
session->hdr_len += 4;
}
--- a/net/l2tp/l2tp_core.h
+++ b/net/l2tp/l2tp_core.h
@@ -314,6 +314,17 @@ do { \
#define l2tp_session_dec_refcount(s) l2tp_session_dec_refcount_1(s)
#endif

+static inline int l2tp_get_l2specific_len(struct l2tp_session *session)
+{
+ switch (session->l2specific_type) {
+ case L2TP_L2SPECTYPE_DEFAULT:
+ return 4;
+ case L2TP_L2SPECTYPE_NONE:
+ default:
+ return 0;
+ }
+}
+
#define l2tp_printk(ptr, type, func, fmt, ...) \
do { \
if (((ptr)->debug) & (type)) \



2019-02-04 11:09:13

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 14/30] fs/dcache: Fix incorrect nr_dentry_unused accounting in shrink_dcache_sb()

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Waiman Long <[email protected]>

commit 1dbd449c9943e3145148cc893c2461b72ba6fef0 upstream.

The nr_dentry_unused per-cpu counter tracks dentries in both the LRU
lists and the shrink lists where the DCACHE_LRU_LIST bit is set.

The shrink_dcache_sb() function moves dentries from the LRU list to a
shrink list and subtracts the dentry count from nr_dentry_unused. This
is incorrect as the nr_dentry_unused count will also be decremented in
shrink_dentry_list() via d_shrink_del().

To fix this double decrement, the decrement in the shrink_dcache_sb()
function is taken out.

Fixes: 4e717f5c1083 ("list_lru: remove special case function list_lru_dispose_all."
Cc: [email protected]
Signed-off-by: Waiman Long <[email protected]>
Reviewed-by: Dave Chinner <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/dcache.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)

--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -1164,15 +1164,11 @@ static enum lru_status dentry_lru_isolat
*/
void shrink_dcache_sb(struct super_block *sb)
{
- long freed;
-
do {
LIST_HEAD(dispose);

- freed = list_lru_walk(&sb->s_dentry_lru,
+ list_lru_walk(&sb->s_dentry_lru,
dentry_lru_isolate_shrink, &dispose, 1024);
-
- this_cpu_sub(nr_dentry_unused, freed);
shrink_dentry_list(&dispose);
cond_resched();
} while (list_lru_count(&sb->s_dentry_lru) > 0);



2019-02-04 11:09:13

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 17/30] arm64: hyp-stub: Forbid kprobing of the hyp-stub

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: James Morse <[email protected]>

commit 8fac5cbdfe0f01254d9d265c6aa1a95f94f58595 upstream.

The hyp-stub is loaded by the kernel's early startup code at EL2
during boot, before KVM takes ownership later. The hyp-stub's
text is part of the regular kernel text, meaning it can be kprobed.

A breakpoint in the hyp-stub causes the CPU to spin in el2_sync_invalid.

Add it to the __hyp_text.

Signed-off-by: James Morse <[email protected]>
Cc: [email protected]
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm64/kernel/hyp-stub.S | 2 ++
1 file changed, 2 insertions(+)

--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -28,6 +28,8 @@
#include <asm/virt.h>

.text
+ .pushsection .hyp.text, "ax"
+
.align 11

ENTRY(__hyp_stub_vectors)



2019-02-04 11:09:20

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 15/30] ARM: cns3xxx: Fix writing to wrong PCI config registers after alignment

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Koen Vandeputte <[email protected]>

commit 65dbb423cf28232fed1732b779249d6164c5999b upstream.

Originally, cns3xxx used its own functions for mapping, reading and
writing config registers.

Commit 802b7c06adc7 ("ARM: cns3xxx: Convert PCI to use generic config
accessors") removed the internal PCI config write function in favor of
the generic one:

cns3xxx_pci_write_config() --> pci_generic_config_write()

cns3xxx_pci_write_config() expected aligned addresses, being produced by
cns3xxx_pci_map_bus() while the generic one pci_generic_config_write()
actually expects the real address as both the function and hardware are
capable of byte-aligned writes.

This currently leads to pci_generic_config_write() writing to the wrong
registers.

For instance, upon ath9k module loading:

- driver ath9k gets loaded
- The driver wants to write value 0xA8 to register PCI_LATENCY_TIMER,
located at 0x0D
- cns3xxx_pci_map_bus() aligns the address to 0x0C
- pci_generic_config_write() effectively writes 0xA8 into register 0x0C
(CACHE_LINE_SIZE)

Fix the bug by removing the alignment in the cns3xxx mapping function.

Fixes: 802b7c06adc7 ("ARM: cns3xxx: Convert PCI to use generic config accessors")
Signed-off-by: Koen Vandeputte <[email protected]>
[[email protected]: updated commit log]
Signed-off-by: Lorenzo Pieralisi <[email protected]>
Acked-by: Krzysztof Halasa <[email protected]>
Acked-by: Tim Harvey <[email protected]>
Acked-by: Arnd Bergmann <[email protected]>
CC: [email protected] # v4.0+
CC: Bjorn Helgaas <[email protected]>
CC: Olof Johansson <[email protected]>
CC: Robin Leblon <[email protected]>
CC: Rob Herring <[email protected]>
CC: Russell King <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm/mach-cns3xxx/pcie.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/arm/mach-cns3xxx/pcie.c
+++ b/arch/arm/mach-cns3xxx/pcie.c
@@ -83,7 +83,7 @@ static void __iomem *cns3xxx_pci_map_bus
} else /* remote PCI bus */
base = cnspci->cfg1_regs + ((busno & 0xf) << 20);

- return base + (where & 0xffc) + (devfn << 12);
+ return base + where + (devfn << 12);
}

static int cns3xxx_pci_read_config(struct pci_bus *bus, unsigned int devfn,



2019-02-04 11:09:31

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 11/30] l2tp: fix reading optional fields of L2TPv3

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Jacob Wen <[email protected]>

[ Upstream commit 4522a70db7aa5e77526a4079628578599821b193 ]

Use pskb_may_pull() to make sure the optional fields are in skb linear
parts, so we can safely read them later.

It's easy to reproduce the issue with a net driver that supports paged
skb data. Just create a L2TPv3 over IP tunnel and then generates some
network traffic.
Once reproduced, rx err in /sys/kernel/debug/l2tp/tunnels will increase.

Changes in v4:
1. s/l2tp_v3_pull_opt/l2tp_v3_ensure_opt_in_linear/
2. s/tunnel->version != L2TP_HDR_VER_2/tunnel->version == L2TP_HDR_VER_3/
3. Add 'Fixes' in commit messages.

Changes in v3:
1. To keep consistency, move the code out of l2tp_recv_common.
2. Use "net" instead of "net-next", since this is a bug fix.

Changes in v2:
1. Only fix L2TPv3 to make code simple.
To fix both L2TPv3 and L2TPv2, we'd better refactor l2tp_recv_common.
It's complicated to do so.
2. Reloading pointers after pskb_may_pull

Fixes: f7faffa3ff8e ("l2tp: Add L2TPv3 protocol support")
Fixes: 0d76751fad77 ("l2tp: Add L2TPv3 IP encapsulation (no UDP) support")
Fixes: a32e0eec7042 ("l2tp: introduce L2TPv3 IP encapsulation support for IPv6")
Signed-off-by: Jacob Wen <[email protected]>
Acked-by: Guillaume Nault <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/l2tp/l2tp_core.c | 4 ++++
net/l2tp/l2tp_core.h | 20 ++++++++++++++++++++
net/l2tp/l2tp_ip.c | 3 +++
net/l2tp/l2tp_ip6.c | 3 +++
4 files changed, 30 insertions(+)

--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -1020,6 +1020,10 @@ static int l2tp_udp_recv_core(struct l2t
goto error;
}

+ if (tunnel->version == L2TP_HDR_VER_3 &&
+ l2tp_v3_ensure_opt_in_linear(session, skb, &ptr, &optr))
+ goto error;
+
l2tp_recv_common(session, skb, ptr, optr, hdrflags, length, payload_hook);
l2tp_session_dec_refcount(session);

--- a/net/l2tp/l2tp_core.h
+++ b/net/l2tp/l2tp_core.h
@@ -325,6 +325,26 @@ static inline int l2tp_get_l2specific_le
}
}

+static inline int l2tp_v3_ensure_opt_in_linear(struct l2tp_session *session, struct sk_buff *skb,
+ unsigned char **ptr, unsigned char **optr)
+{
+ int opt_len = session->peer_cookie_len + l2tp_get_l2specific_len(session);
+
+ if (opt_len > 0) {
+ int off = *ptr - *optr;
+
+ if (!pskb_may_pull(skb, off + opt_len))
+ return -1;
+
+ if (skb->data != *optr) {
+ *optr = skb->data;
+ *ptr = skb->data + off;
+ }
+ }
+
+ return 0;
+}
+
#define l2tp_printk(ptr, type, func, fmt, ...) \
do { \
if (((ptr)->debug) & (type)) \
--- a/net/l2tp/l2tp_ip.c
+++ b/net/l2tp/l2tp_ip.c
@@ -157,6 +157,9 @@ static int l2tp_ip_recv(struct sk_buff *
print_hex_dump_bytes("", DUMP_PREFIX_OFFSET, ptr, length);
}

+ if (l2tp_v3_ensure_opt_in_linear(session, skb, &ptr, &optr))
+ goto discard_sess;
+
l2tp_recv_common(session, skb, ptr, optr, 0, skb->len, tunnel->recv_payload_hook);
l2tp_session_dec_refcount(session);

--- a/net/l2tp/l2tp_ip6.c
+++ b/net/l2tp/l2tp_ip6.c
@@ -169,6 +169,9 @@ static int l2tp_ip6_recv(struct sk_buff
print_hex_dump_bytes("", DUMP_PREFIX_OFFSET, ptr, length);
}

+ if (l2tp_v3_ensure_opt_in_linear(session, skb, &ptr, &optr))
+ goto discard_sess;
+
l2tp_recv_common(session, skb, ptr, optr, 0, skb->len,
tunnel->recv_payload_hook);
l2tp_session_dec_refcount(session);



2019-02-04 11:15:36

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 4.9 30/30] fanotify: fix handling of events on child sub-directory

On Mon, Feb 04, 2019 at 12:48:00PM +0200, Amir Goldstein wrote:
> On Mon, Feb 4, 2019 at 12:44 PM Greg Kroah-Hartman
> <[email protected]> wrote:
> >
> > 4.9-stable review patch. If anyone has any objections, please let me know.
> >
> > ------------------
> >
> > From: Amir Goldstein <[email protected]>
> >
> > commit b469e7e47c8a075cc08bcd1e85d4365134bdcdd5 upstream.
> >
> > When an event is reported on a sub-directory and the parent inode has
> > a mark mask with FS_EVENT_ON_CHILD|FS_ISDIR, the event will be sent to
> > fsnotify() even if the event type is not in the parent mark mask
> > (e.g. FS_OPEN).
> >
> > Further more, if that event happened on a mount or a filesystem with
> > a mount/sb mark that does have that event type in their mask, the "on
> > child" event will be reported on the mount/sb mark. That is not
> > desired, because user will get a duplicate event for the same action.
> >
> > Note that the event reported on the victim inode is never merged with
> > the event reported on the parent inode, because of the check in
> > should_merge(): old_fsn->inode == new_fsn->inode.
> >
> > Fix this by looking for a match of an actual event type (i.e. not just
> > FS_ISDIR) in parent's inode mark mask and by not reporting an "on child"
> > event to group if event type is only found on mount/sb marks.
> >
> > [backport hint: The bug seems to have always been in fanotify, but this
> > patch will only apply cleanly to v4.19.y]
> >
>
> Same comment about this backport hint being misleading in the
> context of the backport patch.

I try to leave the text in the changelog identical to what is upstream
to make it easier to track things.

thanks,

greg k-h

2019-02-04 21:57:49

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/30] 4.9.155-stable review

On Mon, Feb 04, 2019 at 11:36:38AM +0100, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.9.155 release.
> There are 30 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed Feb 6 10:35:37 UTC 2019.
> Anything received after that time might be too late.
>

Build results:
total: 172 pass: 172 fail: 0
Qemu test results:
total: 315 pass: 315 fail: 0

Guenter

2019-02-05 06:42:00

by Naresh Kamboju

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/30] 4.9.155-stable review

On Mon, 4 Feb 2019 at 16:12, Greg Kroah-Hartman
<[email protected]> wrote:
>
> This is the start of the stable review cycle for the 4.9.155 release.
> There are 30 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed Feb 6 10:35:37 UTC 2019.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.155-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>

Results from Linaro’s test farm.
No regressions on arm64, arm, x86_64, and i386.

Summary
------------------------------------------------------------------------

kernel: 4.9.155-rc1
git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
git branch: linux-4.9.y
git commit: 09c6f7d1841f28f385a0082252c5c041dad35ec1
git describe: v4.9.154-31-g09c6f7d1841f
Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-4.9-oe/build/v4.9.154-31-g09c6f7d1841f

No regressions (compared to build v4.9.153-46-gac8774c5cafe)

No fixes (compared to build v4.9.153-46-gac8774c5cafe)

Ran 22030 total tests in the following environments and test suites.

Environments
--------------
- dragonboard-410c - arm64
- hi6220-hikey - arm64
- i386
- juno-r2 - arm64
- qemu_arm
- qemu_arm64
- qemu_i386
- qemu_x86_64
- x15 - arm
- x86_64

Test Suites
-----------
* boot
* install-android-platform-tools-r2600
* kselftest
* libhugetlbfs
* ltp-cap_bounds-tests
* ltp-containers-tests
* ltp-cpuhotplug-tests
* ltp-cve-tests
* ltp-fcntl-locktests-tests
* ltp-filecaps-tests
* ltp-fs-tests
* ltp-fs_bind-tests
* ltp-fs_perms_simple-tests
* ltp-fsx-tests
* ltp-hugetlb-tests
* ltp-io-tests
* ltp-ipc-tests
* ltp-math-tests
* ltp-mm-tests
* ltp-nptl-tests
* ltp-pty-tests
* ltp-sched-tests
* ltp-securebits-tests
* ltp-syscalls-tests
* ltp-timers-tests
* spectre-meltdown-checker-test
* ltp-open-posix-tests
* kselftest-vsyscall-mode-native
* kselftest-vsyscall-mode-none

--
Linaro LKFT
https://lkft.linaro.org

2019-02-05 11:41:58

by Jon Hunter

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/30] 4.9.155-stable review


On 04/02/2019 10:36, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.9.155 release.
> There are 30 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed Feb 6 10:35:37 UTC 2019.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.155-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
All tests are passing for Tegra ...

Test results for stable-v4.9:
8 builds: 8 pass, 0 fail
16 boots: 16 pass, 0 fail
14 tests: 14 pass, 0 fail

Linux version: 4.9.155-rc1-g09c6f7d
Boards tested: tegra124-jetson-tk1, tegra20-ventana,
tegra210-p2371-2180, tegra30-cardhu-a04

Cheers
Jon

--
nvpublic