This is the start of the stable review cycle for the 5.4.87 release.
There are 47 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 06 Jan 2021 15:56:52 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.87-rc1.gz
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <[email protected]>
Linux 5.4.87-rc1
Hyeongseok Kim <[email protected]>
dm verity: skip verity work if I/O error when system is shutting down
Takashi Iwai <[email protected]>
ALSA: pcm: Clear the full allocated memory at hw_params
Thomas Gleixner <[email protected]>
tick/sched: Remove bogus boot "safety" check
Gabriel Krisman Bertazi <[email protected]>
um: ubd: Submit all data segments atomically
Eric Biggers <[email protected]>
fs/namespace.c: WARN if mnt_count has become negative
Jessica Yu <[email protected]>
module: delay kobject uevent until after module init call
Jaegeuk Kim <[email protected]>
f2fs: avoid race condition for shrinker count
Trond Myklebust <[email protected]>
NFSv4: Fix a pNFS layout related use-after-free race when freeing the inode
Qinglang Miao <[email protected]>
i3c master: fix missing destroy_workqueue() on error in i3c_master_register
Qinglang Miao <[email protected]>
powerpc: sysdev: add missing iounmap() on error in mpic_msgr_probe()
Zheng Liang <[email protected]>
rtc: pl031: fix resource leak in pl031_probe
Jan Kara <[email protected]>
quota: Don't overflow quota file offsets
Miroslav Benes <[email protected]>
module: set MODULE_STATE_GOING state when a module fails to load
Dinghao Liu <[email protected]>
rtc: sun6i: Fix memleak in sun6i_rtc_clk_init
Boqun Feng <[email protected]>
fcntl: Fix potential deadlock in send_sig{io, urg}()
Randy Dunlap <[email protected]>
bfs: don't use WARNING: string when it's just info.
Takashi Iwai <[email protected]>
ALSA: rawmidi: Access runtime->avail always in spinlock
Takashi Iwai <[email protected]>
ALSA: seq: Use bool for snd_seq_queue internal flags
Chao Yu <[email protected]>
f2fs: fix shift-out-of-bounds in sanity_check_raw_super()
Mauro Carvalho Chehab <[email protected]>
media: gp8psk: initialize stats at power control logic
Anant Thazhemadam <[email protected]>
misc: vmw_vmci: fix kernel info-leak by initializing dbells in vmci_ctx_get_chkpt_doorbells()
Rustam Kovhaev <[email protected]>
reiserfs: add check for an invalid ih_entry_count
Anant Thazhemadam <[email protected]>
Bluetooth: hci_h5: close serdev device and free hu in h5_close
Randy Dunlap <[email protected]>
scsi: cxgb4i: Fix TLS dependency
Qinglang Miao <[email protected]>
cgroup: Fix memory leak when parsing multiple source parameters
Johan Hovold <[email protected]>
of: fix linker-section match-table corruption
Damien Le Moal <[email protected]>
null_blk: Fix zone size initialization
Arnaldo Carvalho de Melo <[email protected]>
tools headers UAPI: Sync linux/const.h with the kernel headers
Petr Vorel <[email protected]>
uapi: move constants from <linux/kernel.h> to <linux/const.h>
Bart Van Assche <[email protected]>
scsi: block: Fix a race in the runtime power management code
Jamie Iles <[email protected]>
jffs2: Fix NULL pointer dereference in rp_size fs option parsing
lizhe <[email protected]>
jffs2: Allow setting rp_size to zero during remounting
Christophe Leroy <[email protected]>
powerpc/bitops: Fix possible undefined behaviour with fls() and fls64()
Paolo Bonzini <[email protected]>
KVM: x86: reinstate vendor-agnostic check on SPEC_CTRL cpuid bits
Paolo Bonzini <[email protected]>
KVM: SVM: relax conditions for allowing MSR_IA32_SPEC_CTRL accesses
Paolo Bonzini <[email protected]>
KVM: x86: avoid incorrect writes to host MSR_IA32_SPEC_CTRL
Jan Kara <[email protected]>
ext4: don't remount read-only with errors=continue on reboot
Filipe Manana <[email protected]>
btrfs: fix race when defragmenting leads to unnecessary IO
Eric Auger <[email protected]>
vfio/pci: Move dummy_resources_list init in vfio_pci_probe()
Eric Biggers <[email protected]>
fscrypt: remove kernel-internal constants from UAPI header
Eric Biggers <[email protected]>
fscrypt: add fscrypt_is_nokey_name()
Eric Biggers <[email protected]>
f2fs: prevent creating duplicate encrypted filenames
Eric Biggers <[email protected]>
ubifs: prevent creating duplicate encrypted filenames
Eric Biggers <[email protected]>
ext4: prevent creating duplicate encrypted filenames
Zhuguangqing <[email protected]>
thermal/drivers/cpufreq_cooling: Update cpufreq_state only if state has changed
Kevin Vigor <[email protected]>
md/raid10: initialize r10_bio->read_slot before use.
Davide Caratti <[email protected]>
net/sched: sch_taprio: reset child qdiscs before freeing them
-------------
Diffstat:
Makefile | 4 +-
arch/powerpc/include/asm/bitops.h | 23 +++-
arch/powerpc/sysdev/mpic_msgr.c | 2 +-
arch/um/drivers/ubd_kern.c | 191 +++++++++++++++++++-------------
arch/x86/kvm/cpuid.h | 14 +++
arch/x86/kvm/svm.c | 17 +--
arch/x86/kvm/vmx/vmx.c | 13 +--
arch/x86/kvm/x86.c | 22 ++++
arch/x86/kvm/x86.h | 1 +
block/blk-pm.c | 15 ++-
drivers/block/null_blk_zoned.c | 19 ++--
drivers/bluetooth/hci_h5.c | 8 +-
drivers/i3c/master.c | 5 +-
drivers/md/dm-verity-target.c | 12 +-
drivers/md/raid10.c | 3 +-
drivers/media/usb/dvb-usb/gp8psk.c | 2 +-
drivers/misc/vmw_vmci/vmci_context.c | 2 +-
drivers/rtc/rtc-pl031.c | 6 +-
drivers/rtc/rtc-sun6i.c | 8 +-
drivers/scsi/cxgbi/cxgb4i/Kconfig | 1 +
drivers/thermal/cpu_cooling.c | 9 +-
drivers/vfio/pci/vfio_pci.c | 3 +-
fs/bfs/inode.c | 2 +-
fs/btrfs/ioctl.c | 39 +++++++
fs/crypto/fscrypt_private.h | 5 +-
fs/crypto/hooks.c | 10 +-
fs/crypto/keysetup.c | 2 +
fs/crypto/policy.c | 6 +-
fs/ext4/namei.c | 3 +
fs/ext4/super.c | 14 +--
fs/f2fs/checkpoint.c | 2 +-
fs/f2fs/debug.c | 11 +-
fs/f2fs/f2fs.h | 12 +-
fs/f2fs/node.c | 29 +++--
fs/f2fs/node.h | 4 +-
fs/f2fs/shrinker.c | 4 +-
fs/f2fs/super.c | 9 +-
fs/fcntl.c | 10 +-
fs/jffs2/jffs2_fs_sb.h | 1 +
fs/jffs2/super.c | 17 +--
fs/namespace.c | 9 +-
fs/nfs/nfs4super.c | 2 +-
fs/nfs/pnfs.c | 33 +++++-
fs/nfs/pnfs.h | 5 +
fs/pnode.h | 2 +-
fs/quota/quota_tree.c | 8 +-
fs/reiserfs/stree.c | 6 +
fs/ubifs/dir.c | 17 ++-
include/linux/fscrypt.h | 34 ++++++
include/linux/of.h | 1 +
include/uapi/linux/const.h | 5 +
include/uapi/linux/ethtool.h | 2 +-
include/uapi/linux/fscrypt.h | 5 +-
include/uapi/linux/kernel.h | 9 +-
include/uapi/linux/lightnvm.h | 2 +-
include/uapi/linux/mroute6.h | 2 +-
include/uapi/linux/netfilter/x_tables.h | 2 +-
include/uapi/linux/netlink.h | 2 +-
include/uapi/linux/sysctl.h | 2 +-
kernel/cgroup/cgroup-v1.c | 2 +
kernel/module.c | 6 +-
kernel/time/tick-sched.c | 7 --
net/sched/sch_taprio.c | 17 ++-
sound/core/pcm_native.c | 9 +-
sound/core/rawmidi.c | 49 +++++---
sound/core/seq/seq_queue.h | 8 +-
tools/include/uapi/linux/const.h | 5 +
67 files changed, 563 insertions(+), 248 deletions(-)
From: Jaegeuk Kim <[email protected]>
[ Upstream commit a95ba66ac1457b76fe472c8e092ab1006271f16c ]
Light reported sometimes shinker gets nat_cnt < dirty_nat_cnt resulting in
wrong do_shinker work. Let's avoid to return insane overflowed value by adding
single tracking value.
Reported-by: Light Hsieh <[email protected]>
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/f2fs/checkpoint.c | 2 +-
fs/f2fs/debug.c | 11 ++++++-----
fs/f2fs/f2fs.h | 10 ++++++++--
fs/f2fs/node.c | 29 ++++++++++++++++++-----------
fs/f2fs/node.h | 4 ++--
fs/f2fs/shrinker.c | 4 +---
6 files changed, 36 insertions(+), 24 deletions(-)
diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index c966ccc44c157..a57219c51c01a 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -1596,7 +1596,7 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
goto out;
}
- if (NM_I(sbi)->dirty_nat_cnt == 0 &&
+ if (NM_I(sbi)->nat_cnt[DIRTY_NAT] == 0 &&
SIT_I(sbi)->dirty_sentries == 0 &&
prefree_segments(sbi) == 0) {
f2fs_flush_sit_entries(sbi, cpc);
diff --git a/fs/f2fs/debug.c b/fs/f2fs/debug.c
index 9b0bedd82581b..d8d64447bc947 100644
--- a/fs/f2fs/debug.c
+++ b/fs/f2fs/debug.c
@@ -107,8 +107,8 @@ static void update_general_status(struct f2fs_sb_info *sbi)
si->node_pages = NODE_MAPPING(sbi)->nrpages;
if (sbi->meta_inode)
si->meta_pages = META_MAPPING(sbi)->nrpages;
- si->nats = NM_I(sbi)->nat_cnt;
- si->dirty_nats = NM_I(sbi)->dirty_nat_cnt;
+ si->nats = NM_I(sbi)->nat_cnt[TOTAL_NAT];
+ si->dirty_nats = NM_I(sbi)->nat_cnt[DIRTY_NAT];
si->sits = MAIN_SEGS(sbi);
si->dirty_sits = SIT_I(sbi)->dirty_sentries;
si->free_nids = NM_I(sbi)->nid_cnt[FREE_NID];
@@ -254,9 +254,10 @@ static void update_mem_info(struct f2fs_sb_info *sbi)
si->cache_mem += (NM_I(sbi)->nid_cnt[FREE_NID] +
NM_I(sbi)->nid_cnt[PREALLOC_NID]) *
sizeof(struct free_nid);
- si->cache_mem += NM_I(sbi)->nat_cnt * sizeof(struct nat_entry);
- si->cache_mem += NM_I(sbi)->dirty_nat_cnt *
- sizeof(struct nat_entry_set);
+ si->cache_mem += NM_I(sbi)->nat_cnt[TOTAL_NAT] *
+ sizeof(struct nat_entry);
+ si->cache_mem += NM_I(sbi)->nat_cnt[DIRTY_NAT] *
+ sizeof(struct nat_entry_set);
si->cache_mem += si->inmem_pages * sizeof(struct inmem_pages);
for (i = 0; i < MAX_INO_ENTRY; i++)
si->cache_mem += sbi->im[i].ino_num * sizeof(struct ino_entry);
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 0ddc4a74b9d43..4ca3c2a0a0f5b 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -797,6 +797,13 @@ enum nid_state {
MAX_NID_STATE,
};
+enum nat_state {
+ TOTAL_NAT,
+ DIRTY_NAT,
+ RECLAIMABLE_NAT,
+ MAX_NAT_STATE,
+};
+
struct f2fs_nm_info {
block_t nat_blkaddr; /* base disk address of NAT */
nid_t max_nid; /* maximum possible node ids */
@@ -812,8 +819,7 @@ struct f2fs_nm_info {
struct rw_semaphore nat_tree_lock; /* protect nat_tree_lock */
struct list_head nat_entries; /* cached nat entry list (clean) */
spinlock_t nat_list_lock; /* protect clean nat entry list */
- unsigned int nat_cnt; /* the # of cached nat entries */
- unsigned int dirty_nat_cnt; /* total num of nat entries in set */
+ unsigned int nat_cnt[MAX_NAT_STATE]; /* the # of cached nat entries */
unsigned int nat_blocks; /* # of nat blocks */
/* free node ids management */
diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index 3ac2a4b32375d..7ce33698ae381 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -62,8 +62,8 @@ bool f2fs_available_free_memory(struct f2fs_sb_info *sbi, int type)
sizeof(struct free_nid)) >> PAGE_SHIFT;
res = mem_size < ((avail_ram * nm_i->ram_thresh / 100) >> 2);
} else if (type == NAT_ENTRIES) {
- mem_size = (nm_i->nat_cnt * sizeof(struct nat_entry)) >>
- PAGE_SHIFT;
+ mem_size = (nm_i->nat_cnt[TOTAL_NAT] *
+ sizeof(struct nat_entry)) >> PAGE_SHIFT;
res = mem_size < ((avail_ram * nm_i->ram_thresh / 100) >> 2);
if (excess_cached_nats(sbi))
res = false;
@@ -177,7 +177,8 @@ static struct nat_entry *__init_nat_entry(struct f2fs_nm_info *nm_i,
list_add_tail(&ne->list, &nm_i->nat_entries);
spin_unlock(&nm_i->nat_list_lock);
- nm_i->nat_cnt++;
+ nm_i->nat_cnt[TOTAL_NAT]++;
+ nm_i->nat_cnt[RECLAIMABLE_NAT]++;
return ne;
}
@@ -207,7 +208,8 @@ static unsigned int __gang_lookup_nat_cache(struct f2fs_nm_info *nm_i,
static void __del_from_nat_cache(struct f2fs_nm_info *nm_i, struct nat_entry *e)
{
radix_tree_delete(&nm_i->nat_root, nat_get_nid(e));
- nm_i->nat_cnt--;
+ nm_i->nat_cnt[TOTAL_NAT]--;
+ nm_i->nat_cnt[RECLAIMABLE_NAT]--;
__free_nat_entry(e);
}
@@ -253,7 +255,8 @@ static void __set_nat_cache_dirty(struct f2fs_nm_info *nm_i,
if (get_nat_flag(ne, IS_DIRTY))
goto refresh_list;
- nm_i->dirty_nat_cnt++;
+ nm_i->nat_cnt[DIRTY_NAT]++;
+ nm_i->nat_cnt[RECLAIMABLE_NAT]--;
set_nat_flag(ne, IS_DIRTY, true);
refresh_list:
spin_lock(&nm_i->nat_list_lock);
@@ -273,7 +276,8 @@ static void __clear_nat_cache_dirty(struct f2fs_nm_info *nm_i,
set_nat_flag(ne, IS_DIRTY, false);
set->entry_cnt--;
- nm_i->dirty_nat_cnt--;
+ nm_i->nat_cnt[DIRTY_NAT]--;
+ nm_i->nat_cnt[RECLAIMABLE_NAT]++;
}
static unsigned int __gang_lookup_nat_set(struct f2fs_nm_info *nm_i,
@@ -2881,14 +2885,17 @@ int f2fs_flush_nat_entries(struct f2fs_sb_info *sbi, struct cp_control *cpc)
LIST_HEAD(sets);
int err = 0;
- /* during unmount, let's flush nat_bits before checking dirty_nat_cnt */
+ /*
+ * during unmount, let's flush nat_bits before checking
+ * nat_cnt[DIRTY_NAT].
+ */
if (enabled_nat_bits(sbi, cpc)) {
down_write(&nm_i->nat_tree_lock);
remove_nats_in_journal(sbi);
up_write(&nm_i->nat_tree_lock);
}
- if (!nm_i->dirty_nat_cnt)
+ if (!nm_i->nat_cnt[DIRTY_NAT])
return 0;
down_write(&nm_i->nat_tree_lock);
@@ -2899,7 +2906,8 @@ int f2fs_flush_nat_entries(struct f2fs_sb_info *sbi, struct cp_control *cpc)
* into nat entry set.
*/
if (enabled_nat_bits(sbi, cpc) ||
- !__has_cursum_space(journal, nm_i->dirty_nat_cnt, NAT_JOURNAL))
+ !__has_cursum_space(journal,
+ nm_i->nat_cnt[DIRTY_NAT], NAT_JOURNAL))
remove_nats_in_journal(sbi);
while ((found = __gang_lookup_nat_set(nm_i,
@@ -3023,7 +3031,6 @@ static int init_node_manager(struct f2fs_sb_info *sbi)
F2FS_RESERVED_NODE_NUM;
nm_i->nid_cnt[FREE_NID] = 0;
nm_i->nid_cnt[PREALLOC_NID] = 0;
- nm_i->nat_cnt = 0;
nm_i->ram_thresh = DEF_RAM_THRESHOLD;
nm_i->ra_nid_pages = DEF_RA_NID_PAGES;
nm_i->dirty_nats_ratio = DEF_DIRTY_NAT_RATIO_THRESHOLD;
@@ -3160,7 +3167,7 @@ void f2fs_destroy_node_manager(struct f2fs_sb_info *sbi)
__del_from_nat_cache(nm_i, natvec[idx]);
}
}
- f2fs_bug_on(sbi, nm_i->nat_cnt);
+ f2fs_bug_on(sbi, nm_i->nat_cnt[TOTAL_NAT]);
/* destroy nat set cache */
nid = 0;
diff --git a/fs/f2fs/node.h b/fs/f2fs/node.h
index e05af5df56485..4a2e7eaf2b028 100644
--- a/fs/f2fs/node.h
+++ b/fs/f2fs/node.h
@@ -123,13 +123,13 @@ static inline void raw_nat_from_node_info(struct f2fs_nat_entry *raw_ne,
static inline bool excess_dirty_nats(struct f2fs_sb_info *sbi)
{
- return NM_I(sbi)->dirty_nat_cnt >= NM_I(sbi)->max_nid *
+ return NM_I(sbi)->nat_cnt[DIRTY_NAT] >= NM_I(sbi)->max_nid *
NM_I(sbi)->dirty_nats_ratio / 100;
}
static inline bool excess_cached_nats(struct f2fs_sb_info *sbi)
{
- return NM_I(sbi)->nat_cnt >= DEF_NAT_CACHE_THRESHOLD;
+ return NM_I(sbi)->nat_cnt[TOTAL_NAT] >= DEF_NAT_CACHE_THRESHOLD;
}
static inline bool excess_dirty_nodes(struct f2fs_sb_info *sbi)
diff --git a/fs/f2fs/shrinker.c b/fs/f2fs/shrinker.c
index a467aca29cfef..3ceebaaee3840 100644
--- a/fs/f2fs/shrinker.c
+++ b/fs/f2fs/shrinker.c
@@ -18,9 +18,7 @@ static unsigned int shrinker_run_no;
static unsigned long __count_nat_entries(struct f2fs_sb_info *sbi)
{
- long count = NM_I(sbi)->nat_cnt - NM_I(sbi)->dirty_nat_cnt;
-
- return count > 0 ? count : 0;
+ return NM_I(sbi)->nat_cnt[RECLAIMABLE_NAT];
}
static unsigned long __count_free_nids(struct f2fs_sb_info *sbi)
--
2.27.0
From: Trond Myklebust <[email protected]>
[ Upstream commit b6d49ecd1081740b6e632366428b960461f8158b ]
When returning the layout in nfs4_evict_inode(), we need to ensure that
the layout is actually done being freed before we can proceed to free the
inode itself.
Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/nfs/nfs4super.c | 2 +-
fs/nfs/pnfs.c | 33 +++++++++++++++++++++++++++++++--
fs/nfs/pnfs.h | 5 +++++
3 files changed, 37 insertions(+), 3 deletions(-)
diff --git a/fs/nfs/nfs4super.c b/fs/nfs/nfs4super.c
index 04c57066a11af..b90642b022eb9 100644
--- a/fs/nfs/nfs4super.c
+++ b/fs/nfs/nfs4super.c
@@ -96,7 +96,7 @@ static void nfs4_evict_inode(struct inode *inode)
nfs_inode_return_delegation_noreclaim(inode);
/* Note that above delegreturn would trigger pnfs return-on-close */
pnfs_return_layout(inode);
- pnfs_destroy_layout(NFS_I(inode));
+ pnfs_destroy_layout_final(NFS_I(inode));
/* First call standard NFS clear_inode() code */
nfs_clear_inode(inode);
}
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 9c2b07ce57b27..9fd115c4d0a2f 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -294,6 +294,7 @@ void
pnfs_put_layout_hdr(struct pnfs_layout_hdr *lo)
{
struct inode *inode;
+ unsigned long i_state;
if (!lo)
return;
@@ -304,8 +305,12 @@ pnfs_put_layout_hdr(struct pnfs_layout_hdr *lo)
if (!list_empty(&lo->plh_segs))
WARN_ONCE(1, "NFS: BUG unfreed layout segments.\n");
pnfs_detach_layout_hdr(lo);
+ i_state = inode->i_state;
spin_unlock(&inode->i_lock);
pnfs_free_layout_hdr(lo);
+ /* Notify pnfs_destroy_layout_final() that we're done */
+ if (i_state & (I_FREEING | I_CLEAR))
+ wake_up_var(lo);
}
}
@@ -723,8 +728,7 @@ pnfs_free_lseg_list(struct list_head *free_me)
}
}
-void
-pnfs_destroy_layout(struct nfs_inode *nfsi)
+static struct pnfs_layout_hdr *__pnfs_destroy_layout(struct nfs_inode *nfsi)
{
struct pnfs_layout_hdr *lo;
LIST_HEAD(tmp_list);
@@ -742,9 +746,34 @@ pnfs_destroy_layout(struct nfs_inode *nfsi)
pnfs_put_layout_hdr(lo);
} else
spin_unlock(&nfsi->vfs_inode.i_lock);
+ return lo;
+}
+
+void pnfs_destroy_layout(struct nfs_inode *nfsi)
+{
+ __pnfs_destroy_layout(nfsi);
}
EXPORT_SYMBOL_GPL(pnfs_destroy_layout);
+static bool pnfs_layout_removed(struct nfs_inode *nfsi,
+ struct pnfs_layout_hdr *lo)
+{
+ bool ret;
+
+ spin_lock(&nfsi->vfs_inode.i_lock);
+ ret = nfsi->layout != lo;
+ spin_unlock(&nfsi->vfs_inode.i_lock);
+ return ret;
+}
+
+void pnfs_destroy_layout_final(struct nfs_inode *nfsi)
+{
+ struct pnfs_layout_hdr *lo = __pnfs_destroy_layout(nfsi);
+
+ if (lo)
+ wait_var_event(lo, pnfs_layout_removed(nfsi, lo));
+}
+
static bool
pnfs_layout_add_bulk_destroy_list(struct inode *inode,
struct list_head *layout_list)
diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
index f8a38065c7e47..63da33a92d831 100644
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -255,6 +255,7 @@ struct pnfs_layout_segment *pnfs_layout_process(struct nfs4_layoutget *lgp);
void pnfs_layoutget_free(struct nfs4_layoutget *lgp);
void pnfs_free_lseg_list(struct list_head *tmp_list);
void pnfs_destroy_layout(struct nfs_inode *);
+void pnfs_destroy_layout_final(struct nfs_inode *);
void pnfs_destroy_all_layouts(struct nfs_client *);
int pnfs_destroy_layouts_byfsid(struct nfs_client *clp,
struct nfs_fsid *fsid,
@@ -651,6 +652,10 @@ static inline void pnfs_destroy_layout(struct nfs_inode *nfsi)
{
}
+static inline void pnfs_destroy_layout_final(struct nfs_inode *nfsi)
+{
+}
+
static inline struct pnfs_layout_segment *
pnfs_get_lseg(struct pnfs_layout_segment *lseg)
{
--
2.27.0
From: Damien Le Moal <[email protected]>
commit 0ebcdd702f49aeb0ad2e2d894f8c124a0acc6e23 upstream.
For a null_blk device with zoned mode enabled is currently initialized
with a number of zones equal to the device capacity divided by the zone
size, without considering if the device capacity is a multiple of the
zone size. If the zone size is not a divisor of the capacity, the zones
end up not covering the entire capacity, potentially resulting is out
of bounds accesses to the zone array.
Fix this by adding one last smaller zone with a size equal to the
remainder of the disk capacity divided by the zone size if the capacity
is not a multiple of the zone size. For such smaller last zone, the zone
capacity is also checked so that it does not exceed the smaller zone
size.
Reported-by: Naohiro Aota <[email protected]>
Fixes: ca4b2a011948 ("null_blk: add zone support")
Cc: [email protected]
Signed-off-by: Damien Le Moal <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Johannes Thumshirn <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/block/null_blk_zoned.c | 19 ++++++++++++-------
1 file changed, 12 insertions(+), 7 deletions(-)
--- a/drivers/block/null_blk_zoned.c
+++ b/drivers/block/null_blk_zoned.c
@@ -2,8 +2,7 @@
#include <linux/vmalloc.h>
#include "null_blk.h"
-/* zone_size in MBs to sectors. */
-#define ZONE_SIZE_SHIFT 11
+#define MB_TO_SECTS(mb) (((sector_t)mb * SZ_1M) >> SECTOR_SHIFT)
static inline unsigned int null_zone_no(struct nullb_device *dev, sector_t sect)
{
@@ -12,7 +11,7 @@ static inline unsigned int null_zone_no(
int null_zone_init(struct nullb_device *dev)
{
- sector_t dev_size = (sector_t)dev->size * 1024 * 1024;
+ sector_t dev_capacity_sects;
sector_t sector = 0;
unsigned int i;
@@ -25,9 +24,12 @@ int null_zone_init(struct nullb_device *
return -EINVAL;
}
- dev->zone_size_sects = dev->zone_size << ZONE_SIZE_SHIFT;
- dev->nr_zones = dev_size >>
- (SECTOR_SHIFT + ilog2(dev->zone_size_sects));
+ dev_capacity_sects = MB_TO_SECTS(dev->size);
+ dev->zone_size_sects = MB_TO_SECTS(dev->zone_size);
+ dev->nr_zones = dev_capacity_sects >> ilog2(dev->zone_size_sects);
+ if (dev_capacity_sects & (dev->zone_size_sects - 1))
+ dev->nr_zones++;
+
dev->zones = kvmalloc_array(dev->nr_zones, sizeof(struct blk_zone),
GFP_KERNEL | __GFP_ZERO);
if (!dev->zones)
@@ -55,7 +57,10 @@ int null_zone_init(struct nullb_device *
struct blk_zone *zone = &dev->zones[i];
zone->start = zone->wp = sector;
- zone->len = dev->zone_size_sects;
+ if (zone->start + dev->zone_size_sects > dev_capacity_sects)
+ zone->len = dev_capacity_sects - zone->start;
+ else
+ zone->len = dev->zone_size_sects;
zone->type = BLK_ZONE_TYPE_SEQWRITE_REQ;
zone->cond = BLK_ZONE_COND_EMPTY;
From: Johan Hovold <[email protected]>
commit 5812b32e01c6d86ba7a84110702b46d8a8531fe9 upstream.
Specify type alignment when declaring linker-section match-table entries
to prevent gcc from increasing alignment and corrupting the various
tables with padding (e.g. timers, irqchips, clocks, reserved memory).
This is specifically needed on x86 where gcc (typically) aligns larger
objects like struct of_device_id with static extent on 32-byte
boundaries which at best prevents matching on anything but the first
entry. Specifying alignment when declaring variables suppresses this
optimisation.
Here's a 64-bit example where all entries are corrupt as 16 bytes of
padding has been inserted before the first entry:
ffffffff8266b4b0 D __clk_of_table
ffffffff8266b4c0 d __of_table_fixed_factor_clk
ffffffff8266b5a0 d __of_table_fixed_clk
ffffffff8266b680 d __clk_of_table_sentinel
And here's a 32-bit example where the 8-byte-aligned table happens to be
placed on a 32-byte boundary so that all but the first entry are corrupt
due to the 28 bytes of padding inserted between entries:
812b3ec0 D __irqchip_of_table
812b3ec0 d __of_table_irqchip1
812b3fa0 d __of_table_irqchip2
812b4080 d __of_table_irqchip3
812b4160 d irqchip_of_match_end
Verified on x86 using gcc-9.3 and gcc-4.9 (which uses 64-byte
alignment), and on arm using gcc-7.2.
Note that there are no in-tree users of these tables on x86 currently
(even if they are included in the image).
Fixes: 54196ccbe0ba ("of: consolidate linker section OF match table declarations")
Fixes: f6e916b82022 ("irqchip: add basic infrastructure")
Cc: stable <[email protected]> # 3.9
Signed-off-by: Johan Hovold <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[ johan: adjust context to 5.4 ]
Signed-off-by: Johan Hovold <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
include/linux/of.h | 1 +
1 file changed, 1 insertion(+)
--- a/include/linux/of.h
+++ b/include/linux/of.h
@@ -1282,6 +1282,7 @@ static inline int of_get_available_child
#define _OF_DECLARE(table, name, compat, fn, fn_type) \
static const struct of_device_id __of_table_##name \
__used __section(__##table##_of_table) \
+ __aligned(__alignof__(struct of_device_id)) \
= { .compatible = compat, \
.data = (fn == (fn_type)NULL) ? fn : fn }
#else
From: Christophe Leroy <[email protected]>
[ Upstream commit 1891ef21d92c4801ea082ee8ed478e304ddc6749 ]
fls() and fls64() are using __builtin_ctz() and _builtin_ctzll().
On powerpc, those builtins trivially use ctlzw and ctlzd power
instructions.
Allthough those instructions provide the expected result with
input argument 0, __builtin_ctz() and __builtin_ctzll() are
documented as undefined for value 0.
The easiest fix would be to use fls() and fls64() functions
defined in include/asm-generic/bitops/builtin-fls.h and
include/asm-generic/bitops/fls64.h, but GCC output is not optimal:
00000388 <testfls>:
388: 2c 03 00 00 cmpwi r3,0
38c: 41 82 00 10 beq 39c <testfls+0x14>
390: 7c 63 00 34 cntlzw r3,r3
394: 20 63 00 20 subfic r3,r3,32
398: 4e 80 00 20 blr
39c: 38 60 00 00 li r3,0
3a0: 4e 80 00 20 blr
000003b0 <testfls64>:
3b0: 2c 03 00 00 cmpwi r3,0
3b4: 40 82 00 1c bne 3d0 <testfls64+0x20>
3b8: 2f 84 00 00 cmpwi cr7,r4,0
3bc: 38 60 00 00 li r3,0
3c0: 4d 9e 00 20 beqlr cr7
3c4: 7c 83 00 34 cntlzw r3,r4
3c8: 20 63 00 20 subfic r3,r3,32
3cc: 4e 80 00 20 blr
3d0: 7c 63 00 34 cntlzw r3,r3
3d4: 20 63 00 40 subfic r3,r3,64
3d8: 4e 80 00 20 blr
When the input of fls(x) is a constant, just check x for nullity and
return either 0 or __builtin_clz(x). Otherwise, use cntlzw instruction
directly.
For fls64() on PPC64, do the same but with __builtin_clzll() and
cntlzd instruction. On PPC32, lets take the generic fls64() which
will use our fls(). The result is as expected:
00000388 <testfls>:
388: 7c 63 00 34 cntlzw r3,r3
38c: 20 63 00 20 subfic r3,r3,32
390: 4e 80 00 20 blr
000003a0 <testfls64>:
3a0: 2c 03 00 00 cmpwi r3,0
3a4: 40 82 00 10 bne 3b4 <testfls64+0x14>
3a8: 7c 83 00 34 cntlzw r3,r4
3ac: 20 63 00 20 subfic r3,r3,32
3b0: 4e 80 00 20 blr
3b4: 7c 63 00 34 cntlzw r3,r3
3b8: 20 63 00 40 subfic r3,r3,64
3bc: 4e 80 00 20 blr
Fixes: 2fcff790dcb4 ("powerpc: Use builtin functions for fls()/__fls()/fls64()")
Cc: [email protected]
Signed-off-by: Christophe Leroy <[email protected]>
Acked-by: Segher Boessenkool <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/348c2d3f19ffcff8abe50d52513f989c4581d000.1603375524.git.christophe.leroy@csgroup.eu
Signed-off-by: Sasha Levin <[email protected]>
---
arch/powerpc/include/asm/bitops.h | 23 +++++++++++++++++++++--
1 file changed, 21 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/include/asm/bitops.h b/arch/powerpc/include/asm/bitops.h
index 603aed229af78..46338f2360046 100644
--- a/arch/powerpc/include/asm/bitops.h
+++ b/arch/powerpc/include/asm/bitops.h
@@ -217,15 +217,34 @@ static __inline__ void __clear_bit_unlock(int nr, volatile unsigned long *addr)
*/
static __inline__ int fls(unsigned int x)
{
- return 32 - __builtin_clz(x);
+ int lz;
+
+ if (__builtin_constant_p(x))
+ return x ? 32 - __builtin_clz(x) : 0;
+ asm("cntlzw %0,%1" : "=r" (lz) : "r" (x));
+ return 32 - lz;
}
#include <asm-generic/bitops/builtin-__fls.h>
+/*
+ * 64-bit can do this using one cntlzd (count leading zeroes doubleword)
+ * instruction; for 32-bit we use the generic version, which does two
+ * 32-bit fls calls.
+ */
+#ifdef CONFIG_PPC64
static __inline__ int fls64(__u64 x)
{
- return 64 - __builtin_clzll(x);
+ int lz;
+
+ if (__builtin_constant_p(x))
+ return x ? 64 - __builtin_clzll(x) : 0;
+ asm("cntlzd %0,%1" : "=r" (lz) : "r" (x));
+ return 64 - lz;
}
+#else
+#include <asm-generic/bitops/fls64.h>
+#endif
#ifdef CONFIG_PPC64
unsigned int __arch_hweight8(unsigned int w);
--
2.27.0
From: Randy Dunlap <[email protected]>
commit cb5253198f10a4cd79b7523c581e6173c7d49ddb upstream.
SCSI_CXGB4_ISCSI selects CHELSIO_T4. The latter depends on TLS || TLS=n, so
since 'select' does not check dependencies of the selected symbol,
SCSI_CXGB4_ISCSI should also depend on TLS || TLS=n.
This prevents the following kconfig warning and restricts SCSI_CXGB4_ISCSI
to 'm' whenever TLS=m.
WARNING: unmet direct dependencies detected for CHELSIO_T4
Depends on [m]: NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_CHELSIO [=y] && PCI [=y] && (IPV6 [=y] || IPV6 [=y]=n) && (TLS [=m] || TLS [=m]=n)
Selected by [y]:
- SCSI_CXGB4_ISCSI [=y] && SCSI_LOWLEVEL [=y] && SCSI [=y] && PCI [=y] && INET [=y] && (IPV6 [=y] || IPV6 [=y]=n) && ETHERNET [=y]
Link: https://lore.kernel.org/r/[email protected]
Fixes: 7b36b6e03b0d ("[SCSI] cxgb4i v5: iscsi driver")
Cc: Karen Xie <[email protected]>
Cc: [email protected]
Cc: "James E.J. Bottomley" <[email protected]>
Cc: "Martin K. Petersen" <[email protected]>
Signed-off-by: Randy Dunlap <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/scsi/cxgbi/cxgb4i/Kconfig | 1 +
1 file changed, 1 insertion(+)
--- a/drivers/scsi/cxgbi/cxgb4i/Kconfig
+++ b/drivers/scsi/cxgbi/cxgb4i/Kconfig
@@ -4,6 +4,7 @@ config SCSI_CXGB4_ISCSI
depends on PCI && INET && (IPV6 || IPV6=n)
depends on THERMAL || !THERMAL
depends on ETHERNET
+ depends on TLS || TLS=n
select NET_VENDOR_CHELSIO
select CHELSIO_T4
select CHELSIO_LIB
From: Takashi Iwai <[email protected]>
commit 4ebd47037027c4beae99680bff3b20fdee5d7c1e upstream.
The snd_seq_queue struct contains various flags in the bit fields.
Those are categorized to two different use cases, both of which are
protected by different spinlocks. That implies that there are still
potential risks of the bad operations for bit fields by concurrent
accesses.
For addressing the problem, this patch rearranges those flags to be
a standard bool instead of a bit field.
Reported-by: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
sound/core/seq/seq_queue.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
--- a/sound/core/seq/seq_queue.h
+++ b/sound/core/seq/seq_queue.h
@@ -26,10 +26,10 @@ struct snd_seq_queue {
struct snd_seq_timer *timer; /* time keeper for this queue */
int owner; /* client that 'owns' the timer */
- unsigned int locked:1, /* timer is only accesibble by owner if set */
- klocked:1, /* kernel lock (after START) */
- check_again:1,
- check_blocked:1;
+ bool locked; /* timer is only accesibble by owner if set */
+ bool klocked; /* kernel lock (after START) */
+ bool check_again; /* concurrent access happened during check */
+ bool check_blocked; /* queue being checked */
unsigned int flags; /* status flags */
unsigned int info_flags; /* info for sync */
From: Randy Dunlap <[email protected]>
commit dc889b8d4a8122549feabe99eead04e6b23b6513 upstream.
Make the printk() [bfs "printf" macro] seem less severe by changing
"WARNING:" to "NOTE:".
<asm-generic/bug.h> warns us about using WARNING or BUG in a format string
other than in WARN() or BUG() family macros. bfs/inode.c is doing just
that in a normal printk() call, so change the "WARNING" string to be
"NOTE".
Link: https://lkml.kernel.org/r/[email protected]
Reported-by: [email protected]
Signed-off-by: Randy Dunlap <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Al Viro <[email protected]>
Cc: "Tigran A. Aivazian" <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/bfs/inode.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/bfs/inode.c
+++ b/fs/bfs/inode.c
@@ -351,7 +351,7 @@ static int bfs_fill_super(struct super_b
info->si_lasti = (le32_to_cpu(bfs_sb->s_start) - BFS_BSIZE) / sizeof(struct bfs_inode) + BFS_ROOT_INO - 1;
if (info->si_lasti == BFS_MAX_LASTI)
- printf("WARNING: filesystem %s was created with 512 inodes, the real maximum is 511, mounting anyway\n", s->s_id);
+ printf("NOTE: filesystem %s was created with 512 inodes, the real maximum is 511, mounting anyway\n", s->s_id);
else if (info->si_lasti > BFS_MAX_LASTI) {
printf("Impossible last inode number %lu > %d on %s\n", info->si_lasti, BFS_MAX_LASTI, s->s_id);
goto out1;
From: Arnaldo Carvalho de Melo <[email protected]>
commit 7ddcdea5b54492f54700f427f58690cf1e187e5e upstream.
To pick up the changes in:
a85cbe6159ffc973 ("uapi: move constants from <linux/kernel.h> to <linux/const.h>")
That causes no changes in tooling, just addresses this perf build
warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/const.h' differs from latest version at 'include/uapi/linux/const.h'
diff -u tools/include/uapi/linux/const.h include/uapi/linux/const.h
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Petr Vorel <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
tools/include/uapi/linux/const.h | 5 +++++
1 file changed, 5 insertions(+)
--- a/tools/include/uapi/linux/const.h
+++ b/tools/include/uapi/linux/const.h
@@ -28,4 +28,9 @@
#define _BITUL(x) (_UL(1) << (x))
#define _BITULL(x) (_ULL(1) << (x))
+#define __ALIGN_KERNEL(x, a) __ALIGN_KERNEL_MASK(x, (typeof(x))(a) - 1)
+#define __ALIGN_KERNEL_MASK(x, mask) (((x) + (mask)) & ~(mask))
+
+#define __KERNEL_DIV_ROUND_UP(n, d) (((n) + (d) - 1) / (d))
+
#endif /* _UAPI_LINUX_CONST_H */
From: Mauro Carvalho Chehab <[email protected]>
commit d0ac1a26ed5943127cb0156148735f5f52a07075 upstream.
As reported on:
https://lore.kernel.org/linux-media/[email protected]/
if gp8psk_usb_in_op() returns an error, the status var is not
initialized. Yet, this var is used later on, in order to
identify:
- if the device was already started;
- if firmware has loaded;
- if the LNBf was powered on.
Using status = 0 seems to ensure that everything will be
properly powered up.
So, instead of the proposed solution, let's just set
status = 0.
Reported-by: syzbot <[email protected]>
Reported-by: Willem de Bruijn <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/media/usb/dvb-usb/gp8psk.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/media/usb/dvb-usb/gp8psk.c
+++ b/drivers/media/usb/dvb-usb/gp8psk.c
@@ -182,7 +182,7 @@ out_rel_fw:
static int gp8psk_power_ctrl(struct dvb_usb_device *d, int onoff)
{
- u8 status, buf;
+ u8 status = 0, buf;
int gp_product_id = le16_to_cpu(d->udev->descriptor.idProduct);
if (onoff) {
From: Takashi Iwai <[email protected]>
commit 88a06d6fd6b369d88cec46c62db3e2604a2f50d5 upstream.
The runtime->avail field may be accessed concurrently while some
places refer to it without taking the runtime->lock spinlock, as
detected by KCSAN. Usually this isn't a big problem, but for
consistency and safety, we should take the spinlock at each place
referencing this field.
Reported-by: [email protected]
Reported-by: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
sound/core/rawmidi.c | 49 +++++++++++++++++++++++++++++++++++--------------
1 file changed, 35 insertions(+), 14 deletions(-)
--- a/sound/core/rawmidi.c
+++ b/sound/core/rawmidi.c
@@ -72,11 +72,21 @@ static inline unsigned short snd_rawmidi
}
}
-static inline int snd_rawmidi_ready(struct snd_rawmidi_substream *substream)
+static inline bool __snd_rawmidi_ready(struct snd_rawmidi_runtime *runtime)
+{
+ return runtime->avail >= runtime->avail_min;
+}
+
+static bool snd_rawmidi_ready(struct snd_rawmidi_substream *substream)
{
struct snd_rawmidi_runtime *runtime = substream->runtime;
+ unsigned long flags;
+ bool ready;
- return runtime->avail >= runtime->avail_min;
+ spin_lock_irqsave(&runtime->lock, flags);
+ ready = __snd_rawmidi_ready(runtime);
+ spin_unlock_irqrestore(&runtime->lock, flags);
+ return ready;
}
static inline int snd_rawmidi_ready_append(struct snd_rawmidi_substream *substream,
@@ -945,7 +955,7 @@ int snd_rawmidi_receive(struct snd_rawmi
if (result > 0) {
if (runtime->event)
schedule_work(&runtime->event_work);
- else if (snd_rawmidi_ready(substream))
+ else if (__snd_rawmidi_ready(runtime))
wake_up(&runtime->sleep);
}
spin_unlock_irqrestore(&runtime->lock, flags);
@@ -1024,7 +1034,7 @@ static ssize_t snd_rawmidi_read(struct f
result = 0;
while (count > 0) {
spin_lock_irq(&runtime->lock);
- while (!snd_rawmidi_ready(substream)) {
+ while (!__snd_rawmidi_ready(runtime)) {
wait_queue_entry_t wait;
if ((file->f_flags & O_NONBLOCK) != 0 || result > 0) {
@@ -1041,9 +1051,11 @@ static ssize_t snd_rawmidi_read(struct f
return -ENODEV;
if (signal_pending(current))
return result > 0 ? result : -ERESTARTSYS;
- if (!runtime->avail)
- return result > 0 ? result : -EIO;
spin_lock_irq(&runtime->lock);
+ if (!runtime->avail) {
+ spin_unlock_irq(&runtime->lock);
+ return result > 0 ? result : -EIO;
+ }
}
spin_unlock_irq(&runtime->lock);
count1 = snd_rawmidi_kernel_read1(substream,
@@ -1181,7 +1193,7 @@ int __snd_rawmidi_transmit_ack(struct sn
runtime->avail += count;
substream->bytes += count;
if (count > 0) {
- if (runtime->drain || snd_rawmidi_ready(substream))
+ if (runtime->drain || __snd_rawmidi_ready(runtime))
wake_up(&runtime->sleep);
}
return count;
@@ -1370,9 +1382,11 @@ static ssize_t snd_rawmidi_write(struct
return -ENODEV;
if (signal_pending(current))
return result > 0 ? result : -ERESTARTSYS;
- if (!runtime->avail && !timeout)
- return result > 0 ? result : -EIO;
spin_lock_irq(&runtime->lock);
+ if (!runtime->avail && !timeout) {
+ spin_unlock_irq(&runtime->lock);
+ return result > 0 ? result : -EIO;
+ }
}
spin_unlock_irq(&runtime->lock);
count1 = snd_rawmidi_kernel_write1(substream, buf, NULL, count);
@@ -1452,6 +1466,7 @@ static void snd_rawmidi_proc_info_read(s
struct snd_rawmidi *rmidi;
struct snd_rawmidi_substream *substream;
struct snd_rawmidi_runtime *runtime;
+ unsigned long buffer_size, avail, xruns;
rmidi = entry->private_data;
snd_iprintf(buffer, "%s\n\n", rmidi->name);
@@ -1470,13 +1485,16 @@ static void snd_rawmidi_proc_info_read(s
" Owner PID : %d\n",
pid_vnr(substream->pid));
runtime = substream->runtime;
+ spin_lock_irq(&runtime->lock);
+ buffer_size = runtime->buffer_size;
+ avail = runtime->avail;
+ spin_unlock_irq(&runtime->lock);
snd_iprintf(buffer,
" Mode : %s\n"
" Buffer size : %lu\n"
" Avail : %lu\n",
runtime->oss ? "OSS compatible" : "native",
- (unsigned long) runtime->buffer_size,
- (unsigned long) runtime->avail);
+ buffer_size, avail);
}
}
}
@@ -1494,13 +1512,16 @@ static void snd_rawmidi_proc_info_read(s
" Owner PID : %d\n",
pid_vnr(substream->pid));
runtime = substream->runtime;
+ spin_lock_irq(&runtime->lock);
+ buffer_size = runtime->buffer_size;
+ avail = runtime->avail;
+ xruns = runtime->xruns;
+ spin_unlock_irq(&runtime->lock);
snd_iprintf(buffer,
" Buffer size : %lu\n"
" Avail : %lu\n"
" Overruns : %lu\n",
- (unsigned long) runtime->buffer_size,
- (unsigned long) runtime->avail,
- (unsigned long) runtime->xruns);
+ buffer_size, avail, xruns);
}
}
}
From: Jamie Iles <[email protected]>
[ Upstream commit a61df3c413e49b0042f9caf774c58512d1cc71b7 ]
syzkaller found the following JFFS2 splat:
Unable to handle kernel paging request at virtual address dfffa00000000001
Mem abort info:
ESR = 0x96000004
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000004
CM = 0, WnR = 0
[dfffa00000000001] address between user and kernel address ranges
Internal error: Oops: 96000004 [#1] SMP
Dumping ftrace buffer:
(ftrace buffer empty)
Modules linked in:
CPU: 0 PID: 12745 Comm: syz-executor.5 Tainted: G S 5.9.0-rc8+ #98
Hardware name: linux,dummy-virt (DT)
pstate: 20400005 (nzCv daif +PAN -UAO BTYPE=--)
pc : jffs2_parse_param+0x138/0x308 fs/jffs2/super.c:206
lr : jffs2_parse_param+0x108/0x308 fs/jffs2/super.c:205
sp : ffff000022a57910
x29: ffff000022a57910 x28: 0000000000000000
x27: ffff000057634008 x26: 000000000000d800
x25: 000000000000d800 x24: ffff0000271a9000
x23: ffffa0001adb5dc0 x22: ffff000023fdcf00
x21: 1fffe0000454af2c x20: ffff000024cc9400
x19: 0000000000000000 x18: 0000000000000000
x17: 0000000000000000 x16: ffffa000102dbdd0
x15: 0000000000000000 x14: ffffa000109e44bc
x13: ffffa00010a3a26c x12: ffff80000476e0b3
x11: 1fffe0000476e0b2 x10: ffff80000476e0b2
x9 : ffffa00010a3ad60 x8 : ffff000023b70593
x7 : 0000000000000003 x6 : 00000000f1f1f1f1
x5 : ffff000023fdcf00 x4 : 0000000000000002
x3 : ffffa00010000000 x2 : 0000000000000001
x1 : dfffa00000000000 x0 : 0000000000000008
Call trace:
jffs2_parse_param+0x138/0x308 fs/jffs2/super.c:206
vfs_parse_fs_param+0x234/0x4e8 fs/fs_context.c:117
vfs_parse_fs_string+0xe8/0x148 fs/fs_context.c:161
generic_parse_monolithic+0x17c/0x208 fs/fs_context.c:201
parse_monolithic_mount_data+0x7c/0xa8 fs/fs_context.c:649
do_new_mount fs/namespace.c:2871 [inline]
path_mount+0x548/0x1da8 fs/namespace.c:3192
do_mount+0x124/0x138 fs/namespace.c:3205
__do_sys_mount fs/namespace.c:3413 [inline]
__se_sys_mount fs/namespace.c:3390 [inline]
__arm64_sys_mount+0x164/0x238 fs/namespace.c:3390
__invoke_syscall arch/arm64/kernel/syscall.c:36 [inline]
invoke_syscall arch/arm64/kernel/syscall.c:48 [inline]
el0_svc_common.constprop.0+0x15c/0x598 arch/arm64/kernel/syscall.c:149
do_el0_svc+0x60/0x150 arch/arm64/kernel/syscall.c:195
el0_svc+0x34/0xb0 arch/arm64/kernel/entry-common.c:226
el0_sync_handler+0xc8/0x5b4 arch/arm64/kernel/entry-common.c:236
el0_sync+0x15c/0x180 arch/arm64/kernel/entry.S:663
Code: d2d40001 f2fbffe1 91002260 d343fc02 (38e16841)
---[ end trace 4edf690313deda44 ]---
This is because since ec10a24f10c8, the option parsing happens before
fill_super and so the MTD device isn't associated with the filesystem.
Defer the size check until there is a valid association.
Fixes: ec10a24f10c8 ("vfs: Convert jffs2 to use the new mount API")
Cc: <[email protected]>
Cc: David Howells <[email protected]>
Signed-off-by: Jamie Iles <[email protected]>
Signed-off-by: Richard Weinberger <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/jffs2/super.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/fs/jffs2/super.c b/fs/jffs2/super.c
index 68ce77cbeed3b..6839a61e8ff1e 100644
--- a/fs/jffs2/super.c
+++ b/fs/jffs2/super.c
@@ -208,12 +208,8 @@ static int jffs2_parse_param(struct fs_context *fc, struct fs_parameter *param)
case Opt_rp_size:
if (result.uint_32 > UINT_MAX / 1024)
return invalf(fc, "jffs2: rp_size unrepresentable");
- opt = result.uint_32 * 1024;
- if (opt > c->mtd->size)
- return invalf(fc, "jffs2: Too large reserve pool specified, max is %llu KB",
- c->mtd->size / 1024);
+ c->mount_opts.rp_size = result.uint_32 * 1024;
c->mount_opts.set_rp_size = true;
- c->mount_opts.rp_size = opt;
break;
default:
return -EINVAL;
@@ -275,6 +271,10 @@ static int jffs2_fill_super(struct super_block *sb, struct fs_context *fc)
c->mtd = sb->s_mtd;
c->os_priv = sb;
+ if (c->mount_opts.rp_size > c->mtd->size)
+ return invalf(fc, "jffs2: Too large reserve pool specified, max is %llu KB",
+ c->mtd->size / 1024);
+
/* Initialize JFFS2 superblock locks, the further initialization will
* be done later */
mutex_init(&c->alloc_sem);
--
2.27.0
From: Anant Thazhemadam <[email protected]>
commit 31dcb6c30a26d32650ce134820f27de3c675a45a upstream.
A kernel-infoleak was reported by syzbot, which was caused because
dbells was left uninitialized.
Using kzalloc() instead of kmalloc() fixes this issue.
Reported-by: [email protected]
Tested-by: [email protected]
Signed-off-by: Anant Thazhemadam <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/misc/vmw_vmci/vmci_context.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/misc/vmw_vmci/vmci_context.c
+++ b/drivers/misc/vmw_vmci/vmci_context.c
@@ -743,7 +743,7 @@ static int vmci_ctx_get_chkpt_doorbells(
return VMCI_ERROR_MORE_DATA;
}
- dbells = kmalloc(data_size, GFP_ATOMIC);
+ dbells = kzalloc(data_size, GFP_ATOMIC);
if (!dbells)
return VMCI_ERROR_NO_MEM;
From: Eric Biggers <[email protected]>
commit 75d18cd1868c2aee43553723872c35d7908f240f upstream.
As described in "fscrypt: add fscrypt_is_nokey_name()", it's possible to
create a duplicate filename in an encrypted directory by creating a file
concurrently with adding the directory's encryption key.
Fix this bug on ext4 by rejecting no-key dentries in ext4_add_entry().
Note that the duplicate check in ext4_find_dest_de() sometimes prevented
this bug. However in many cases it didn't, since ext4_find_dest_de()
doesn't examine every dentry.
Fixes: 4461471107b7 ("ext4 crypto: enable filename encryption")
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Eric Biggers <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/ext4/namei.c | 3 +++
1 file changed, 3 insertions(+)
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -2192,6 +2192,9 @@ static int ext4_add_entry(handle_t *hand
if (!dentry->d_name.len)
return -EINVAL;
+ if (fscrypt_is_nokey_name(dentry))
+ return -ENOKEY;
+
#ifdef CONFIG_UNICODE
if (ext4_has_strict_mode(sbi) && IS_CASEFOLDED(dir) &&
sbi->s_encoding && utf8_validate(sbi->s_encoding, &dentry->d_name))
From: Chao Yu <[email protected]>
commit e584bbe821229a3e7cc409eecd51df66f9268c21 upstream.
syzbot reported a bug which could cause shift-out-of-bounds issue,
fix it.
Call Trace:
__dump_stack lib/dump_stack.c:79 [inline]
dump_stack+0x107/0x163 lib/dump_stack.c:120
ubsan_epilogue+0xb/0x5a lib/ubsan.c:148
__ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:395
sanity_check_raw_super fs/f2fs/super.c:2812 [inline]
read_raw_super_block fs/f2fs/super.c:3267 [inline]
f2fs_fill_super.cold+0x16c9/0x16f6 fs/f2fs/super.c:3519
mount_bdev+0x34d/0x410 fs/super.c:1366
legacy_get_tree+0x105/0x220 fs/fs_context.c:592
vfs_get_tree+0x89/0x2f0 fs/super.c:1496
do_new_mount fs/namespace.c:2896 [inline]
path_mount+0x12ae/0x1e70 fs/namespace.c:3227
do_mount fs/namespace.c:3240 [inline]
__do_sys_mount fs/namespace.c:3448 [inline]
__se_sys_mount fs/namespace.c:3425 [inline]
__x64_sys_mount+0x27f/0x300 fs/namespace.c:3425
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Reported-by: [email protected]
Signed-off-by: Anant Thazhemadam <[email protected]>
Signed-off-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/f2fs/super.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -2523,7 +2523,6 @@ static int sanity_check_raw_super(struct
block_t total_sections, blocks_per_seg;
struct f2fs_super_block *raw_super = (struct f2fs_super_block *)
(bh->b_data + F2FS_SUPER_OFFSET);
- unsigned int blocksize;
size_t crc_offset = 0;
__u32 crc = 0;
@@ -2557,10 +2556,10 @@ static int sanity_check_raw_super(struct
}
/* Currently, support only 4KB block size */
- blocksize = 1 << le32_to_cpu(raw_super->log_blocksize);
- if (blocksize != F2FS_BLKSIZE) {
- f2fs_info(sbi, "Invalid blocksize (%u), supports only 4KB",
- blocksize);
+ if (le32_to_cpu(raw_super->log_blocksize) != F2FS_BLKSIZE_BITS) {
+ f2fs_info(sbi, "Invalid log_blocksize (%u), supports only %u",
+ le32_to_cpu(raw_super->log_blocksize),
+ F2FS_BLKSIZE_BITS);
return -EFSCORRUPTED;
}
From: Eric Auger <[email protected]>
[ Upstream commit 16b8fe4caf499ae8e12d2ab1b1324497e36a7b83 ]
In case an error occurs in vfio_pci_enable() before the call to
vfio_pci_probe_mmaps(), vfio_pci_disable() will try to iterate
on an uninitialized list and cause a kernel panic.
Lets move to the initialization to vfio_pci_probe() to fix the
issue.
Signed-off-by: Eric Auger <[email protected]>
Fixes: 05f0c03fbac1 ("vfio-pci: Allow to mmap sub-page MMIO BARs if the mmio page is exclusive")
CC: Stable <[email protected]> # v4.7+
Signed-off-by: Alex Williamson <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/vfio/pci/vfio_pci.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 632653cd70e3b..2372e161cd5e8 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -114,8 +114,6 @@ static void vfio_pci_probe_mmaps(struct vfio_pci_device *vdev)
int bar;
struct vfio_pci_dummy_resource *dummy_res;
- INIT_LIST_HEAD(&vdev->dummy_resources_list);
-
for (bar = PCI_STD_RESOURCES; bar <= PCI_STD_RESOURCE_END; bar++) {
res = vdev->pdev->resource + bar;
@@ -1606,6 +1604,7 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
mutex_init(&vdev->igate);
spin_lock_init(&vdev->irqlock);
mutex_init(&vdev->ioeventfds_lock);
+ INIT_LIST_HEAD(&vdev->dummy_resources_list);
INIT_LIST_HEAD(&vdev->ioeventfds_list);
mutex_init(&vdev->vma_lock);
INIT_LIST_HEAD(&vdev->vma_list);
--
2.27.0
From: Eric Biggers <[email protected]>
commit 159e1de201b6fca10bfec50405a3b53a561096a8 upstream.
It's possible to create a duplicate filename in an encrypted directory
by creating a file concurrently with adding the encryption key.
Specifically, sys_open(O_CREAT) (or sys_mkdir(), sys_mknod(), or
sys_symlink()) can lookup the target filename while the directory's
encryption key hasn't been added yet, resulting in a negative no-key
dentry. The VFS then calls ->create() (or ->mkdir(), ->mknod(), or
->symlink()) because the dentry is negative. Normally, ->create() would
return -ENOKEY due to the directory's key being unavailable. However,
if the key was added between the dentry lookup and ->create(), then the
filesystem will go ahead and try to create the file.
If the target filename happens to already exist as a normal name (not a
no-key name), a duplicate filename may be added to the directory.
In order to fix this, we need to fix the filesystems to prevent
->create(), ->mkdir(), ->mknod(), and ->symlink() on no-key names.
(->rename() and ->link() need it too, but those are already handled
correctly by fscrypt_prepare_rename() and fscrypt_prepare_link().)
In preparation for this, add a helper function fscrypt_is_nokey_name()
that filesystems can use to do this check. Use this helper function for
the existing checks that fs/crypto/ does for rename and link.
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Eric Biggers <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/crypto/hooks.c | 10 +++++-----
include/linux/fscrypt.h | 34 ++++++++++++++++++++++++++++++++++
2 files changed, 39 insertions(+), 5 deletions(-)
--- a/fs/crypto/hooks.c
+++ b/fs/crypto/hooks.c
@@ -58,8 +58,8 @@ int __fscrypt_prepare_link(struct inode
if (err)
return err;
- /* ... in case we looked up ciphertext name before key was added */
- if (dentry->d_flags & DCACHE_ENCRYPTED_NAME)
+ /* ... in case we looked up no-key name before key was added */
+ if (fscrypt_is_nokey_name(dentry))
return -ENOKEY;
if (!fscrypt_has_permitted_context(dir, inode))
@@ -83,9 +83,9 @@ int __fscrypt_prepare_rename(struct inod
if (err)
return err;
- /* ... in case we looked up ciphertext name(s) before key was added */
- if ((old_dentry->d_flags | new_dentry->d_flags) &
- DCACHE_ENCRYPTED_NAME)
+ /* ... in case we looked up no-key name(s) before key was added */
+ if (fscrypt_is_nokey_name(old_dentry) ||
+ fscrypt_is_nokey_name(new_dentry))
return -ENOKEY;
if (old_dir != new_dir) {
--- a/include/linux/fscrypt.h
+++ b/include/linux/fscrypt.h
@@ -100,6 +100,35 @@ static inline void fscrypt_handle_d_move
dentry->d_flags &= ~DCACHE_ENCRYPTED_NAME;
}
+/**
+ * fscrypt_is_nokey_name() - test whether a dentry is a no-key name
+ * @dentry: the dentry to check
+ *
+ * This returns true if the dentry is a no-key dentry. A no-key dentry is a
+ * dentry that was created in an encrypted directory that hasn't had its
+ * encryption key added yet. Such dentries may be either positive or negative.
+ *
+ * When a filesystem is asked to create a new filename in an encrypted directory
+ * and the new filename's dentry is a no-key dentry, it must fail the operation
+ * with ENOKEY. This includes ->create(), ->mkdir(), ->mknod(), ->symlink(),
+ * ->rename(), and ->link(). (However, ->rename() and ->link() are already
+ * handled by fscrypt_prepare_rename() and fscrypt_prepare_link().)
+ *
+ * This is necessary because creating a filename requires the directory's
+ * encryption key, but just checking for the key on the directory inode during
+ * the final filesystem operation doesn't guarantee that the key was available
+ * during the preceding dentry lookup. And the key must have already been
+ * available during the dentry lookup in order for it to have been checked
+ * whether the filename already exists in the directory and for the new file's
+ * dentry not to be invalidated due to it incorrectly having the no-key flag.
+ *
+ * Return: %true if the dentry is a no-key name
+ */
+static inline bool fscrypt_is_nokey_name(const struct dentry *dentry)
+{
+ return dentry->d_flags & DCACHE_ENCRYPTED_NAME;
+}
+
/* crypto.c */
extern void fscrypt_enqueue_decrypt_work(struct work_struct *);
extern struct fscrypt_ctx *fscrypt_get_ctx(gfp_t);
@@ -290,6 +319,11 @@ static inline void fscrypt_handle_d_move
{
}
+static inline bool fscrypt_is_nokey_name(const struct dentry *dentry)
+{
+ return false;
+}
+
/* crypto.c */
static inline void fscrypt_enqueue_decrypt_work(struct work_struct *work)
{
From: Zhuguangqing <[email protected]>
commit 236761f19a4f373354f1dcf399b57753f1f4b871 upstream.
If state has not changed successfully and we updated cpufreq_state,
next time when the new state is equal to cpufreq_state (not changed
successfully last time), we will return directly and miss a
freq_qos_update_request() that should have been.
Fixes: 5130802ddbb1 ("thermal: cpu_cooling: Switch to QoS requests for freq limits")
Cc: v5.4+ <[email protected]> # v5.4+
Signed-off-by: Zhuguangqing <[email protected]>
Acked-by: Viresh Kumar <[email protected]>
Signed-off-by: Daniel Lezcano <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/thermal/cpu_cooling.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -320,6 +320,7 @@ static int cpufreq_set_cur_state(struct
unsigned long state)
{
struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
+ int ret;
/* Request state should be less than max_level */
if (WARN_ON(state > cpufreq_cdev->max_level))
@@ -329,10 +330,12 @@ static int cpufreq_set_cur_state(struct
if (cpufreq_cdev->cpufreq_state == state)
return 0;
- cpufreq_cdev->cpufreq_state = state;
+ ret = freq_qos_update_request(&cpufreq_cdev->qos_req,
+ cpufreq_cdev->freq_table[state].frequency);
+ if (ret > 0)
+ cpufreq_cdev->cpufreq_state = state;
- return freq_qos_update_request(&cpufreq_cdev->qos_req,
- cpufreq_cdev->freq_table[state].frequency);
+ return ret;
}
/**
From: Filipe Manana <[email protected]>
[ Upstream commit 7f458a3873ae94efe1f37c8b96c97e7298769e98 ]
When defragmenting we skip ranges that have holes or inline extents, so that
we don't do unnecessary IO and waste space. We do this check when calling
should_defrag_range() at btrfs_defrag_file(). However we do it without
holding the inode's lock. The reason we do it like this is to avoid
blocking other tasks for too long, that possibly want to operate on other
file ranges, since after the call to should_defrag_range() and before
locking the inode, we trigger a synchronous page cache readahead. However
before we were able to lock the inode, some other task might have punched
a hole in our range, or we may now have an inline extent there, in which
case we should not set the range for defrag anymore since that would cause
unnecessary IO and make us waste space (i.e. allocating extents to contain
zeros for a hole).
So after we locked the inode and the range in the iotree, check again if
we have holes or an inline extent, and if we do, just skip the range.
I hit this while testing my next patch that fixes races when updating an
inode's number of bytes (subject "btrfs: update the number of bytes used
by an inode atomically"), and it depends on this change in order to work
correctly. Alternatively I could rework that other patch to detect holes
and flag their range with the 'new delalloc' bit, but this itself fixes
an efficiency problem due a race that from a functional point of view is
not harmful (it could be triggered with btrfs/062 from fstests).
CC: [email protected] # 5.4+
Reviewed-by: Josef Bacik <[email protected]>
Signed-off-by: Filipe Manana <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/btrfs/ioctl.c | 39 +++++++++++++++++++++++++++++++++++++++
1 file changed, 39 insertions(+)
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index f58e03d1775d8..8ed71b3b25466 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1256,6 +1256,7 @@ static int cluster_pages_for_defrag(struct inode *inode,
u64 page_end;
u64 page_cnt;
u64 start = (u64)start_index << PAGE_SHIFT;
+ u64 search_start;
int ret;
int i;
int i_done;
@@ -1352,6 +1353,40 @@ static int cluster_pages_for_defrag(struct inode *inode,
lock_extent_bits(&BTRFS_I(inode)->io_tree,
page_start, page_end - 1, &cached_state);
+
+ /*
+ * When defragmenting we skip ranges that have holes or inline extents,
+ * (check should_defrag_range()), to avoid unnecessary IO and wasting
+ * space. At btrfs_defrag_file(), we check if a range should be defragged
+ * before locking the inode and then, if it should, we trigger a sync
+ * page cache readahead - we lock the inode only after that to avoid
+ * blocking for too long other tasks that possibly want to operate on
+ * other file ranges. But before we were able to get the inode lock,
+ * some other task may have punched a hole in the range, or we may have
+ * now an inline extent, in which case we should not defrag. So check
+ * for that here, where we have the inode and the range locked, and bail
+ * out if that happened.
+ */
+ search_start = page_start;
+ while (search_start < page_end) {
+ struct extent_map *em;
+
+ em = btrfs_get_extent(BTRFS_I(inode), NULL, 0, search_start,
+ page_end - search_start, 0);
+ if (IS_ERR(em)) {
+ ret = PTR_ERR(em);
+ goto out_unlock_range;
+ }
+ if (em->block_start >= EXTENT_MAP_LAST_BYTE) {
+ free_extent_map(em);
+ /* Ok, 0 means we did not defrag anything */
+ ret = 0;
+ goto out_unlock_range;
+ }
+ search_start = extent_map_end(em);
+ free_extent_map(em);
+ }
+
clear_extent_bit(&BTRFS_I(inode)->io_tree, page_start,
page_end - 1, EXTENT_DELALLOC | EXTENT_DO_ACCOUNTING |
EXTENT_DEFRAG, 0, 0, &cached_state);
@@ -1382,6 +1417,10 @@ static int cluster_pages_for_defrag(struct inode *inode,
btrfs_delalloc_release_extents(BTRFS_I(inode), page_cnt << PAGE_SHIFT);
extent_changeset_free(data_reserved);
return i_done;
+
+out_unlock_range:
+ unlock_extent_cached(&BTRFS_I(inode)->io_tree,
+ page_start, page_end - 1, &cached_state);
out:
for (i = 0; i < i_done; i++) {
unlock_page(pages[i]);
--
2.27.0
From: Jan Kara <[email protected]>
[ Upstream commit b08070eca9e247f60ab39d79b2c25d274750441f ]
ext4_handle_error() with errors=continue mount option can accidentally
remount the filesystem read-only when the system is rebooting. Fix that.
Fixes: 1dc1097ff60e ("ext4: avoid panic during forced reboot")
Signed-off-by: Jan Kara <[email protected]>
Reviewed-by: Andreas Dilger <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Theodore Ts'o <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/ext4/super.c | 14 ++++++--------
1 file changed, 6 insertions(+), 8 deletions(-)
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 920658ca8777d..06568467b0c27 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -455,19 +455,17 @@ static bool system_going_down(void)
static void ext4_handle_error(struct super_block *sb)
{
+ journal_t *journal = EXT4_SB(sb)->s_journal;
+
if (test_opt(sb, WARN_ON_ERROR))
WARN_ON_ONCE(1);
- if (sb_rdonly(sb))
+ if (sb_rdonly(sb) || test_opt(sb, ERRORS_CONT))
return;
- if (!test_opt(sb, ERRORS_CONT)) {
- journal_t *journal = EXT4_SB(sb)->s_journal;
-
- EXT4_SB(sb)->s_mount_flags |= EXT4_MF_FS_ABORTED;
- if (journal)
- jbd2_journal_abort(journal, -EIO);
- }
+ EXT4_SB(sb)->s_mount_flags |= EXT4_MF_FS_ABORTED;
+ if (journal)
+ jbd2_journal_abort(journal, -EIO);
/*
* We force ERRORS_RO behavior when system is rebooting. Otherwise we
* could panic during 'reboot -f' as the underlying device got already
--
2.27.0
From: Eric Biggers <[email protected]>
commit 76786a0f083473de31678bdb259a3d4167cf756d upstream.
As described in "fscrypt: add fscrypt_is_nokey_name()", it's possible to
create a duplicate filename in an encrypted directory by creating a file
concurrently with adding the directory's encryption key.
Fix this bug on ubifs by rejecting no-key dentries in ubifs_create(),
ubifs_mkdir(), ubifs_mknod(), and ubifs_symlink().
Note that ubifs doesn't actually report the duplicate filenames from
readdir, but rather it seems to replace the original dentry with a new
one (which is still wrong, just a different effect from ext4).
On ubifs, this fixes xfstest generic/595 as well as the new xfstest I
wrote specifically for this bug.
Fixes: f4f61d2cc6d8 ("ubifs: Implement encrypted filenames")
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Eric Biggers <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/ubifs/dir.c | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)
--- a/fs/ubifs/dir.c
+++ b/fs/ubifs/dir.c
@@ -278,6 +278,15 @@ done:
return d_splice_alias(inode, dentry);
}
+static int ubifs_prepare_create(struct inode *dir, struct dentry *dentry,
+ struct fscrypt_name *nm)
+{
+ if (fscrypt_is_nokey_name(dentry))
+ return -ENOKEY;
+
+ return fscrypt_setup_filename(dir, &dentry->d_name, 0, nm);
+}
+
static int ubifs_create(struct inode *dir, struct dentry *dentry, umode_t mode,
bool excl)
{
@@ -301,7 +310,7 @@ static int ubifs_create(struct inode *di
if (err)
return err;
- err = fscrypt_setup_filename(dir, &dentry->d_name, 0, &nm);
+ err = ubifs_prepare_create(dir, dentry, &nm);
if (err)
goto out_budg;
@@ -961,7 +970,7 @@ static int ubifs_mkdir(struct inode *dir
if (err)
return err;
- err = fscrypt_setup_filename(dir, &dentry->d_name, 0, &nm);
+ err = ubifs_prepare_create(dir, dentry, &nm);
if (err)
goto out_budg;
@@ -1046,7 +1055,7 @@ static int ubifs_mknod(struct inode *dir
return err;
}
- err = fscrypt_setup_filename(dir, &dentry->d_name, 0, &nm);
+ err = ubifs_prepare_create(dir, dentry, &nm);
if (err) {
kfree(dev);
goto out_budg;
@@ -1130,7 +1139,7 @@ static int ubifs_symlink(struct inode *d
if (err)
return err;
- err = fscrypt_setup_filename(dir, &dentry->d_name, 0, &nm);
+ err = ubifs_prepare_create(dir, dentry, &nm);
if (err)
goto out_budg;
From: Qinglang Miao <[email protected]>
[ Upstream commit ffa1797040c5da391859a9556be7b735acbe1242 ]
I noticed that iounmap() of msgr_block_addr before return from
mpic_msgr_probe() in the error handling case is missing. So use
devm_ioremap() instead of just ioremap() when remapping the message
register block, so the mapping will be automatically released on
probe failure.
Signed-off-by: Qinglang Miao <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
arch/powerpc/sysdev/mpic_msgr.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/sysdev/mpic_msgr.c b/arch/powerpc/sysdev/mpic_msgr.c
index f6b253e2be409..36ec0bdd8b63c 100644
--- a/arch/powerpc/sysdev/mpic_msgr.c
+++ b/arch/powerpc/sysdev/mpic_msgr.c
@@ -191,7 +191,7 @@ static int mpic_msgr_probe(struct platform_device *dev)
/* IO map the message register block. */
of_address_to_resource(np, 0, &rsrc);
- msgr_block_addr = ioremap(rsrc.start, resource_size(&rsrc));
+ msgr_block_addr = devm_ioremap(&dev->dev, rsrc.start, resource_size(&rsrc));
if (!msgr_block_addr) {
dev_err(&dev->dev, "Failed to iomap MPIC message registers");
return -EFAULT;
--
2.27.0
From: Eric Biggers <[email protected]>
[ Upstream commit edf7ddbf1c5eb98b720b063b73e20e8a4a1ce673 ]
Missing calls to mntget() (or equivalently, too many calls to mntput())
are hard to detect because mntput() delays freeing mounts using
task_work_add(), then again using call_rcu(). As a result, mnt_count
can often be decremented to -1 without getting a KASAN use-after-free
report. Such cases are still bugs though, and they point to real
use-after-frees being possible.
For an example of this, see the bug fixed by commit 1b0b9cc8d379
("vfs: fsmount: add missing mntget()"), discussed at
https://lkml.kernel.org/linux-fsdevel/20190605135401.GB30925@xxxxxxxxxxxxxxxxxxxxxxxxx/T/#u.
This bug *should* have been trivial to find. But actually, it wasn't
found until syzkaller happened to use fchdir() to manipulate the
reference count just right for the bug to be noticeable.
Address this by making mntput_no_expire() issue a WARN if mnt_count has
become negative.
Suggested-by: Miklos Szeredi <[email protected]>
Signed-off-by: Eric Biggers <[email protected]>
Signed-off-by: Al Viro <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/namespace.c | 9 ++++++---
fs/pnode.h | 2 +-
2 files changed, 7 insertions(+), 4 deletions(-)
diff --git a/fs/namespace.c b/fs/namespace.c
index 2adfe7b166a3e..76ea92994d26d 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -156,10 +156,10 @@ static inline void mnt_add_count(struct mount *mnt, int n)
/*
* vfsmount lock must be held for write
*/
-unsigned int mnt_get_count(struct mount *mnt)
+int mnt_get_count(struct mount *mnt)
{
#ifdef CONFIG_SMP
- unsigned int count = 0;
+ int count = 0;
int cpu;
for_each_possible_cpu(cpu) {
@@ -1123,6 +1123,7 @@ static DECLARE_DELAYED_WORK(delayed_mntput_work, delayed_mntput);
static void mntput_no_expire(struct mount *mnt)
{
LIST_HEAD(list);
+ int count;
rcu_read_lock();
if (likely(READ_ONCE(mnt->mnt_ns))) {
@@ -1146,7 +1147,9 @@ static void mntput_no_expire(struct mount *mnt)
*/
smp_mb();
mnt_add_count(mnt, -1);
- if (mnt_get_count(mnt)) {
+ count = mnt_get_count(mnt);
+ if (count != 0) {
+ WARN_ON(count < 0);
rcu_read_unlock();
unlock_mount_hash();
return;
diff --git a/fs/pnode.h b/fs/pnode.h
index 49a058c73e4c7..26f74e092bd98 100644
--- a/fs/pnode.h
+++ b/fs/pnode.h
@@ -44,7 +44,7 @@ int propagate_mount_busy(struct mount *, int);
void propagate_mount_unlock(struct mount *);
void mnt_release_group_id(struct mount *);
int get_dominating_id(struct mount *mnt, const struct path *root);
-unsigned int mnt_get_count(struct mount *mnt);
+int mnt_get_count(struct mount *mnt);
void mnt_set_mountpoint(struct mount *, struct mountpoint *,
struct mount *);
void mnt_change_mountpoint(struct mount *parent, struct mountpoint *mp,
--
2.27.0
From: Qinglang Miao <[email protected]>
[ Upstream commit 59165d16c699182b86b5c65181013f1fd88feb62 ]
Add the missing destroy_workqueue() before return from
i3c_master_register in the error handling case.
Signed-off-by: Qinglang Miao <[email protected]>
Signed-off-by: Boris Brezillon <[email protected]>
Link: https://lore.kernel.org/linux-i3c/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/i3c/master.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/i3c/master.c b/drivers/i3c/master.c
index 6cc71c90f85ea..19337aed9f235 100644
--- a/drivers/i3c/master.c
+++ b/drivers/i3c/master.c
@@ -2492,7 +2492,7 @@ int i3c_master_register(struct i3c_master_controller *master,
ret = i3c_master_bus_init(master);
if (ret)
- goto err_put_dev;
+ goto err_destroy_wq;
ret = device_add(&master->dev);
if (ret)
@@ -2523,6 +2523,9 @@ int i3c_master_register(struct i3c_master_controller *master,
err_cleanup_bus:
i3c_master_bus_cleanup(master);
+err_destroy_wq:
+ destroy_workqueue(master->wq);
+
err_put_dev:
put_device(&master->dev);
--
2.27.0
From: Hyeongseok Kim <[email protected]>
[ Upstream commit 252bd1256396cebc6fc3526127fdb0b317601318 ]
If emergency system shutdown is called, like by thermal shutdown,
a dm device could be alive when the block device couldn't process
I/O requests anymore. In this state, the handling of I/O errors
by new dm I/O requests or by those already in-flight can lead to
a verity corruption state, which is a misjudgment.
So, skip verity work in response to I/O error when system is shutting
down.
Signed-off-by: Hyeongseok Kim <[email protected]>
Reviewed-by: Sami Tolvanen <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/md/dm-verity-target.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
index 4fb33e7562c52..2aeb922e2365c 100644
--- a/drivers/md/dm-verity-target.c
+++ b/drivers/md/dm-verity-target.c
@@ -533,6 +533,15 @@ static int verity_verify_io(struct dm_verity_io *io)
return 0;
}
+/*
+ * Skip verity work in response to I/O error when system is shutting down.
+ */
+static inline bool verity_is_system_shutting_down(void)
+{
+ return system_state == SYSTEM_HALT || system_state == SYSTEM_POWER_OFF
+ || system_state == SYSTEM_RESTART;
+}
+
/*
* End one "io" structure with a given error.
*/
@@ -560,7 +569,8 @@ static void verity_end_io(struct bio *bio)
{
struct dm_verity_io *io = bio->bi_private;
- if (bio->bi_status && !verity_fec_is_enabled(io->v)) {
+ if (bio->bi_status &&
+ (!verity_fec_is_enabled(io->v) || verity_is_system_shutting_down())) {
verity_finish_io(io, bio->bi_status);
return;
}
--
2.27.0
From: Paolo Bonzini <[email protected]>
[ Upstream commit df7e8818926eb4712b67421442acf7d568fe2645 ]
Userspace that does not know about the AMD_IBRS bit might still
allow the guest to protect itself with MSR_IA32_SPEC_CTRL using
the Intel SPEC_CTRL bit. However, svm.c disallows this and will
cause a #GP in the guest when writing to the MSR. Fix this by
loosening the test and allowing the Intel CPUID bit, and in fact
allow the AMD_STIBP bit as well since it allows writing to
MSR_IA32_SPEC_CTRL too.
Reported-by: Zhiyi Guo <[email protected]>
Analyzed-by: Dr. David Alan Gilbert <[email protected]>
Analyzed-by: Laszlo Ersek <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/x86/kvm/svm.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 72bf1d8175ac2..ca746006ac040 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -4233,6 +4233,8 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
break;
case MSR_IA32_SPEC_CTRL:
if (!msr_info->host_initiated &&
+ !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) &&
+ !guest_cpuid_has(vcpu, X86_FEATURE_AMD_STIBP) &&
!guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBRS) &&
!guest_cpuid_has(vcpu, X86_FEATURE_AMD_SSBD))
return 1;
@@ -4318,6 +4320,8 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
break;
case MSR_IA32_SPEC_CTRL:
if (!msr->host_initiated &&
+ !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) &&
+ !guest_cpuid_has(vcpu, X86_FEATURE_AMD_STIBP) &&
!guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBRS) &&
!guest_cpuid_has(vcpu, X86_FEATURE_AMD_SSBD))
return 1;
--
2.27.0
From: Petr Vorel <[email protected]>
commit a85cbe6159ffc973e5702f70a3bd5185f8f3c38d upstream.
and include <linux/const.h> in UAPI headers instead of <linux/kernel.h>.
The reason is to avoid indirect <linux/sysinfo.h> include when using
some network headers: <linux/netlink.h> or others -> <linux/kernel.h>
-> <linux/sysinfo.h>.
This indirect include causes on MUSL redefinition of struct sysinfo when
included both <sys/sysinfo.h> and some of UAPI headers:
In file included from x86_64-buildroot-linux-musl/sysroot/usr/include/linux/kernel.h:5,
from x86_64-buildroot-linux-musl/sysroot/usr/include/linux/netlink.h:5,
from ../include/tst_netlink.h:14,
from tst_crypto.c:13:
x86_64-buildroot-linux-musl/sysroot/usr/include/linux/sysinfo.h:8:8: error: redefinition of `struct sysinfo'
struct sysinfo {
^~~~~~~
In file included from ../include/tst_safe_macros.h:15,
from ../include/tst_test.h:93,
from tst_crypto.c:11:
x86_64-buildroot-linux-musl/sysroot/usr/include/sys/sysinfo.h:10:8: note: originally defined here
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Petr Vorel <[email protected]>
Suggested-by: Rich Felker <[email protected]>
Acked-by: Rich Felker <[email protected]>
Cc: Peter Korsgaard <[email protected]>
Cc: Baruch Siach <[email protected]>
Cc: Florian Weimer <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
include/uapi/linux/const.h | 5 +++++
include/uapi/linux/ethtool.h | 2 +-
include/uapi/linux/kernel.h | 9 +--------
include/uapi/linux/lightnvm.h | 2 +-
include/uapi/linux/mroute6.h | 2 +-
include/uapi/linux/netfilter/x_tables.h | 2 +-
include/uapi/linux/netlink.h | 2 +-
include/uapi/linux/sysctl.h | 2 +-
8 files changed, 12 insertions(+), 14 deletions(-)
--- a/include/uapi/linux/const.h
+++ b/include/uapi/linux/const.h
@@ -28,4 +28,9 @@
#define _BITUL(x) (_UL(1) << (x))
#define _BITULL(x) (_ULL(1) << (x))
+#define __ALIGN_KERNEL(x, a) __ALIGN_KERNEL_MASK(x, (typeof(x))(a) - 1)
+#define __ALIGN_KERNEL_MASK(x, mask) (((x) + (mask)) & ~(mask))
+
+#define __KERNEL_DIV_ROUND_UP(n, d) (((n) + (d) - 1) / (d))
+
#endif /* _UAPI_LINUX_CONST_H */
--- a/include/uapi/linux/ethtool.h
+++ b/include/uapi/linux/ethtool.h
@@ -14,7 +14,7 @@
#ifndef _UAPI_LINUX_ETHTOOL_H
#define _UAPI_LINUX_ETHTOOL_H
-#include <linux/kernel.h>
+#include <linux/const.h>
#include <linux/types.h>
#include <linux/if_ether.h>
--- a/include/uapi/linux/kernel.h
+++ b/include/uapi/linux/kernel.h
@@ -3,13 +3,6 @@
#define _UAPI_LINUX_KERNEL_H
#include <linux/sysinfo.h>
-
-/*
- * 'kernel.h' contains some often-used function prototypes etc
- */
-#define __ALIGN_KERNEL(x, a) __ALIGN_KERNEL_MASK(x, (typeof(x))(a) - 1)
-#define __ALIGN_KERNEL_MASK(x, mask) (((x) + (mask)) & ~(mask))
-
-#define __KERNEL_DIV_ROUND_UP(n, d) (((n) + (d) - 1) / (d))
+#include <linux/const.h>
#endif /* _UAPI_LINUX_KERNEL_H */
--- a/include/uapi/linux/lightnvm.h
+++ b/include/uapi/linux/lightnvm.h
@@ -21,7 +21,7 @@
#define _UAPI_LINUX_LIGHTNVM_H
#ifdef __KERNEL__
-#include <linux/kernel.h>
+#include <linux/const.h>
#include <linux/ioctl.h>
#else /* __KERNEL__ */
#include <stdio.h>
--- a/include/uapi/linux/mroute6.h
+++ b/include/uapi/linux/mroute6.h
@@ -2,7 +2,7 @@
#ifndef _UAPI__LINUX_MROUTE6_H
#define _UAPI__LINUX_MROUTE6_H
-#include <linux/kernel.h>
+#include <linux/const.h>
#include <linux/types.h>
#include <linux/sockios.h>
#include <linux/in6.h> /* For struct sockaddr_in6. */
--- a/include/uapi/linux/netfilter/x_tables.h
+++ b/include/uapi/linux/netfilter/x_tables.h
@@ -1,7 +1,7 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
#ifndef _UAPI_X_TABLES_H
#define _UAPI_X_TABLES_H
-#include <linux/kernel.h>
+#include <linux/const.h>
#include <linux/types.h>
#define XT_FUNCTION_MAXNAMELEN 30
--- a/include/uapi/linux/netlink.h
+++ b/include/uapi/linux/netlink.h
@@ -2,7 +2,7 @@
#ifndef _UAPI__LINUX_NETLINK_H
#define _UAPI__LINUX_NETLINK_H
-#include <linux/kernel.h>
+#include <linux/const.h>
#include <linux/socket.h> /* for __kernel_sa_family_t */
#include <linux/types.h>
--- a/include/uapi/linux/sysctl.h
+++ b/include/uapi/linux/sysctl.h
@@ -23,7 +23,7 @@
#ifndef _UAPI_LINUX_SYSCTL_H
#define _UAPI_LINUX_SYSCTL_H
-#include <linux/kernel.h>
+#include <linux/const.h>
#include <linux/types.h>
#include <linux/compiler.h>
From: Thomas Gleixner <[email protected]>
[ Upstream commit ba8ea8e7dd6e1662e34e730eadfc52aa6816f9dd ]
can_stop_idle_tick() checks whether the do_timer() duty has been taken over
by a CPU on boot. That's silly because the boot CPU always takes over with
the initial clockevent device.
But even if no CPU would have installed a clockevent and taken over the
duty then the question whether the tick on the current CPU can be stopped
or not is moot. In that case the current CPU would have no clockevent
either, so there would be nothing to keep ticking.
Remove it.
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Frederic Weisbecker <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
kernel/time/tick-sched.c | 7 -------
1 file changed, 7 deletions(-)
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 5c9fcc72460df..4419486d7413c 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -916,13 +916,6 @@ static bool can_stop_idle_tick(int cpu, struct tick_sched *ts)
*/
if (tick_do_timer_cpu == cpu)
return false;
- /*
- * Boot safety: make sure the timekeeping duty has been
- * assigned before entering dyntick-idle mode,
- * tick_do_timer_cpu is TICK_DO_TIMER_BOOT
- */
- if (unlikely(tick_do_timer_cpu == TICK_DO_TIMER_BOOT))
- return false;
/* Should not happen for nohz-full */
if (WARN_ON_ONCE(tick_do_timer_cpu == TICK_DO_TIMER_NONE))
--
2.27.0
From: Bart Van Assche <[email protected]>
commit fa4d0f1992a96f6d7c988ef423e3127e613f6ac9 upstream.
With the current implementation the following race can happen:
* blk_pre_runtime_suspend() calls blk_freeze_queue_start() and
blk_mq_unfreeze_queue().
* blk_queue_enter() calls blk_queue_pm_only() and that function returns
true.
* blk_queue_enter() calls blk_pm_request_resume() and that function does
not call pm_request_resume() because the queue runtime status is
RPM_ACTIVE.
* blk_pre_runtime_suspend() changes the queue status into RPM_SUSPENDING.
Fix this race by changing the queue runtime status into RPM_SUSPENDING
before switching q_usage_counter to atomic mode.
Link: https://lore.kernel.org/r/[email protected]
Fixes: 986d413b7c15 ("blk-mq: Enable support for runtime power management")
Cc: Ming Lei <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: stable <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Hannes Reinecke <[email protected]>
Reviewed-by: Jens Axboe <[email protected]>
Acked-by: Alan Stern <[email protected]>
Acked-by: Stanley Chu <[email protected]>
Co-developed-by: Can Guo <[email protected]>
Signed-off-by: Can Guo <[email protected]>
Signed-off-by: Bart Van Assche <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
block/blk-pm.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
--- a/block/blk-pm.c
+++ b/block/blk-pm.c
@@ -67,6 +67,10 @@ int blk_pre_runtime_suspend(struct reque
WARN_ON_ONCE(q->rpm_status != RPM_ACTIVE);
+ spin_lock_irq(&q->queue_lock);
+ q->rpm_status = RPM_SUSPENDING;
+ spin_unlock_irq(&q->queue_lock);
+
/*
* Increase the pm_only counter before checking whether any
* non-PM blk_queue_enter() calls are in progress to avoid that any
@@ -89,15 +93,14 @@ int blk_pre_runtime_suspend(struct reque
/* Switch q_usage_counter back to per-cpu mode. */
blk_mq_unfreeze_queue(q);
- spin_lock_irq(&q->queue_lock);
- if (ret < 0)
+ if (ret < 0) {
+ spin_lock_irq(&q->queue_lock);
+ q->rpm_status = RPM_ACTIVE;
pm_runtime_mark_last_busy(q->dev);
- else
- q->rpm_status = RPM_SUSPENDING;
- spin_unlock_irq(&q->queue_lock);
+ spin_unlock_irq(&q->queue_lock);
- if (ret)
blk_clear_pm_only(q);
+ }
return ret;
}
From: Paolo Bonzini <[email protected]>
[ Upstream commit 39485ed95d6b83b62fa75c06c2c4d33992e0d971 ]
Until commit e7c587da1252 ("x86/speculation: Use synthetic bits for
IBRS/IBPB/STIBP"), KVM was testing both Intel and AMD CPUID bits before
allowing the guest to write MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD.
Testing only Intel bits on VMX processors, or only AMD bits on SVM
processors, fails if the guests are created with the "opposite" vendor
as the host.
While at it, also tweak the host CPU check to use the vendor-agnostic
feature bit X86_FEATURE_IBPB, since we only care about the availability
of the MSR on the host here and not about specific CPUID bits.
Fixes: e7c587da1252 ("x86/speculation: Use synthetic bits for IBRS/IBPB/STIBP")
Cc: [email protected]
Reported-by: Denis V. Lunev <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/x86/kvm/cpuid.h | 14 ++++++++++++++
arch/x86/kvm/svm.c | 14 ++++----------
arch/x86/kvm/vmx/vmx.c | 8 ++++----
3 files changed, 22 insertions(+), 14 deletions(-)
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index d78a61408243f..7dec43b2c4205 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -154,6 +154,20 @@ static inline int guest_cpuid_stepping(struct kvm_vcpu *vcpu)
return x86_stepping(best->eax);
}
+static inline bool guest_has_spec_ctrl_msr(struct kvm_vcpu *vcpu)
+{
+ return (guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) ||
+ guest_cpuid_has(vcpu, X86_FEATURE_AMD_STIBP) ||
+ guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBRS) ||
+ guest_cpuid_has(vcpu, X86_FEATURE_AMD_SSBD));
+}
+
+static inline bool guest_has_pred_cmd_msr(struct kvm_vcpu *vcpu)
+{
+ return (guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) ||
+ guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBPB));
+}
+
static inline bool supports_cpuid_fault(struct kvm_vcpu *vcpu)
{
return vcpu->arch.msr_platform_info & MSR_PLATFORM_INFO_CPUID_FAULT;
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index ca746006ac040..2b506904be024 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -4233,10 +4233,7 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
break;
case MSR_IA32_SPEC_CTRL:
if (!msr_info->host_initiated &&
- !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) &&
- !guest_cpuid_has(vcpu, X86_FEATURE_AMD_STIBP) &&
- !guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBRS) &&
- !guest_cpuid_has(vcpu, X86_FEATURE_AMD_SSBD))
+ !guest_has_spec_ctrl_msr(vcpu))
return 1;
msr_info->data = svm->spec_ctrl;
@@ -4320,10 +4317,7 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
break;
case MSR_IA32_SPEC_CTRL:
if (!msr->host_initiated &&
- !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) &&
- !guest_cpuid_has(vcpu, X86_FEATURE_AMD_STIBP) &&
- !guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBRS) &&
- !guest_cpuid_has(vcpu, X86_FEATURE_AMD_SSBD))
+ !guest_has_spec_ctrl_msr(vcpu))
return 1;
if (data & ~kvm_spec_ctrl_valid_bits(vcpu))
@@ -4348,12 +4342,12 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
break;
case MSR_IA32_PRED_CMD:
if (!msr->host_initiated &&
- !guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBPB))
+ !guest_has_pred_cmd_msr(vcpu))
return 1;
if (data & ~PRED_CMD_IBPB)
return 1;
- if (!boot_cpu_has(X86_FEATURE_AMD_IBPB))
+ if (!boot_cpu_has(X86_FEATURE_IBPB))
return 1;
if (!data)
break;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 8450fce70bd96..e7fd2f00edc11 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1788,7 +1788,7 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
break;
case MSR_IA32_SPEC_CTRL:
if (!msr_info->host_initiated &&
- !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL))
+ !guest_has_spec_ctrl_msr(vcpu))
return 1;
msr_info->data = to_vmx(vcpu)->spec_ctrl;
@@ -1971,7 +1971,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
break;
case MSR_IA32_SPEC_CTRL:
if (!msr_info->host_initiated &&
- !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL))
+ !guest_has_spec_ctrl_msr(vcpu))
return 1;
if (data & ~kvm_spec_ctrl_valid_bits(vcpu))
@@ -1999,12 +1999,12 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
break;
case MSR_IA32_PRED_CMD:
if (!msr_info->host_initiated &&
- !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL))
+ !guest_has_pred_cmd_msr(vcpu))
return 1;
if (data & ~PRED_CMD_IBPB)
return 1;
- if (!boot_cpu_has(X86_FEATURE_SPEC_CTRL))
+ if (!boot_cpu_has(X86_FEATURE_IBPB))
return 1;
if (!data)
break;
--
2.27.0
From: Qinglang Miao <[email protected]>
commit 2d18e54dd8662442ef5898c6bdadeaf90b3cebbc upstream.
A memory leak is found in cgroup1_parse_param() when multiple source
parameters overwrite fc->source in the fs_context struct without free.
unreferenced object 0xffff888100d930e0 (size 16):
comm "mount", pid 520, jiffies 4303326831 (age 152.783s)
hex dump (first 16 bytes):
74 65 73 74 6c 65 61 6b 00 00 00 00 00 00 00 00 testleak........
backtrace:
[<000000003e5023ec>] kmemdup_nul+0x2d/0xa0
[<00000000377dbdaa>] vfs_parse_fs_string+0xc0/0x150
[<00000000cb2b4882>] generic_parse_monolithic+0x15a/0x1d0
[<000000000f750198>] path_mount+0xee1/0x1820
[<0000000004756de2>] do_mount+0xea/0x100
[<0000000094cafb0a>] __x64_sys_mount+0x14b/0x1f0
Fix this bug by permitting a single source parameter and rejecting with
an error all subsequent ones.
Fixes: 8d2451f4994f ("cgroup1: switch to option-by-option parsing")
Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Qinglang Miao <[email protected]>
Reviewed-by: Zefan Li <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
kernel/cgroup/cgroup-v1.c | 2 ++
1 file changed, 2 insertions(+)
--- a/kernel/cgroup/cgroup-v1.c
+++ b/kernel/cgroup/cgroup-v1.c
@@ -914,6 +914,8 @@ int cgroup1_parse_param(struct fs_contex
opt = fs_parse(fc, &cgroup1_fs_parameters, param, &result);
if (opt == -ENOPARAM) {
if (strcmp(param->key, "source") == 0) {
+ if (fc->source)
+ return invalf(fc, "Multiple sources not supported");
fc->source = param->string;
param->string = NULL;
return 0;
From: Jan Kara <[email protected]>
[ Upstream commit 10f04d40a9fa29785206c619f80d8beedb778837 ]
The on-disk quota format supports quota files with upto 2^32 blocks. Be
careful when computing quota file offsets in the quota files from block
numbers as they can overflow 32-bit types. Since quota files larger than
4GB would require ~26 millions of quota users, this is mostly a
theoretical concern now but better be careful, fuzzers would find the
problem sooner or later anyway...
Reviewed-by: Andreas Dilger <[email protected]>
Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/quota/quota_tree.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/fs/quota/quota_tree.c b/fs/quota/quota_tree.c
index a6f856f341dc7..c5562c871c8be 100644
--- a/fs/quota/quota_tree.c
+++ b/fs/quota/quota_tree.c
@@ -62,7 +62,7 @@ static ssize_t read_blk(struct qtree_mem_dqinfo *info, uint blk, char *buf)
memset(buf, 0, info->dqi_usable_bs);
return sb->s_op->quota_read(sb, info->dqi_type, buf,
- info->dqi_usable_bs, blk << info->dqi_blocksize_bits);
+ info->dqi_usable_bs, (loff_t)blk << info->dqi_blocksize_bits);
}
static ssize_t write_blk(struct qtree_mem_dqinfo *info, uint blk, char *buf)
@@ -71,7 +71,7 @@ static ssize_t write_blk(struct qtree_mem_dqinfo *info, uint blk, char *buf)
ssize_t ret;
ret = sb->s_op->quota_write(sb, info->dqi_type, buf,
- info->dqi_usable_bs, blk << info->dqi_blocksize_bits);
+ info->dqi_usable_bs, (loff_t)blk << info->dqi_blocksize_bits);
if (ret != info->dqi_usable_bs) {
quota_error(sb, "dquota write failed");
if (ret >= 0)
@@ -284,7 +284,7 @@ static uint find_free_dqentry(struct qtree_mem_dqinfo *info,
blk);
goto out_buf;
}
- dquot->dq_off = (blk << info->dqi_blocksize_bits) +
+ dquot->dq_off = ((loff_t)blk << info->dqi_blocksize_bits) +
sizeof(struct qt_disk_dqdbheader) +
i * info->dqi_entry_size;
kfree(buf);
@@ -559,7 +559,7 @@ static loff_t find_block_dqentry(struct qtree_mem_dqinfo *info,
ret = -EIO;
goto out_buf;
} else {
- ret = (blk << info->dqi_blocksize_bits) + sizeof(struct
+ ret = ((loff_t)blk << info->dqi_blocksize_bits) + sizeof(struct
qt_disk_dqdbheader) + i * info->dqi_entry_size;
}
out_buf:
--
2.27.0
Hello!
On 1/4/21 9:56 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.4.87 release.
> There are 47 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed, 06 Jan 2021 15:56:52 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.87-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
Results from Linaro’s test farm.
No regressions detected.
Tested-by: Linux Kernel Functional Testing <[email protected]>
Summary
------------------------------------------------------------------------
kernel: 5.4.87-rc1
git repo: ['https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git', 'https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc']
git branch: linux-5.4.y
git commit: 01678c93fa9e3da85a53deb1510c25fdcd2e5d6d
git describe: v5.4.86-48-g01678c93fa9e
Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.4.y/build/v5.4.86-48-g01678c93fa9e
No regressions (compared to build v5.4.86)
No fixes (compared to build v5.4.86)
Ran 48522 total tests in the following environments and test suites.
Environments
--------------
- arc
- arm
- arm64
- dragonboard-410c
- hi6220-hikey
- i386
- juno-r2
- juno-r2-compat
- juno-r2-kasan
- mips
- parisc
- powerpc
- qemu-arm-clang
- qemu-arm64-clang
- qemu-arm64-kasan
- qemu-x86_64-clang
- qemu-x86_64-kasan
- qemu-x86_64-kcsan
- qemu_arm
- qemu_arm64
- qemu_arm64-compat
- qemu_i386
- qemu_x86_64
- qemu_x86_64-compat
- riscv
- s390
- sh
- sparc
- x15
- x86
- x86-kasan
Test Suites
-----------
* build
* fwts
* igt-gpu-tools
* install-android-platform-tools-r2600
* kselftest
* kselftest-vsyscall-mode-native
* kselftest-vsyscall-mode-none
* kvm-unit-tests
* libhugetlbfs
* linux-log-parser
* ltp-cap_bounds-tests
* ltp-commands-tests
* ltp-containers-tests
* ltp-controllers-tests
* ltp-cpuhotplug-tests
* ltp-crypto-tests
* ltp-cve-tests
* ltp-dio-tests
* ltp-fcntl-locktests-tests
* ltp-filecaps-tests
* ltp-fs_bind-tests
* ltp-fs_perms_simple-tests
* ltp-fs-tests
* ltp-fsx-tests
* ltp-hugetlb-tests
* ltp-io-tests
* ltp-ipc-tests
* ltp-math-tests
* ltp-mm-tests
* ltp-nptl-tests
* ltp-open-posix-tests
* ltp-pty-tests
* ltp-sched-tests
* ltp-securebits-tests
* ltp-syscalls-tests
* ltp-tracing-tests
* network-basic-tests
* perf
* rcutorture
* v4l2-compliance
Greetings!
Daniel Díaz
[email protected]
--
Linaro LKFT
https://lkft.linaro.org
On 1/4/21 8:56 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.4.87 release.
> There are 47 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed, 06 Jan 2021 15:56:52 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.87-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan <[email protected]>
thanks,
-- Shuah
On Mon, Jan 04, 2021 at 04:56:59PM +0100, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.4.87 release.
> There are 47 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed, 06 Jan 2021 15:56:52 +0000.
> Anything received after that time might be too late.
>
Build results:
total: 157 pass: 157 fail: 0
Qemu test results:
total: 427 pass: 427 fail: 0
Tested-by: Guenter Roeck <[email protected]>
Guenter