linux.git/fs/ceph, branch v5.4.148

ceph: take snap_empty_lock atomically with snaprealm refcount change

2021-08-18T06:57:04+00:00

commit 8434ffe71c874b9c4e184b88d25de98c2bf5fe3f upstream.

There is a race in ceph_put_snap_realm. The change to the nref and the
spinlock acquisition are not done atomically, so you could decrement
nref, and before you take the spinlock, the nref is incremented again.
At that point, you end up putting it on the empty list when it
shouldn't be there. Eventually __cleanup_empty_realms runs and frees
it when it's still in-use.

Fix this by protecting the 1->0 transition with atomic_dec_and_lock,
and just drop the spinlock if we can get the rwsem.

Because these objects can also undergo a 0->1 refcount transition, we
must protect that change as well with the spinlock. Increment locklessly
unless the value is at 0, in which case we take the spinlock, increment
and then take it off the empty list if it did the 0->1 transition.

With these changes, I'm removing the dout() messages from these
functions, as well as in __put_snap_realm. They've always been racy, and
it's better to not print values that may be misleading.

Cc: stable@vger.kernel.org
URL: https://tracker.ceph.com/issues/46419
Reported-by: Mark Nelson 
Signed-off-by: Jeff Layton 
Reviewed-by: Luis Henriques 
Signed-off-by: Ilya Dryomov 
Signed-off-by: Greg Kroah-Hartman

ceph: clean up locking annotation for ceph_get_snap_realm and __lookup_snap_realm

2021-08-18T06:57:04+00:00

commit df2c0cb7f8e8c83e495260ad86df8c5da947f2a7 upstream.

They both say that the snap_rwsem must be held for write, but I don't
see any real reason for it, and it's not currently always called that
way.

The lookup is just walking the rbtree, so holding it for read should be
fine there. The "get" is bumping the refcount and (possibly) removing
it from the empty list. I see no need to hold the snap_rwsem for write
for that.

Signed-off-by: Jeff Layton 
Reviewed-by: Ilya Dryomov 
Signed-off-by: Ilya Dryomov 
Signed-off-by: Greg Kroah-Hartman

ceph: add some lockdep assertions around snaprealm handling

2021-08-18T06:57:04+00:00

commit a6862e6708c15995bc10614b2ef34ca35b4b9078 upstream.

Turn some comments into lockdep asserts.

Signed-off-by: Jeff Layton 
Reviewed-by: Ilya Dryomov 
Signed-off-by: Ilya Dryomov 
Signed-off-by: Greg Kroah-Hartman

ceph: reduce contention in ceph_check_delayed_caps()

2021-08-18T06:56:57+00:00

commit bf2ba432213fade50dd39f2e348085b758c0726e upstream.

Function ceph_check_delayed_caps() is called from the mdsc->delayed_work
workqueue and it can be kept looping for quite some time if caps keep
being added back to the mdsc->cap_delay_list.  This may result in the
watchdog tainting the kernel with the softlockup flag.

This patch breaks this loop if the caps have been recently (i.e. during
the loop execution).  Any new caps added to the list will be handled in
the next run.

Also, allow schedule_delayed() callers to explicitly set the delay value
instead of defaulting to 5s, so we can ensure that it runs soon
afterward if it looks like there is more work.

Cc: stable@vger.kernel.org
URL: https://tracker.ceph.com/issues/46284
Signed-off-by: Luis Henriques 
Reviewed-by: Jeff Layton 
Signed-off-by: Ilya Dryomov 
Signed-off-by: Greg Kroah-Hartman

ceph: remove bogus checks and WARN_ONs from ceph_set_page_dirty

2021-07-20T14:10:48+00:00

[ Upstream commit 22d41cdcd3cfd467a4af074165357fcbea1c37f5 ]

The checks for page->mapping are odd, as set_page_dirty is an
address_space operation, and I don't see where it would be called on a
non-pagecache page.

The warning about the page lock also seems bogus.  The comment over
set_page_dirty() says that it can be called without the page lock in
some rare cases. I don't think we want to warn if that's the case.

Reported-by: Matthew Wilcox 
Signed-off-by: Jeff Layton 
Signed-off-by: Ilya Dryomov 
Signed-off-by: Sasha Levin

ceph: fix fscache invalidation

2021-05-22T09:38:29+00:00

[ Upstream commit 10a7052c7868bc7bc72d947f5aac6f768928db87 ]

Ensure that we invalidate the fscache whenever we invalidate the
pagecache.

Signed-off-by: Jeff Layton 
Signed-off-by: Ilya Dryomov 
Signed-off-by: Sasha Levin

ceph: fix inode leak on getattr error in __fh_to_dentry

2021-05-19T08:08:26+00:00

[ Upstream commit 1775c7ddacfcea29051c67409087578f8f4d751b ]

Fixes: 878dabb64117 ("ceph: don't return -ESTALE if there's still an open file")
Signed-off-by: Jeff Layton 
Reviewed-by: Xiubo Li 
Signed-off-by: Ilya Dryomov 
Signed-off-by: Sasha Levin

ceph: fix race in concurrent __ceph_remove_cap invocations

2020-12-30T10:51:40+00:00

commit e5cafce3ad0f8652d6849314d951459c2bff7233 upstream.

A NULL pointer dereference may occur in __ceph_remove_cap with some of the
callbacks used in ceph_iterate_session_caps, namely trim_caps_cb and
remove_session_caps_cb. Those callers hold the session->s_mutex, so they
are prevented from concurrent execution, but ceph_evict_inode does not.

Since the callers of this function hold the i_ceph_lock, the fix is simply
a matter of returning immediately if caps->ci is NULL.

Cc: stable@vger.kernel.org
URL: https://tracker.ceph.com/issues/43272
Suggested-by: Jeff Layton 
Signed-off-by: Luis Henriques 
Reviewed-by: Jeff Layton 
Signed-off-by: Ilya Dryomov 
Signed-off-by: Greg Kroah-Hartman

ceph: promote to unsigned long long before shifting

2020-11-05T10:43:34+00:00

commit c403c3a2fbe24d4ed33e10cabad048583ebd4edf upstream.

On 32-bit systems, this shift will overflow for files larger than 4GB.

Cc: stable@vger.kernel.org
Fixes: 61f68816211e ("ceph: check caps in filemap_fault and page_mkwrite")
Signed-off-by: Matthew Wilcox (Oracle) 
Reviewed-by: Jeff Layton 
Signed-off-by: Ilya Dryomov 
Signed-off-by: Greg Kroah-Hartman

ceph: fix potential race in ceph_check_caps

2020-10-01T11:18:08+00:00

[ Upstream commit dc3da0461cc4b76f2d0c5b12247fcb3b520edbbf ]

Nothing ensures that session will still be valid by the time we
dereference the pointer. Take and put a reference.

In principle, we should always be able to get a reference here, but
throw a warning if that's ever not the case.

Signed-off-by: Jeff Layton 
Signed-off-by: Ilya Dryomov 
Signed-off-by: Sasha Levin