linux.git/drivers/md/dm-cache-target.c, branch v3.16.2

dm cache: fix race affecting dirty block count

2014-08-01T16:25:22+00:00

nr_dirty is updated without locking, causing it to drift so that it is
non-zero (either a small positive integer, or a very large one when an
underflow occurs) even when there are no actual dirty blocks.  This was
due to a race between the workqueue and map function accessing nr_dirty
in parallel without proper protection.

People were seeing under runs due to a race on increment/decrement of
nr_dirty, see: https://lkml.org/lkml/2014/6/3/648

Fix this by using an atomic_t for nr_dirty.

Reported-by: roma1390@gmail.com
Signed-off-by: Anssi Hannula 
Signed-off-by: Joe Thornber 
Signed-off-by: Mike Snitzer 
Cc: stable@vger.kernel.org

dm cache: always split discards on cache block boundaries

2014-05-27T14:33:05+00:00

The DM cache target cannot cope with discards that span multiple cache
blocks, so each discard bio that spans more than one cache block must
get split by the DM core.

Signed-off-by: Heinz Mauelshagen 
Acked-by: Joe Thornber 
Signed-off-by: Mike Snitzer 
Cc: stable@vger.kernel.org # v3.9+

dm cache: fix writethrough mode quiescing in cache_map

2014-05-01T20:14:24+00:00

Commit 2ee57d58735 ("dm cache: add passthrough mode") inadvertently
removed the deferred set reference that was taken in cache_map()'s
writethrough mode support.  Restore taking this reference.

This issue was found with code inspection.

Signed-off-by: Mike Snitzer 
Acked-by: Joe Thornber 
Cc: stable@vger.kernel.org # 3.13+

dm cache: fix a lock-inversion

2014-04-04T18:53:05+00:00

When suspending a cache the policy is walked and the individual policy
hints written to the metadata via sync_metadata().  This led to this
lock order:

      policy->lock
        cache_metadata->root_lock

When loading the cache target the policy is populated while the metadata
lock is held:

      cache_metadata->root_lock
         policy->lock

Fix this potential lock-inversion (ABBA) deadlock in sync_metadata() by
ensuring the cache_metadata root_lock is held whilst all the hints are
written, rather than being repeatedly locked while policy->lock is held
(as was the case with each callout that policy_walk_mappings() made to
the old save_hint() method).

Found by turning on the CONFIG_PROVE_LOCKING ("Lock debugging: prove
locking correctness") build option.  However, it is not clear how the
LOCKDEP reported paths can lead to a deadlock since the two paths,
suspending a target and loading a target, never occur at the same time.
But that doesn't mean the same lock-inversion couldn't have occurred
elsewhere.

Reported-by: Marian Csontos 
Signed-off-by: Joe Thornber 
Signed-off-by: Mike Snitzer 
Cc: stable@vger.kernel.org

dm cache: remove remainder of distinct discard block size

2014-03-27T20:56:23+00:00

Discard block size not being equal to cache block size causes data
corruption by erroneously avoiding migrations in issue_copy() because
the discard state is being cleared for a group of cache blocks when it
should not.

Completely remove all code that enabled a distinction between the
cache block size and discard block size.

Signed-off-by: Heinz Mauelshagen 
Signed-off-by: Mike Snitzer

dm cache: prevent corruption caused by discard_block_size > cache_block_size

2014-03-27T20:56:23+00:00

If the discard block size is larger than the cache block size we will
not properly quiesce IO to a region that is about to be discarded.  This
results in a race between a cache migration where no copy is needed, and
a write to an adjacent cache block that's within the same large discard
block.

Workaround this by limiting the discard_block_size to cache_block_size.
Also limit the max_discard_sectors to cache_block_size.

A more comprehensive fix that introduces range locking support in the
bio_prison and proper quiescing of a discard range that spans multiple
cache blocks is already in development.

Reported-by: Morgan Mears 
Signed-off-by: Mike Snitzer 
Acked-by: Joe Thornber 
Acked-by: Heinz Mauelshagen 
Cc: stable@vger.kernel.org

dm cache: fix access beyond end of origin device

2014-03-12T17:52:00+00:00

In order to avoid wasting cache space a partial block at the end of the
origin device is not cached.  Unfortunately, the check for such a
partial block at the end of the origin device was flawed.

Fix accesses beyond the end of the origin device that occured due to
attempted promotion of an undetected partial block by:

- initializing the per bio data struct to allow cache_end_io to work properly
- recognizing access to the partial block at the end of the origin device
- avoiding out of bounds access to the discard bitset

Otherwise, users can experience errors like the following:

 attempt to access beyond end of device
 dm-5: rw=0, want=20971520, limit=20971456
 ...
 device-mapper: cache: promotion failed; couldn't copy block

Signed-off-by: Heinz Mauelshagen 
Acked-by: Joe Thornber 
Signed-off-by: Mike Snitzer 
Cc: stable@vger.kernel.org

dm cache: fix truncation bug when copying a block to/from >2TB fast device

2014-03-12T17:49:27+00:00

During demotion or promotion to a cache's >2TB fast device we must not
truncate the cache block's associated sector to 32bits.  The 32bit
temporary result of from_cblock() caused a 32bit multiplication when
calculating the sector of the fast device in issue_copy_real().

Use an intermediate 64bit type to store the 32bit from_cblock() to allow
for proper 64bit multiplication.

Here is an example of how this bug manifests on an ext4 filesystem:

 EXT4-fs error (device dm-0): ext4_mb_generate_buddy:756: group 17136, 32768 clusters in bitmap, 30688 in gd; block bitmap corrupt.
 JBD2: Spotted dirty metadata buffer (dev = dm-0, blocknr = 0). There's a risk of filesystem corruption in case of system crash.

Signed-off-by: Heinz Mauelshagen 
Acked-by: Joe Thornber 
Signed-off-by: Mike Snitzer 
Cc: stable@vger.kernel.org

dm cache: fix truncation bug when mapping I/O to >2TB fast device

2014-02-28T14:23:02+00:00

When remapping a block to the cache's fast device that is larger than
2TB we must not truncate the destination sector to 32bits.  The 32bit
temporary result of from_cblock() was being overflowed in
remap_to_cache() due to the logical left shift.

Use an intermediate 64bit type to store the 32bit from_cblock() result
to fix the overflow.

Signed-off-by: Heinz Mauelshagen 
Signed-off-by: Mike Snitzer 
Cc: stable@vger.kernel.org

dm cache: do not add migration to completed list before unhooking bio

2014-02-17T16:00:05+00:00

When completing an overwrite bio, in overwrite_endio(), the associated
migration should not be added to the 'completed_migrations' until the
bio's fields are restored with dm_unhook_bio().

Otherwise, do_worker() can race to process 'completed_migrations' before
dm_unhook_bio() -- so the bio's bi_end_io is incorrect.  This is
unlikely to cause any problems given the current code but should be
fixed on the basis of correctness.

Also, the cache's spinlock only needs to be held when manipulating the
'completed_migrations' list -- other changes don't need protection.

Signed-off-by: Mike Snitzer 
Acked-by: Joe Thornber