// SPDX-License-Identifier: GPL-2.0-or-later
#include <linux/blkdev.h>
#include <linux/module.h>
#include <linux/errno.h>
#include <linux/slab.h>
#include <linux/init.h>
#include <linux/timer.h>
#include <linux/sched.h>
#include <linux/list.h>
#include <linux/file.h>
#include <linux/seq_file.h>
#include <trace/events/block.h>
#include "md.h"
#include "md-bitmap.h"
/*
* #### Background
*
* Redundant data is used to enhance data fault tolerance, and the storage
* methods for redundant data vary depending on the RAID levels. And it's
* important to maintain the consistency of redundant data.
*
* Bitmap is used to record which data blocks have been synchronized and which
* ones need to be resynchronized or recovered. Each bit in the bitmap
* represents a segment of data in the array. When a bit is set, it indicates
* that the multiple redundant copies of that data segment may not be
* consistent. Data synchronization can be performed based on the bitmap after
* power failure or readding a disk. If there is no bitmap, a full disk
* synchronization is required.
*
* #### Key Features
*
* - IO fastpath is lockless, if user issues lots of write IO to the same
* bitmap bit in a short time, only the first write has additional overhead
* to update bitmap bit, no additional overhead for the following writes;
* - support only resync or recover written data, means in the case creating
* new array or replacing with a new disk, there is no need to do a full disk
* resync/recovery;
*
* #### Key Concept
*
* ##### State Machine
*
* Each bit is one byte, contain 6 different states, see llbitmap_state. And
* there are total 8 different actions, see llbitmap_action, can change state:
*
* llbitmap state machine: transitions between states
*
* | | Startwrite | Startsync | Endsync | Abortsync|
* | --------- | ---------- | --------- | ------- | ------- |
* | Unwritten | Dirty | x | x | x |
* | Clean | Dirty | x | x | x |
* | Dirty | x | x | x | x |
* | NeedSync | x | Syncing | x | x |
* | Syncing | x | Syncing | Dirty | NeedSync |
*
* | | Reload | Daemon | Discard | Stale |
* | --------- | -------- | ------ | --------- | --------- |
* | Unwritten | x | x | x | x |
* | Clean | x | x | Unwritten | NeedSync |
* | Dirty | NeedSync | Clean | Unwritten | NeedSync |
* | NeedSync | x | x | Unwritten | x |
* | Syncing | NeedSync | x | Unwritten | NeedSync |
*
* Typical scenarios:
*
* 1) Create new array
* All bits will be set to Unwritten by default, if --assume-clean is set,
* all bits will be set to Clean instead.
*
* 2) write data, raid1/raid10 have full copy of data, while raid456 doesn't and
* rely on xor data
*
* 2.1) write new data to raid1/raid10:
* Unwritten --StartWrite--> Dirty
*
* 2.2) write new data to raid456:
* Unwritten --StartWrite--> NeedSync
*
* Because the initial recover for raid456 is skipped, the xor data is not built
* yet, the bit must be set to NeedSync first and after lazy initial recover is
* finished, the bit will finally set to Dirty(see 5.1 and 5.4);
*
* 2.3) cover write
* Clean --StartWrite--> Dirty
*
* 3) daemon, if the array is not degraded:
* Dirty --Daemon--> Clean
*
* 4) discard
* {Clean, Dirty, NeedSync, Syncing} --Discard--> Unwritten
*
* 5) resync and recover
*
* 5.1) common process
* NeedSync --Startsync--> Syncing --Endsync--> Dirty --Daemon--> Clean
*
* 5.2) resync after power failure
* Dirty --Reload--> NeedSync
*
* 5.3) recover while replacing with a new disk
* By default, the old bitmap framework will recover all data, and llbitmap
* implements this by a new helper, see llbitmap_skip_sync_blocks:
*
* skip recover for bits other than dirty or clean;
*
* 5.4) lazy initial recover for raid5:
* By default, the old bitmap framework will only allow new recover when there
* are spares(new disk), a new recovery flag MD_RECOVERY_LAZY_RECOVER is added
* to perform raid456 lazy recover for set bits(from 2.2).
*
* 6. special handling for degraded array:
*
* - Dirty bits will never be cleared, daemon will just do nothing, so that if
* a disk is readded, Clean bits can be skipped
|