diff options
| author | Linus Torvalds <torvalds@linux-foundation.org> | 2023-11-01 12:30:07 -1000 |
|---|---|---|
| committer | Linus Torvalds <torvalds@linux-foundation.org> | 2023-11-01 12:30:07 -1000 |
| commit | 90d624af2e5a9945eedd5cafd6ae6d88f32cc977 (patch) | |
| tree | e936a0cc8f2b613f327ab08280dccbad664703cf /block | |
| parent | 4de520f1fcefd4ebb7dddcf28bde1b330c2f6b5d (diff) | |
| parent | 0c696bb38f4cc0f0f90a8e06ae1eda21a9630cd0 (diff) | |
| download | linux-90d624af2e5a9945eedd5cafd6ae6d88f32cc977.tar.gz linux-90d624af2e5a9945eedd5cafd6ae6d88f32cc977.tar.bz2 linux-90d624af2e5a9945eedd5cafd6ae6d88f32cc977.zip | |
Merge tag 'for-6.7/block-2023-10-30' of git://git.kernel.dk/linux
Pull block updates from Jens Axboe:
- Improvements to the queue_rqs() support, and adding null_blk support
for that as well (Chengming)
- Series improving badblocks support (Coly)
- Key store support for sed-opal (Greg)
- IBM partition string handling improvements (Jan)
- Make number of ublk devices supported configurable (Mike)
- Cancelation improvements for ublk (Ming)
- MD pull requests via Song:
- Handle timeout in md-cluster, by Denis Plotnikov
- Cleanup pers->prepare_suspend, by Yu Kuai
- Rewrite mddev_suspend(), by Yu Kuai
- Simplify md_seq_ops, by Yu Kuai
- Reduce unnecessary locking array_state_store(), by Mariusz
Tkaczyk
- Make rdev add/remove independent from daemon thread, by Yu Kuai
- Refactor code around quiesce() and mddev_suspend(), by Yu Kuai
- NVMe pull request via Keith:
- nvme-auth updates (Mark)
- nvme-tcp tls (Hannes)
- nvme-fc annotaions (Kees)
- Misc cleanups and improvements (Jiapeng, Joel)
* tag 'for-6.7/block-2023-10-30' of git://git.kernel.dk/linux: (95 commits)
block: ublk_drv: Remove unused function
md: cleanup pers->prepare_suspend()
nvme-auth: allow mixing of secret and hash lengths
nvme-auth: use transformed key size to create resp
nvme-auth: alloc nvme_dhchap_key as single buffer
nvmet-tcp: use 'spin_lock_bh' for state_lock()
powerpc/pseries: PLPKS SED Opal keystore support
block: sed-opal: keystore access for SED Opal keys
block:sed-opal: SED Opal keystore
ublk: simplify aborting request
ublk: replace monitor with cancelable uring_cmd
ublk: quiesce request queue when aborting queue
ublk: rename mm_lock as lock
ublk: move ublk_cancel_dev() out of ub->mutex
ublk: make sure io cmd handled in submitter task context
ublk: don't get ublk device reference in ublk_abort_queue()
ublk: Make ublks_max configurable
ublk: Limit dev_id/ub_number values
md-cluster: check for timeout while a new disk adding
nvme: rework NVME_AUTH Kconfig selection
...
Diffstat (limited to 'block')
| -rw-r--r-- | block/Kconfig | 1 | ||||
| -rw-r--r-- | block/badblocks.c | 1618 | ||||
| -rw-r--r-- | block/blk-flush.c | 11 | ||||
| -rw-r--r-- | block/blk-mq-debugfs.c | 1 | ||||
| -rw-r--r-- | block/blk-mq.c | 45 | ||||
| -rw-r--r-- | block/blk-mq.h | 57 | ||||
| -rw-r--r-- | block/partitions/ibm.c | 98 | ||||
| -rw-r--r-- | block/sed-opal.c | 18 |
8 files changed, 1465 insertions, 384 deletions
diff --git a/block/Kconfig b/block/Kconfig index f1364d1c0d93..55ae2286a4de 100644 --- a/block/Kconfig +++ b/block/Kconfig @@ -186,6 +186,7 @@ config BLK_SED_OPAL bool "Logic for interfacing with Opal enabled SEDs" depends on KEYS select PSERIES_PLPKS if PPC_PSERIES + select PSERIES_PLPKS_SED if PPC_PSERIES help Builds Logic for interfacing with Opal enabled controllers. Enabling this option enables users to setup/unlock/lock diff --git a/block/badblocks.c b/block/badblocks.c index 3afb550c0f7b..fc92d4e18aa3 100644 --- a/block/badblocks.c +++ b/block/badblocks.c @@ -16,119 +16,830 @@ #include <linux/types.h> #include <linux/slab.h> -/** - * badblocks_check() - check a given range for bad sectors - * @bb: the badblocks structure that holds all badblock information - * @s: sector (start) at which to check for badblocks - * @sectors: number of sectors to check for badblocks - * @first_bad: pointer to store location of the first badblock - * @bad_sectors: pointer to store number of badblocks after @first_bad +/* + * The purpose of badblocks set/clear is to manage bad blocks ranges which are + * identified by LBA addresses. * - * We can record which blocks on each device are 'bad' and so just - * fail those blocks, or that stripe, rather than the whole device. - * Entries in the bad-block table are 64bits wide. This comprises: - * Length of bad-range, in sectors: 0-511 for lengths 1-512 - * Start of bad-range, sector offset, 54 bits (allows 8 exbibytes) - * A 'shift' can be set so that larger blocks are tracked and - * consequently larger devices can be covered. - * 'Acknowledged' flag - 1 bit. - the most significant bit. + * When the caller of badblocks_set() wants to set a range of bad blocks, the + * setting range can be acked or unacked. And the setting range may merge, + * overwrite, skip the overlapped already set range, depends on who they are + * overlapped or adjacent, and the acknowledgment type of the ranges. It can be + * more complicated when the setting range covers multiple already set bad block + * ranges, with restrictions of maximum length of each bad range and the bad + * table space limitation. * - * Locking of the bad-block table uses a seqlock so badblocks_check - * might need to retry if it is very unlucky. - * We will sometimes want to check for bad blocks in a bi_end_io function, - * so we use the write_seqlock_irq variant. + * It is difficult and unnecessary to take care of all the possible situations, + * for setting a large range of bad blocks, we can handle it by dividing the + * large range into smaller ones when encounter overlap, max range length or + * bad table full conditions. Every time only a smaller piece of the bad range + * is handled with a limited number of conditions how it is interacted with + * possible overlapped or adjacent already set bad block ranges. Then the hard + * complicated problem can be much simpler to handle in proper way. * - * When looking for a bad block we specify a range and want to - * know if any block in the range is bad. So we binary-search - * to the last range that starts at-or-before the given endpoint, - * (or "before the sector after the target range") - * then see if it ends after the given start. + * When setting a range of bad blocks to the bad table, the simplified situations + * to be considered are, (The already set bad blocks ranges are naming with + * prefix E, and the setting bad blocks range is naming with prefix S) * - * Return: - * 0: there are no known bad blocks in the range - * 1: there are known bad block which are all acknowledged - * -1: there are bad blocks which have not yet been acknowledged in metadata. - * plus the start/length of the first bad section we overlap. + * 1) A setting range is not overlapped or adjacent to any other already set bad + * block range. + * +--------+ + * | S | + * +--------+ + * +-------------+ +-------------+ + * | E1 | | E2 | + * +-------------+ +-------------+ + * For this situation if the bad blocks table is not full, just allocate a + * free slot from the bad blocks table to mark the setting range S. The + * result is, + * +-------------+ +--------+ +-------------+ + * | E1 | | S | | E2 | + * +-------------+ +--------+ +-------------+ + * 2) A setting range starts exactly at a start LBA of an already set bad blocks + * range. + * 2.1) The setting range size < already set range size + * +--------+ + * | S | + * +--------+ + * +-------------+ + * | E | + * +-------------+ + * 2.1.1) If S and E are both acked or unacked range, the setting range S can + * be merged into existing bad range E. The result is, + * +-------------+ + * | S | + * +-------------+ + * 2.1.2) If S is unacked setting and E is acked, the setting will be denied, and + * the result is, + * +-------------+ + * | E | + * +-------------+ + * 2.1.3) If S is acked setting and E is unacked, range S can overwrite on E. + * An extra slot from the bad blocks table will be allocated for S, and head + * of E will move to end of the inserted range S. The result is, + * +--------+----+ + * | S | E | + * +--------+----+ + * 2.2) The setting range size == already set range size + * 2.2.1) If S and E are both acked or unacked range, the setting range S can + * be merged into existing bad range E. The result is, + * +-------------+ + * | S | + * +-------------+ + * 2.2.2) If S is unacked setting and E is acked, the setting will be denied, and + * the result is, + * +-------------+ + * | E | + * +-------------+ + * 2.2.3) If S is acked setting and E is unacked, range S can overwrite all of + bad blocks range E. The result is, + * +-------------+ + * | S | + * +-------------+ + * 2.3) The setting range size > already set range size + * +-------------------+ + * | S | + * +-------------------+ + * +-------------+ + * | E | + * +-------------+ + * For such situation, the setting range S can be treated as two parts, the + * first part (S1) is as same size as the already set range E, the second + * part (S2) is the rest of setting range. + * +-------------+-----+ +-------------+ +-----+ + * | S1 | S2 | | S1 | | S2 | + * +-------------+-----+ ===> +-------------+ +-----+ + * +-------------+ +-------------+ + * | E | | E | + * +-------------+ +-------------+ + * Now we only focus on how to handle the setting range S1 and already set + * range E, which are already explained in 2.2), for the rest S2 it will be + * handled later in next loop. + * 3) A setting range starts before the start LBA of an already set bad blocks + * range. + * +-------------+ + * | S | + * +-------------+ + * +-------------+ + * | E | + * +-------------+ + * For this situation, the setting range S can be divided into two parts, the + * first (S1) ends at the start LBA of already set range E, the second part + * (S2) starts exactly at a start LBA of the already set range E. + * +----+---------+ +----+ +---------+ + * | S1 | S2 | | S1 | | S2 | + * +----+---------+ ===> +----+ +---------+ + * +-------------+ +-------------+ + * | E | | E | + * +-------------+ +-------------+ + * Now only the first part S1 should be handled in this loop, which is in + * similar condition as 1). The rest part S2 has exact same start LBA address + * of the already set range E, they will be handled in next loop in one of + * situations in 2). + * 4) A setting range starts after the start LBA of an already set bad blocks + * range. + * 4.1) If the setting range S exactly matches the tail part of already set bad + * blocks range E, like the following chart shows, + * +---------+ + * | S | + * +---------+ + * +-------------+ + * | E | + * +-------------+ + * 4.1.1) If range S and E have same acknowledge value (both acked or unacked), + * they will be merged into one, the result is, + * +-------------+ + * | S | + * +-------------+ + * 4.1.2) If range E is acked and the setting range S is unacked, the setting + * request of S will be rejected, the result is, + * +-------------+ + * | E | + * +-------------+ + * 4.1.3) If range E is unacked, and the setting range S is acked, then S may + * overwrite the overlapped range of E, the result is, + * +---+---------+ + * | E | S | + * +---+---------+ + * 4.2) If the setting range S stays in middle of an already set range E, like + * the following chart shows, + * +----+ + * | S | + * +----+ + * +--------------+ + * | E | + * +--------------+ + * 4.2.1) If range S and E have same acknowledge value (both acked or unacked), + * they will be merged into one, the result is, + * +--------------+ + * | S | + * +--------------+ + * 4.2.2) If range E is acked and the setting range S is unacked, the setting + * request of S will be rejected, the result is also, + * +--------------+ + * | E | + * +--------------+ + * 4.2.3) If range E is unacked, and the setting range S is acked, then S will + * inserted into middle of E and split previous range E into two parts (E1 + * and E2), the result is, + * +----+----+----+ + * | E1 | S | E2 | + * +----+----+----+ + * 4.3) If the setting bad blocks range S is overlapped with an already set bad + * blocks range E. The range S starts after the start LBA of range E, and + * ends after the end LBA of range E, as the following chart shows, + * +-------------------+ + * | S | + * +-------------------+ + * +-------------+ + * | E | + * +-------------+ + * For this situation the range S can be divided into two parts, the first + * part (S1) ends at end range E, and the second part (S2) has rest range of + * origin S. + * +---------+---------+ +---------+ +---------+ + * | S1 | S2 | | S1 | | S2 | + * +---------+---------+ ===> +---------+ +---------+ + * +-------------+ +-------------+ + * | E | | E | + * +-------------+ +-------------+ + * Now in this loop the setting range S1 and already set range E can be + * handled as the situations 4.1), the rest range S2 will be handled in next + * loop and ignored in this loop. + * 5) A setting bad blocks range S is adjacent to one or more already set bad + * blocks range(s), and they are all acked or unacked range. + * 5.1) Front merge: If the already set bad blocks range E is before setting + * range S and they are adjacent, + * +------+ + * | S | + * +------+ + * +-------+ + * | E | + * +-------+ + * 5.1.1) When total size of range S and E <= BB_MAX_LEN, and their acknowledge + * values are same, the setting range S can front merges into range E. The + * result is, + * +--------------+ + * | S | + * +--------------+ + * 5.1.2) Otherwise these two ranges cannot merge, just insert the setting + * range S right after already set range E into the bad blocks table. The + * result is, + * +--------+------+ + * | E | S | + * +--------+------+ + * 6) Special cases which above conditions cannot handle + * 6.1) Multiple already set ranges may merge into less ones in a full bad table + * +-------------------------------------------------------+ + * | S | + * +-------------------------------------------------------+ + * |<----- BB_MAX_LEN ----->| + * +-----+ +-----+ +-----+ + * | E1 | | E2 | | E3 | + * +-----+ +-----+ +-----+ + * In the above example, when the bad blocks table is full, inserting the + * first part of setting range S will fail because no more available slot + * can be allocated from bad blocks table. In this situation a proper + * setting method should be go though all the setting bad blocks range and + * look for chance to merge already set ranges into less ones. When there + * is available slot from bad blocks table, re-try again to handle more + * setting bad blocks ranges as many as possible. + * +------------------------+ + * | S3 | + * +------------------------+ + * |<----- BB_MAX_LEN ----->| + * +-----+-----+-----+---+-----+--+ + * | S1 | S2 | + * +-----+-----+-----+---+-----+--+ + * The above chart shows although the first part (S3) cannot be inserted due + * to no-space in bad blocks table, but the following E1, E2 and E3 ranges + * can be merged with rest part of S into less range S1 and S2. Now there is + * 1 free slot in bad blocks table. + * +------------------------+-----+-----+-----+---+-----+--+ + * | S3 | S1 | S2 | + * +------------------------+-----+-----+-----+---+-----+--+ + * Since the bad blocks table is not full anymore, re-try again for the + * origin setting range S. Now the setting range S3 can be inserted into the + * bad blocks table with previous freed slot from multiple ranges merge. + * 6.2) Front merge after overwrite + * In the following example, in bad blocks table, E1 is an acked bad blocks + * range and E2 is an unacked bad blocks range, therefore they are not able + * to merge into a larger range. The setting bad blocks range S is acked, + * therefore part of E2 can be overwritten by S. + * +--------+ + * | S | acknowledged + * +--------+ S: 1 + * +-------+-------------+ E1: 1 + * | E1 | E2 | E2: 0 + * +-------+-------------+ + * With previous simplified routines, after overwriting part of E2 with S, + * the bad blocks table should be (E3 is remaining part of E2 which is not + * overwritten by S), + * acknowledged + * +-------+--------+----+ S: 1 + * | E1 | S | E3 | E1: 1 + * +-------+--------+----+ E3: 0 + * The above result is correct but not perfect. Range E1 and S in the bad + * blocks table are all acked, merging them into a larger one range may + * occupy less bad blocks table space and make badblocks_check() faster. + * Therefore in such situation, after overwriting range S, the previous range + * E1 should be checked for possible front combination. Then the ideal + * result can be, + * +----------------+----+ acknowledged + * | E1 | E3 | E1: 1 + * +----------------+----+ E3: 0 + * 6.3) Behind merge: If the already set bad blocks range E is behind the setting + * range S and they are adjacent. Normally we don't need to care about this + * because front merge handles this while going though range S from head to + * tail, except for the tail part of range S. When the setting range S are + * fully handled, all the above simplified routine doesn't check whether the + * tail LBA of range S is adjacent to the next already set range and not + * merge them even it is possible. + * +------+ + * | S | + * +------+ + * +-------+ + * | E | + * +-------+ + * For the above special situation, when the setting range S are all handled + * and the loop ends, an extra check is necessary for whether next already + * set range E is right after S and mergeable. + * 6.3.1) When total size of range E and S <= BB_MAX_LEN, and their acknowledge + * values are same, the setting range S can behind merges into range E. The + * result is, + * +--------------+ + * | S | + * +--------------+ + * 6.3.2) Otherwise these two ranges cannot merge, just insert the setting range + * S in front of the already set range E in the bad blocks table. The result + * is, + * +------+-------+ + * | S | E | + * +------+-------+ + * + * All the above 5 simplified situations and 3 special cases may cover 99%+ of + * the bad block range setting conditions. Maybe there is some rare corner case + * is not considered and optimized, it won't hurt if badblocks_set() fails due + * to no space, or some ranges are not merged to save bad blocks table space. + * + * Inside badblocks_set() each loop starts by jumping to re_insert label, every + * time for the new loop prev_badblocks() is called to find an already set range + * which starts before or at current setting range. Since the setting bad blocks + * range is handled from head to tail, most of the cases it is unnecessary to do + * the binary search inside prev_badblocks(), it is possible to provide a hint + * to prev_badblocks() for a fast path, then the expensive binary search can be + * avoided. In my test with the hint to prev_badblocks(), except for the first + * loop, all rested calls to prev_badblocks() can go into the fast path and + * return correct bad blocks table index immediately. + * + * + * Clearing a bad blocks range from the bad block table has similar idea as + * setting does, but much more simpler. The only thing needs to be noticed is + * when the clearing range hits middle of a bad block range, the existing bad + * block range will split into two, and one more item should be added into the + * bad block table. The simplified situations to be considered are, (The already + * set bad blocks ranges in bad block table are naming with prefix E, and the + * clearing bad blocks range is naming with prefix C) + * + * 1) A clearing range is not overlapped to any already set ranges in bad block + * table. + * +-----+ | +-----+ | +-----+ + * | C | | | C | | | C | + * +-----+ or +-----+ or +-----+ + * +---+ | +----+ +----+ | +---+ + * | E | | | E1 | | E2 | | | E | + * +---+ | +----+ +----+ | +---+ + * For the above situations, no bad block to be cleared and no failure + * happens, simply returns 0. + * 2) The clearing range hits middle of an already setting bad blocks range in + * the bad block table. + * +---+ + * | C | + * +---+ + * +-----------------+ + * | E | + * +-----------------+ + * In this situation if the bad block table is not full, the range E will be + * split into two ranges E1 and E2. The result is, + * +------+ +------+ + * | E1 | | E2 | + * +------+ +------+ + * 3) The clearing range starts exactly at same LBA as an already set bad block range + * from the bad block table. + * 3.1) Partially covered at head part + * +------------+ + * | C | + * +------------+ + * +-----------------+ + * | E | + * +-----------------+ + * For this situation, the overlapped already set range will update the + * start LBA to end of C and shrink the range to BB_LEN(E) - BB_LEN(C). No + * item deleted from bad block table. The result is, + * +----+ + * | E1 | + * +----+ + * 3.2) Exact fully covered + * +-----------------+ + * | C | + * +-----------------+ + * +-----------------+ + * | E | + * +-----------------+ + * For this situation the whole bad blocks range E will be cleared and its + * corresponded item is deleted from the bad block table. + * 4) The clearing range exactly ends at same LBA as an already set bad block + * range. + * +-------+ + * | C | + * +-------+ + * +-----------------+ + * | E | + * +-----------------+ + * For the above situation, the already set range E is updated to shrink its + * end to the start of C, and reduce its length to BB_LEN(E) - BB_LEN(C). + * The result is, + * +---------+ + * | E | + * +---------+ + * 5) The clearing range is partially overlapped with an already set bad block + * range from the bad block table. + * 5.1) The already set bad block range is front overlapped with the clearing + * range. + * +----------+ + * | C | + * +----------+ + * +------------+ + * | E | + * +------------+ + * For such situation, the clearing range C can be treated as two parts. The + * first part ends at the start LBA of range E, and the second part starts at + * same LBA of range E. + * +----+-----+ +----+ +-----+ + * | C1 | C2 | | C1 | | C2 | + * +----+-----+ ===> +----+ +-----+ + * +------------+ +------------+ + * | E | | E | + * +------------+ +------------+ + * Now the first part C1 can be handled as condition 1), and the second part C2 can be + * handled as condition 3.1) in next loop. + * 5.2) The already set bad block range is behind overlaopped with the clearing + * range. + * +----------+ + * | C | + * +----------+ + * +------------+ + * | E | + * +------------+ + * For such situation, the clearing range C can be treated as two parts. The + * first part C1 ends at same end LBA of range E, and the second part starts + * at end LBA of range E. + * +----+-----+ +----+ +-----+ + * | C1 | C2 | | C1 | | C2 | + * +----+-----+ ===> +----+ +-----+ + * +------------+ +------------+ + * | E | | E | + * +------------+ +------------+ + * Now the first part clearing range C1 can be handled as condition 4), and + * the second part clearing range C2 can be handled as condition 1) in next + * loop. + * + * All bad blocks range clearing can be simplified into the above 5 situations + * by only handling the head part of the clearing range in each run of the + * while-loop. The idea is similar to bad blocks range setting but much + * simpler. */ -int badblocks_check(struct badblocks *bb, sector_t s, int sectors, - sector_t *first_bad, int *bad_sectors) + +/* + * Find the range starts at-or-before 's' from bad table. The search + * starts from index 'hint' and stops at index 'hint_end' from the bad + * table. + */ +static int prev_by_hint(struct badblocks *bb, sector_t s, int hint) { - int hi; - int lo; + int hint_end = hint + 2; u64 *p = bb->page; - int rv; - sector_t target = s + sectors; - unsigned seq; + int ret = -1; - if (bb->shift > 0) { - /* round the start down, and the end up */ - s >>= bb->shift; - target += (1<<bb->shift) - 1; - target >>= bb->shift; + while ((hint < hint_end) && ((hint + 1) <= bb->count) && + (BB_OFFSET(p[hint]) <= s)) { + if ((hint + 1) == bb->count || BB_OFFSET(p[hint + 1]) > s) { + ret = hint; + break; + } + hint++; + } + + return ret; +} + +/* + * Find the range starts at-or-before bad->start. If 'hint' is provided + * (hint >= 0) then search in the bad table from hint firstly. It is + * very probably the wanted bad range can be found from the hint index, + * then the unnecessary while-loop iteration can be avoided. + */ +static int prev_badblocks(struct badblocks *bb, struct badblocks_context *bad, + int hint) +{ + sector_t s = bad->start; + int ret = -1; + int lo, hi; + u64 *p; + + if (!bb->count) + goto out; + + if (hint >= 0) { + ret = prev_by_hint(bb, s, hint); + if (ret >= 0) + goto out; } - /* 'target' is now the first block after the bad range */ -retry: - seq = read_seqbegin(&bb->lock); lo = 0; - rv = 0; hi = bb->count; + p = bb->page; - /* Binary search between lo and hi for 'target' - * i.e. for the last range that starts before 'target' - */ - /* INVARIANT: ranges before 'lo' and at-or-after 'hi' - * are known not to be the last range before target. - * VARIANT: hi-lo is the number of possible - * ranges, and decreases until it reaches 1 - */ + /* The following bisect search might be unnecessary */ + if (BB_OFFSET(p[lo]) > s) + return -1; + if (BB_OFFSET(p[hi - 1]) <= s) + return hi - 1; + + /* Do bisect search in bad table */ while (hi - lo > 1) { - int mid = (lo + hi) / 2; + int mid = (lo + hi)/2; sector_t a = BB_OFFSET(p[mid]); - if (a < target) - /* This could still be the one, earlier ranges - * could not. - */ + if (a == s) { + ret = mid; + goto out; + } + + if (a < s) lo = mid; else - /* This and later ranges are definitely out. */ hi = mid; } - /* 'lo' might be the last that started before target, but 'hi' isn't */ - if (hi > lo) { - /* need to check all range that end after 's' to see if - * any are unacknowledged. + + if (BB_OFFSET(p[lo]) <= s) + ret = lo; +out: + return ret; +} + +/* + * Return 'true' if the range indicated by 'bad' can be backward merged + * with the bad range (from the bad table) index by 'behind'. + */ +static bool can_merge_behind(struct badblocks *bb, + struct badblocks_context *bad, int behind) +{ + sector_t sectors = bad->len; + sector_t s = bad->start; + u64 *p = bb->page; + + if ((s < BB_OFFSET(p[behind])) && + ((s + sectors) >= BB_OFFSET(p[behind])) && + ((BB_END(p[behind]) - s) <= BB_MAX_LEN) && + BB_ACK(p[behind]) == bad->ack) + return true; + return false; +} + +/* + * Do backward merge for range indicated by 'bad' and the bad range + * (from the bad table) indexed by 'behind'. The return value is merged + * sectors from bad->len. + */ +static int behind_merge(struct badblocks *bb, struct badblocks_context *bad, + int behind) +{ + sector_t sectors = bad->len; + sector_t s = bad->start; + u64 *p = bb->page; + int merged = 0; + + WARN_ON(s >= BB_OFFSET(p[behind])); + WARN_ON((s + sectors) < BB_OFFSET(p[behind])); + + if (s < BB_OFFSET(p[behind])) { + merged = BB_OFFSET(p[behind]) - s; + p[behind] = BB_MAKE(s, BB_LEN(p[behind]) + merged, bad->ack); + + WARN_ON((BB_LEN(p[behind]) + merged) >= BB_MAX_LEN); + } + + return merged; +} + +/* + * Return 'true' if the range indicated by 'bad' can be forward + * merged with the bad range (from the bad table) indexed by 'prev'. + */ +static bool can_merge_front(struct badblocks *bb, int prev, + struct badblocks_context *bad) +{ + sector_t s = bad->start; + u64 *p = bb->page; + + if (BB_ACK(p[prev]) == bad->ack && + (s < BB_END(p[prev]) || + (s == BB_END(p[prev]) && (BB_LEN(p[prev]) < BB_MAX_LEN)))) + return true; + return false; +} + +/* + * Do forward merge for range indicated by 'bad' and the bad range + * (from bad table) indexed by 'prev'. The return value is sectors + * merged from bad->len. + */ +static int front_merge(struct badblocks *bb, int prev, struct badblocks_context *bad) +{ + sector_t sectors = bad->len; + sector_t s = bad->start; + u64 *p = bb->page; + int merged = 0; + + WARN_ON(s > BB_END(p[prev])); + + if (s < BB_END(p[prev])) { + merged = min_t(sector_t, sectors, BB_END(p[prev]) - s); + } else { + merged = min_t(sector_t, sectors, BB_MAX_LEN - BB_LEN(p[prev])); + if ((prev + 1) < bb->count && + merged > (BB_OFFSET(p[prev + 1]) - BB_END(p[prev]))) { + merged = BB_OFFSET(p[prev + 1]) - BB_END(p[prev]); + } + + p[prev] = BB_MAKE(BB_OFFSET(p[prev]), + BB_LEN(p[prev]) + merged, bad->ack); + } + + return merged; +} + +/* + * 'Combine' is a special case which can_merge_front() is not able to + * handle: If a bad range (indexed by 'prev' from bad table) exactly + * starts as bad->start, and the bad range ahead of 'prev' (indexed by + * 'prev - 1' from bad table) exactly ends at where 'prev' starts, and + * the sum of their lengths does not exceed BB_MAX_LEN limitation, then + * these two bad range (from bad table) can be combined. + * + * Return 'true' if bad ranges indexed by 'prev' and 'prev - 1' from bad + * table can be combined. + */ +static bool can_combine_front(struct badblocks *bb, int prev, + struct badblocks_context *bad) +{ + u64 *p = bb->page; + + if ((prev > 0) && + (BB_OFFSET(p[prev]) == bad->start) && + (BB_END(p[prev - 1]) == BB_OFFSET(p[prev])) && + (BB_LEN(p[prev - 1]) + BB_LEN(p[prev]) <= BB_MAX_LEN) && + (BB_ACK(p[prev - 1]) == BB_ACK(p[prev]))) + return true; + return false; +} + +/* + * Combine the bad ranges indexed by 'prev' and 'prev - 1' (from bad + * table) into one larger bad range, and the new range is indexed by + * 'prev - 1'. + * The caller of front_combine() will decrease bb->count, therefore + * it is unnecessary to clear p[perv] after front merge. + */ +static void front_combine(struct badblocks *bb, int prev) +{ + u64 *p = bb->page; + + p[prev - 1] = BB_MAKE(BB_OFFSET(p[prev - 1]), + BB_LEN(p[prev - 1]) + BB_LEN(p[prev]), + BB_ACK(p[prev])); + if ((prev + 1) < bb->count) + memmove(p + prev, p + prev + 1, (bb->count - prev - 1) * 8); +} + +/* + * Return 'true' if the range indicated by 'bad' is exactly forward + * overlapped with the bad range (from bad table) indexed by 'front'. + * Exactly forward overlap means the bad range (from bad table) indexed + * by 'prev' does not cover the whole range indicated by 'bad'. + */ +static bool overlap_front(struct badblocks *bb, int front, + struct badblocks_context *bad) +{ + u64 *p = bb->page; + + if (bad->start >= BB_OFFSET(p[front]) && + bad->start < BB_END(p[front])) + return true; + return false; +} + +/* + * Return 'true' if the range indicated by 'bad' is exactly backward + * overlapped with the bad range (from bad table) indexed by 'behind'. + */ +static bool overlap_behind(struct badblocks *bb, struct badblocks_context *bad, + int behind) +{ + u64 *p = bb->page; + + if (bad->start < BB_OFFSET(p[behind]) && + (bad->start + bad->len) > BB_OFFSET(p[behind])) + return true; + return false; +} + +/* + * Return 'true' if the range indicated by 'bad' can overwrite the bad + * range (from bad table) indexed by 'prev'. + * + * The range indicated by 'bad' can overwrite the bad range indexed by + * 'prev' when, + * 1) The whole range indicated by 'bad' can cover partial or whole bad + * range (from bad table) indexed by 'prev'. + * 2) The ack value of 'bad' is larger or equal to the ack value of bad + * range 'prev'. + * + * If the overwriting doesn't cover the whole bad range (from bad table) + * indexed by 'prev', new range might be split from existing bad range, + * 1) The overwrite covers head or tail part of existing bad range, 1 + * extra bad range will be split and added into the bad table. + * 2) The overwrite covers middle of existing bad range, 2 extra bad + * ranges will be split (ahead and after the overwritten range) and + * added into the bad table. + * The number of extra split ranges of the overwriting is stored in + * 'extra' and returned for the caller. + */ +static bool can_front_overwrite(struct badblocks *bb, int prev, + struct badblocks_context *bad, int *extra) +{ + u64 *p = bb->page; + int len; + + WARN_ON(!overlap_front(bb, prev, bad)); + + if (BB_ACK(p[prev]) >= bad->ack) + return false; + + if (BB_END(p[prev]) <= (bad->start + bad->len)) { + len = BB_END(p[prev]) - bad->start; + if (BB_OFFSET(p[prev]) == bad->start) + *extra = 0; + else + *extra = 1; + + bad->len = len; |
