linux.git/block, branch v3.14.31

genhd: check for int overflow in disk_expand_part_tbl()

2015-01-16T14:59:32+00:00

commit 5fabcb4c33fe11c7e3afdf805fde26c1a54d0953 upstream.

We can get here from blkdev_ioctl() -> blkpg_ioctl() -> add_partition()
with a user passed in partno value. If we pass in 0x7fffffff, the
new target in disk_expand_part_tbl() overflows the 'int' and we
access beyond the end of ptbl->part[] and even write to it when we
do the rcu_assign_pointer() to assign the new partition.

Reported-by: David Ramos 
Signed-off-by: Jens Axboe 
Signed-off-by: Greg Kroah-Hartman

blk-mq: use 'nr_cpu_ids' as highest CPU ID count for hwq <-> cpu map

2015-01-16T14:59:31+00:00

commit a33c1ba2913802b6fb23e974bb2f6a4e73c8b7ce upstream.

We currently use num_possible_cpus(), but that breaks on sparc64 where
the CPU ID space is discontig. Use nr_cpu_ids as the highest CPU ID
instead, so we don't end up reading from invalid memory.

Signed-off-by: Jens Axboe 
Signed-off-by: Greg Kroah-Hartman

scsi: Fix error handling in SCSI_IOCTL_SEND_COMMAND

2014-11-14T17:00:08+00:00

commit 84ce0f0e94ac97217398b3b69c21c7a62ebeed05 upstream.

When sg_scsi_ioctl() fails to prepare request to submit in
blk_rq_map_kern() we jump to a label where we just end up copying
(luckily zeroed-out) kernel buffer to userspace instead of reporting
error. Fix the problem by jumping to the right label.

CC: Jens Axboe 
CC: linux-scsi@vger.kernel.org
Coverity-id: 1226871
Signed-off-by: Jan Kara 
Signed-off-by: Greg Kroah-Hartman 

Fixed up the, now unused, out label.

Signed-off-by: Jens Axboe

block: fix alignment_offset math that assumes io_min is a power-of-2

2014-11-14T16:59:51+00:00

commit b8839b8c55f3fdd60dc36abcda7e0266aff7985c upstream.

The math in both blk_stack_limits() and queue_limit_alignment_offset()
assume that a block device's io_min (aka minimum_io_size) is always a
power-of-2.  Fix the math such that it works for non-power-of-2 io_min.

This issue (of alignment_offset != 0) became apparent when testing
dm-thinp with a thinp blocksize that matches a RAID6 stripesize of
1280K.  Commit fdfb4c8c1 ("dm thin: set minimum_io_size to pool's data
block size") unlocked the potential for alignment_offset != 0 due to
the dm-thin-pool's io_min possibly being a non-power-of-2.

Signed-off-by: Mike Snitzer 
Acked-by: Martin K. Petersen 
Signed-off-by: Jens Axboe 
Signed-off-by: Greg Kroah-Hartman

partitions: aix.c: off by one bug

2014-10-05T21:52:24+00:00

commit d97a86c170b4e432f76db072a827fe30b4d6f659 upstream.

The lvip[] array has "state->limit" elements so the condition here
should be >= instead of >.

Fixes: 6ceea22bbbc8 ('partitions: add aix lvm partition support files')
Signed-off-by: Dan Carpenter 
Acked-by: Philippe De Muyter 
Signed-off-by: Jens Axboe 
Signed-off-by: Greg Kroah-Hartman

genhd: fix leftover might_sleep() in blk_free_devt()

2014-10-05T21:52:20+00:00

commit 46f341ffcfb5d8530f7d1e60f3be06cce6661b62 upstream.

Commit 2da78092 changed the locking from a mutex to a spinlock,
so we now longer sleep in this context. But there was a leftover
might_sleep() in there, which now triggers since we do the final
free from an RCU callback. Get rid of it.

Reported-by: Pontus Fuchs 
Signed-off-by: Jens Axboe 
Signed-off-by: Greg Kroah-Hartman

block: Fix dev_t minor allocation lifetime

2014-10-05T21:52:19+00:00

commit 2da78092dda13f1efd26edbbf99a567776913750 upstream.

Releases the dev_t minor when all references are closed to prevent
another device from acquiring the same major/minor.

Since the partition's release may be invoked from call_rcu's soft-irq
context, the ext_dev_idr's mutex had to be replaced with a spinlock so
as not so sleep.

Signed-off-by: Keith Busch 
Signed-off-by: Jens Axboe 
Signed-off-by: Greg Kroah-Hartman

cfq-iosched: Fix wrong children_weight calculation

2014-10-05T21:52:11+00:00

commit e15693ef18e13e3e6bffe891fe140f18b8ff6d07 upstream.

cfq_group_service_tree_add() is applying new_weight at the beginning of
the function via cfq_update_group_weight().
This actually allows weight to change between adding it to and subtracting
it from children_weight, and triggers WARN_ON_ONCE() in
cfq_group_service_tree_del(), or even causes oops by divide error during
vfr calculation in cfq_group_service_tree_add().

The detailed scenario is as follows:
1. Create blkio cgroups X and Y as a child of X.
   Set X's weight to 500 and perform some I/O to apply new_weight.
   This X's I/O completes before starting Y's I/O.
2. Y starts I/O and cfq_group_service_tree_add() is called with Y.
3. cfq_group_service_tree_add() walks up the tree during children_weight
   calculation and adds parent X's weight (500) to children_weight of root.
   children_weight becomes 500.
4. Set X's weight to 1000.
5. X starts I/O and cfq_group_service_tree_add() is called with X.
6. cfq_group_service_tree_add() applies its new_weight (1000).
7. I/O of Y completes and cfq_group_service_tree_del() is called with Y.
8. I/O of X completes and cfq_group_service_tree_del() is called with X.
9. cfq_group_service_tree_del() subtracts X's weight (1000) from
   children_weight of root. children_weight becomes -500.
   This triggers WARN_ON_ONCE().
10. Set X's weight to 500.
11. X starts I/O and cfq_group_service_tree_add() is called with X.
12. cfq_group_service_tree_add() applies its new_weight (500) and adds it
    to children_weight of root. children_weight becomes 0. Calcularion of
    vfr triggers oops by divide error.

weight should be updated right before adding it to children_weight.

Reported-by: Ruki Sekiya 
Signed-off-by: Toshiaki Makita 
Acked-by: Tejun Heo 
Signed-off-by: Jens Axboe 
Signed-off-by: Greg Kroah-Hartman

blkcg: don't call into policy draining if root_blkg is already gone

2014-07-31T19:52:55+00:00

commit 0b462c89e31f7eb6789713437eb551833ee16ff3 upstream.

While a queue is being destroyed, all the blkgs are destroyed and its
->root_blkg pointer is set to NULL.  If someone else starts to drain
while the queue is in this state, the following oops happens.

  NULL pointer dereference at 0000000000000028
  IP: [] blk_throtl_drain+0x84/0x230
  PGD e4a1067 PUD b773067 PMD 0
  Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
  Modules linked in: cfq_iosched(-) [last unloaded: cfq_iosched]
  CPU: 1 PID: 537 Comm: bash Not tainted 3.16.0-rc3-work+ #2
  Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  task: ffff88000e222250 ti: ffff88000efd4000 task.ti: ffff88000efd4000
  RIP: 0010:[]  [] blk_throtl_drain+0x84/0x230
  RSP: 0018:ffff88000efd7bf0  EFLAGS: 00010046
  RAX: 0000000000000000 RBX: ffff880015091450 RCX: 0000000000000001
  RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
  RBP: ffff88000efd7c10 R08: 0000000000000000 R09: 0000000000000001
  R10: ffff88000e222250 R11: 0000000000000000 R12: ffff880015091450
  R13: ffff880015092e00 R14: ffff880015091d70 R15: ffff88001508fc28
  FS:  00007f1332650740(0000) GS:ffff88001fa80000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
  CR2: 0000000000000028 CR3: 0000000009446000 CR4: 00000000000006e0
  Stack:
   ffffffff8144e8f6 ffff880015091450 0000000000000000 ffff880015091d80
   ffff88000efd7c28 ffffffff8144ae2f ffff880015091450 ffff88000efd7c58
   ffffffff81427641 ffff880015091450 ffffffff82401f00 ffff880015091450
  Call Trace:
   [] blkcg_drain_queue+0x1f/0x60
   [] __blk_drain_queue+0x71/0x180
   [] blk_queue_bypass_start+0x6e/0xb0
   [] blkcg_deactivate_policy+0x38/0x120
   [] blk_throtl_exit+0x34/0x50
   [] blkcg_exit_queue+0x35/0x40
   [] blk_release_queue+0x26/0xd0
   [] kobject_cleanup+0x38/0x70
   [] kobject_put+0x28/0x60
   [] blk_put_queue+0x15/0x20
   [] scsi_device_dev_release_usercontext+0x16b/0x1c0
   [] execute_in_process_context+0x89/0xa0
   [] scsi_device_dev_release+0x1c/0x20
   [] device_release+0x32/0xa0
   [] kobject_cleanup+0x38/0x70
   [] kobject_put+0x28/0x60
   [] put_device+0x17/0x20
   [] __scsi_remove_device+0xa9/0xe0
   [] scsi_remove_device+0x2b/0x40
   [] sdev_store_delete+0x27/0x30
   [] dev_attr_store+0x18/0x30
   [] sysfs_kf_write+0x3e/0x50
   [] kernfs_fop_write+0xe7/0x170
   [] vfs_write+0xaf/0x1d0
   [] SyS_write+0x4d/0xc0
   [] system_call_fastpath+0x16/0x1b

776687bce42b ("block, blk-mq: draining can't be skipped even if
bypass_depth was non-zero") made it easier to trigger this bug by
making blk_queue_bypass_start() drain even when it loses the first
bypass test to blk_cleanup_queue(); however, the bug has always been
there even before the commit as blk_queue_bypass_start() could race
against queue destruction, win the initial bypass test but perform the
actual draining after blk_cleanup_queue() already destroyed all blkgs.

Fix it by skippping calling into policy draining if all the blkgs are
already gone.

Signed-off-by: Tejun Heo 
Reported-by: Shirish Pargaonkar 
Reported-by: Sasha Levin 
Reported-by: Jet Chen 
Tested-by: Shirish Pargaonkar 
Signed-off-by: Jens Axboe 
Signed-off-by: Greg Kroah-Hartman

block: don't assume last put of shared tags is for the host

2014-07-31T19:52:54+00:00

commit d45b3279a5a2252cafcd665bbf2db8c9b31ef783 upstream.

There is no inherent reason why the last put of a tag structure must be
the one for the Scsi_Host, as device model objects can be held for
arbitrary periods.  Merge blk_free_tags and __blk_free_tags into a single
funtion that just release a references and get rid of the BUG() when the
host reference wasn't the last.

Signed-off-by: Christoph Hellwig 
Signed-off-by: Jens Axboe 
Signed-off-by: Greg Kroah-Hartman