linux.git/drivers/nvme, branch v4.14.124

nvme-loop: init nvmet_ctrl fatal_err_work when allocate

2019-05-08T05:20:47+00:00

[ Upstream commit d11de63f2b519f0a162b834013b6d3a46dbf3886 ]

After commit 4d43d395fe (workqueue: Try to catch flush_work() without
INIT_WORK()), it can cause warning when delete nvme-loop device, trace
like:

[   76.601272] Call Trace:
[   76.601646]  ? del_timer+0x72/0xa0
[   76.602156]  __cancel_work_timer+0x1ae/0x270
[   76.602791]  cancel_work_sync+0x14/0x20
[   76.603407]  nvmet_ctrl_free+0x1b7/0x2f0 [nvmet]
[   76.604091]  ? free_percpu+0x168/0x300
[   76.604652]  nvmet_sq_destroy+0x106/0x240 [nvmet]
[   76.605346]  nvme_loop_destroy_admin_queue+0x30/0x60 [nvme_loop]
[   76.606220]  nvme_loop_shutdown_ctrl+0xc3/0xf0 [nvme_loop]
[   76.607026]  nvme_loop_delete_ctrl_host+0x19/0x30 [nvme_loop]
[   76.607871]  nvme_do_delete_ctrl+0x75/0xb0
[   76.608477]  nvme_sysfs_delete+0x7d/0xc0
[   76.609057]  dev_attr_store+0x24/0x40
[   76.609603]  sysfs_kf_write+0x4c/0x60
[   76.610144]  kernfs_fop_write+0x19a/0x260
[   76.610742]  __vfs_write+0x1c/0x60
[   76.611246]  vfs_write+0xfa/0x280
[   76.611739]  ksys_write+0x6e/0x120
[   76.612238]  __x64_sys_write+0x1e/0x30
[   76.612787]  do_syscall_64+0xbf/0x3a0
[   76.613329]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

We fix it by moving fatal_err_work init to nvmet_alloc_ctrl(), which may
more reasonable.

Signed-off-by: Yufen Yu 
Reviewed-by: Sagi Grimberg 
Reviewed-by: Bart Van Assche 
Signed-off-by: Christoph Hellwig 
Signed-off-by: Jens Axboe 
Signed-off-by: Sasha Levin

nvme-pci: use the same attributes when freeing host_mem_desc_bufs.

2019-02-20T09:20:50+00:00

[ Upstream commit cc667f6d5de023ee131e96bb88e5cddca23272bd ]

When using HMB the PCIe host driver allocates host_mem_desc_bufs using
dma_alloc_attrs() but frees them using dma_free_coherent(). Use the
correct dma_free_attrs() function to free the buffers.

Signed-off-by: Liviu Dudau 
Signed-off-by: Christoph Hellwig 
Signed-off-by: Sasha Levin

nvmet-rdma: fix null dereference under heavy load

2019-01-31T07:13:47+00:00

commit 5cbab6303b4791a3e6713dfe2c5fda6a867f9adc upstream.

Under heavy load if we don't have any pre-allocated rsps left, we
dynamically allocate a rsp, but we are not actually allocating memory
for nvme_completion (rsp->req.rsp). In such a case, accessing pointer
fields (req->rsp->status) in nvmet_req_init() will result in crash.

To fix this, allocate the memory for nvme_completion by calling
nvmet_rdma_alloc_rsp()

Fixes: 8407879c("nvmet-rdma:fix possible bogus dereference under heavy load")

Cc: 
Reviewed-by: Max Gurtovoy 
Reviewed-by: Christoph Hellwig 
Signed-off-by: Raju Rangoju 
Signed-off-by: Sagi Grimberg 
Signed-off-by: Jens Axboe 
Signed-off-by: Greg Kroah-Hartman

nvmet-rdma: Add unlikely for response allocated check

2019-01-31T07:13:47+00:00

commit ad1f824948e4ed886529219cf7cd717d078c630d upstream.

Signed-off-by: Israel Rukshin 
Reviewed-by: Sagi Grimberg 
Reviewed-by: Max Gurtovoy 
Signed-off-by: Christoph Hellwig 
Signed-off-by: Jens Axboe 
Cc: Raju  Rangoju 
Signed-off-by: Greg Kroah-Hartman

nvmet-rdma: fix response use after free

2018-12-21T13:13:18+00:00

[ Upstream commit d7dcdf9d4e15189ecfda24cc87339a3425448d5c ]

nvmet_rdma_release_rsp() may free the response before using it at error
flow.

Fixes: 8407879 ("nvmet-rdma: fix possible bogus dereference under heavy load")
Signed-off-by: Israel Rukshin 
Reviewed-by: Sagi Grimberg 
Reviewed-by: Max Gurtovoy 
Signed-off-by: Christoph Hellwig 
Signed-off-by: Sasha Levin

nvme: flush namespace scanning work just before removing namespaces

2018-12-17T08:28:53+00:00

[ Upstream commit f6c8e432cb0479255322c5d0335b9f1699a0270c ]

nvme_stop_ctrl can be called also for reset flow and there is no need to
flush the scan_work as namespaces are not being removed. This can cause
deadlock in rdma, fc and loop drivers since nvme_stop_ctrl barriers
before controller teardown (and specifically I/O cancellation of the
scan_work itself) takes place, but the scan_work will be blocked anyways
so there is no need to flush it.

Instead, move scan_work flush to nvme_remove_namespaces() where it really
needs to flush.

Reported-by: Ming Lei 
Signed-off-by: Sagi Grimberg 
Reviewed-by: Keith Busch 
Reviewed by: James Smart 
Tested-by: Ewan D. Milne 
Signed-off-by: Christoph Hellwig 
Signed-off-by: Sasha Levin

nvme-loop: fix kernel oops in case of unhandled command

2018-11-21T08:24:17+00:00

commit 11d9ea6f2ca69237d35d6c55755beba3e006b106 upstream.

When nvmet_req_init() fails, __nvmet_req_complete() is called
to handle the target request via .queue_response(), so
nvme_loop_queue_response() shouldn't be called again for
handling the failure.

This patch fixes this case by the following way:

- move blk_mq_start_request() before nvmet_req_init(), so
nvme_loop_queue_response() may work well to complete this
host request

- don't call nvme_cleanup_cmd() which is done in nvme_loop_complete_rq()

- don't call nvme_loop_queue_response() which is done via
.queue_response()

Signed-off-by: Ming Lei 
Reviewed-by: Christoph Hellwig 
[trimmed changelog]
Signed-off-by: Keith Busch 
Signed-off-by: Jens Axboe 
Signed-off-by: Sudip Mukherjee 
Signed-off-by: Greg Kroah-Hartman

nvme_fc: fix ctrl create failures racing with workq items

2018-10-13T07:27:28+00:00

commit cf25809bec2c7df4b45df5b2196845d9a4a3c89b upstream.

If there are errors during initial controller create, the transport
will teardown the partially initialized controller struct and free
the ctlr memory.  Trouble is - most of those errors can occur due
to asynchronous events happening such io timeouts and subsystem
connectivity failures. Those failures invoke async workq items to
reset the controller and attempt reconnect.  Those may be in progress
as the main thread frees the ctrl memory, resulting in NULL ptr oops.

Prevent this from happening by having the main ctrl failure thread
changing state to DELETING followed by synchronously cancelling any
pending queued work item. The change of state will prevent the
scheduling of resets or reconnect events.

Signed-off-by: James Smart 
Signed-off-by: Keith Busch 
Signed-off-by: Jens Axboe 
Signed-off-by: Amit Pundir 
Signed-off-by: Greg Kroah-Hartman

nvmet-rdma: fix possible bogus dereference under heavy load

2018-10-10T06:54:24+00:00

[ Upstream commit 8407879c4e0d7731f6e7e905893cecf61a7762c7 ]

Currently we always repost the recv buffer before we send a response
capsule back to the host. Since ordering is not guaranteed for send
and recv completions, it is posible that we will receive a new request
from the host before we got a send completion for the response capsule.

Today, we pre-allocate 2x rsps the length of the queue, but in reality,
under heavy load there is nothing that is really preventing the gap to
expand until we exhaust all our rsps.

To fix this, if we don't have any pre-allocated rsps left, we dynamically
allocate a rsp and make sure to free it when we are done. If under memory
pressure we fail to allocate a rsp, we silently drop the command and
wait for the host to retry.

Reported-by: Steve Wise 
Tested-by: Steve Wise 
Signed-off-by: Sagi Grimberg 
[hch: dropped a superflous assignment]
Signed-off-by: Christoph Hellwig 
Signed-off-by: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman

nvme-fcloop: Fix dropped LS's to removed target port

2018-10-04T00:00:59+00:00

[ Upstream commit afd299ca996929f4f98ac20da0044c0cdc124879 ]

When a targetport is removed from the config, fcloop will avoid calling
the LS done() routine thinking the targetport is gone. This leaves the
initiator reset/reconnect hanging as it waits for a status on the
Create_Association LS for the reconnect.

Change the filter in the LS callback path. If tport null (set when
failed validation before "sending to remote port"), be sure to call
done. This was the main bug. But, continue the logic that only calls
done if tport was set but there is no remoteport (e.g. case where
remoteport has been removed, thus host doesn't expect a completion).

Signed-off-by: James Smart 
Signed-off-by: Christoph Hellwig 
Signed-off-by: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman