Since targets are given a virtual target device, it is necessary to
translate all communication between targets and the backend device.
Implement the translation layer for get/set bad block table.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
On target-specific operations pass on nvm_tgt_dev instead of the generic
nvm device.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
Target devices do not have access to the device driver operations.
Introduce a helper function that exposes the max. number of physical
sectors supported by the underlying device.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
Avoid calling media manager and device-specific operations directly from
rrpc. Create helper functions on lightnvm's core instead.
Signed-off-by: Javier González <javier@cnexlabs.com>
Made it work with null_blk as well.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
In order to naturally support multi-target instances on an Open-Channel
SSD, targets should own the LUNs they get blocks from and manage
provisioning internally. This is done in several steps.
Since targets own the LUNs the are instantiated on top of and manage the
free block list internally, there is no need for a LUN abstraction in
the media manager. LUNs are intrinsically managed as in the physical
layout (ch:0,lun:0, ..., ch:0,lun:n, ch:1,lun:0, ch:1,lun:n, ...,
ch:m,lun:0, ch:m,lun:n) and given to the targets based on the target
creation ioctl. This simplifies LUN management and clears the path for a
partition manager to sit directly underneath LightNVM targets.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
In order to naturally support multi-target instances on an Open-Channel
SSD, targets should own the LUNs they get blocks from and manage
provisioning internally. This is done in several steps.
A part of this transformation is that targets manage their blocks
internally. This patch eliminates the nvm_block abstraction and moves
block management to the target logic. The rrpc target is transformed.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
Since LUNs are managed internally on targets, the media manager has no
access to the free LUN lists. Thus, debug functions that show LUN
information on the device should not be implemented on the media
manager, but rather on the target in itself.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
Since LUNs are managed internally on the target, there is no need for
the media manager to implement a get_lun operation.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
In order to naturally support multi-target instances on an Open-Channel
SSD, targets should own the LUNs they get blocks from and manage
provisioning internally. This is done in several steps.
This patch moves the block provisioning inside of the target and removes
the get/put block interface from the media manager.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
LUNs are exclusively owned by targets implementing a block device FTL.
Doing this reservation requires at the moment a 2-way callback gennvm
<-> target. The reason behind this is that LUNs were not assumed to
always be exclusively owned by targets. However, this design decision
goes against I/O determinism QoS (two targets would mix I/O on the same
parallel unit in the device).
This patch makes LUN reservation as part of the target creation on the
media manager. This makes that LUNs are always exclusively owned by the
target instantiated on top of them. LUN stripping and/or sharing should
be implemented on the target itself or the layers on top.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
The gen_lun abstraction in the generic media manager was conceived on
the assumption that a single target would instantiated on top of it.
This has complicated target design to implement multi-instances. Remove
this abstraction and move its logic to nvm_lun, which manages physical
lun geometry and operations.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
Targets are assumed to used the same generic ppa format, where the
address is partitioned on ch:lun:block:pg:pl:sec. Thus, make the
function in charge of transforming the ppa address from a linear format
to the generic one available to all targets.
This function will be needed by the media manager in order to do target
mapping translations when targets are divided on different physical
partitions.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
Add ECC error codes to enable the appropriate handling in the target.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
Bad blocks should be managed by block owners. This would be either
targets for data blocks or sysblk for system blocks.
In order to support this, export two functions: One to mark a block as
an specific type (e.g., bad block) and another to update the bad block
table on the device.
Move bad block management to rrpc.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
Erases might be subject to host hints. An example is multi-plane
programming to erase blocks in parallel. Enable targets to specify this
hint.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
Previously, LBA read and write were not supported in the lightnvm
specification. Now that it supports it, lets use the traditional
NVMe gendisk, and attach the lightnvm sysfs geometry export.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
For a host to access an Open-Channel SSD, it has to know its geometry,
so that it writes and reads at the appropriate device bounds.
Currently, the geometry information is kept within the kernel, and not
exported to user-space for consumption. This patch exposes the
configuration through sysfs and enables user-space libraries, such as
liblightnvm, to use the sysfs implementation to get the geometry of an
Open-Channel SSD.
The sysfs entries are stored within the device hierarchy, and can be
found using the "lightnvm" device type.
An example configuration looks like this:
/sys/class/nvme/
└── nvme0n1
├── capabilities: 3
├── device_mode: 1
├── erase_max: 1000000
├── erase_typ: 1000000
├── flash_media_type: 0
├── media_capabilities: 0x00000001
├── media_type: 0
├── multiplane: 0x00010101
├── num_blocks: 1022
├── num_channels: 1
├── num_luns: 4
├── num_pages: 64
├── num_planes: 1
├── page_size: 4096
├── prog_max: 100000
├── prog_typ: 100000
├── read_max: 10000
├── read_typ: 10000
├── sector_oob_size: 0
├── sector_size: 4096
├── media_manager: gennvm
├── ppa_format: 0x380830082808001010102008
├── vendor_opcode: 0
├── max_phys_secs: 64
└── version: 1
Signed-off-by: Simon A. F. Lund <slund@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
LightNVM compatible device drivers does not have a method to expose
LightNVM specific sysfs entries.
To enable LightNVM sysfs entries to be exposed, lightnvm device
drivers require a struct device to attach it to. To allow both the
actual device driver and lightnvm sysfs entries to coexist, the device
driver tracks the lifetime of the nvm_dev structure.
This patch refactors NVMe and null_blk to handle the lifetime of struct
nvm_dev, which eliminates the need for struct gendisk when a lightnvm
compatible device is provided.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
The passed by reference ppa list in nvm_set_rqd_list() is updated when
multiple planes are available. In that case, each PPA plane is
incremented when the device side PPA list is created. This prevents the
caller to rely on the PPA list to be unmodified after a call.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
The [get/put]_blk API enables targets to get ownership of blocks at
runtime. This information is currently not recorded on disk, and the
information is therefore lost on power failure. To restore the
metadata, the [get/put]_blk must persist its metadata. In that case,
we need to control the outer lock, so that we can disable them while
updating the on-disk metadata. Fortunately, the _unlocked versions can
be removed, which allows us to move the lock into the [get/put]_blk
functions.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
To enable persistent block management to easily control creation and
removal of targets, we move target management into the media
manager. The LightNVM core continues to maintain which target types are
registered, while the media manager now keeps track of its initialized
targets.
Two new callbacks for the media manager are introduced. create_tgt and
remove_tgt. Note that remove_tgt returns 0 on successfully removing a
target, and returns 1 if the target was not found.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
The responsibility of the media manager is not to keep track of
open/closed blocks. This is better maintained within a target,
that already manages this information on writes.
Remove the statistics and merge the states NVM_BLK_ST_OPEN and
NVM_BLK_ST_CLOSED.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
The ->reserved bit is not initialized when allocated on stack.
This may lead targets to misinterpret the PPA as cached.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
Expose media manager mark_blk() to targets, as done for the rest of the
media manager callback functions.
Signed-off-by: Javier González <javier@cnexlabs.com>
Updated description
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
The nvm_dev->max_pages_per_blk variable was removed in favor of the new
nvm->sec_per_blk variable. The ->max_pages_per_blk variable was still
used in rrpc_capacity, reporting the reserved capacity to zero. Replace
with ->sec_per_blk to calculate the reserved area again.
Signed-off-by: Javier González <javier@cnexlabs.com>
Updated patch description. Was "lightnvm: eliminate redundant variable"
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
The number of ppas contained on a request is not necessarily the number
of pages that it maps to neither on the target nor on the device side.
In order to avoid confusion, rename nr_pages to nr_ppas since it is what
the variable actually contains.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
A target requires a method to identify PPAs that are either cached in
memory or on disk. This can efficiently be maintained within the PPA.
The target host-side translation table can then lookup a PPA and know
from the PPA if it is cached or on disk. In the case it is cached, it is
the responsibility of the target to maintain this cache.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
Targets can update a block state when having a reference to an
in-memory virtual block. In the case that a target does not keep the
block metadata in memory, it does not have a way to update this
structure.
Therefore, expose gennvm_mark_blk() through the media managers
->mark_blk() callback and let targets update the state structure through
this callback.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
Targets associated with a device manager are not freed on device
removal. They have to be manually removed before shutdown. Make sure
any outstanding targets are freed upon shutdown.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
Until now, the dma pool have been exclusively used to allocate the ppa
list being sent to the device. In pblk (upcoming), we use these pools to
allocate metadata too. Thus, we generalize the names of some variables
on the dma helper functions to make the code more readable.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
Enable metadata buffer to be sent to the device through the metadata
field on the physical rw nvme command. The size of the metadata buffer
must follow dev->oob_size * # of PPAs.
Signed-off-by: Javier González <javier@cnexlabs.com>
Updated description.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
The set_bb_tbl takes struct nvm_rq and only uses its ppa_list and
nr_pages internally. Instead, make these two variables explicit.
This allows a user to call it without initializing a struct nvm_rq
first.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
A virtual block enables a block to identify multiple physical blocks.
This is useful for metadata where a device media supports multiple
planes. In that case, a block, with multiple planes can be managed
as a single vblk. Reducing the metadata required by one forth.
nvm_set_rqd_ppalist() takes care of expanding a ppa_list with vblks
automatically. However, for some use-cases, where only a single physical
block is required, the ppa_list should not be expanded.
Therefore, add a vblk parameter to nvm_set_rqd_ppalist(), and only
expand the ppa_list if vblk is set.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
The device ops->get_bb_tbl() takes a callback, that allows the caller
to use its own callback function to update its data structures in the
returning function.
This makes it difficult to send parameters to the callback, and usually
is circumvented by small private structures, that both carry the callers
state and any flags needed to fulfill the update.
Refactor ops->get_bb_tbl() to fill a data buffer with the status of the
blocks returned, and let the user call the callback function manually.
That will provide the necessary flags and data structures and simplify
the logic around ops->get_bb_tbl().
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
Users that wish to iterate all luns on a device. Must create a
struct ppa_addr and separate iterators for channels and luns. To set the
iterators, two loops are required, one to iterate channels, and another
to iterate luns. This leads to decrease in readability.
Introduce nvm_for_each_lun_ppa, which implements the nested loop and
sets ppa, channel, and lun variable for each loop body, eliminating
the boilerplate code.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
A target name must be unique. However, a per-device registration of
targets is maintained on a dev->online_targets list, with a per-device
search for targets upon registration.
This results in a name collision when two targets, with the same name,
are created on two different targets, where the per-device list is not
shared.
Signed-off-by: Simon A. F. Lund <slund@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
The functions nvm_register_target(), nvm_unregister_target() and
associated list refers to a target type that is being registered by a
target type module. Rename nvm_*_targets() to nvm_*_tgt_type(), so that
the intension is clear.
This enables target instances to use the _nvm_*_targets() naming.
Signed-off-by: Simon A. F. Lund <slund@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
The get block table command returns a list of blocks and planes
with their associated state. Users, such as gennvm and sysblk,
manages all planes as a single virtual block.
It was therefore natural to fold the bad block list before it is
returned. However, to allow users, which manages on a per-plane
block level, to also use the interface, the get_bb_tbl interface is
changed to not fold by default and instead let the caller fold if
necessary.
Reviewed by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
The flash page size (fpg) and size across planes (pfpg) are convenient
to know when allocating buffer sizes. This has previously been a
calculated in various places. Replace with the pre-calculated values.
Reviewed by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
The nvm_submit_ppa function assumes that users manage all plane
blocks as a single block. Extend the API with nvm_submit_ppa_list
to allow the user to send its own ppa list. If the user submits more
than a single PPA, the user must take care to allocate and free
the corresponding ppa list.
Reviewed by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
PPAs sent to device is separately acknowledge in a 64bit status
variable. The status is stored in DW0 and DW1 of the completion queue
entry. Store this status inside the nvm_rq for further processing.
This can later be used to implement retry techniques for failed writes
and reads.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
Add a bitmap of luns to indicate the status
of luns: inuse/available. When create targets
do the necessary check to avoid allocating luns
that are already allocated.
Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Freed dev->lun_map if nvm_core_init later failed in the init process.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
We can create more than one target on a lightnvm
device by specifying its begin lun and end lun.
But only specify the physical address area is not
enough, we need to get the corresponding non-
intersection logical address area division from
the backend device's logcial address space.
Otherwise the targets on the device might use
the same logical addresses cause incorrect
information in the device's l2p table.
Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
Pull block driver updates from Jens Axboe:
"This is the block driver pull request for this merge window. It sits
on top of for-4.6/core, that was just sent out.
This contains:
- A set of fixes for lightnvm. One from Alan, fixing an overflow,
and the rest from the usual suspects, Javier and Matias.
- A set of fixes for nbd from Markus and Dan, and a fixup from Arnd
for correct usage of the signed 64-bit divider.
- A set of bug fixes for the Micron mtip32xx, from Asai.
- A fix for the brd discard handling from Bart.
- Update the maintainers entry for cciss, since that hardware has
transferred ownership.
- Three bug fixes for bcache from Eric Wheeler.
- Set of fixes for xen-blk{back,front} from Jan and Konrad.
- Removal of the cpqarray driver. It has been disabled in Kconfig
since 2013, and we were initially scheduled to remove it in 3.15.
- Various updates and fixes for NVMe, with the most important being:
- Removal of the per-device NVMe thread, replacing that with a
watchdog timer instead. From Christoph.
- Exposing the namespace WWID through sysfs, from Keith.
- Set of cleanups from Ming Lin.
- Logging the controller device name instead of the underlying
PCI device name, from Sagi.
- And a bunch of fixes and optimizations from the usual suspects
in this area"
* 'for-4.6/drivers' of git://git.kernel.dk/linux-block: (49 commits)
NVMe: Expose ns wwid through single sysfs entry
drivers:block: cpqarray clean up
brd: Fix discard request processing
cpqarray: remove it from the kernel
cciss: update MAINTAINERS
NVMe: Remove unused sq_head read in completion path
bcache: fix cache_set_flush() NULL pointer dereference on OOM
bcache: cleaned up error handling around register_cache()
bcache: fix race of writeback thread starting before complete initialization
NVMe: Create discard zero quirk white list
nbd: use correct div_s64 helper
mtip32xx: remove unneeded variable in mtip_cmd_timeout()
lightnvm: generalize rrpc ppa calculations
lightnvm: remove struct nvm_dev->total_blocks
lightnvm: rename ->nr_pages to ->nr_sects
lightnvm: update closed list outside of intr context
xen/blback: Fit the important information of the thread in 17 characters
lightnvm: fold get bb tbl when using dual/quad plane mode
lightnvm: fix up nonsensical configure overrun checking
xen-blkback: advertise indirect segment support earlier
...
The struct rrpc->nr_pages can easily be interpreted as the number of
flash pages allocated to rrpc, while it is the nr_sects. Make sure that
this is reflected from the variable name.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
When the media manager runs in dual or quad plane mode, lightnvm
abstracts away plane specific commands. This poses a problem for
get bad block table, as it reports bad blocks per plane, making the
table either two or four times bigger than expected. Fold the bad block
list before returning.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
System block allows the device to initialize with its configured media
manager. The system blocks is written to disk, and read again when media
manager is determined. For this to work, the backend must store the
data. Device drivers, such as null_blk, does not have any backend
storage. This patch allows the media manager to be initialized without a
storage backend.
It also fix incorrect configuration of capabilities in null_blk, as it
does not support get/set bad block interface.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
null_blk defines an empty version of this ops structure if CONFIG_NVM
isn't set, but it doesn't know the type. Move those bits out of the
protection of CONFIG_NVM in the main lightnvm include.
Signed-off-by: Jens Axboe <axboe@fb.com>
Now that a device can be managed using the system blocks, a method to
reset the device is necessary as well. This patch introduces logic to
reset the device easily to factory state and exposes it through an
ioctl.
The ioctl takes the following flags:
NVM_FACTORY_ERASE_ONLY_USER
By default all blocks, except host-reserved blocks are erased upon
factory reset. Instead of this, only erase host-reserved blocks.
NVM_FACTORY_RESET_HOST_BLKS
Mark host-reserved blocks to be erased and set their type to free.
NVM_FACTORY_RESET_GRWN_BBLKS
Mark "grown bad blocks" to be erased and set their type to free.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>