linux-stable

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2024-08-26 19:00:25 +00:00

Author	SHA1	Message	Date
Javier González	766c8ceb16	lightnvm: pblk: guarantee that backpointer is respected on writer stall pblk's write buffer must guarantee that it respects the device's constrains for reads (i.e., mw_cunits). This is done by maintaining a backpointer that updates the L2P table as entries wrap up, making them point to the media instead of pointing to the write buffer. This mechanism can race in case that the write thread stalls, as the write pointer will protect the last written entry, thus disregarding the read constrains. This patch adds an extra check on wrap up, making sure that the threshold is respected at all times, preventing new entries to overwrite committed data, also in case of write thread stall. Reported-by: Heiner Litz <hlitz@ucsc.edu> Signed-off-by: Javier González <javier@cnexlabs.com> Reviewed-by: Heiner Litz <hlitz@ucsc.edu> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-10-09 08:25:08 -06:00
Javier González	9bd1f875c0	lightnvm: pblk: move ring buffer alloc/free rb init pblk's read/write buffer currently takes a buffer and its size and uses it to create the metadata around it to use it as a ring buffer. This puts the responsibility of allocating/freeing ring buffer memory on the ring buffer user. Instead, move it inside of the ring buffer helpers (pblk-rb.c). This simplifies creation/destruction routines. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-10-09 08:25:08 -06:00
Javier González	40b8657dcc	lightnvm: pblk: encapsulate rb pointer operations pblk's read/write buffer is always a power-of-2, thus wrapping up the buffer can be done with a bit mask. Since this is an implementation detail internal to the write buffer, make a helper that hides pointer increment + wrap, and allows to transparently relax this assumption in the future. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-10-09 08:25:08 -06:00
Javier González	dde4aac20b	lightnvm: pblk: remove unused function Removed unused function in pblk-rb.c Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-10-09 08:25:08 -06:00
Javier González	02a1520d56	lightnvm: pblk: add SPDX license tag Add GLP-2.0 SPDX license tag to all pblk files Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-10-09 08:25:08 -06:00
Javier González	253babc3f6	lightnvm: pblk: take write semaphore on metadata pblk guarantees write ordering at a chunk level through a per open chunk semaphore. At this point, since we only have an open I/O stream for both user and GC data, the semaphore is per parallel unit. For the metadata I/O that is synchronous, the semaphore is not needed as ordering is guaranteed. However, if the metadata scheme changes or multiple streams are open, this guarantee might not be preserved. This patch makes sure that all writes go through the semaphore, even for synchronous I/O. This is consistent with pblk's write I/O model. It also simplifies maintenance since changes in the metadata scheme could cause ordering issues. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-10-09 08:25:08 -06:00
Javier González	af3fac1664	lightnvm: pblk: refactor metadata paths pblk maintains two different metadata paths for smeta and emeta, which store metadata at the start of the line and at the end of the line, respectively. Until now, these path has been common for writing and retrieving metadata, however, as these paths diverge, the common code becomes less clear and unnecessary complicated. In preparation for further changes to the metadata write path, this patch separates the write and read paths for smeta and emeta and removes the synchronous emeta path as it not used anymore (emeta is scheduled asynchronously to prevent jittering due to internal I/Os). Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-10-09 08:25:07 -06:00
Javier González	45dcf29b98	lightnvm: pblk: encapsulate rqd dma allocations dma allocations for ppa_list and meta_list in rqd are replicated in several places across the pblk codebase. Make helpers to encapsulate creation and deletion to simplify the code. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-10-09 08:25:07 -06:00
Javier González	63dee3a6c3	lightnvm: pblk: calculate line pad distance in helper If a line is padded, calculate the pad distance directly on the helper being used for this purpose. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-10-09 08:25:07 -06:00
Javier González	7f985f9a69	lightnvm: move ppa transformations to core Continuing the effort of moving 1.2 and 2.0 specific code to core, move 64_to_32 and 32_to_64 ppa helpers from pblk to core. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-10-09 08:25:07 -06:00
Hans Holmberg	4209c31c0c	lightnvm: pblk: add tracing for chunk resets Trace state of chunk resets. Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-10-09 08:25:07 -06:00
Hans Holmberg	4c44abf43d	lightnvm: pblk: add trace events for chunk states Introduce trace points for tracking chunk states in pblk - this is useful for inspection of the entire state of the drive, and real handy for both fw and pblk debugging. Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-10-09 08:25:07 -06:00
Matias Bjørling	43241cfe47	lightnvm: pblk: remove debug from pblk_[down/up]_page Remove the debug only iteration within __pblk_down_page, which then allows us to reduce the number of arguments down to pblk and the parallel unit from the functions that calls it. Simplifying the callers logic considerably. Also, rename the functions pblk_[down/up]_page to pblk_[down/up]_chunk, to communicate that it manages the write pointer of the chunk. Note that it also protects the parallel unit such that at most one chunk is active per parallel unit. Signed-off-by: Matias Bjørling <mb@lightnvm.io> Reviewed-by: Javier González <javier@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-10-09 08:25:07 -06:00
Hans Holmberg	e99e802fc6	lightnvm: pblk: remove unused parameters in pblk_up_rq The parameters nr_ppas and ppa_list are not used, so remove them. Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-10-09 08:25:07 -06:00
Hans Holmberg	53d82db693	lightnvm: pblk: allocate line map bitmaps using a mempool Line map bitmap allocations are fairly large and can fail. Allocation failures are fatal to pblk, stopping the write pipeline. To avoid this, allocate the bitmaps using a mempool instead. Mempool allocations never fail if called from a process context, and pblk should only allocate map bitmaps in process context, but keep the failure handling for robustness sake. Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-10-09 08:25:07 -06:00
Hans Holmberg	d68a934404	lightnvm: introduce nvm_rq_to_ppa_list There is a number of places in the lightnvm subsystem where the user iterates over the ppa list. Before iterating, the user must know if it is a single or multiple LBAs due to vector commands using either the nvm_rq ->ppa_addr or ->ppa_list fields on command submission, which leads to open-coding the if/else statement. Instead of having multiple if/else's, move it into a function that can be called by its users. A nice side effect of this cleanup is that this patch fixes up a bunch of cases where we don't consider the single-ppa case in pblk. Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-10-09 08:25:07 -06:00
Javier González	7a7d6f9b48	lightnvm: pblk: remove unused variable. Removed unused struct ppa_addr variable. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-10-09 08:25:06 -06:00
Javier González	cb21665c8d	lightnvm: pblk: improve line helpers The current helper to obtain a line from a ppa returns the line id, which requires its users to explicitly retrieve the pointer to the line with the id. Make 2 different helpers: one returning the line id and one returning the line directly. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-10-09 08:25:06 -06:00
Javier González	2cf99bbd10	lightnvm: pblk: add helpers for chunk addresses Implement helpers to go from ppas to a chunk within a line and an address within a chunk. These helpers will be used on the patches adding trace support in pblk, which will be sent in this window. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-10-09 08:25:06 -06:00
Matias Bjørling	ae14cc044b	lightnvm: pblk: refactor put line fn on read completion The read completion path uses the put_line variable to decide whether the reference on a line should be released. The function name used for that is pblk_read_put_rqd_kref, which could lead one to believe that it is the rqd that is releasing the reference, while it is the line reference that is put. Rename and also split the function in two to account for either rqd or single ppa callers and move it to core, such that it later can be used in the write path as well. Signed-off-by: Matias Bjørling <mb@lightnvm.io> Reviewed-by: Javier González <javier@cnexlabs.com> Reviewed-by: Heiner Litz <hlitz@ucsc.edu> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-10-09 08:25:06 -06:00
Matias Bjørling	afdc23c91e	lightnvm: pblk: unify vector max req constants Both NVM_MAX_VLBA and PBLK_MAX_REQ_ADDRS define how many LBAs that are available in a vector command. pblk uses them interchangeably in its implementation. Use NVM_MAX_VLBA as the main one and remove usages of PBLK_MAX_REQ_ADDRS. Also remove the power representation that only has one user, and instead calculate it at runtime. Signed-off-by: Matias Bjørling <mb@lightnvm.io> Reviewed-by: Javier González <javier@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-10-09 08:25:06 -06:00
Matias Bjørling	aff3fb18f9	lightnvm: move bad block and chunk state logic to core pblk implements two data paths for recovery line state. One for 1.2 and another for 2.0, instead of having pblk implement these, combine them in the core to reduce complexity and make available to other targets. The new interface will adhere to the 2.0 chunk definition, including managing open chunks with an active write pointer. To provide this interface, a 1.2 device recovers the state of the chunks by manually detecting if a chunk is either free/open/close/offline, and if open, scanning the flash pages sequentially to find the next writeable page. This process takes on average ~10 seconds on a device with 64 dies, 1024 blocks and 60us read access time. The process can be parallelized but is left out for maintenance simplicity, as the 1.2 specification is deprecated. For 2.0 devices, the logic is maintained internally in the drive and retrieved through the 2.0 interface. Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-10-09 08:25:06 -06:00
Matias Bjørling	d7b6801673	lightnvm: combine 1.2 and 2.0 command flags Add nvm_set_flags helper to enable core to appropriately set the command flags for read/write/erase depending on which version a drive supports. The flags arguments can be distilled into the access hint, scrambling, and program/erase suspend. Replace the access hint with a "is_seq" parameter. The rest of the flags are dependent on the command opcode, which is trivial to detect and set. Signed-off-by: Matias Bjørling <mb@lightnvm.io> Reviewed-by: Javier González <javier@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-10-09 08:25:05 -06:00
Heiner Litz	11f6ad699a	lightnvm: pblk: add asynchronous partial read In the read path, partial reads are currently performed synchronously which affects performance for workloads that generate many partial reads. This patch adds an asynchronous partial read path as well as the required partial read ctx. Signed-off-by: Heiner Litz <hlitz@ucsc.edu> Reviewed-by: Igor Konopko <igor.j.konopko@intel.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-07-13 08:14:47 -06:00
Matias Bjørling	4e495a46b1	lightnvm: pblk: expose generic disk name on pr_* msgs The error messages in pblk does not say which pblk instance that a message occurred from. Update each error message to reflect the instance it belongs to, and also prefix it with pblk, so we know the message comes from the pblk module. Signed-off-by: Matias Bjørling <mb@lightnvm.io> Reviewed-by: Javier González <javier@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-07-13 08:14:43 -06:00
Matias Bjørling	880eda5440	lightnvm: move NVM_DEBUG to pblk There is no users of CONFIG_NVM_DEBUG in the LightNVM subsystem. All users are in pblk. Rename NVM_DEBUG to NVM_PBLK_DEBUG and enable only for pblk. Also fix up the CONFIG_NVM_PBLK entry to follow the code style for Kconfig files. Signed-off-by: Matias Bjørling <mb@lightnvm.io> Reviewed-by: Javier González <javier@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-07-13 08:14:31 -06:00
Marcin Dziegielewski	ffc03fb7a5	lightnvm: pblk: handle case when mw_cunits equals to 0 Some devices can expose mw_cunits equal to 0, it can cause the creation of too small write buffer and cause performance to drop on write workloads. Additionally, write buffer size must cover write data requirements, such as WS_MIN and MW_CUNITS - it must be greater than or equal to the larger one multiplied by the number of PUs. However, for performance reasons, use the WS_OPT value to calculation instead of WS_MIN. Because the place where buffer size is calculated was changed, this patch also removes pgs_in_buffer filed in pblk structure. Signed-off-by: Marcin Dziegielewski <marcin.dziegielewski@intel.com> Signed-off-by: Igor Konopko <igor.j.konopko@intel.com> Reviewed-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-07-13 08:14:29 -06:00
Hans Holmberg	cc9c9a00b1	lightnvm: pblk: kick writer on new flush points Unless we kick the writer directly when setting a new flush point, the user risks having to wait for up to one second (the default timeout for the write thread to be kicked) for the IO to complete. Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-06-01 09:02:53 -06:00
Hans Holmberg	48b8d20895	lightnvm: pblk: garbage collect lines with failed writes Write failures should not happen under normal circumstances, so in order to bring the chunk back into a known state as soon as possible, evacuate all the valid data out of the line and let the fw judge if the block can be written to in the next reset cycle. Do this by introducing a new gc list for lines with failed writes, and ensure that the rate limiter allocates a small portion of the write bandwidth to get the job done. The lba list is saved in memory for use during gc as we cannot gurantee that the emeta data is readable if a write error occurred. Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Reviewed-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-06-01 09:02:53 -06:00
Hans Holmberg	6a3abf5bee	lightnvm: pblk: rework write error recovery path The write error recovery path is incomplete, so rework the write error recovery handling to do resubmits directly from the write buffer. When a write error occurs, the remaining sectors in the chunk are mapped out and invalidated and the request inserted in a resubmit list. The writer thread checks if there are any requests to resubmit, scans and invalidates any lbas that have been overwritten by later writes and resubmits the failed entries. Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Reviewed-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-06-01 09:02:53 -06:00
Javier González	72b6cdbb11	lightnvm: pblk: remove dead function Remove dead function for manual sync. I/O Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-06-01 07:43:53 -06:00
Javier González	a7c9e9109c	lightnvm: pass flag on graceful teardown to targets If the namespace is unregistered before the LightNVM target is removed (e.g., on hot unplug) it is too late for the target to store any metadata on the device - any attempt to write to the device will fail. In this case, pass on a "gracefull teardown" flag to the target to let it know when this happens. In the case of pblk, we pad the open line (close all open chunks) to improve data retention. In the event of an ungraceful shutdown, avoid this part and just clean up. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-06-01 07:43:53 -06:00
Javier González	8e55c07b2b	lightnvm: pblk: remove unnecessary argument Remove unnecessary argument on pblk_line_free() Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-06-01 07:43:53 -06:00
Kent Overstreet	b906bbb699	lightnvm: convert to bioset_init()/mempool_init() Convert lightnvm to embedded bio sets. Reviewed-by: Javier González <javier@cnexlabs.com> Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-05-30 15:33:32 -06:00
Javier González	3b2a3ad119	lightnvm: pblk: implement 2.0 support Implement 2.0 support in pblk. This includes the address formatting and mapping paths, as well as the sysfs entries for them. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-03-29 17:29:09 -06:00
Javier González	32ef9412c1	lightnvm: pblk: implement get log report chunk In preparation of pblk supporting 2.0, implement the get log report chunk in pblk. Also, define the chunk states as given in the 2.0 spec. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-03-29 17:29:09 -06:00
Javier González	bb845ae45c	lightnvm: pblk: rename ppaf* to addrf* In preparation for 2.0 support in pblk, rename variables referring to the address format to addrf and reserve ppaf for the 1.2 path. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-03-29 17:29:09 -06:00
Javier González	6947151374	lightnvm: add support for 2.0 address format Add support for 2.0 address format. Also, align address bits for 1.2 and 2.0 to be able to operate on channel and luns without requiring a format conversion. Use a generic address format for this purpose. Also, convert the generic operations to the generic format in pblk. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-03-29 17:29:09 -06:00
Javier González	a40afad90b	lightnvm: normalize geometry nomenclature Normalize nomenclature for naming channels, luns, chunks, planes and sectors as well as derivations in order to improve readability. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-03-29 17:29:09 -06:00
Javier González	e46f4e4822	lightnvm: simplify geometry structure Currently, the device geometry is stored redundantly in the nvm_id and nvm_geo structures at a device level. Moreover, when instantiating targets on a specific number of LUNs, these structures are replicated and manually modified to fit the instance channel and LUN partitioning. Instead, create a generic geometry around nvm_geo, which can be used by (i) the underlying device to describe the geometry of the whole device, and (ii) instances to describe their geometry independently. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-03-29 17:29:09 -06:00
Javier González	e411b33117	lightnvm: pblk: refactor bad block identification In preparation for the OCSSD 2.0 spec. bad block identification, refactor the current code to generalize bad block get/set functions and structures. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-03-29 17:29:09 -06:00
Hans Holmberg	5d149bfabe	lightnvm: pblk: add padding distribution sysfs attribute When pblk receives a sync, all data up to that point in the write buffer must be comitted to persistent storage, and as flash memory comes with a minimal write size there is a significant cost involved both in terms of time for completing the sync and in terms of write amplification padded sectors for filling up to the minimal write size. In order to get a better understanding of the costs involved for syncs, Add a sysfs attribute to pblk: padded_dist, showing a normalized distribution of sectors padded. In order to facilitate measurements of specific workloads during the lifetime of the pblk instance, the distribution can be reset by writing 0 to the attribute. Do this by introducing counters for each possible padding: {0..(minimal write size - 1)} and calculate the normalized distribution when showing the attribute. Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Signed-off-by: Javier González <javier@cnexlabs.com> Rearranged total_buckets statement in pblk_sysfs_get_padding_dist Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-03-29 17:29:09 -06:00
Hans Holmberg	76758390f8	lightnvm: pblk: export write amplification counters to sysfs In a SSD, write amplification, WA, is defined as the average number of page writes per user page write. Write amplification negatively affects write performance and decreases the lifetime of the disk, so it's a useful metric to add to sysfs. In plkb's case, the number of writes per user sector is the sum of: (1) number of user writes (2) number of sectors written by the garbage collector (3) number of sectors padded (i.e. due to syncs) This patch adds persistent counters for 1-3 and two sysfs attributes to export these along with WA calculated with five decimals: write_amp_mileage: the accumulated write amplification stats for the lifetime of the pblk instance write_amp_trip: resetable stats to facilitate delta measurements, values reset at creation and if 0 is written to the attribute. 64-bit counters are used as a 32 bit counter would wrap around already after about 17 TB worth of user data. It will take a long long time before the 64 bit sector counters wrap around. The counters are stored after the bad block bitmap in the first emeta sector of each written line. There is plenty of space in the first emeta sector, so we don't need to bump the major version of the line data format. Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-03-29 17:29:09 -06:00
Hans Holmberg	d0ab0b1ab9	lightnvm: pblk: check data lines version on recovery As a preparation for future bumps of data line persistent storage versions, we need to start checking the emeta line version during recovery. Also slit up the current emeta/smeta version into two bytes (major,minor). Recovering lines with the same major number as the current pblk data line version must succeed. This means that any changes in the persistent format must be: (1) Backward compatible: if we switch back to and older kernel, recovery of lines stored with major == current_major and minor > current_minor must succeed. (2) Forward compatible: switching to a newer kernel, recovery of lines stored with major=current_major and minor < minor must handle the data format differences gracefully(i.e. initialize new data structures to default values). If we detect lines that have a different major number than the current we must abort recovery. The user must manually migrate the data in this case. Previously the version stored in the emeta header was copied from smeta, which has version 1, so we need to set the minor version to 1. Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-03-29 17:29:09 -06:00
Matias Bjørling	8b7bc84988	lightnvm: pblk: refactor pblk_ppa_comp function Shorten function to simply return the value of the if statement. Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-01-05 08:50:12 -07:00
Javier González	998ba62973	lightnvm: pblk: add iostat support Since pblk registers its own block device, the iostat accounting is not automatically done for us. Therefore, add the necessary accounting logic to satisfy the iostat interface. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-01-05 08:50:12 -07:00
Javier González	8f554597e0	lightnvm: pblk: do not log recovery read errors On scan recovery, reads can fail. This happens because the first page for each line is read in order to determined if the line has been used (and thus needs to be recovered), or not. This can lead to "empty page" read errors. Since these errors are normal, do not log them, as they are confusing when reviewing the logs. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-01-05 08:50:12 -07:00
Javier González	e53927393b	lightnvm: set target over-provision on create ioctl Allow to set the over-provision percentage on target creation. In case that the value is not provided, fall back to the default value set by the target. In pblk, set the default OP to 11% of the total size of the device Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-01-05 08:50:12 -07:00
Javier González	a7689938ef	lightnvm: pblk: use exact free block counter in RL Until now, pblk's rate-limiter has used a heuristic to reserve space for GC I/O given that the over-provision area was fixed. In preparation for allowing to define the over-provision area on target creation, define a dedicated free_block counter in the rate-limiter to track the number of blocks being used for user data. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-01-05 08:50:12 -07:00
Hans Holmberg	8154d296d9	lightnvm: pblk: rename sync_point to flush_point Sync point is a really confusing name for keeping track of the last entry that needs to be flushed so change the name to to flush_point instead. Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-01-05 08:50:12 -07:00

1 2

96 commits