linux-stable/include/video/udlfb.h
Mikulas Patocka 9d0aa601e4 udlfb: fix semaphore value leak
I observed that the performance of the udl fb driver degrades over time.
On a freshly booted machine, it takes 6 seconds to do "ls -la /usr/bin";
after some time of use, the same operation takes 14 seconds.

The reason is that the value of "limit_sem" decays over time.

The udl driver uses a semaphore "limit_set" to specify how many free urbs
are there on dlfb->urbs.list. If the count is zero, the "down" operation
will sleep until some urbs are added to the freelist.

In order to avoid some hypothetical deadlock, the driver will not call
"up" immediately, but it will offload it to a workqueue. The problem is
that if we call "schedule_delayed_work" on the same work item multiple
times, the work item may only be executed once.

This is happening:
* some urb completes
* dlfb_urb_completion adds it to the free list
* dlfb_urb_completion calls schedule_delayed_work to schedule the function
  dlfb_release_urb_work to increase the semaphore count
* as the urb is on the free list, some other task grabs it and submits it
* the submitted urb completes, dlfb_urb_completion is called again
* dlfb_urb_completion calls schedule_delayed_work, but the work is already
  scheduled, so it does nothing
* finally, dlfb_release_urb_work is called, it increases the semaphore
  count by 1, although it should increase it by 2

So, the semaphore count is decreasing over time, and this causes gradual
performance degradation.

Note that in the current kernel, the "up" function may be called from
interrupt and it may race with the "down" function called by another
thread, so we don't have to offload the call of "up" to a workqueue at
all. This patch removes the workqueue code. The patch also changes
"down_interruptible" to "down" in dlfb_free_urb_list, so that we will
clean up the driver properly even if a signal arrives.

With this patch, the performance of udlfb no longer degrades.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
[b.zolnierkie: fix immediatelly -> immediately typo]
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
2018-07-25 15:41:54 +02:00

96 lines
2.7 KiB
C

/* SPDX-License-Identifier: GPL-2.0 */
#ifndef UDLFB_H
#define UDLFB_H
/*
* TODO: Propose standard fb.h ioctl for reporting damage,
* using _IOWR() and one of the existing area structs from fb.h
* Consider these ioctls deprecated, but they're still used by the
* DisplayLink X server as yet - need both to be modified in tandem
* when new ioctl(s) are ready.
*/
#define DLFB_IOCTL_RETURN_EDID 0xAD
#define DLFB_IOCTL_REPORT_DAMAGE 0xAA
struct dloarea {
int x, y;
int w, h;
int x2, y2;
};
struct urb_node {
struct list_head entry;
struct dlfb_data *dlfb;
struct urb *urb;
};
struct urb_list {
struct list_head list;
spinlock_t lock;
struct semaphore limit_sem;
int available;
int count;
size_t size;
};
struct dlfb_data {
struct usb_device *udev;
struct fb_info *info;
struct urb_list urbs;
struct kref kref;
char *backing_buffer;
int fb_count;
bool virtualized; /* true when physical usb device not present */
struct delayed_work init_framebuffer_work;
struct delayed_work free_framebuffer_work;
atomic_t usb_active; /* 0 = update virtual buffer, but no usb traffic */
atomic_t lost_pixels; /* 1 = a render op failed. Need screen refresh */
char *edid; /* null until we read edid from hw or get from sysfs */
size_t edid_size;
int sku_pixel_limit;
int base16;
int base8;
u32 pseudo_palette[256];
int blank_mode; /*one of FB_BLANK_ */
/* blit-only rendering path metrics, exposed through sysfs */
atomic_t bytes_rendered; /* raw pixel-bytes driver asked to render */
atomic_t bytes_identical; /* saved effort with backbuffer comparison */
atomic_t bytes_sent; /* to usb, after compression including overhead */
atomic_t cpu_kcycles_used; /* transpired during pixel processing */
};
#define NR_USB_REQUEST_I2C_SUB_IO 0x02
#define NR_USB_REQUEST_CHANNEL 0x12
/* -BULK_SIZE as per usb-skeleton. Can we get full page and avoid overhead? */
#define BULK_SIZE 512
#define MAX_TRANSFER (PAGE_SIZE*16 - BULK_SIZE)
#define WRITES_IN_FLIGHT (4)
#define MAX_VENDOR_DESCRIPTOR_SIZE 256
#define GET_URB_TIMEOUT HZ
#define FREE_URB_TIMEOUT (HZ*2)
#define BPP 2
#define MAX_CMD_PIXELS 255
#define RLX_HEADER_BYTES 7
#define MIN_RLX_PIX_BYTES 4
#define MIN_RLX_CMD_BYTES (RLX_HEADER_BYTES + MIN_RLX_PIX_BYTES)
#define RLE_HEADER_BYTES 6
#define MIN_RLE_PIX_BYTES 3
#define MIN_RLE_CMD_BYTES (RLE_HEADER_BYTES + MIN_RLE_PIX_BYTES)
#define RAW_HEADER_BYTES 6
#define MIN_RAW_PIX_BYTES 2
#define MIN_RAW_CMD_BYTES (RAW_HEADER_BYTES + MIN_RAW_PIX_BYTES)
#define DL_DEFIO_WRITE_DELAY 5 /* fb_deferred_io.delay in jiffies */
#define DL_DEFIO_WRITE_DISABLE (HZ*60) /* "disable" with long delay */
/* remove these once align.h patch is taken into kernel */
#define DL_ALIGN_UP(x, a) ALIGN(x, a)
#define DL_ALIGN_DOWN(x, a) ALIGN_DOWN(x, a)
#endif