linux-stable

Commit Graph

Author	SHA1	Message	Date
Chris Wilson	94e39e282e	drm/i915: Capture batchbuffer state upon GPU hang The bbstate contains useful bits of debugging information such as whether the batch is being read from GTT or PPGTT, or whether it is allowed to execute privileged instructions. v2: Only record BB_STATE for gen4+ Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-30 10:37:58 +01:00
Imre Deak	b4ed448447	drm/i915: remove device field from struct power_well The only real need for this field was in i915_{request,release}_power_well, but there we can get at it by a container_of magic. Also since in the future we'll have multiple power wells each with its own power_well struct it makes sense to remove the field from there where it'd be just redundancy. Suggested-by: Paulo Zanoni <paulo.zanoni@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-27 20:29:57 +01:00
Imre Deak	baa707073b	drm/i915: use power get/put instead of set for power on after init Currently we make sure that all power domains are enabled during driver init and turn off unneded ones only after the first modeset. Similarly during suspend we enable all power domains, which will remain on through the following resume until the first modeset. This logic is supported by intel_set_power_well() in the power domain framework. It would be nice to simplify the API, so that we only have get/put functions and make it more explicit on the higher level how this "power well on during init" logic works. This will make it also easier if in the future we want to shorten the time the power wells are on. For this add a new device private flag tracking whether we have the power wells on because of init/suspend and use only intel_display_power_get()/put(). As nothing else uses intel_set_power_well() we can remove it. This also fixes commit `6efdf354dd` Author: Imre Deak <imre.deak@intel.com> Date: Wed Oct 16 17:25:52 2013 +0300 drm/i915: enable only the needed power domains during modeset where removing intel_set_power_well() resulted in not releasing the reference on the power well that was taken during init and thus leaving the power well on all the time. Regression reported by Paulo. v2: - move the init_power_on flag to the power_domains struct (Daniel) v3: - add note about this being a regression fix too (Paulo) Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-27 17:38:13 +01:00
Imre Deak	83c00f5530	drm/i915: prepare for multiple power wells In the future we'll need to support multiple power wells, so prepare for that here. Create a new power domains struct which contains all power domain/well specific fields. Since we'll have one lock protecting all power wells, move power_well->lock to the new struct too. No functional change. Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Paulo Zanoni <paulo.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-27 17:37:42 +01:00
Damien Lespiau	d538bbdfde	drm/i915: Use a spin lock to protect the pipe crc struct Daniel pointed out that it was hard to get anything lockless to work correctly, so don't even try for this non critical piece of code and just use a spin lock. v2: Make intel_pipe_crc->opened a bool v3: Use assert_spin_locked() instead of a comment (Daniel Vetter) v4: Use spin_lock_irq() in the debugfs functions (they can only be called from process context), Use spin_lock() in the pipe_crc_update() function that can only be called from an interrupt handler, Use wait_event_interruptible_lock_irq() when waiting for data in the cicular buffer to ensure proper locking around the condition we are waiting for. (Daniel Vetter) Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-22 00:27:49 +02:00
Daniel Vetter	c459787294	drm/i915: Move the pipe CRC stuff to other pipe data Adding stuff to the bottom of struct drm_i915_driver_private is nowadays considered uncool. Cc: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-22 00:27:38 +02:00
Imre Deak	959cbc1b8a	drm/i915: change power_well->lock to be mutex There is no hard need for this to be a spin lock, as we don't take these locks in irq context from anywhere. An upcoming patch will add calls to punit read/write functions from within regions protected by this lock and those functions need a mutex in turn. As a solution for that convert the spin lock to be a mutex. Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-21 20:57:01 +02:00
Imre Deak	bddc76452d	drm/i915: factor out is_always_on_domain It is just cleaner this way and makes it easier to add support for other HW generations with always-on power wells powering a different set of domains. Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-21 20:56:13 +02:00
Imre Deak	f52e353e19	drm/i915: make the intel_display_power_domain enum compact Upcoming patches will add tracking for a set of power domains via a bitmask; to make things simple there remove the current gap in the enum values. Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-21 20:55:37 +02:00
Daniel Vetter	3d099a05b1	drm/i915: Add new CRC sources On pre-gen5 and vlv we can't use the pipe source when TV-out or a DP port is connected to the pipe. Hence we need to expose new CRC sources. Also simplify the existing pipe source platform code a bit by rejecting all unhandled sources by default. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-21 18:33:42 +02:00
Ben Widawsky	dc39fff722	drm/i915: Print RC6 info less often Since we use intel_enable_rc6() now for more than just when we're enabling RC6, we'll see this message many times, and it is just confusing. As an example, calc_residency calls this function whenever poked via sysfs. This leaves the impression in dmesg that we're constantly re-enabling RC6. While at it, move the defines and description from drv.h to intel_pm.c, since these are only ever used in that code. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-21 10:03:39 +02:00
Daniel Vetter	5b3a856bcf	drm/i915: wire up CRC interrupt for ilk/snb We enable the interrupt unconditionally and only control it through the enable bit in the CRC control register. v2: Extract per-platform helpers to compute the register values. Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-18 15:05:32 +02:00
Jani Nikula	34427052eb	drm/i915: pass mode to ELD write vfuncs This will be needed for setting the HDMI pixel clock for audio config. No functional changes. v2: Now with a commit message. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-18 15:05:29 +02:00
Daniel Vetter	aa5f802181	drm/i915: Use unsigned long for obj->user_pin_count At least on linux sizeof(long) == sizeof(void*) and the thinking is that you can grab about as many references as there's memory. Doesn't really matter, just a bit of OCD since the fixed size data type in a pure in-kernel datastructure look off. v2: Ville asked for an overflow check since no one prevents userspace from incrementing the pin count forever. v3: s/INT/LONG/, noticed by Chris. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-16 22:06:39 +02:00
Daniel Vetter	80075d492f	drm/i915: prevent tiling changes on framebuffer backing storage Assuming that all framebuffer related metadata is invariant simplifies our userspace input data checking. And current userspace always first updates the tiling of an object before creating a framebuffer with it. This allows us to upconvert a check in pin_and_fence to a WARN. In the future it should also be helpful to know which buffer objects are potential scanout targets for e.g. frontbuffer rendering tracking and similar things. Note that SNA shipped for one prerelease with code which will be broken through this patch. But users shouldn't notice since it's purely an optimization and will transparently fall back to allocating a new fb. i-g-t also had offending code (now fixed), but we don't really care about breaking the test-suite. Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Grumpily-reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-16 22:04:52 +02:00
Chris Wilson	45c5f2022c	drm/i915: Disable all GEM timers and work on unload We have two once very similar functions, i915_gpu_idle() and i915_gem_idle(). The former is used as the lower level operation to flush work on the GPU, whereas the latter is the high level interface to flush the GEM bookkeeping in addition to flushing the GPU. As such i915_gem_idle() also clears out the request and activity lists and cancels the delayed work. This is what we need for unloading the driver, unfortunately we called i915_gpu_idle() instead. In the process, make sure that when cancelling the delayed work and timer, which is synchronous, that we do not hold any locks to prevent a deadlock if the work item is already waiting upon the mutex. This requires us to push the mutex down from the caller to i915_gem_idle(). v2: s/i915_gem_idle/i915_gem_suspend/ Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70334 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: xunx.fang@intel.com [danvet: Only set ums.suspended for !kms as discussed earlier. Chris noticed that this slipped through.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-16 19:42:14 +02:00
Daniel Vetter	f8c168fa45	drm/i915: static inline for dummy crc functions Also use #ifdef to keep consistent with all other such cases. Cc: Damien Lespiau <damien.lespiau@intel.com> Acked-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-16 13:32:17 +02:00
Damien Lespiau	be5c7a9075	drm/i915: Only one open() allowed on pipe CRC result files It doesn't really make sense to have two processes dequeueing the CRC values at the same time. Forbid that usage. Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-16 13:32:16 +02:00
Damien Lespiau	071444280b	drm/i915: Implement blocking read for pipe CRC files seq_file is not quite the right interface for these ones. We have a circular buffer with a new entry per vblank on one side and a process wanting to dequeue the CRC with a read(). It's quite racy to wait for vblank in user land and then try to read a pipe_crc file, sometimes the CRC interrupt hasn't been fired and we end up with an EOF. So, let's have the read on the pipe_crc file block until the interrupt gives us a new entry. At that point we can wake the reading process. Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-16 13:32:16 +02:00
Damien Lespiau	e5f75aca19	drm/i915: Dynamically allocate the CRC circular buffer So we don't eat that memory when not needed. Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-16 13:32:12 +02:00
Damien Lespiau	ac2300d4d5	drm/i915: Sample the frame counter instead of a timestamp for CRCs Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-16 13:32:10 +02:00
Damien Lespiau	b2c88f5b1d	drm/i915: Keep the CRC values into a circular buffer There are a few good properties to a circular buffer, for instance it has a number of entries (before we were always dumping the full buffer). Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-16 13:32:10 +02:00
Daniel Vetter	926321d503	drm/i915: Add a control file for pipe CRCs Note the "return -ENODEV;" in pipe_crc_set_source(). The ctl file is disabled until the end of the series to be able to do incremental improvements. Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-16 13:32:04 +02:00
Shuang He	8bf1e9f1d2	drm/i915: Expose latest 200 CRC value for pipe through debugfs There are several points in the display pipeline where CRCs can be computed on the bits flowing there. For instance, it's usually possible to compute the CRCs of the primary plane, the sprite plane or the CRCs of the bits after the panel fitter (collectively called pipe CRCs). v2: Quite a bit of rework here and there (Damien) Signed-off-by: Shuang He <shuang.he@intel.com> Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> [danvet: Fix intermediate compile file reported by Wu Fengguang's kernel builder.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-16 13:31:42 +02:00
Ben Widawsky	73ae478cdf	drm/i915: Replace has_bsd/blt/vebox with a mask I've sent this patch several times for various reasons. It essentially cleans up a lot of code where we need to do something per ring, and want to query whether or not the ring exists on that hardware. It has various uses coming up, but for now it shouldn't be too offensive. v2: Big conflict resolution on Damien's DEV_INFO_FOR_EACH stuff v3: Resolved vebox addition v4: Rebased after months of disuse. Also made failed ringbuffer init cleaner. v5: Remove the init cleaner from v4. There is a better way to do it. (Chris) Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-16 11:08:39 +02:00
Ville Syrjälä	609cedef6a	drm/i915: Store current watermark state in dev_priv->wm To make it easier to check what watermark updates are actually necessary, keep copies of the relevant bits that match the current hardware state. Also add DDB partitioning into hsw_wm_values as that's another piece of state we want to track. We don't read out the hardware state on init yet, so we can't really start using this yet, but it will be used later. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> [danvet: Paulo asked for a comment around the memcmp to say that we depend upon zero-initializing the entire structures due to padding. But a later patch in this series removes the memcmp again. So this is ok as-is.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-15 18:58:58 +02:00
Daniel Vetter	4520f53a15	drm/i915: Kconfig option to disable the legacy fbdev support Boots Just Fine (tm)! The only glitch seems to be that at least on Fedora the boot splash gets confused and doesn't display much at all. And since there's no ugly console flickering anymore in between, the flicker while switching between X servers (VT support is still enabled) is even more jarring. Also, I'm unsure whether we don't need to somehow kick out vgacon, now that nothing else gets in the way. But stuff seems to work, so I don't care. Also everything still works as well with VGA_CONSOLE=n Also the #ifdef mess needs a bit of a cleanup, follow-up patches will do just that. To keep the Kconfig tidy, extract all the i915 options into its own file. v2: - Rebase on top of the preliminary hw support option and the intel_drv.h cleanup. - Shut up warnings in i915_debugfs.c v3: Use the right CONFIG variable, spotted by Chon Ming. Cc: Lee, Chon Ming <chon.ming.lee@intel.com> Cc: David Herrmann <dh.herrmann@gmail.com> Reviewed-by: Chon Ming Lee <chon.ming.lee@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-11 23:37:23 +02:00
Chris Wilson	c0951f0c97	drm/i915: Avoid tweaking RPS before it is enabled As we delay the initial RPS enabling (upon boot and after resume), there is a chance that we may start to render and trigger RPS boosts before we set up the punit. Any changes we make could result in inconsistent hardware state, with a danger of causing undefined behaviour. However, as the boosting is a optional tweak to RPS, we can simply ignore it whilst RPS is not yet enabled. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-10 23:12:05 +02:00
Ben Widawsky	ab484f8fd6	drm/i915: Remove gen specific checks in MMIO Now that MMIO has been split up into gen specific functions it is obvious when HAS_FPGA_DBG_UNCLAIMED, HAS_FORCE_WAKE are needed. As such, we can remove this extraneous condition. As a result of this, as well as previously existing function pointers for forcewake, we no longer need the has_force_wake member in the device specific data structure. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-10 12:47:10 +02:00
Ben Widawsky	0b27448141	drm/i915: Create MMIO virtual functions In preparation for having per GEN MMIO functions, create, and start using MMIO functions in our uncore data structure. This simply makes the transition easier by allowing us to just plug in the per GEN stuff later. For simplicity, I moved the intel_uncore_init() function down since those rely on static functions defined lower in the file. This is most of the churn in this patch. I made one unrelated change here by using off_t datatype for the offset of the register to write. I like the clarity that this brings to the code. If I did it as a separate patch, I am pretty certain it would get bikeshedded to oblivion. Requested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-10 12:47:08 +02:00
Daniel Vetter	967ad7f148	Merge remote-tracking branch 'airlied/drm-next' into drm-intel-next The conflict in intel_drv.h tripped me up a bit since a patch in dinq moves all the functions around, but another one in drm-next removes a single function. So I'ev figured backing this into a backmerge would be good. i915_dma.c is just adjacent lines changed, nothing nefarious there. Conflicts: drivers/gpu/drm/i915/i915_dma.c drivers/gpu/drm/i915/intel_drv.h Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-10 12:44:43 +02:00
Ville Syrjälä	ffbab09bf9	drm: Remove pci_vendor and pci_device from struct drm_device We can get the PCI vendor and device IDs via dev->pdev. So we can drop the duplicated information. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-10-09 15:55:33 +10:00
David Herrmann	16eb5f4379	drm: kill ->gem_init_object() and friends All drivers embed gem-objects into their own buffer objects. There is no reason to keep drm_gem_object_alloc(), gem->driver_private and ->gem_init_object() anymore. New drivers are highly encouraged to do the same. There is no benefit in allocating gem-objects separately. Cc: Dave Airlie <airlied@gmail.com> Cc: Alex Deucher <alexdeucher@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Rob Clark <robdclark@gmail.com> Cc: Inki Dae <inki.dae@samsung.com> Cc: Ben Skeggs <skeggsb@gmail.com> Cc: Patrik Jakobsson <patrik.r.jakobsson@gmail.com> Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-10-09 14:38:02 +10:00
Rodrigo Vivi	a031d709bb	drm/i915: Simplify PSR debugfs for igt test case. v2: remove trailing spaces and fix conflicts Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> [danvet: - make it comipile - s/IS_HASWELL/HAS_PSR/] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-03 21:20:09 +02:00
Chris Wilson	dd75fdc8c6	drm/i915: Tweak RPS thresholds to more aggressively downclock After applying wait-boost we often find ourselves stuck at higher clocks than required. The current threshold value requires the GPU to be continuously and completely idle for 313ms before it is dropped by one bin. Conversely, we require the GPU to be busy for an average of 90% over a 84ms period before we upclock. So the current thresholds almost never downclock the GPU, and respond very slowly to sudden demands for more power. It is easy to observe that we currently lock into the wrong bin and both underperform in benchmarks and consume more power than optimal (just by repeating the task and measuring the different results). An alternative approach, as discussed in the bspec, is to use a continuous threshold for upclocking, and an average value for downclocking. This is good for quickly detecting and reacting to state changes within a frame, however it fails with the common throttling method of waiting upon the outstanding frame - at least it is difficult to choose a threshold that works well at 15,000fps and at 60fps. So continue to use average busy/idle loads to determine frequency change. v2: Use 3 power zones to keep frequencies low in steady-state mostly idle (e.g. scrolling, interactive 2D drawing), and frequencies high for demanding games. In between those end-states, we use a fast-reclocking algorithm to converge more quickly on the desired bin. v3: Bug fixes - make sure we reset adj after switching power zones. v4: Tune - drop the continuous busy thresholds as it prevents us from choosing the right frequency for glxgears style swap benchmarks. Instead the goal is to be able to find the right clocks irrespective of the wait-boost. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Stéphane Marchesin <stephane.marchesin@gmail.com> Cc: Owen Taylor <otaylor@redhat.com> Cc: "Meng, Mengmeng" <mengmeng.meng@intel.com> Cc: "Zhuang, Lena" <lena.zhuang@intel.com> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-03 20:01:31 +02:00
Chris Wilson	b29c19b645	drm/i915: Boost RPS frequency for CPU stalls If we encounter a situation where the CPU blocks waiting for results from the GPU, give the GPU a kick to boost its the frequency. This should work to reduce user interface stalls and to quickly promote mesa to high frequencies - but the cost is that our requested frequency stalls high (as we do not idle for long enough before rc6 to start reducing frequencies, nor are we aggressive at down clocking an underused GPU). However, this should be mitigated by rc6 itself powering off the GPU when idle, and that energy use is dependent upon the workload of the GPU in addition to its frequency (e.g. the math or sampler functions only consume power when used). Still, this is likely to adversely affect light workloads. In particular, this nearly eliminates the highly noticeable wake-up lag in animations from idle. For example, expose or workspace transitions. (However, given the situation where we fail to downclock, our requested frequency is almost always the maximum, except for Baytrail where we manually downclock upon idling. This often masks the latency of upclocking after being idle, so animations are typically smooth - at the cost of increased power consumption.) Stéphane raised the concern that this will punish good applications and reward bad applications - but due to the nature of how mesa performs its client throttling, I believe all mesa applications will be roughly equally affected. To address this concern, and to prevent applications like compositors from permanently boosting the RPS state, we ratelimit the frequency of the wait-boosts each client recieves. Unfortunately, this techinique is ineffective with Ironlake - which also has dynamic render power states and suffers just as dramatically. For Ironlake, the thermal/power headroom is shared with the CPU through Intelligent Power Sharing and the intel-ips module. This leaves us with no GPU boost frequencies available when coming out of idle, and due to hardware limitations we cannot change the arbitration between the CPU and GPU quickly enough to be effective. v2: Limit each client to receiving a single boost for each active period. Tested by QA to only marginally increase power, and to demonstrably increase throughput in games. No latency measurements yet. v3: Cater for front-buffer rendering with manual throttling. v4: Tidy up. v5: Sadly the compositor needs frequent boosts as it may never idle, but due to its picking mechanism (using ReadPixels) may require frequent waits. Those waits, along with the waits for the vrefresh swap, conspire to keep the GPU at low frequencies despite the interactive latency. To overcome this we ditch the one-boost-per-active-period and just ratelimit the number of wait-boosts each client can receive. Reported-and-tested-by: Paul Neumann <paul104x@yahoo.de> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68716 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Stéphane Marchesin <stephane.marchesin@gmail.com> Cc: Owen Taylor <otaylor@redhat.com> Cc: "Meng, Mengmeng" <mengmeng.meng@intel.com> Cc: "Zhuang, Lena" <lena.zhuang@intel.com> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> [danvet: No extern for function prototypes in headers.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-03 20:01:31 +02:00
Chris Wilson	094f9a54e3	drm/i915: Fix __wait_seqno to use true infinite timeouts When we switched to always using a timeout in conjunction with wait_seqno, we lost the ability to detect missed interrupts. Since, we have had issues with interrupts on a number of generations, and they are required to be delivered in a timely fashion for a smooth UX, it is important that we do log errors found in the wild and prevent the display stalling for upwards of 1s every time the seqno interrupt is missed. Rather than continue to fix up the timeouts to work around the interface impedence in wait_event_*(), open code the combination of wait_event[_interruptible][_timeout], and use the exposed timer to poll for seqno should we detect a lost interrupt. v2: In order to satisfy the debug requirement of logging missed interrupts with the real world requirments of making machines work even if interrupts are hosed, we revert to polling after detecting a missed interrupt. v3: Throw in a debugfs interface to simulate broken hw not reporting interrupts. v4: s/EGAIN/EAGAIN/ (Imre) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Imre Deak <imre.deak@intel.com> [danvet: Don't use the struct typedef in new code.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-03 20:01:30 +02:00
Chris Wilson	f56383cb9f	drm/i915: Show WT caching in debugfs Add the missing cache-level to the describe_obj() function for debug and error reporting. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-01 07:45:23 +02:00
Ben Widawsky	e2d05a8b1e	drm/i915: Convert active API to VMA Even though we track object activity and not VMA, because we have the active_list be based on the VM, it makes the most sense to use VMAs in the APIs. NOTE: Daniel intends to eventually rip out active/inactive LRUs, but for now, leave them be. v2: Remove leftover hunk from the previous patch which didn't keep i915_gem_object_move_to_active. That patch had to rely on the ring to get the dev instead of the obj. (Chris) Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-01 07:45:21 +02:00
Ben Widawsky	5c2abbeab7	drm/i915: Provide a cheap ggtt vma lookup "We do fairly often lookup the ggtt vma for an obj." - Chris Wilson. As such, provide a function to offer slightly cheaper access to the vma. Not performance tested. By my quick estimation it saves at least 3 pointer dereferences from the existing mechanism. This patch mostly matches code from Chris in <20130911221430.GB7825@nuc-i3427.alporthouse.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-01 07:45:21 +02:00
Chris Wilson	aec347ab19	drm/i915: Delay the release of the forcewake by a jiffie Obtaining the forcwake requires expensive and time consuming serialisation. And we often try to obtain the forcewake multiple times in very quick succession. We can reduce the overhead of these sequences by delaying the forcewake release, and so not hammer the hw quite so hard. I was hoping this would help with the spurious [drm:__gen6_gt_force_wake_mt_get] ERROR Timed out waiting for forcewake old ack to clear. found on Haswell. Alas not. v2: Fix teardown ordering - unmap the regs after turning off forcewake, and make sure we do turn off forcewake - both found by Ville. v3: As we introduce intel_uncore_fini(), use it to make sure everything is disabled before we hand back to the BIOS. Note: I have no claims for improved performance, stablity or power comsumption for this patch. We should not be hitting the registers often enough for this to improve benchmarks, but given the nature of our hw it is likely to improve long term stability. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-01 07:45:13 +02:00
Ben Widawsky	18b5992c37	drm/i915: Calculate PSR register offsets from base + gen Future generations will be changing these registers (thanks to design for giving us an early heads up). To help abstract, create the definition of the base of the register block, and define all registers relative to that. Design has promised to not change the offsets relative to the base. v2: Also change IS_HASWELL checks to HAS_PSR CC: Rodrigo Vivi <rodrigo.vivi@gmail.com> CC: Intel GFX <intel-gfx@lists.freedesktop.org> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-01 07:45:12 +02:00
Daniel Vetter	6fca55b114	drm/i915: Rip out SUPPORTS_EDP It only controls the setting of the vbt.edp_support variable, which in turn only controls one debug output plus can also force-disable the lvds output. Since the value only restricted this logic to mobile ilk there's the slight risk that this will break lvds on desktop ilk or on snb/ivb platforms. But with the vbt it's better when we know what's going on here, so let's rip it out and see what happens. Cc: Ben Widawsky <benjamin.widawsky@intel.com> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-01 07:45:07 +02:00
Paulo Zanoni	311a20949f	drm/i915: don't init DP or HDMI when not supported by DDI port There's no reason to init a DP connector if the encoder just supports HDMI: we'll just waste hundreds and hundreds of cycles trying to do DP AUX transactions to detect if there's something there. Same goes for a DP connector that doesn't support HDMI, but I'm not sure these actually exist. v2: - Use bit fields - Remove useless identation level - Replace DRM_ERROR with DRM_DEBUG_KMS Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> (v1) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-01 07:45:06 +02:00
Paulo Zanoni	6acab15a7b	drm/i915: use the HDMI DDI buffer translations from VBT We currently use the recommended values from BSpec, but the VBT specifies the correct value to use for the hardware we have, so use it. We also fall back to the recommended value in case we can't find the VBT. In addition, this code also provides some infrastructure to parse more information about the DDI ports. There's a lot more information we could extract and use in the future. v2: - Move some code to init_vbt_defaults. v3: - Rebase - Clarify the "DVO Port" matching code v4: - Use I915_MAX_PORTS - Change the HAS_DDI checks - Replace DRM_ERROR with DRM_DEBUG_KMS Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-01 07:45:04 +02:00
Paulo Zanoni	768f69c9fe	drm/i915: VBT's child_device_config changes over time We currently treat the child_device_config as a simple struct, but this is not correct: new BDB versions change the meaning of some offsets, so the struct needs to be adjusted for each version. Since there are too many changes (today we're in version 170!), making a big versioned union would be too complicated, so child_device_config is now a union of 3 things: (i) a "raw" byte array that's safe to use anywhere; (ii) an "old" structure that's the one we've been using and should be safe to keep in the SDVO and TV code; and (iii) a "common" structure that should contain only fields that are common for all the known VBT versions. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-01 07:45:04 +02:00
Ville Syrjälä	cdf8dd7f88	drm/i915: Add POWER_DOMAIN_VGA VGA registers/memory live inside the the display power well. Add a power domain for VGA. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-20 23:48:45 +02:00
Ben Widawsky	040d2baa62	drm/i915: s/HAS_L3_GPU_CACHE/HAS_L3_DPF We'd only ever used this define to denote whether or not we have the dynamic parity feature (DPF) and never to determine whether or not L3 exists. Baytrail is a good example of where L3 exists, and not DPF. This patch provides clarify in the code for future use cases which might want to actually query whether or not L3 exists. v2: Add /* DPF == dynamic parity feature */ Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-19 20:41:00 +02:00
Ben Widawsky	3ccfd19dea	drm/i915: Do remaps for all contexts On both Ivybridge and Haswell, row remapping information is saved and restored with context. This means, we never actually properly supported the l3 remapping because our sysfs interface is asynchronous (and not tied to any context), and the known faulty HW would be reused by the next context to run. Not that due to the asynchronous nature of the sysfs entry, there is no point modifying the registers for the existing context. Instead we set a flag for all contexts to load the correct remapping information on the next run. Interested clients can use debugfs to determine whether or not the row has been remapped. One could propose at this point that we just do the remapping in the kernel. I guess since we have to maintain the sysfs interface anyway, I'm not sure how useful it is, and I do like keeping the policy in userspace; (it wasn't my original decision to make the interface the way it is, so I'm not attached). v2: Force a context switch when we have a remap on the next switch. (Ville) Don't let userspace use the interface with disabled contexts. v3: Don't force a context switch, just let it nop Improper context slice remap initialization, 1<<1 instead of 1<<i, but I rewrote it to avoid a second round of confusion. Error print moved to error path (All Ville) Added a comment on why the slice remap initialization happens. CC: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-19 20:39:56 +02:00
Ben Widawsky	a33afea5ff	drm/i915: Keep a list of all contexts I have implemented this patch before without creating a separate list (I'm having trouble finding the links, but the messages ids are: <1364942743-6041-2-git-send-email-ben@bwidawsk.net> <1365118914-15753-9-git-send-email-ben@bwidawsk.net>) However, the code is much simpler to just use a list and it makes the code from the next patch a lot more pretty. As you'll see in the next patch, the reason for this is to be able to specify when a context needs to get L3 remapping. More details there. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-19 20:39:43 +02:00

1 2 3 4 5 ...

881 Commits