Commit graph

1445 commits

Author SHA1 Message Date
Daniel Vetter
82d5b58f13 drm/i915: Update DRIVER_DATE to 20150522
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-05-22 19:45:27 +02:00
Chris Wilson
d0bc54f2f0 drm/i915: Introduce DRM_I915_THROTTLE_JIFFIES
As Daniel commented on

commit b7ffe1362c5f468b853223acc9268804aa92afc8
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon Apr 27 13:41:24 2015 +0100

    drm/i915: Free RPS boosts for all laggards

it is better to be explicit when sharing hardcoded values such as
throttle/boost timeouts. Make it so!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-05-22 08:59:52 +02:00
Damien Lespiau
5d96d8afcf drm/i915/skl: Deinit/init the display at suspend/resume
We need to re-init the display hardware when going out of suspend. This
includes:

  - Hooking the PCH to the reset logic
  - Restoring CDCDLK
  - Enabling the DDB power

Among those, only the CDCDLK one is a bit tricky. There's some
complexity in that:

  - DPLL0 (which is the source for CDCLK) has two VCOs, each with a set
    of supported frequencies. As eDP also uses DPLL0 for its link rate,
    once DPLL0 is on, we restrict the possible eDP link rates the chosen
    VCO.
  - CDCLK also limits the bandwidth available to push pixels.

So, as a first step, this commit restore what the BIOS set, until I can
do more testing.

In case that's of interest for the reviewer, I've unit tested the
function that derives the decimal frequency field:

  #include <stdio.h>
  #include <stdint.h>
  #include <assert.h>

  #define ARRAY_SIZE(x) (sizeof(x) / sizeof(*(x)))

  static const struct dpll_freq {
          unsigned int freq;
          unsigned int decimal;
  } freqs[] = {
          { .freq = 308570, .decimal = 0b01001100111},
          { .freq = 337500, .decimal = 0b01010100001},
          { .freq = 432000, .decimal = 0b01101011110},
          { .freq = 450000, .decimal = 0b01110000010},
          { .freq = 540000, .decimal = 0b10000110110},
          { .freq = 617140, .decimal = 0b10011010000},
          { .freq = 675000, .decimal = 0b10101000100},
  };

  static void intbits(unsigned int v)
  {
          int i;

          for(i = 10; i >= 0; i--)
                  putchar('0' + ((v >> i) & 1));
  }

  static unsigned int freq_decimal(unsigned int freq /* in kHz */)
  {
          return (freq - 1000) / 500;
  }

  static void test_freq(const struct dpll_freq *entry)
  {
          unsigned int decimal = freq_decimal(entry->freq);

          printf("freq: %d, expected: ", entry->freq);
          intbits(entry->decimal);
          printf(", got: ");
          intbits(decimal);
          putchar('\n');

          assert(decimal == entry->decimal);
  }

  int main(int argc, char **argv)
  {
          int i;

          for (i = 0; i < ARRAY_SIZE(freqs); i++)
                  test_freq(&freqs[i]);

          return 0;
  }

v2:
  - Rebase on top of -nightly
  - Use (freq - 1000) / 500 for the decimal frequency (Ville)
  - Fix setting the enable bit of HSW_NDE_RSTWRN_OPT (Ville)
  - Rename skl_display_{resume,suspend} to skl_{init,uninit}_cdclk to
    be consistent with the BXT code (Ville)
  - Store boot CDCLK in ddi_pll_init (Ville)
  - Merge dev_priv's skl_boot_cdclk into cdclk_freq
  - Use LCPLL_PLL_LOCK instead of (1 << 30) (Ville)
  - Replace various '0' by SKL_DPLL0 to be a bit more explicit that
    we're programming DPLL0
  - Busy poll the PCU before doing the frequency change. It takes about
    3/4 cycles, each separated by 10us, to get the ACK from the CPU
    (Ville)

v3:
  - Restore dev_priv->skl_boot_cdclk, leaving unification with
    dev_priv->cdclk_freq for a later patch (Daniel, Ville)

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-05-21 22:50:15 +02:00
Chris Wilson
2e1b873072 drm/i915: Convert RPS tracking to a intel_rps_client struct
Now that we have internal clients, rather than faking a whole
drm_i915_file_private just for tracking RPS boosts, create a new struct
intel_rps_client and pass it along when waiting.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: s/rq/req/]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-05-21 15:11:44 +02:00
Chris Wilson
bcafc4e38b drm/i915: Limit mmio flip RPS boosts
Since we will often pageflip to an active surface, we will often have to
wait for the surface to be written before issuing the flip. Also we are
likely to wait on that surface in plenty of time before the vblank.
Since we have a mechanism for boosting when a flip misses the expected
vblank, curtain the number of times we RPS boost when simply waiting for
mmioflip.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: s/rq/req/]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-05-21 15:11:44 +02:00
Chris Wilson
a6f766f397 drm/i915: Limit ring synchronisation (sw sempahores) RPS boosts
Ring switches can occur many times per frame, and are often out of
control, causing frequent RPS boosting for no practical benefit. Treat
the sw semaphore synchronisation as a separate client and only allow it
to boost once per busy/idle cycle.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: s/rq/req/]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-05-21 15:11:43 +02:00
Chris Wilson
b47161858b drm/i915: Implement inter-engine read-read optimisations
Currently, we only track the last request globally across all engines.
This prevents us from issuing concurrent read requests on e.g. the RCS
and BCS engines (or more likely the render and media engines). Without
semaphores, we incur costly stalls as we synchronise between rings -
greatly impacting the current performance of Broadwell versus Haswell in
certain workloads (like video decode). With the introduction of
reference counted requests, it is much easier to track the last request
per ring, as well as the last global write request so that we can
optimise inter-engine read read requests (as well as better optimise
certain CPU waits).

v2: Fix inverted readonly condition for nonblocking waits.
v3: Handle non-continguous engine array after waits
v4: Rebase, tidy, rewrite ring list debugging
v5: Use obj->active as a bitfield, it looks cool
v6: Micro-optimise, mostly involving moving code around
v7: Fix retire-requests-upto for execlists (and multiple rq->ringbuf)
v8: Rebase
v9: Refactor i915_gem_object_sync() to allow the compiler to better
optimise it.

Benchmark: igt/gem_read_read_speed
hsw:gt3e (with semaphores):
Before: Time to read-read 1024k:		275.794µs
After:  Time to read-read 1024k:		123.260µs

hsw:gt3e (w/o semaphores):
Before: Time to read-read 1024k:		230.433µs
After:  Time to read-read 1024k:		124.593µs

bdw-u (w/o semaphores):             Before          After
Time to read-read 1x1:            26.274µs       10.350µs
Time to read-read 128x128:        40.097µs       21.366µs
Time to read-read 256x256:        77.087µs       42.608µs
Time to read-read 512x512:       281.999µs      181.155µs
Time to read-read 1024x1024:    1196.141µs     1118.223µs
Time to read-read 2048x2048:    5639.072µs     5225.837µs
Time to read-read 4096x4096:   22401.662µs    21137.067µs
Time to read-read 8192x8192:   89617.735µs    85637.681µs

Testcase: igt/gem_concurrent_blit (read-read and friends)
Cc: Lionel Landwerlin <lionel.g.landwerlin@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> [v8]
[danvet: s/\<rq\>/req/g]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-05-21 15:11:42 +02:00
Imre Deak
b88baa2a46 drm/i915/skl: add F0 stepping ID
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-05-20 11:26:11 +02:00
Vandana Kannan
b6dc71f38a drm/i915/bxt: Port PLL programming BUN
BUN 1: prop_coeff, int_coeff, tdctargetcnt programming updated and tied to
VCO frequencies. Program i_lockthresh in PORT_PLL_9.

VCO calculated based on the formula:
Desired Output = Port bit rate in MHz (DisplayPort HBR2 is 5400 MHz)
Fast Clock = Desired Output / 2
VCO = Fast Clock * P1 * P2

Prop_coeff, int_coeff, and tdctargetcnt modified according to above
calculation.

BUN 2: Port PLLs require additional programming at certain frequencies -
DCO amplitude in PORT_PLL_10

Review comments from Siva which were addressed in the initial version of the
patch.
	- Change PORT_PLL_LOCK_THRESHOLD to PORT_PLL_LOCK_THRESHOLD_MASK
	- Calculate for HDMI
	- Correct values for vco = 5.4
	- return in case of invalid vco range

v2: Imre's review comments addressed
	- change dcoampovr_en to dcoampovr_en_h
	- change PORT_PLL_DCO_AMP_OVR_EN to PORT_PLL_DCO_AMP_OVR_EN_H
	- Correct lane stagger value for 324MHz
	- Make coef common for HDMI and DP
	- remove superfluous comments

v3: Imre's comments addressed
	- Remove Prop_coeff, int_coeff, tdctargetcnt, dcoampovr_en, gain_ctl,
	dcoampovr_en_h from bxt_clk_div and make them local variables.

Signed-off-by: Vandana Kannan <vandana.kannan@intel.com>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> [v1]
Cc: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-05-20 11:26:08 +02:00
Jani Nikula
0c9b371550 drm/i915: add HAS_DP_MST feature test macro
Be in line with other features that we have.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-05-20 11:26:07 +02:00
Chris Wilson
b2cfe0ab63 drm/i915: Fix race on unreferencing the wrong mmio-flip-request
As we perform the mmio-flip without any locking and then try to acquire
the struct_mutex prior to dereferencing the request, it is possible for
userspace to queue a new pageflip before the worker can finish clearing
the old state - and then it will clear the new flip request. The result
is that the new flip could be completed before the GPU has finished
rendering.

The bugs stems from removing the seqno checking in
commit 536f5b5e86
Author: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Date:   Thu Nov 6 11:03:40 2014 +0200

    drm/i915: Make mmio flip wait for seqno in the work function

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-05-20 11:25:46 +02:00
Chris Wilson
2e2f351dbf drm/i915: Remove domain flubbing from i915_gem_object_finish_gpu()
We no longer interpolate domains in the same manner, and even if we did,
we should trust setting either of the other write domains would trigger
an invalidation rather than force it. Remove the tweaking of the
read_domains since it serves no purpose and use
i915_gem_object_wait_rendering() directly.

Note that this goes back to

commit a8198eea15
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Apr 13 22:04:09 2011 +0100

    drm/i915: Introduce i915_gem_object_finish_gpu()

and gpu domain tracking died in

commit cc889e0f6c
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Wed Jun 13 20:45:19 2012 +0200

    drm/i915: disable flushing_list/gpu_write_list

which is more than 1 year older.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Add notes with information dug out of git history.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-05-20 11:25:45 +02:00
Chandra Konduru
2cd601c620 drm/i915: Adding dbuf support for skl nv12 format.
Skylake nv12 format requires dbuf (aka. ddb) calculations
and programming for each of y and uv sub-planes. Made minor
changes to reuse current dbuf calculations and programming
for uv plane. i.e., with this change, existing computation
is used for either packed format or uv portion of nv12
depending on incoming format. Added new code for dbuf
computation and programming for y plane.

This patch is a pre-requisite for adding NV12 format support.
Actual nv12 support is coming in later patches.

Signed-off-by: Chandra Konduru <chandra.konduru@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-05-20 11:25:39 +02:00
Daniel Vetter
214a2b7fab drm/i915: Update DRIVER_DATE to 20150508
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-05-08 17:38:19 +02:00
Ville Syrjälä
7072246887 drm/i915: Work around DISPLAY_PHY_CONTROL register corruption on CHV
Sometimes (exactly when is a bit unclear) DISPLAY_PHY_CONTROL appears to
get corrupted. The values I've managed to read from it seem to have some
pattern but vary quite a lot. The corruption doesn't seem to just happen
when the register is accessed, but can also happen spontaneosly during
modeset. When this happens during a modeset things go south and the
display doesn't light up.

I've managed to hit the problemn when toggling HDMI on port D on and
off. When things get corrupted the display doesn't light up, but as soon
as I manually write the correct value to the register the display comes
up.

First I was suspicious that we ourselves accidentally overwrite it with
garbage, but didn't catch anything with the reg_rw tracepoint. Also I
sprinkled check all over the modeset path to see exactly when the
corruption happens, and eg. the read back value was fine just before
intel_dp_set_m(), and corrupted immediately after it. I also made my
check function repair the register value whenever it was wrong, and with
this approach the corruption repeated several times during the modeset
operation, always seeming to trigger in the same exact calls to the
check function, while other calls to the function never caught anything.

So far I've not seen this problem occurring when carefully avoiding all
read accesses to DISPLAY_PHY_CONTROL. Not sure if that's just pure luck
or an actual workaround, but we can hope it works. So let's avoid reading
the register and instead track the desired value of the register in dev_priv.

v2: Read out the power well state to determine initial register value
v3: Use DPIO_CHx names instead of raw numbers

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by:  Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-05-08 15:56:30 +02:00
Maarten Lankhorst
27321ae88c drm/i915: Use the disable callback for disabling planes.
This allows disabling all planes affecting a crtc without caring what type it is.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-05-08 13:03:52 +02:00
Sonika Jindal
9e45803465 drm/i915/skl: Add module parameter to select edp vswing table
This provides an option to override the value set by VBT
for selecting edp Vswing Pre-emph setting table.

v2: Adding comment about this being a temporary workaround and
making the parameter read-only (Jani)
v3: Changing mode to 0400 instead of 0 (Jani)

https://bugs.freedesktop.org/show_bug.cgi?id=89554
Signed-off-by: Sonika Jindal <sonika.jindal@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-05-08 13:03:41 +02:00
Damien Lespiau
71cd8423cd drm/i915/skl: Fix the CTRL typo in the DPLL_CRTL1 defines
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-05-08 13:03:32 +02:00
Suketu Shah
00776511da drm/i915/skl: Enable runtime PM
Enable runtime PM for Skylake platform

v2: After adding dmc ver 1.0 support rebased on top of nightly. (Animesh)

Issue: VIZ-2819
Signed-off-by: A.Sunil Kamath <sunil.kamath@intel.com>
Signed-off-by: Suketu Shah <suketu.j.shah@intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Animesh Manna <animesh.manna@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-05-08 13:03:17 +02:00
Suketu Shah
dc17430054 drm/i915/skl: Add DC5 Trigger Sequence
Add triggers as per expectations mentioned in gen9_enable_dc5
and gen9_disable_dc5 patch.

Also call POSTING_READ for every write to a register to ensure that
its written immediately.

v1: Remove POSTING_READ calls as they've already been added in previous patches.

v2: Rebase to move all runtime pm specific changes to intel_runtime_pm.c file.

Modified as per review comments from Imre:
1] Change variable name 'dc5_allowed' to 'dc5_enabled' to correspond to relevant
   functions.
2] Move the check dc5_enabled in skl_set_power_well() to disable DC5 into
   gen9_disable_DC5 which is a more appropriate place.
3] Convert checks for 'pm.dc5_enabled' and 'pm.suspended' in skl_set_power_well()
   to warnings. However, removing them for now as they'll be included in a future patch
   asserting DC-state entry/exit criteria.
4] Enable DC5, only when CSR firmware is verified to be loaded. Create new structure
   to track 'enabled' and 'deferred' status of DC5.
5] Ensure runtime PM reference is obtained, if CSR is not loaded, to avoid entering
   runtime-suspend and release it when it's loaded.
6] Protect necessary CSR-related code with locks.
7] Move CSR-loading call to runtime PM initialization, as power domains needed to be
   accessed during deferred DC5-enabling, are not initialized earlier.

v3: Rebase to latest.

Modified as per review comments from Imre:
1] Use blocking wait for CSR-loading to finish to enable DC5  for simplicity, instead of
   deferring enabling DC5 until CSR is loaded.
2] Obtain runtime PM reference during CSR-loading initialization itself as deferred DC5-
   enabling is removed and release it at the end of CSR-loading functionality.
3] Revert calling CSR-loading functionality to the beginning of i915 driver-load
   functionality to avoid any delay in loading.
4] Define another variable to track whether CSR-loading failed and use it to avoid enabling
   DC5 if it's true.
5] Define CSR-load-status accessor functions for use later.

v4:
1] Disable DC5 before enabling PG2 instead of after it.
2] DC5 was being mistaken enabled even when CSR-loading timed-out. Fix that.
3] Enable DC5-related functionality using a macro.
4] Remove dc5_enabled tracking variable and its use as it's not needed now.

v5:
1] Mark CSR failed to load where necessary in finish_csr_load function.
2] Use mutex-protected accessor function to check if CSR loaded instead of directly
   accessing the variable.
3] Prefix csr_load_status_get/set function names with intel_.

v6: rebase to latest.
v7: Rebase on top of nightly (Damien)
v8: Squashed the patch from Imre - added csr helper pointers to simplify the code. (Imre)
v9: After adding dmc ver 1.0 support rebased on top of nightly. (Animesh)
v10: Added a enum for different csr states, suggested by Imre. (Animesh)

v11: Based on review comments from Imre, Damien and Daniel following changes done
- enum name chnaged to csr_state (singular form).
- FW_UNINITIALIZED used as zeroth element in enum csr_state.
- Prototype changed for helper function(set/get csr status), using enum csr_state instead of bool.

v12: Based on review comment from Imre, introduced bool fw_loaded local to finish_csr_load() which helps
calling once to set the csr status. The same flag used to fail RPM if find any issue during
firmware loading.

Issue: VIZ-2819
Signed-off-by: A.Sunil Kamath <sunil.kamath@intel.com>
Signed-off-by: Suketu Shah <suketu.j.shah@intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Animesh Manna <animesh.manna@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-05-08 13:03:11 +02:00
Daniel Vetter
eb805623d8 drm/i915/skl: Add support to load SKL CSR firmware.
Display Context Save and Restore support is needed for
various SKL Display C states like DC5, DC6.

This implementation is added based on first version of DMC CSR program
that we received from h/w team.

Here we are using request_firmware based design.
Finally this firmware should end up in linux-firmware tree.

For SKL platform its mandatory to ensure that we load this
csr program before enabling DC states like DC5/DC6.

As CSR program gets reset on various conditions, we should ensure
to load it during boot and in future change to be added to load
this system resume sequence too.

v1: Initial relese as RFC patch

v2: Design change as per Daniel, Damien and Shobit's review comments
request firmware method followed.

v3: Some optimization and functional changes.
Pulled register defines into drivers/gpu/drm/i915/i915_reg.h
Used kmemdup to allocate and duplicate firmware content.
Ensured to free allocated buffer.

v4: Modified as per review comments from Satheesh and Daniel
Removed temporary buffer.
Optimized number of writes by replacing I915_WRITE with I915_WRITE64.

v5:
Modified as per review comemnts from Damien.
- Changed name for functions and firmware.
- Introduced HAS_CSR.
- Reverted back previous change and used csr_buf with u8 size.
- Using cpu_to_be64 for endianness change.

Modified as per review comments from Imre.
- Modified registers and macro names to be a bit closer to bspec terminology
and the existing register naming in the driver.
- Early return for non SKL platforms in intel_load_csr_program function.
- Added locking around CSR program load function as it may be called
concurrently during system/runtime resume.
- Releasing the fw before loading the program for consistency
- Handled error path during f/w load.

v6: Modified as per review comments from Imre.
- Corrected out_freecsr sequence.

v7: Modified as per review comments from Imre.
Fail loading fw if fw->size%8!=0.

v8: Rebase to latest.

v9: Rebase on top of -nightly (Damien)

v10: Enabled support for dmc firmware ver 1.0.
According to ver 1.0 in a single binary package all the firmware's that are
required for different stepping's of the product will be stored. The package
contains the css header, followed by the package header and the actual dmc
firmwares. Package header contains the firmware/stepping mapping table and
the corresponding firmware offsets to the individual binaries, within the
package. Each individual program binary contains the header and the payload
sections whose size is specified in the header section. This changes are done
to extract the specific firmaware from the package. (Animesh)

v11: Modified as per review comemnts from Imre.
- Added code comment from bpec for header structure elements.
- Added __packed to avoid structure padding.
- Added helper functions for stepping and substepping info.
- Added code comment for CSR_MAX_FW_SIZE.
- Disabled BXT firmware loading, will be enabled with dmc 1.0 support.
- Changed skl_stepping_info based on bspec, earlier used from config DB.
- Removed duplicate call of cpu_to_be* from intel_csr_load_program function.
- Used cpu_to_be32 instead of cpu_to_be64 as firmware binary in dword aligned.
- Added sanity check for header length.
- Added sanity check for mmio address got from firmware binary.
- kmalloc done separately for dmc header and dmc firmware. (Animesh)

v12: Modified as per review comemnts from Imre.
- Corrected the typo error in skl stepping info structure.
- Added out-of-bound access for skl_stepping_info.
- Sanity check for mmio address modified.
- Sanity check added for stepping and substeppig.
- Modified the intel_dmc_info structure, cache only the required header info. (Animesh)

v13: clarify firmware load error message.
The reason for a firmware loading failure can be obscure if the driver
is built-in. Provide an explanation to the user about the likely reason for
the failure and how to resolve it. (Imre)

v14: Suggested by Jani.
- fix s/I915/CONFIG_DRM_I915/ typo
- add fw_path to the firmware object instead of using a static ptr (Jani)

v15:
1) Changed the firmware name as dmc_gen9.bin, everytime for a new firmware version a symbolic link
with same name will help not to build kernel again.
2) Changes done as per review comments from Imre.
- Error check removed for intel_csr_ucode_init.
- Moved csr-specific data structure to intel_csr.h and optimization done on structure definition.
- fw->data used directly for parsing the header info & memory allocation
only done separately for payload. (Animesh)

v16:
- No need for out_regs label in i915_driver_load(), so removed it.
- Changed the firmware name as skl_dmc_ver1.bin, followed naming convention <platform>_dmc_<api-version>.bin (Animesh)

Issue: VIZ-2569
Signed-off-by: A.Sunil Kamath <sunil.kamath@intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Animesh Manna <animesh.manna@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-05-08 13:03:10 +02:00
Dave Airlie
e1dee1973c Merge tag 'drm-intel-next-2015-04-23-fixed' of git://anongit.freedesktop.org/drm-intel into drm-next
drm-intel-next-2015-04-23:
- dither support for ns2501 dvo (Thomas Richter)
- some polish for the gtt code and fixes to finally enable the cmd parser on hsw
- first pile of bxt stage 1 enabling (too many different people to list ...)
- more psr fixes from Rodrigo
- skl rotation support from Chandra
- more atomic work from Ander and Matt
- pile of cleanups and micro-ops for execlist from Chris
drm-intel-next-2015-04-10:
- cdclk handling cleanup and fixes from Ville
- more prep patches for olr removal from John Harrison
- gmbus pin naming rework from Jani (prep for bxt)
- remove ->new_config from Ander (more atomic conversion work)
- rps (boost) tuning and unification with byt/bsw from Chris
- cmd parser batch bool tuning from Chris
- gen8 dynamic pte allocation (Michel Thierry, based on work from Ben Widawsky)
- execlist tuning (not yet all of it) from Chris
- add drm_plane_from_index (Chandra)
- various small things all over

* tag 'drm-intel-next-2015-04-23-fixed' of git://anongit.freedesktop.org/drm-intel: (204 commits)
  drm/i915/gtt: Allocate va range only if vma is not bound
  drm/i915: Enable cmd parser to do secure batch promotion for aliasing ppgtt
  drm/i915: fix intel_prepare_ddi
  drm/i915: factor out ddi_get_encoder_port
  drm/i915/hdmi: check port in ibx_infoframe_enabled
  drm/i915/hdmi: fix vlv infoframe port check
  drm/i915: Silence compiler warning in dvo
  drm/i915: Update DRIVER_DATE to 20150423
  drm/i915: Enable dithering on NatSemi DVO2501 for Fujitsu S6010
  rm/i915: Move i915_get_ggtt_vma_pages into ggtt_bind_vma
  drm/i915: Don't try to outsmart gcc in i915_gem_gtt.c
  drm/i915: Unduplicate i915_ggtt_unbind/bind_vma
  drm/i915: Move ppgtt_bind/unbind around
  drm/i915: move i915_gem_restore_gtt_mappings around
  drm/i915: Fix up the vma aliasing ppgtt binding
  drm/i915: Remove misleading comment around bind_to_vm
  drm/i915: Don't use atomics for pg_dirty_rings
  drm/i915: Don't look at pg_dirty_rings for aliasing ppgtt
  drm/i915/skl: Support Y tiling in MMIO flips
  drm/i915: Fixup kerneldoc for struct intel_context
  ...

Conflicts:
	drivers/gpu/drm/i915/i915_drv.c
2015-05-08 20:51:06 +10:00
Imre Deak
faa0cdbec1 drm/i915: fix intel_prepare_ddi
At the moment intel_prepare_ddi buffer will iterate through both MST and
CRT encoders, which is incorrect. Neither of these encoder types have an
embedding intel_digital_port object, so for these encoder types we will
use random data when dereferencing the corresponding
intel_digital_port->port field.

Introduced in
commit b403745c84
Author: Damien Lespiau <damien.lespiau@intel.com>
Date:   Mon Aug 4 22:01:33 2014 +0100

    drm/i915: Iterate through the initialized DDIs to prepare their buffers

v2:
- fix getting at the port for MST encoders too
- make sure that intel_prepare_ddi_buffers() gets called for port E too
  (Paulo)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90067
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-04-30 12:33:09 +03:00
Daniel Vetter
de4de566f8 drm/i915: Update DRIVER_DATE to 20150423
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-04-23 22:02:54 +02:00
Daniel Vetter
0875546c53 drm/i915: Fix up the vma aliasing ppgtt binding
Currently we have the problem that the decision whether ptes need to
be (re)written is splattered all over the codebase. Move all that into
i915_vma_bind. This needs a few changes:
- Just reuse the PIN_* flags for i915_vma_bind and do the conversion
  to vma->bound in there to avoid duplicating the conversion code all
  over.
- We need to make binding for EXECBUF (i.e. pick aliasing ppgtt if
  around) explicit, add PIN_USER for that.
- Two callers want to update ptes, give them a PIN_UPDATE for that.

Of course we still want to avoid double-binding, but that should be
taken care of:
- A ppgtt vma will only ever see PIN_USER, so no issue with
  double-binding.
- A ggtt vma with aliasing ppgtt needs both types of binding, and we
  track that properly now.
- A ggtt vma without aliasing ppgtt could be bound twice. In the
  lower-level ->bind_vma functions hence unconditionally set
  GLOBAL_BIND when writing the ggtt ptes.

There's still a bit room for cleanup, but that's for follow-up
patches.

v2: Fixup fumbles.

v3: s/PIN_EXECBUF/PIN_USER/ for clearer meaning, suggested by Chris.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2015-04-23 21:06:39 +02:00
Tvrtko Ursulin
7df113e473 drm/i915: Fixup kerneldoc for struct intel_context
commit ae6c480692
   Author: Daniel Vetter <daniel.vetter@ffwll.ch>
   Date:   Wed Aug 6 15:04:53 2014 +0200

       drm/i915: Only track real ppgtt for a context

Changed the code but didn't update kerneldoc.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: "Thierry, Michel" <michel.thierry@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-04-20 09:07:39 -07:00
Tvrtko Ursulin
8a0c39b162 drm/i915: Simplify and fix object to display tracking
Purpose of this tracking is to know when to flush the cache between
the CPU and the non-coherent display engine. Prior to:

   commit 121920faf2
   Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
   Date:   Mon Mar 23 11:10:37 2015 +0000

       drm/i915/skl: Query display address through a wrapper

This worked by a mix of direct flag manipulation and checking for
existence of a pinned GGTT VMA.

With the introduction of rotated display mappings this approach is
no longer correct.

New simpler approach is to just keep this count over calls which pin
and unpin objects to and from display, at the slight cost of extra
space in every bo.

(Inspired and extracted code from a larger rework by Chris Wilson.)

v2: Remove the limit since it is not well defined. (Chris Wilson, Ville Syrjälä)
v3: Commit message corrections. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-04-20 08:51:45 -07:00
Dave Airlie
2c33ce009c Merge Linus master into drm-next
The merge is clean, but the arm build fails afterwards,
due to API changes in the regulator tree.

I've included the patch into the merge to fix the build.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-04-20 13:05:20 +10:00
Damien Lespiau
b403745c84 drm/i915: Iterate through the initialized DDIs to prepare their buffers
Not every DDIs is necessarily connected can be strapped off and, in the
future, we'll have platforms with a different number of default DDI
ports. So, let's only call intel_prepare_ddi_buffers() on DDI ports that
are actually detected.

We also use the opportunity to give a struct intel_digital_port to
intel_prepare_ddi_buffers() as we'll need it in a following patch to
query if the port supports HMDI or not.

On my HSW machine this removes the initialization of a couple of
(unused) DDIs.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-04-16 11:42:38 +02:00
Satheeshakrishna M
dfb8240847 drm/i915/bxt: Define bxt DDI PLLs and implement enable/disable sequence
Plug bxt PLL code into existing shared DPLL framework.

v2: (imre)
- squash in Satheeshakrishna's "Define BXT clock registers" and
  "Add state variables for bxt clock registers" patches
- squash in Vandanas's "Change grp access to lane access for PLL"
- fix group vs. lane access in bxt_ddi_pll_get_hw_state
- add code comment why we read from lane registers while writing to
  group registers
- clean up register macros
- use BXT_PORT_PLL_* macros instead of open-coding the same
- check if BXT_PORT_PCS_DW12_LN01 matches BXT_PORT_PCS_DW12_LN23
  during hardware state readout
- add missing LANESTAGGER_STRAP_OVRD masking
- add note about missing step according to the latest BUN for
  PORT_PLL_9/lockthresh

Signed-off-by: Satheeshakrishna M <satheeshakrishna.m@intel.com> (v1)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-04-16 11:29:06 +02:00
Vandana Kannan
164dfd2877 drm/i915: Rename vlv_cdclk_freq to cdclk_freq
Rename vlv_cdclk_freq to cdclk_freq so that it can be used for all
platforms as required. Needed by the next patch.

Signed-off-by: Vandana Kannan <vandana.kannan@intel.com>
Signed-off-by: A.Sunil Kamath <sunil.kamath@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-04-16 09:20:13 +02:00
Rodrigo Vivi
89251b177b drm/i915: PSR: deprecate link_standby support for core platforms.
On Haswell and Broadwell with link in standby when exit event happens
between vblank and VSC packet, PSR exit on panel but DPA transmitter
still sends black pixel. When this condition hits, panel will intermittently
display black frame.

The known W/A for this case involve the of single_frame update
that isn't supported on Haswell and to be supported on Broadwell
3 other workarounds would be required. So it is better and safe to
just deprecate link_standby for now.

Also, link fully off saves more power than link_standby and afwk
no OEM is requesting link standby on VBT. There is no reason for that.

For Skylake let's just consider it behaves like Broadwell until
we prove otherwise.

v2: Fix commit message (Durga).

v3: Fix conflict with PSR2.

Reference: HSD: bdwgfx/1912559
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-04-14 19:15:17 +02:00
Daniel Vetter
c5fe557dde Merge branch 'topic/bxt-stage1' into drm-intel-next-queued
Separate topic branch for bxt didn't work out since we needed to
refactor the gmbus code a bit to make it look decent. So backmerge.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2015-04-14 14:00:56 +02:00
Chris Wilson
30154650b8 drm/i915: Remove obj->pin_mappable
The obj->pin_mappable flag only exists for debug purposes and is a
hindrance that is mistreated with rotated GGTT views. For debug
purposes, it suffices to mark objects with pin_display as being of note.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-04-13 14:25:36 +02:00
Jani Nikula
249e87de5f drm/i915: fix build for DEBUG_FS=n
Fix DEBUG_FS=n build broken by

commit aa7471d228
Author: Jani Nikula <jani.nikula@intel.com>
Date:   Wed Apr 1 11:15:21 2015 +0300

    drm/i915: add i915 specific connector debugfs file for DPCD

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-04-10 16:19:42 +02:00
Chris Wilson
d7b9ca2f7a drm/i915: Remove request->uniq
We already assign a unique identifier to every request: seqno. That
someone felt like adding a second one without even mentioning why and
tweaking ABI smells very fishy.

Fixes regression from
commit b3a38998f0
Author: Nick Hoath <nicholas.hoath@intel.com>
Date:   Thu Feb 19 16:30:47 2015 +0000

    drm/i915: Fix a use after free, and unbalanced refcounting

v2: Rebase

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Nick Hoath <nicholas.hoath@intel.com>
Cc: Thomas Daniel <thomas.daniel@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Jani Nikula <jani.nikula@intel.com>
[danvet: Fixup because different merge order.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-04-10 10:41:11 +02:00
Chris Wilson
a6111f7b66 drm/i915: Reduce locking in execlist command submission
This eliminates six needless spin lock/unlock pairs when writing out
ELSP.

v2: Respin with my preferred colour.
v3: Mostly back to the original colour

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> [v1]
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-04-10 10:31:44 +02:00
Chris Wilson
e20d2ab741 drm/i915: Use a separate slab for vmas
vma are more frequently allocated than objects and so should equally
benefit from having a dedicated slab.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-04-10 10:17:13 +02:00
Chris Wilson
efab6d8dd1 drm/i915: Use a separate slab for requests
requests are even more frequently allocated than objects and equally
benefit from having a dedicated slab.

v2: Rebase

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-04-10 10:17:07 +02:00
Daniel Vetter
3e1ab4b705 drm/i915: Update DRIVER_DATE to 20150410
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-04-10 09:31:40 +02:00
Chris Wilson
94f8cf109e drm/i915: Record ring->start address in error state
This is mostly useful for execlists where the rings switch between
contexts (and so checking that the ring's start register matches the
context is important).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-04-10 08:56:07 +02:00
Chris Wilson
8d9d5744c6 drm/i915: Split batch pool into size buckets
Now with the trimmed memcpy before the command parser, we try to
allocate many different sizes of batches, predominantly one or two
pages. We can therefore speed up searching for a good sized batch by
keeping the objects of buckets of roughly the same size.

v2: Add a comment about bucket sizes

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-04-10 08:56:05 +02:00
Chris Wilson
06fbca713e drm/i915: Split the batch pool by engine
I woke up one morning and found 50k objects sitting in the batch pool
and every search seemed to iterate the entire list... Painting the
screen in oils would provide a more fluid display.

One issue with the current design is that we only check for retirements
on the current ring when preparing to submit a new batch. This means
that we can have thousands of "active" batches on another ring that we
have to walk over. The simplest way to avoid that is to split the pools
per ring and then our LRU execution ordering will also ensure that the
inactive buffers remain at the front.

v2: execlists still requires duplicate code.
v3: execlists requires more duplicate code

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by:  Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-04-10 08:56:04 +02:00
Chris Wilson
ed9ddd25b2 drm/i915: Split i915_gem_batch_pool into its own header
In the next patch, I want to use the structure elsewhere and so require
it defined earlier. Rather than move the definition to an earlier location
where it feels very odd, place it in its own header file.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-04-10 08:56:03 +02:00
Chris Wilson
1854d5ca0d drm/i915: Deminish contribution of wait-boosting from clients
With boosting for missed pageflips, we have a much stronger indication
of when we need to (temporarily) boost GPU frequency to ensure smooth
delivery of frames. So now only allow each client to perform one RPS boost
in each period of GPU activity due to stalling on results.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Deepak S <deepak.s@linux.intel.com>
Reviewed-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-04-10 08:56:02 +02:00
Chris Wilson
8fb55197e6 drm/i915: Agressive downclocking on Baytrail
Reuse the same reclocking strategy for Baytail as on its bigger brethren,
Sandybridge and Ivybridge. In particular, this makes the device quicker
to reclock (both up and down) though the tendency now is to downclock
more aggressively to compensate for the RPS boosts.

v2: Rebase
v3: Exclude Cherrytrail as Deepak was concerned that the increased
number of register writes would wake the common powerwell too often.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Deepak S <deepak.s@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-04-10 08:56:01 +02:00
Chris Wilson
ee286370d6 drm/i915: Cache last obj->pages location for i915_gem_object_get_page()
The biggest user of i915_gem_object_get_page() is the relocation
processing during execbuffer. Typically userspace passes in a set of
relocations in sorted order. Sadly, we alternate between relocations
increasing from the start of the buffers, and relocations decreasing
from the end. However the majority of consecutive lookups will still be
in the same page. We could cache the start of the last sg chain, however
for most callers, the entire sgl is inside a single chain and so we see
no improve from the extra layer of caching.

v2: Avoid the double increment inside unlikely()

References: https://bugs.freedesktop.org/show_bug.cgi?id=88308
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-04-10 08:56:00 +02:00
Maarten Lankhorst
b833bb61fd drm/i915: use kref_put_mutex in i915_gem_request_unreference__unlocked
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-04-10 08:55:58 +02:00
Nick Hoath
6c74c87f7c drm/i915/bxt: Add Broxton steppings
Signed-off-by: Nick Hoath <nicholas.hoath@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-04-09 15:57:52 +02:00
Damien Lespiau
8232edb5f7 drm/i915/bxt: Broxton raises the maximum number of planes to 4
Pipe A and b have 4 planes.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Antti Koskipää <antti.koskipaa@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-04-09 15:57:50 +02:00