metal : concurrently dispatch commands (#2358)
* metal: concurrently dispatch commands Function `ggml_metal_graph_find_concurrency` will run and write commands that can be issued concurrently to metal context `concur_list` array, when `ggml_metal_graph_compute` is called for the first time. * metal: don't call find_concurrency automatically. * metal : code style changes --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
This commit is contained in:
		
							parent
							
								
									9a08eaf3c4
								
							
						
					
					
						commit
						1aa18ef994
					
				
					 3 changed files with 138 additions and 19 deletions
				
			
		|  | @ -61,6 +61,13 @@ void ggml_metal_set_tensor(struct ggml_metal_context * ctx, struct ggml_tensor * | |||
| // get data from the device into host memory
 | ||||
| void ggml_metal_get_tensor(struct ggml_metal_context * ctx, struct ggml_tensor * t); | ||||
| 
 | ||||
| // try to find operations that can be run concurrently in the graph
 | ||||
| // you should run it again if the topology of your graph changes
 | ||||
| void ggml_metal_graph_find_concurrency(struct ggml_metal_context * ctx, struct ggml_cgraph * gf); | ||||
| 
 | ||||
| // if the graph has been optimized for concurrently dispatch
 | ||||
| bool ggml_metal_if_optimized(struct ggml_metal_context * ctx); | ||||
| 
 | ||||
| // same as ggml_graph_compute but uses Metal
 | ||||
| // creates gf->n_threads command buffers in parallel
 | ||||
| void ggml_metal_graph_compute(struct ggml_metal_context * ctx, struct ggml_cgraph * gf); | ||||
|  |  | |||
		Loading…
	
	Add table
		Add a link
		
	
		Reference in a new issue