Using Dask for single-machine parallel computing

This example shows the simplest usage of the Dask backend on your local machine.

This is useful for prototyping a solution, to later be run on a truly distributed Dask cluster, as the only change needed is the cluster class.

Another realistic usage scenario: combining dask code with joblib code, for instance using dask for preprocessing data, and scikit-learn for machine learning. In such a setting, it may be interesting to use distributed as a backend scheduler for both dask and joblib, to orchestrate the computation.

Setup the distributed client

from dask.distributed import Client, LocalCluster

# replace with whichever cluster class you're using
# https://docs.dask.org/en/stable/deploying.html#distributed-computing
cluster = LocalCluster()
# connect client to your cluster
client = Client(cluster)

# Monitor your computation with the Dask dashboard
print(client.dashboard_link)
http://127.0.0.1:8787/status

Run parallel computation using dask.distributed

import time
import joblib


def long_running_function(i):
    time.sleep(0.1)
    return i

The verbose messages below show that the backend is indeed the dask.distributed one

with joblib.parallel_config(backend="dask"):
    joblib.Parallel(verbose=100)(
        joblib.delayed(long_running_function)(i) for i in range(10)
    )
[Parallel(n_jobs=-1)]: Using backend DaskDistributedBackend with 2 concurrent workers.
[Parallel(n_jobs=-1)]: Done   1 tasks      | elapsed:    0.3s
[Parallel(n_jobs=-1)]: Done   2 tasks      | elapsed:    0.4s
[Parallel(n_jobs=-1)]: Done   3 tasks      | elapsed:    0.4s
[Parallel(n_jobs=-1)]: Done   4 tasks      | elapsed:    0.4s
[Parallel(n_jobs=-1)]: Done   5 tasks      | elapsed:    0.5s
[Parallel(n_jobs=-1)]: Done   6 tasks      | elapsed:    0.6s
[Parallel(n_jobs=-1)]: Done   7 tasks      | elapsed:    0.6s
[Parallel(n_jobs=-1)]: Done   8 out of  10 | elapsed:    0.7s remaining:    0.2s
[Parallel(n_jobs=-1)]: Done  10 out of  10 | elapsed:    0.8s finished

Total running time of the script: ( 0 minutes 2.059 seconds)

Gallery generated by Sphinx-Gallery