HyperLoom Stanislav Böhm, Vojt ě ch Cima https://hyperloom.eu It4Innovations / ADAS / DEF
10:00-11:30 • Intoduction to HyperLoom • Hands-on: 'Hello world' 12:30-14:00 • Hands-on: Building a real pipeline • Best practices 14:30-16:00 • Bring your own problems
E CAPE
Chemogenomics gene GSK3B ZZRDAWZXJBVQSO-UYBDAZJANA-N Inactive + = ZZRDINJRCNXHIQ-MCRMGHHCNA-N Inactive ZZRDSHKJUCIYQL-GPQMBLKYNA-N Active ZZRDKBQKAVQCMK-LILDFLRNNA-N Inactive
GSK3B + ZZRDAWZXJBVQSO = inactive GSK3B + ZZRDINJRCNXHIQ = inactive ... GSK3B + ZZRDSHKJUCIYQL = active GSK3B + ZZRDKBQKAVQCMK = inactive labeled dataset
modelling ... M (machine learning) training data (~80%) model
modelling ... M (machine learning) training data (~80%) model validation V validation data (~20%) model accuracy
M ... M TF2 TF1 training data (~80%) training data (~80%) V V VF2 VF1 validation data (~20%) validation data (~20%)
M M M M M V V V V V average accuracies cross-validated accuracy
modelling parameterization P M M M M M V V V V V cross-validated accuracy (for given parametrization)
P1 P2 P3
P1 P2 P3 pick best model best model
best model (but only for a single target!)
HPC Schedulers PBS SLURM Duration of tasks
HPC Schedulers PBS SLURM Work fl ow systems Luigi Air fl ow DAGMAN Duration of tasks
HPC Schedulers PBS SLURM Work fl ow systems Luigi Air fl ow DAGMAN Task-based programming OmpSs StarPU Duration of tasks
HPC Schedulers PBS SLURM Work fl ow systems Luigi Air fl ow DAGMAN Task-based programming OmpSs StarPU Duration of tasks
HPC Schedulers PBS Map Reduce SLURM Spark Hadoop Work fl ow systems Luigi Air fl ow DAGMAN Task-based programming OmpSs StarPU Duration of tasks
Berkeley Dask/ Distributed Ray HPC Schedulers Rain pycompss PBS Map Reduce SLURM Spark Hadoop Work fl ow systems Luigi Air fl ow DAGMAN Task-based programming OmpSs StarPU Duration of tasks
HyperLoom https://code.it4i.cz/ADAS/loom
Architecture worker pipeline client server worker worker
Architecture worker client server worker server schedules tasks worker for execution on workers
Architecture worker client server worker worker workers process tasks
Architecture server returns results to client worker client server worker results worker
Deployment worker pipeline single PBS client server worker job results worker
HyperLoom Checklist Inputs Task Intra-node tasks Output
HyperLoom Checklist Inputs Task Intra-node tasks Output Python code & .py Extern programs ./binary
HyperLoom Checklist Inputs Task Intra-node tasks Output Python code & .py Extern programs ./binary 100k+ Many tasks
HyperLoom Checklist Inputs Task Intra-node tasks Output Python code & .py Extern programs ./binary 100k+ Many tasks Inter-depenedant tasks
HyperLoom Checklist Inputs Task Intra-node tasks Output Python code & .py Extern programs ./binary 100k+ Many tasks Inter-depenedant tasks Heterogenous tasks
Example 1 import loom.client as lc # Pipeline Definition task1 = lc.tasks.const("Hello ") task2 = lc.tasks.const("world!") task3 = lc.tasks.merge((task1, task2)) # Create a client object client = lc.Client("localhost", 9010) # Submit tasks/pipeline future = client.submit_one(task3) # Gather result result = future.gather()
Example 1 import loom.client as lc const const # Pipeline Definition task1 = lc.tasks.const("Hello ") task2 = lc.tasks.const("world!") task3 = lc.tasks.merge((task1, task2)) merge # Create a client object client = lc.Client("localhost", 9010) # Submit tasks/pipeline future = client.submit_one(task3) # Gather result result = future.gather()
Example 1 import loom.client as lc const const # Pipeline Definition task1 = lc.tasks.const("Hello ") task2 = lc.tasks.const("world!") task3 = lc.tasks.merge((task1, task2)) merge # Create a client object client = lc.Client("localhost", 9010) # Submit tasks/pipeline future = client.submit_one(task3) # Gather result result = future.gather()
Example 1 import loom.client as lc const const # Pipeline Definition task1 = lc.tasks.const("Hello ") task2 = lc.tasks.const("world!") task3 = lc.tasks.merge((task1, task2)) merge # Create a client object client = lc.Client("localhost", 9010) # Submit tasks/pipeline future = client.submit_one(task3) # Gather result result = future.gather()
Example 1 import loom.client as lc const const # Pipeline Definition task1 = lc.tasks.const("Hello ") task2 = lc.tasks.const("world!") task3 = lc.tasks.merge((task1, task2)) merge # Create a client object client = lc.Client("localhost", 9010) # Submit tasks/pipeline future = client.submit_one(task3) # Gather result result = future.gather()
Example 2 import loom.client as lc # Pipeline Definition task1 = lc.tasks.run("/bin/hostname") # Create a client object client = lc.Client("localhost", 9010) # Submit tasks/pipeline future = client.submit_one(task1) # Gather result print(future.gather())
Example 3A import loom.client as lc # Pipeline Definition tasks = [lc.tasks.run("/bin/hostname") for i in range(100)] task1 = lc.tasks.merge(tasks) # Create a client object client = lc.Client("localhost", 9010) # Submit tasks/pipeline future = client.submit_one(task1) # Gather result print(future.gather())
Example 3A import loom.client as lc # Pipeline Definition tasks = [lc.tasks.run("/bin/hostname") for i in range(100)] task1 = lc.tasks.merge(tasks) # Create a client object client = lc.Client("localhost", 9010) # Submit tasks/pipeline future = client.submit_one(task1) # Gather result print(future.gather()) Output: r37u12n989 r36u11n321 r36u11n320 r37u12n989 r37u12n989
Example 3B import loom.client as lc # Pipeline Definition tasks = [lc.tasks.run("/bin/hostname > output", shell=True, outputs=["output"]) for i in range(100)] task1 = lc.tasks.merge(tasks) # Create a client object client = lc.Client("localhost", 9010) # Submit tasks/pipeline future = client.submit_one(task1) # Gather result print(future.gather())
Mapping inputs/outputs task_load_data = ... tasks = lc.tasks.run("./my-train --train-data=train.dat --output=model.dat", inputs=[(task_load_data, "train.dat")], outputs=["model.dat"])
Example 4 import loom.client as lc @lc.tasks.py_task() def tensorflow_task(param): import tensorflow as tf p = param.read() hello = tf.constant("{} from TensorFlow!".format(p)) sess = tf.Session() return sess.run(hello) param = lc.tasks.const("Hello") task = tensorflow_task(param) task.resource_request = lc.tasks.cpus(24) c = lc.Client("localhost", 9010) future = c.submit_one(task) result = future.gather()
Example 4 import loom.client as lc @lc.tasks.py_task() def tensorflow_task(param): import tensorflow as tf p = param.read() hello = tf.constant("{} from TensorFlow!".format(p)) sess = tf.Session() return sess.run(hello) param = lc.tasks.const("Hello") task = tensorflow_task(param) task.resource_request = lc.tasks.cpus(24) c = lc.Client("localhost", 9010) future = c.submit_one(task) result = future.gather()
Example 4 import loom.client as lc @lc.tasks.py_task() def tensorflow_task(param): import tensorflow as tf p = param.read() hello = tf.constant("{} from TensorFlow!".format(p)) sess = tf.Session() return sess.run(hello) param = lc.tasks.const("Hello") task = tensorflow_task(param) task.resource_request = lc.tasks.cpus(24) c = lc.Client("localhost", 9010) future = c.submit_one(task) result = future.gather()
Example 4 import loom.client as lc @lc.tasks.py_task() def tensorflow_task(param): import tensorflow as tf p = param.read() hello = tf.constant("{} from TensorFlow!".format(p)) sess = tf.Session() return sess.run(hello) param = lc.tasks.const("Hello") task = tensorflow_task(param) task.resource_request = lc.tasks.cpus(24) c = lc.Client("localhost", 9010) future = c.submit_one(task) result = future.gather()
Example 5 task = ... task.resource_request = lc.tasks.cpus(24) task.checkpoint_path = "/scratch/myproject/task-1"
Progress Lore Loom reporting Scheduling info Node utilization
Hands-on user@loginX.salomon$ cp -r /scratch/temp/hyperloom-tutorial/examples/ex1 . user@loginX.salomon$ cd ex1 user@loginX.salomon$ qsub ex1.pbs
Hands-on user@loginX.salomon$ cp -r /scratch/temp/hyperloom-tutorial/examples/ex1 . user@loginX.salomon$ cd ex1 user@loginX.salomon$ qsub ex1.pbs user@loginX.salomon$ ml load loom user@loginX.salomon$ python3 -m pip install bokeh==0.12.6 --user user@loginX.salomon$ python3 -m loom.lore lore_logs user@loginX.salomon$ firefox output.html # Or download output.html and open it locally
Snailwatch
Recommend
More recommend