ExpectationMax - 用于在GPU上运行作业的简单计划程序

网友投稿 603 2022-10-24

ExpectationMax - 用于在GPU上运行作业的简单计划程序

ExpectationMax - 用于在GPU上运行作业的简单计划程序

simple_gpu_scheduler

A simple scheduler to run your commands on individual GPUs. Following the KISS principle, this script simply accepts commands via stdin and executes them on a specific GPU by setting the CUDA_VISIBLE_DEVICES variable.

The commands read are executed using the login shell, thus redirections > pipes | and all other kinds of shell magic can be used.

Installation

The package can simply be installed from pypi

$ pip3 install simple-gpu-scheduler

Simple Example

Suppose you have a file gpu_commands.txt with commands that you would like to execute on the GPUs 0, 1 and 2 in parallel:

$ cat gpu_commands.txtpython train_model.py --lr 0.001 --output run_1python train_model.py --lr 0.0005 --output run_2python train_model.py --lr 0.0001 --output run_3

Then you can do so by simply piping the command into the simple_gpu_scheduler script

$ simple_gpu_scheduler --gpus 0 1 2 < gpu_commands.txtProcessing command `python train_model.py --lr 0.001 --output run_1` on gpu 2Processing command `python train_model.py --lr 0.0005 --output run_2` on gpu 1Processing command `python train_model.py --lr 0.0001 --output run_3` on gpu 0

For further details see simple_gpu_scheduler -h.

Hyperparameter search

In order to allow user friendly utilization of the scheduler in the common scenario of hyperparameter search, a convenience script simple_hypersearch is included in the package. The output can directly be piped into simple_gpu_scheduler or appended to the "queue file" (see Simple scheduler for jobs).

Grid of all possible parameter configurations in random order:

simple_hypersearch "python3 train_dnn.py --lr {lr} --batch_size {bs}" -p lr 0.001 0.0005 0.0001 -p bs 32 64 128 | simple_gpu_scheduler --gpus 0,1,2

5 uniformly sampled parameter configurations:

simple_hypersearch "python3 train_dnn.py --lr {lr} --batch_size {bs}" --n-samples 5 -p lr 0.001 0.0005 0.0001 -p bs 32 64 128 | simple_gpu_scheduler --gpus 0,1,2

For further information see the simple_hypersearch -h.

Simple scheduler for jobs

Combined with some basic command line tools, one can set up a very basic scheduler which waits for new jobs to be "submitted" and executes them in order of submission.

Setup and start scheduler in background or in a separate permanent session (using for example tmux):

touch gpu.queuetail -f -n 0 gpu.queue | simple_gpu_scheduler --gpus 0,1,2

the command tail -f -n 0 follows the end of the gpu.queue file. Thus if there was anything written into gpu.queue prior to the execution of the command it will not be passed to simple_gpu_scheduler.

Then submitting commands boils down to appending text to the gpu.queue file:

echo "my_command_with | and stuff > logfile" >> gpu.queue

TODO

Multi line jobs (evtl. we would then need a submission script after all)Stop, but let commands finish when receiving a defined signalTests would be nice, until now the project is still very small but if it grows tests should be added

版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。

上一篇:mysql的全文索引支持中文,且可以以自然语言处理方式
下一篇:Nacos源码阅读方法
相关文章

 发表评论

暂时没有评论,来抢沙发吧~