新建
LBG 命令行提交任务常见QA

Hui_Zhou

推荐镜像 :Basic Image:bohrium-notebook:2023-03-26
推荐机型 :c2_m4_cpu
赞
1
- 批量提交的任务如何将任务结果下载到指定文件夹?
答:有两种方式可以实现任务结果的批量回收下载,一种是在批量提交任务时指定 “-r” 参数,可以自动将特定任务结果回收到指定文件夹(推荐);另一种则需要自行编写脚本手动下载任务结果。
- 预先准备了 5 个 DeepMD-kit 的训练任务,它们的提交任务脚本完全相同,仅文件夹名称不同

具体的任务提交脚本文件(job.json)内容为:
{
"job_name": "DeePMD-kit example",
"command": " bash job.sh",
"log_file": "se_e2_a/tmp_log",
"backward_files": [],
"project_id": *****,
"platform": "ali",
"machine_type": "c4_m15_1 * NVIDIA T4",
"job_type": "container",
"image_address": "registry.dp.tech/dptech/deepmd-kit:2.1.5-cuda11.6"
}
- 使用
-r
参数批量提交任务,并指定回收目录为/data/result
,具体提交命令为:
for dir in `ls`
do
cd <span class="katex"><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord mathnormal">d</span><span class="mord mathnormal">i</span><span class="mord mathnormal" style="margin-right:0.02778em;">r</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">b</span><span class="mord mathnormal" style="margin-right:0.03588em;">g</span><span class="mord mathnormal" style="margin-right:0.05724em;">j</span><span class="mord mathnormal">o</span><span class="mord mathnormal">b</span><span class="mord mathnormal">s</span><span class="mord mathnormal">u</span><span class="mord mathnormal">bmi</span><span class="mord mathnormal">t</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.05724em;">ij</span><span class="mord mathnormal">o</span><span class="mord mathnormal">b</span><span class="mord">.</span><span class="mord mathnormal" style="margin-right:0.05724em;">j</span><span class="mord mathnormal">so</span><span class="mord mathnormal">n</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.7778em;vertical-align:-0.1944em;"></span><span class="mord mathnormal">p</span><span class="mord">.</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">r</span><span class="mord">/</span><span class="mord mathnormal">d</span><span class="mord mathnormal">a</span><span class="mord mathnormal">t</span><span class="mord mathnormal">a</span><span class="mord">/</span><span class="mord mathnormal">res</span><span class="mord mathnormal">u</span><span class="mord mathnormal">lt</span><span class="mord">/</span></span></span></span>dir
cd ..
done
- 任务运行结束后,即可在回收目录
/data/result
下查看任务结果,如图所示:

如果在提交任务时未指定
-r
参数,需使用此种方法回收批量任务结果
- 与方式一相同,预先准备了 5 个文件夹并进行了批量提交,但未指定
-r
参数,具体的提交命令如下:
for dir in `ls`
do
cd $dir
lbg job submit -i job.json -p .
cd ..
done
- 定位到批量任务的
jobid
- 最近提交的 n 个任务
lbg jobgroup ls -n 5
- 搜索特定名称
lbg jobgroup ls -k "DeepMD-kit example"
- 指定时间范围
lbg jobgroup ls -s 2023-12-08 -e 2023-12-08
- 多级搜索
lbg jobgroup ls -s 2023-12-08 -e 2023-12-08 -k "DeepMD-kit example" -n 5
当然,也可以使用 grep、sed、awk 等命令来组合定位到任务 id
- 定位到特定的任务后,使用
awk
来获取jobid
,并自动将结果回收,参考命令如下:
jobids=( `lbg jobgroup ls -s 2023-12-08 -e 2023-12-08 -k "DeepMD-kit example" -n 5 | awk '{print $1}' | tail -n 5 | sort -n -k 1,1` )
i=0
for dir in `ls`
do
cd <span class="katex"><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord mathnormal">d</span><span class="mord mathnormal">i</span><span class="mord mathnormal" style="margin-right:0.02778em;">r</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">b</span><span class="mord mathnormal" style="margin-right:0.03588em;">g</span><span class="mord mathnormal" style="margin-right:0.05724em;">j</span><span class="mord mathnormal">o</span><span class="mord mathnormal">b</span><span class="mord mathnormal" style="margin-right:0.03588em;">g</span><span class="mord mathnormal">ro</span><span class="mord mathnormal">u</span><span class="mord mathnormal">p</span><span class="mord mathnormal">d</span><span class="mord mathnormal">o</span><span class="mord mathnormal" style="margin-right:0.02691em;">w</span><span class="mord mathnormal">n</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">o</span><span class="mord mathnormal">a</span><span class="mord mathnormal">d</span></span></span></span>{jobids[$i]}
let i++
cd ..
done
下载后的结果如下:

代码
文本
点个赞吧
本文被以下合集收录
study

Lily

更新于 2024-02-20
7 篇0 人关注
哈!这又是一个超级厉害的合集,看!

cool hot hot

更新于 2023-09-14
3 篇0 人关注
推荐阅读
公开
基于DPDispatcher在Bohrium上批量提交任务
Wenshuo Liang

发布于 2023-10-17
1 转存文件
公开
Dp-dispatcher提交任务
ck

发布于 2023-10-25
2 赞1 转存文件