Job Scheduling in a Distributed System Using Backfilling with Inaccurate Runtime Computations

为什么80%的码农都做不了架构师？>>> Job Scheduling in a Distributed System Using Backfilling with Inaccurate Runtime Computations

Two different scheduling techniques are considered and a simulation model
is used to evaluate system performance.

在本文中我们检查 a grid system where both parallel and sequential jobs require service.使用回填，but an error margin is added to a job’s runtime prediction.

队列网络模型：

Job Scheduling in a Distributed System Using Backfilling with Inaccurate Runtime Computations

系统中有三个到达流：

one at the GS (grid jobs 网格任务) and one inside each of the two sites (local jobs 本地任务).

一个gang可以有2～13个任务，一致分布。gang size=2，4，8，16

A job can start execution prior to a gang waiting in the queue if the following condition is met:

ServiceTime<=ElapsedTime+T

为实现回填方法，我们需要知道以下参数：

1）一个任务的服务时间

2）The exact time that all needed resources will be free for the gang to start execution.

Job Scheduling in a Distributed System Using Backfilling with Inaccurate Runtime Computations

SLD：平均slowdown

任务j的减速sj=响应时间rj/服务的时间ej

平均响应时间RT=sum(r_j)/m

wiki名词解释：Gang调度

Gang scheduling is used so that if two or more threads or processes communicate with each other, they will all be ready to communicate at the same time. If they were not gang-scheduled, then one could wait to send or receive a message to another while it is sleeping, and vice versa. When processors are over-subscribed（超额认购） and gang scheduling is not used within a group of processes or threads which communicate with each other, it can lead to situations where each communication event suffers the overhead of a context switch（上下文交换）.

Gang scheduling is based on a data structure called the Ousterhout matrix（？）. In this matrix each row represents a time slice（时间片）, and each column a processor. The threads or processes of each job are packed into a single row of the matrix.[1] During execution, coordinated context switching is performed across all nodes to switch from the processes in one row to those in the next row.

Gang scheduling is stricter than coscheduling.[2] It requires all threads of the same process to run concurrently, while coscheduling allows for fragments, which are sets of threads that do not run concurrently with the rest of the gang.

Gang scheduling was implemented and used in production mode on several parallel machines, most notably the Connection Machine CM-5.

转载于:https://my.oschina.net/lfxu/blog/1508014

Job Scheduling in a Distributed System Using Backfilling with Inaccurate Runtime Computations

相关推荐