-
Notifications
You must be signed in to change notification settings - Fork 511
Multi user feeder
The current feeder (sched/feeder.cpp) has various options, e.g. to divide computing equally among apps, or to accelerate batch completion by enumerating high-priority jobs first. But it has no provision for dividing computing among multiple users (job submitters). The new feeder (sched/feeder_user.php) provides this function.
Each user has a 'resource share' (arbitrary units). The user's job stream has a 'usage'. When we send a job we increment this by 1/share. We pick job from the user with the lowest usage.
Say e.g. users A and B have shares 4 and 10. Initially usages are zero. Pick a job for user A. A.usage is .25 Pick a job for user B. B.usage is .1 Pick a job for user B. B.usage is .2 Pick a job for user B. B.usage is .3 Pick a job for user A. A.usage is .5 .. and so on; A and B will leapfrog each other in the right ratio.
This works fine for users with lots of jobs. What if there's a user C, initially with no jobs. If we've been going for a while, A and B will have high usages. If user C submits lots of jobs, we'd pick ONLY jobs from C for a long time.
Solution: let N = ~1000. N is the max # of jobs a newly active user will get before entering round-robin w/ other users.
Let X be the largest share, and let Y be 1000/X. When a usage exceeds Y by an amount Z, subtract Z from all usages that are > Z
-
there are lots of jobs to send. when a job is needed, we get one from the enumerator. These may be in shmem and we skip them. Jobs may be boosted in priority, which moves them to the start of the next enum. What we want to avoid: doing frequent enums that are mostly or all already in shmem
-
We run out of jobs to send. This happens when we do an enum and everything is already in shmem. In this case we don't want to enumerate again until the stream has no jobs in shmem.
-
A period of no jobs Enums will return nothing, so they're cheap. When a new job arrives we want to send it out fairly quickly, like 10 sec or at most a minute
JOB_STREAM fields
bool paused
last enum was empty or entirely in shmem
don't start a new one until time t
int n_inmem
number of jobs currently in shmem
num_rows
# rows in event set
num_left
how many rows left in event set
feeder_loop()
while 1:
if !scan_work_array()
sleep(1 or so)
bool scan_work_array():
action = false
for each slot s in shmem
is s is empty
if fill(slot):
action = true
else
break
return action
fill(slot):
while 1:
S = the highest-prio stream that is not paused
if !S:
break
if S.get_job()
return true // job was inserted
(S is now paused)
return false
bool S::scan_result_set()
while num_left>0
mysql_fetch_row()
num_left--
if job is usable and not in shmem
insert in shmem
return true
return false
bool S::get_job()
if S.scan_result_set()
return true
mysql_query()
num_left = mysql_num_rows
if S.scan_result_set()
return true
paused = true
return false