Skip to content

Multi user feeder

David Anderson edited this page Feb 15, 2026 · 2 revisions

The current feeder (sched/feeder.cpp) has various options, e.g. to divide computing equally among apps, or to accelerate batch completion by enumerating high-priority jobs first. But it has no provision for dividing computing among multiple users (job submitters). The new feeder (sched/feeder_user.php) provides this function.

Allocation

Each user has a 'resource share' (arbitrary units). The user's job stream has a 'usage'. When we send a job we increment this by 1/share. We pick job from the user with the lowest usage.

Say e.g. users A and B have shares 4 and 10. Initially usages are zero. Pick a job for user A. A.usage is .25 Pick a job for user B. B.usage is .1 Pick a job for user B. B.usage is .2 Pick a job for user B. B.usage is .3 Pick a job for user A. A.usage is .5 .. and so on; A and B will leapfrog each other in the right ratio.

This works fine for users with lots of jobs. What if there's a user C, initially with no jobs. If we've been going for a while, A and B will have high usages. If user C submits lots of jobs, we'd pick ONLY jobs from C for a long time.

Solution: let N = ~1000. N is the max # of jobs a newly active user will get before entering round-robin w/ other users.

Let X be the largest share, and let Y be 1000/X. When a usage exceeds Y by an amount Z, subtract Z from all usages that are > Z

Implementation notes

  • there are lots of jobs to send. when a job is needed, we get one from the enumerator. These may be in shmem and we skip them. Jobs may be boosted in priority, which moves them to the start of the next enum. What we want to avoid: doing frequent enums that are mostly or all already in shmem

  • We run out of jobs to send. This happens when we do an enum and everything is already in shmem. In this case we don't want to enumerate again until the stream has no jobs in shmem.

  • A period of no jobs Enums will return nothing, so they're cheap. When a new job arrives we want to send it out fairly quickly, like 10 sec or at most a minute

Logic

JOB_STREAM fields
    bool paused
        last enum was empty or entirely in shmem
        don't start a new one until time t
    int n_inmem
        number of jobs currently in shmem
    num_rows
        # rows in event set
    num_left
        how many rows left in event set

feeder_loop()
    while 1:
        if !scan_work_array()
            sleep(1 or so)

bool scan_work_array():
    action = false
    for each slot s in shmem
        is s is empty
            if fill(slot):
                action = true
            else
                break
    return action

fill(slot):
    while 1:
        S = the highest-prio stream that is not paused
        if !S:
            break
        if S.get_job()
            return true  // job was inserted
        (S is now paused)
    return false

bool S::scan_result_set()
    while num_left>0
        mysql_fetch_row()
        num_left--
        if job is usable and not in shmem
            insert in shmem
            return true
    return false

bool S::get_job()
    if S.scan_result_set()
        return true
    mysql_query()
    num_left = mysql_num_rows
    if S.scan_result_set()
        return true
    paused = true
    return false

Clone this wiki locally