Software and data architecture

Since its first version in 2002, BOINC has evolved continuously and fairly smoothly. It's now a feature-rich system, but its code size is fairly small, and the code is malleable (easy to change and extend).

This is due in large part to BOINC's architecture. This has three aspects: software, components, and data.

Software

This is discussed here.

Components

We divide functionality into "components" that community view network APIs (typically XML RPCs over TCP connections). For example:

The client and the Manager.
The various server components (feeder, scheduler, transitioner, validators etc.).

Note: when I started BOINC, JSON didn't exist. Otherwise I would have used it instead of XML.

This lets the components run on different hosts or containers, and disentangles their software.

Data architecture

BOINC stores information in various forms. Guidelines:

MySQL database: The DB server is a potential performance bottleneck, and there is a code overhead for DB tables (you need to write interface functions in C++ and/or PHP, and you need to write web pages or scripts for editing data). So use the DB only when it's necessary (e.g. because you need indices for large data) and avoid creating new tables. Don't use referential constraints; instead, check for lookups returning null.
XML fields in DB tables: Simplify DB design by storing information in XML in DB fields (for example, computing preferences and job file lists). This can eliminate the need for separate tables.
XML files: BOINC uses XML format both for internal state (e.g. client_state.xml) and configuration files (cc_config.xml on the client, project.xml on the server). Sometimes we use enclosing tags for multiple items (e.g. <tasks> in project.xml); sometimes we don't. Either is fine - imitate what's already there.
PHP and/or C++ code: If data is used only in PHP, put it in a PHP data structure. You can assume that project admins are able to edit these (e.g. html/project/project.inc).

Avoid over-engineering

If you make a big change to solve a problem that never actually arises, that's over-engineering. Avoid it.

There are various things (HTTP headers, message boards, XML parsing) where we've had to choose between using someone else's library or doing it ourselves. There are always tradeoffs. If we have something that works and is stable (e.g. XML parsing), leave it.

In general, big changes should be undertaken only if it's certain that there's going to be a big reward. Big changes introduce bugs, and end up costing more than you think. One exception: occasionally I'll do 'code cleanups' whose only effect is to shorten or simplify the code.

Home

Software and data architecture

Software

Components

Data architecture

Avoid over-engineering

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!