You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: README.md
+64
Original file line number
Diff line number
Diff line change
@@ -269,3 +269,67 @@ Thanks also to fellow colleagues at Answer.AI team for their support, testing he
269
269
Join our community in the `#gpu-cpp` channel on the [AnswerDotAI Discord with this invite link](https://discord.gg/zmJVhXsC7f). Feel free to get in touch via X [@austinvhuang](https://twitter.com/austinvhuang) as well.
270
270
271
271
Feedback, issues and pull requests are welcome.
272
+
273
+
## Style and Design Guidelines
274
+
275
+
For contributors, here are general rules of thumb regarding the design and
276
+
style of the gpu.cpp library:
277
+
278
+
Aesthetics - Maximize Leverage and Account for Various Sources of Friction:
279
+
280
+
- In addition to performance, time-to-grok the codebase, compilation time,
281
+
number of failure modes for builds are things worth optimizing for.
282
+
- Increase the implementation surface area only when there's a clear goal
283
+
behind doing so. This maximizes leverage per unit effort, increases
284
+
optionality in how the library can be used, and keeps compile times low.
285
+
286
+
Overloads and Templates:
287
+
288
+
- Particularly for core implementation code, prefer value types over templates
289
+
where possible. It's generally easy to add a more typesafe templated wrapper
290
+
around a value type core implementation. Whereas reversing a core
291
+
implementation that's templated often leads to a more significant refactor.
292
+
- For comptime polymorphism prefer trivial function overloads over templates.
293
+
Besides compile time benefits, this makes it trivial to reason about which
294
+
version of a function is being called.
295
+
296
+
Avoid Encapsulation and Methods:
297
+
298
+
- To build systems effectively, we need to construct them out of subsystems for
299
+
which the behavior is known and thereby composable and predictable.
300
+
- Prefer transparency over encapsulation. Don't use abstract classes as
301
+
interface specifications, the library and its function signatures is the
302
+
interface.
303
+
- Use struct as a default over class unless there's a clear reason otherwise.
304
+
- Instead of methods, pass the "owning object" object as a reference to a
305
+
function. In general this convention can perform any operation that a method
306
+
can, but with more flexibility and less coupling. Using mutating functions
307
+
generalizes more cleanly to operations that have side effects on more than
308
+
one parameter, whereas methods priveledge the the owning class, treating the
309
+
single variable case as a special case and making it harder to generalize to
310
+
multiple parameters.
311
+
- Methods are usually only used for constructor/destructor/operator priveledged
312
+
cases.
313
+
- For operations requesting GPU resources and more complex initialization, use
314
+
factory functions following the `create[X]` convention - createTensor,
315
+
createKernel, createContext etc.
316
+
- Use (as-trivial-as-possible) constructors for simple supporting types (mostly
317
+
providing metadata for a dispatch) Shape, KernelCode, etc.
318
+
319
+
Ownership:
320
+
321
+
- Prefer stack allocation for ownership, use unique_ptr for ownership when the
322
+
heap is needed. Use raw pointers only for non-owning views. Avoid shared_ptr
323
+
unless there's a clear rationale for shared ownership.
324
+
- Use pools as a single point of control to manage sets of resources. Consider
325
+
incorporating a pool in Context if the resource is universal enough to the
326
+
overall API.
327
+
328
+
Separating Resource Acquisition from Hot Paths:
329
+
330
+
- In general, resource acquisition should be done ahead of time from the hot
331
+
paths of the application. This is to ensure that the hot paths are as fast as
332
+
possible and don't have to deal with resource allocation or data movement.
333
+
- Operations in the API should be implemented with a use in mind - typically
334
+
either ahead-of-time resource preparation/acquisition, hot-paths, or
0 commit comments