Skip to content

Commit 579e991

Browse files
authored
Rendering (#2)
* WIP * `RenderSync` * `RenderTarget`, `Swapchain` updates * Dynamic Rendering * Add deploy script * Update guide * Cleanup
1 parent 5a220f3 commit 579e991

16 files changed

+671
-8
lines changed

.github/workflows/deploy.yml

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
name: Deploy
2+
on:
3+
push:
4+
branches:
5+
- main
6+
jobs:
7+
deploy:
8+
runs-on: ubuntu-latest
9+
permissions:
10+
contents: write # To push a branch
11+
pages: write # To push to a GitHub Pages site
12+
id-token: write # To update the deployment status
13+
steps:
14+
- uses: actions/checkout@v4
15+
with:
16+
fetch-depth: 0
17+
- name: init
18+
run: |
19+
url="https://github.com/rust-lang/mdBook/releases/download/v0.4.47/mdbook-v0.4.47-x86_64-unknown-linux-gnu.tar.gz"
20+
mkdir mdbook
21+
curl -sSL $url | tar -xz --directory=./mdbook
22+
echo `pwd`/mdbook >> $GITHUB_PATH
23+
- name: build book
24+
run: |
25+
cd guide
26+
mdbook build
27+
- name: setup pages
28+
uses: actions/configure-pages@v4
29+
- name: upload artifact
30+
uses: actions/upload-pages-artifact@v3
31+
with:
32+
path: 'book'
33+
- name: Deploy to GitHub Pages
34+
id: deployment
35+
uses: actions/deploy-pages@v4

guide/src/SUMMARY.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,3 +16,8 @@
1616
- [Vulkan Device](initialization/device.md)
1717
- [Scoped Waiter](initialization/scoped_waiter.md)
1818
- [Swapchain](initialization/swapchain.md)
19+
- [Rendering](rendering/README.md)
20+
- [Swapchain Loop](rendering/swapchain_loop.md)
21+
- [Render Sync](rendering/render_sync.md)
22+
- [Swapchain Update](rendering/swapchain_update.md)
23+
- [Dynamic Rendering](rendering/dynamic_rendering.md)

guide/src/rendering/README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# Rendering
2+
3+
This section implements Render Sync, the Swapchain loop, performs Swapchain image layout transitions, and introduces Dynamic Rendering.
Lines changed: 172 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,172 @@
1+
# Dynamic Rendering
2+
3+
Dynamic Rendering enables us to avoid using Render Passes, which are quite a bit more verbose (but also generally more performant on tiled GPUs). Here we tie together the Swapchain, Render Sync, and rendering.
4+
5+
In the main loop, attempt to acquire a Swapchain image / Render Target:
6+
7+
```cpp
8+
auto const framebuffer_size = glfw::framebuffer_size(m_window.get());
9+
// minimized? skip loop.
10+
if (framebuffer_size.x <= 0 || framebuffer_size.y <= 0) { continue; }
11+
// an eErrorOutOfDateKHR result is not guaranteed if the
12+
// framebuffer size does not match the Swapchain image size, check it
13+
// explicitly.
14+
auto fb_size_changed = framebuffer_size != m_swapchain->get_size();
15+
auto& render_sync = m_render_sync.at(m_frame_index);
16+
auto render_target = m_swapchain->acquire_next_image(*render_sync.draw);
17+
if (fb_size_changed || !render_target) {
18+
m_swapchain->recreate(framebuffer_size);
19+
continue;
20+
}
21+
```
22+
23+
Wait for the associated fence and reset ('un'signal) it:
24+
25+
```cpp
26+
static constexpr auto fence_timeout_v =
27+
static_cast<std::uint64_t>(std::chrono::nanoseconds{3s}.count());
28+
auto result = m_device->waitForFences(*render_sync.drawn, vk::True,
29+
fence_timeout_v);
30+
if (result != vk::Result::eSuccess) {
31+
throw std::runtime_error{"Failed to wait for Render Fence"};
32+
}
33+
// reset fence _after_ acquisition of image: if it fails, the
34+
// fence remains signaled.
35+
m_device->resetFences(*render_sync.drawn);
36+
```
37+
38+
Since the fence has been reset, a queue submission must be made that signals it before continuing, otherwise the app will deadlock on the next wait (and eventually throw after 3s). We can now begin command buffer recording:
39+
40+
```cpp
41+
auto command_buffer_bi = vk::CommandBufferBeginInfo{};
42+
// this flag means recorded commands will not be reused.
43+
command_buffer_bi.setFlags(
44+
vk::CommandBufferUsageFlagBits::eOneTimeSubmit);
45+
render_sync.command_buffer.begin(command_buffer_bi);
46+
```
47+
48+
We are not ready to actually render anything yet, but can clear the image to a particular color. First we need to transition the image for rendering, ie Attachment Optimal layout. Set up the image barrier and record it:
49+
50+
```cpp
51+
auto dependency_info = vk::DependencyInfo{};
52+
auto barrier = m_swapchain->base_barrier();
53+
// Undefined => AttachmentOptimal
54+
// we don't need to block any operations before the barrier, since we
55+
// rely on the image acquired semaphore to block rendering.
56+
// any color attachment operations must happen after the barrier.
57+
barrier.setOldLayout(vk::ImageLayout::eUndefined)
58+
.setNewLayout(vk::ImageLayout::eAttachmentOptimal)
59+
.setSrcAccessMask(vk::AccessFlagBits2::eNone)
60+
.setSrcStageMask(vk::PipelineStageFlagBits2::eTopOfPipe)
61+
.setDstAccessMask(vk::AccessFlagBits2::eColorAttachmentWrite)
62+
.setDstStageMask(
63+
vk::PipelineStageFlagBits2::eColorAttachmentOutput);
64+
dependency_info.setImageMemoryBarriers(barrier);
65+
render_sync.command_buffer.pipelineBarrier2(dependency_info);
66+
```
67+
68+
Create an Rendering Attachment Info using the acquired image as the color target. We use a red clear color, make sure the Load Op clears the image, and Store Op stores the results (currently just the cleared image):
69+
70+
```cpp
71+
auto attachment_info = vk::RenderingAttachmentInfo{};
72+
attachment_info.setImageView(render_target->image_view)
73+
.setImageLayout(vk::ImageLayout::eAttachmentOptimal)
74+
.setLoadOp(vk::AttachmentLoadOp::eClear)
75+
.setStoreOp(vk::AttachmentStoreOp::eStore)
76+
.setClearValue(vk::ClearColorValue{1.0f, 0.0f, 0.0f, 1.0f});
77+
```
78+
79+
Set up a Rendering Info object with the color attachment and the entire image as the render area:
80+
81+
```cpp
82+
auto rendering_info = vk::RenderingInfo{};
83+
auto const render_area =
84+
vk::Rect2D{vk::Offset2D{}, render_target->extent};
85+
rendering_info.setRenderArea(render_area)
86+
.setColorAttachments(attachment_info)
87+
.setLayerCount(1);
88+
```
89+
90+
Finally, execute a render:
91+
92+
```cpp
93+
render_sync.command_buffer.beginRendering(rendering_info);
94+
// draw stuff here.
95+
render_sync.command_buffer.endRendering();
96+
```
97+
98+
Transition the image for presentation:
99+
100+
```cpp
101+
// AttachmentOptimal => PresentSrc
102+
// the barrier must wait for color attachment operations to complete.
103+
// we don't need any post-synchronization as the present Sempahore takes
104+
// care of that.
105+
barrier.setOldLayout(vk::ImageLayout::eAttachmentOptimal)
106+
.setNewLayout(vk::ImageLayout::ePresentSrcKHR)
107+
.setSrcAccessMask(vk::AccessFlagBits2::eColorAttachmentWrite)
108+
.setSrcStageMask(vk::PipelineStageFlagBits2::eColorAttachmentOutput)
109+
.setDstAccessMask(vk::AccessFlagBits2::eNone)
110+
.setDstStageMask(vk::PipelineStageFlagBits2::eBottomOfPipe);
111+
dependency_info.setImageMemoryBarriers(barrier);
112+
render_sync.command_buffer.pipelineBarrier2(dependency_info);
113+
```
114+
115+
End the command buffer and submit it:
116+
117+
```cpp
118+
render_sync.command_buffer.end();
119+
120+
auto submit_info = vk::SubmitInfo2{};
121+
auto const command_buffer_info =
122+
vk::CommandBufferSubmitInfo{render_sync.command_buffer};
123+
auto wait_semaphore_info = vk::SemaphoreSubmitInfo{};
124+
wait_semaphore_info.setSemaphore(*render_sync.draw)
125+
.setStageMask(vk::PipelineStageFlagBits2::eTopOfPipe);
126+
auto signal_semaphore_info = vk::SemaphoreSubmitInfo{};
127+
signal_semaphore_info.setSemaphore(*render_sync.present)
128+
.setStageMask(vk::PipelineStageFlagBits2::eColorAttachmentOutput);
129+
submit_info.setCommandBufferInfos(command_buffer_info)
130+
.setWaitSemaphoreInfos(wait_semaphore_info)
131+
.setSignalSemaphoreInfos(signal_semaphore_info);
132+
m_queue.submit2(submit_info, *render_sync.drawn);
133+
```
134+
135+
The `draw` Semaphore will be signaled by the Swapchain when the image is ready, which will trigger this command buffer's execution. It will signal the `present` Semaphore and `drawn` Fence on completion, with the latter being waited on the next time this virtual frame is processed. Finally, we increment the frame index, pass the `present` semaphore as the one for the subsequent present operation to wait on:
136+
137+
```cpp
138+
m_frame_index = (m_frame_index + 1) % m_render_sync.size();
139+
140+
if (!m_swapchain->present(m_queue, *render_sync.present)) {
141+
m_swapchain->recreate(framebuffer_size);
142+
continue;
143+
}
144+
```
145+
146+
> Wayland users: congratulaions, you can finally see and interact with the window!
147+
148+
![Cleared Image](./dynamic_rendering_red_clear.png)
149+
150+
## Render Doc on Wayland
151+
152+
At the time of writing, RenderDoc doesn't support inspecting Wayland applications. Temporarily force X11 (XWayland) by calling `glfwInitHint()` before `glfwInit()`:
153+
154+
```cpp
155+
glfwInitHint(GLFW_PLATFORM, GLFW_PLATFORM_X11);
156+
```
157+
158+
Setting up a command line option to conditionally call this is a simple and flexible approach: just set that argument in RenderDoc itself and/or pass it whenever an X11 backend is desired:
159+
160+
```cpp
161+
// main.cpp
162+
// skip the first argument.
163+
auto args = std::span{argv, static_cast<std::size_t>(argc)}.subspan(1);
164+
while (!args.empty()) {
165+
auto const arg = std::string_view{args.front()};
166+
if (arg == "-x" || arg == "--force-x11") {
167+
glfwInitHint(GLFW_PLATFORM, GLFW_PLATFORM_X11);
168+
}
169+
args = args.subspan(1);
170+
}
171+
lvk::App{}.run();
172+
```
Loading

guide/src/rendering/render_sync.md

Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
# Render Sync
2+
3+
Create a new header `resource_buffering.hpp`:
4+
5+
```cpp
6+
// Number of virtual frames.
7+
inline constexpr std::size_t buffering_v{2};
8+
9+
// Alias for N-buffered resources.
10+
template <typename Type>
11+
using Buffered = std::array<Type, buffering_v>;
12+
```
13+
14+
Add a private `struct RenderSync` to `App`:
15+
16+
```cpp
17+
struct RenderSync {
18+
// signaled when Swapchain image has been acquired.
19+
vk::UniqueSemaphore draw{};
20+
// signaled when image is ready to be presented.
21+
vk::UniqueSemaphore present{};
22+
// signaled with present Semaphore, waited on before next render.
23+
vk::UniqueFence drawn{};
24+
// used to record rendering commands.
25+
vk::CommandBuffer command_buffer{};
26+
};
27+
```
28+
29+
Add the new members associated with the Swapchain loop:
30+
31+
```cpp
32+
// command pool for all render Command Buffers.
33+
vk::UniqueCommandPool m_render_cmd_pool{};
34+
// Sync and Command Buffer for virtual frames.
35+
Buffered<RenderSync> m_render_sync{};
36+
// Current virtual frame index.
37+
std::size_t m_frame_index{};
38+
```
39+
40+
Add, implement, and call the create function:
41+
42+
```cpp
43+
void App::create_render_sync() {
44+
// Command Buffers are 'allocated' from a Command Pool (which is 'created'
45+
// like all other Vulkan objects so far). We can allocate all the buffers
46+
// from a single pool here.
47+
auto command_pool_ci = vk::CommandPoolCreateInfo{};
48+
// this flag enables resetting the command buffer for re-recording (unlike a
49+
// single-time submit scenario).
50+
command_pool_ci.setFlags(vk::CommandPoolCreateFlagBits::eResetCommandBuffer)
51+
.setQueueFamilyIndex(m_gpu.queue_family);
52+
m_render_cmd_pool = m_device->createCommandPoolUnique(command_pool_ci);
53+
54+
auto command_buffer_ai = vk::CommandBufferAllocateInfo{};
55+
command_buffer_ai.setCommandPool(*m_render_cmd_pool)
56+
.setCommandBufferCount(static_cast<std::uint32_t>(resource_buffering_v))
57+
.setLevel(vk::CommandBufferLevel::ePrimary);
58+
auto const command_buffers =
59+
m_device->allocateCommandBuffers(command_buffer_ai);
60+
assert(command_buffers.size() == m_render_sync.size());
61+
62+
// we create Render Fences as pre-signaled so that on the first render for
63+
// each virtual frame we don't wait on their fences (since there's nothing
64+
// to wait for yet).
65+
static constexpr auto fence_create_info_v =
66+
vk::FenceCreateInfo{vk::FenceCreateFlagBits::eSignaled};
67+
for (auto [sync, command_buffer] :
68+
std::views::zip(m_render_sync, command_buffers)) {
69+
sync.command_buffer = command_buffer;
70+
sync.draw = m_device->createSemaphoreUnique({});
71+
sync.present = m_device->createSemaphoreUnique({});
72+
sync.drawn = m_device->createFenceUnique(fence_create_info_v);
73+
}
74+
}
75+
```

guide/src/rendering/swapchain_loop.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
# Swapchain Loop
2+
3+
One part of rendering in the main loop is the Swapchain loop, which at a high level comprises of these steps:
4+
5+
1. Acquire a Swapchain Image (and its view)
6+
1. Render to the acquired Image
7+
1. Present the Image (this releases the image back to the Swapchain)
8+
9+
![WSI Engine](./wsi_engine.png)
10+
11+
There are a few nuances to deal with, for instance:
12+
13+
1. Acquiring (and/or presenting) will sometimes fail (eg because the Swapchain is out of date), in which case the remaining steps need to be skipped
14+
1. The acquire command can return before the image is actually ready for use, rendering needs to be synchronized to only start after the image is ready
15+
1. The images need appropriate Layout Transitions at each stage
16+
17+
Additionally, the number of swapchain images can vary, whereas the engine should use a fixed number of _virtual frames_: 2 for double buffering, 3 for triple (more is usually overkill). It's also possible for the main loop to acquire the same image before a previous render command has finished (or even started), if the Swapchain is using Mailbox Present Mode. While FIFO will block until the oldest submitted image is available (also known as vsync), we should still synchronize and wait until the acquired image has finished rendering.
18+
19+
## Virtual Frames
20+
21+
All the dynamic resources used during the rendering of a frame comprise a virtual frame. The application has a fixed number of virtual frames which it cycles through on each render pass. Each frame will be associated with a `vk::Fence` which will be waited on before rendering to it again. It will also have a pair of `vk::Semaphore`s to synchronize the acquire, render, and present calls on the GPU (we don't need to wait for them in the code). Lastly, there will be a Command Buffer per virtual frame, where all rendering commands for that frame (including layout transitions) will be recorded.
22+
23+
## Image Layouts
24+
25+
Vulkan Images have a property known as Image Layout. Most operations on images require them to be in certain specific layouts, requiring transitions before (and after). A layout transition conveniently also functions as a Pipeline Barrier (think memory barrier on the GPU), enabling us to synchronize operations before and after the transition.
26+
27+
Vulkan Synchronization is arguably the most complicated aspect of the API, a good amount of research is recommended. Here is an [article explaining barriers](https://gpuopen.com/learn/vulkan-barriers-explained/).

0 commit comments

Comments
 (0)