30 minutes to know Vulkan API
This article is written for readers who are familiar with the existing concepts of D3D11 and GL, but want to know more about how they are implemented in Vulkan. So this article will end with a conceptual tour of Vulkan.
The article does not seek to be easy to understand (in order to understand the obscure content, you should read the Vulkan specification or a more in-depth tutorial).
This is the first time I have seen Vulkan write.
- baldurk
General
At the end of the article, we will introduce a concise pseudocode to illustrate the general steps of displaying a triangle.
Following are some simple points of attention that do not apply to other parts:
- Vulkan is a C language graphics library similar to GL
- This API is strongly typed, unlike GL using GLenum to get all the parameters, the enumeration of Vk is typed separately.
- Most function calls are structured or nested.
- VkAllocation Callbacks * is a parameter of many vkCreate * functions, which are used to take in memory allocation, and of course you can simply pass NULL.
Note: I didn't consider any error exception handling, nor did I talk about the implementation limitations of query.
Initialization steps
Initialization of the Vulkan API must create instances (VkInstance). Vulkan instances are independent of each other, so you can set different properties for different instances (for example, whether validation layer s and extensions are activated).
The availability of GPU devices can be checked by VkInstance. Vulkan is not necessarily running on GPU, but we simplify the problem. Each GPU provides a handle - VkPhysical Device. You can use it to query GPU vendors, properties, capabilities, etc.
VkDevice can be created through VkPhysical Device. VkDevice is the main call handle, which is logically associated with the GPU. It's equivalent to GL Context or D3D11 Device.
A VkInstance can have more than one VkPhysical Device, and a VkPhysical Device can have more than one VkDevice. In Vulkan 1.0, cross-GPU calls are not yet implemented (in the future).
The approximate initialization call is like this: vkCreateInstance() vkEnumeratePhysical Devices () vkCreateDevice(). For a rudimentary Hello Triangle program, you simply take the first physical device as the main VkDevice, and then open the corresponding properties of the device (error reporting, API call checking) for development.
Images and Buffers
Now that we have VkDevice, we can start creating any type of resource, such as VkImage and VkBuffer.
In contrast to GL, when using Vulkan, you have to declare the use of Image before creating it. You can set bit to represent the use type of Image - Color Attachment, Sampled Image, or Image load/store.
You can also specify Tiling mode of Image - Linear or Optimal. This sets the layout of Image in memory. This affects whether image data is readable and writable.
Buffer is similar and more direct. You specify size and use.
Image is not directly used, so you're going to create a VkImageView - similar to D3D11. Unlike GLTextureView, Vulkan's ImageView is mandatory, but the same idea is used to describe where array slices or MIPLevel s are exposed to ImageView, as well as alternative (but compatible) formats (such as using UNORM textures as UINT s).
Buffers are usually used directly because they are just a piece of memory, but if you want to use them as TextureBuffer s in Shader, you need to provide a VkBufferView.
GPU memory allocation
The newly created Buffer s and Image s are not immediately available because we do not allocate memory for them.
We can query the memory available to the application by calling vkGetPhysical Device Memory Properties. It returns one or more memory heaps of request size, or one or more memory types of request properties. Each memory type comes from a memory heap - so an example is that a separate graphics card on a PC will have two heaps - one is system memory, the other is GPU memory, and they each have multiple memory types.
Memory types have different properties. Some memory can be accessed by CPU or not, GPU and CPU access is consistent, cached or not, and so on.
Memory can be allocated by calling vkAllocateMemory(), but it requires a VkDevice handle and a description structure.
HostVisibleMemory can update data by Map (vkMapMemory()/vkUnmapMemory()).
GL Users should be familiar with the concept, but explain it to D3D11 Users, vkMapMemory The returned pointer can be hold Occupied by CPU Write when GPU They are being used. These persistent mappings are perfectly correct as long as you follow the rules and make sure that memory access is synchronized.
This is a little outside the scope of this guide but I'm going to mention it any chance I get - for the purposes of debugging, persistent maps of non-coherent memory with explicit region flushes will be much more efficient/fast than coherent memory. The reason being that for coherent memory the debugger must jump through hoops to detect and track changes, but the explicit flushes of non-coherent memory provide nice markup of modifications.
In RenderDoc to help out with this, if you flush a memory region then the tool assumes you will flush for every write, and turns off the expensive hoop-jumping to track coherent memory. That way even if the only memory available is coherent, then you can get efficient debugging.
It's a little bit out of this tutorial, but I'll mention it whenever I have a chance.
Memory binding
By calling vkGetBuffer Memory Requirements / vkGetImageMemory Requirements, you can know the memory requirements and types of VkBuffer/VkImage.
The acquired memory size is memory alignment or paving between Mips, hidden metadata and other required objects. Requirements also include BitmapMask compatible with the resource memory type. The obvious limitation here is that Optimal's Color Attach Image will only use DeviceLocal memory, and it would be incorrect to try to bind Host Visible memory.
If you have the same type of Image or Buffer, memory type requirements are usually not very different. For example, if you know that the optimal tiled image can use memory type 3, you can allocate them from the same place. You just need to check the size and alignment requirements for each image. In order to accurately guarantee to read the norms.
Note the memory allocation is by no means 1:1. You can allocate a large amount of memory and as long as you obey the above restrictions you can place several images or buffers in it at different offsets. The requirements include an alignment if you are placing the resource at a non-zero offset. In fact you will definitely want to do this in any real application, as there are limits on the total number of allocations allowed.
There is an additional alignment requirement bufferImageGranularity - a minimum separation required between memory used for a VkImage and memory used for a VkBuffer in the same VkDeviceMemory. Read the spec for more details, but this mostly boils down to an effective page size, and requirement that each page is only used for one type of resource.
Note that memory allocation is not 1:1. You can allocate a large piece of memory as long as you follow the above restrictions.
You can put several different Image s and Buffer s on different memory offsets.
If you place resources on offsets that are not zero, you also need an aligned property.
In fact, you will ultimately decide what to do on any application, because the total allocation is limited.
Once you have the right memory type and size and alignment, you can bind it with vkBindBufferMemory or vkBindImageMemory. This binding is immutable, and must happen before you start using the buffer or image.
Once you have the correct memory type, size, and alignment, you can bind memory through vkBind Buffer Memory or vkBind Image Memory. This binding is fixed and occurs before you start using Buffer or Image.
Command buffering and submission
Work is explicitly recorded to and submitted from a VkCommandBuffer.
Execution is done by VkCommand Buffer explicitly recording and submitting.
A VkCommandBuffer isn't created directly, it is allocated from a VkCommandPool. This allows for better threading behaviour since command buffers and command pools must be externally synchronised (see later). You can have a pool per thread and vkAllocateCommandBuffers()/vkFreeCommandBuffers() command buffers from it without heavy locking.
VkCommand Buffer is not created directly, it is allocated from VkCommand Pool. Because the command buffer and the command buffer pool need to be synchronized separately, this achieves better multithreading behavior. You can create a command buffer pool in each thread and then allocate (vkAllocateCommandBuffers()/vkFreeCommandBuffers()) command buffers unlocked.
Once you have a VkCommandBuffer you begin recording, issue all your GPU commands into it hand waving goes here and end recording.
After creating VkCommand Buffer, you can start recording GPU commands.
Command buffers are submitted to a VkQueue. The notion of queues are how work becomes serialised to be passed to the GPU. A VkPhysicalDevice (remember way back? The GPU handle) can report a number of queue families with different capabilities. e.g. a graphics queue family and a compute-only queue family. When you create your device you ask for a certain number of queues from each family, and then you can enumerate them from the device after creation with vkGetDeviceQueue().
I'm going to focus on having just a single do-everything VkQueue as the simple case, since multiple queues must be synchronised against each other as they can run out of order or in parallel to each other. Be aware that some implementations might require you to use a separate queue for swapchain presentation - I think chances are that most won't, but you have to account for this. Again, read the spec for details!
I'll focus on a VkQueue that can do anything as a simple example, because multiple queues must synchronize with each other to achieve parallel execution without affecting the order. Some Vulkan implementations may require you to complete the swapchain presentation in a separate queue.
You can vkQueueSubmit() several command buffers at once to the queue and they will be executed in turn. Nominally this defines the order of execution but remember that Vulkan has very specific ordering guarantees - mostly about what work can overlap rather than wholesale rearrangement - so take care to read the spec to make sure you synchronise everything correctly.
You can call vkQueueSubmit() to submit several command buffers at a time, and then they will be executed sequentially. On the surface, this will determine the order in which commands are executed, but remember that Vulkan has a very clear order guarantee - about which jobs can be overwritten or reordered altogether, so read the specification carefully to determine the correct synchronization of all objects.
Shaders and PSO
The reasoning behind moving to monolithic PSOs is well trodden by now so I won't go over it.
The reason for the overall shift to PSO is already very good, so I won't repeat it here.
A Vulkan VkPipeline bakes in a lot of state, but allows specific parts of the fixed function pipeline to be set dynamically: Things like viewport, stencil masks and refs, blend constants, etc. A full list as ever is in the spec. When you call vkCreateGraphicsPipelines(), you choose which states will be dynamic, and the others are taken from values specified in the PSO creation info.
Vulkan's pipelines bake many states inside, but allow certain parts of the fixed pipeline to change dynamically, such as viewport, masks, mixing, etc. The specification contains the state of the entire pipeline. When you call vkCreateGraphics Pipelines, you need to choose which part becomes dynamic and the rest comes entirely from PSO creation information.
You can optionally specify a VkPipelineCache at creation time. This allows you to compile a whole bunch of pipelines and then call vkGetPipelineCacheData() to save the blob of data to disk. Next time you can prepopulate the cache to save on PSO creation time. The expected caveats apply - there is versioning to be aware of so you can't load out of date or incorrect caches.
You can also optionally create it with Cache at the time of creation. This will allow you to compile the entire pipeline and save data to disk via vkGetPipeline CacheData. Next time create pipelines directly with cache to shorten the creation time.
Shaders are specified as SPIR-V. This has already been discussed much better elsewhere, so I will just say that you create a VkShaderModule from a SPIR-V module, which could contain several entry points, and at pipeline creation time you chose one particular entry point.
The shader is specified in SPIR-V format. This has been discussed extensively elsewhere, so I'll just talk about creating a VkShaderModule from SPIR-V (containing multiple entries, specifying function entries during creation).
The easiest way to get some SPIR-V for testing is with the reference compiler glslang, but other front-ends are available, as well as LLVM → SPIR-V support.
The simplest way to test SPIR-V is to use the reference compiler glslang. Other compiler front-ends are OK, and there is also support for LLVM - > SPIR-V.
Binding model
To establish a point of reference, let's roughly outline D3D11's binding model. GL's is quite similar.
In order to have a reference point, we roughly go through the binding model of D3D11. GL is very similar.
Each shader stage has its own namespace, so pixel shader texture binding 0 is not vertex shader texture binding 0.
Each resource type is namespaced apart, so constant buffer binding 0 is definitely not the same as texture binding 0.
Resources are individually bound and unbound to slots (or at best in contiguous batches).
In Vulkan, the base binding unit is a descriptor. A descriptor is an opaque representation that stores 'one bind'. This could be an image, a sampler, a uniform/constant buffer, etc. It could also be arrayed - so you can have an array of images that can be different sizes etc, as long as they are all 2D floating point images.
Each shading stage has its own namespace, so the texture binding unit 0 of the pixel shader is not the texture binding unit 0 of the vertex shader. Each resource type has its own namespace, so Constant Buffer bound cell 0 must be different from Texture bound cell 0. Resources are independently bound to or unbound from Slot (preferably batched). The basic binding unit in Vulkan is the Descriptor.
Descriptors aren't bound individually, they are bound in blocks in a VkDescriptorSet which each have a particular VkDescriptorSetLayout. The VkDescriptorSetLayout describes the types of the individual bindings in each VkDescriptorSet.
Descriptors are not independently bound. They are bound to the Block of the VkDescriptor Set (each set has a VkDescriptor Set Layout). VkDescriptorSet Layout describes the type of each binding in the Set.
The easiest way I find to think about this is consider VkDescriptorSetLayout as being like a C struct type - it describes some members, each member having an opaque type (constant buffer, load/store image, etc). The VkDescriptorSet is a specific instance of that type - and each member in the VkDescriptorSet is a binding you can update with whichever resource you want it to contain.
The simplest way I can think of to describe VkDescriptorSetLayout is to treat it as a C structure, which describes some member variables, and each member has a clear type (constant buffer, load/store image, etc.). A VkDescriptorSet is a specific instance in which each member is an updated resource binding.
This is roughly how you create the objects too. You pass a list of the types, array sizes and bindings to Vulkan to create a VkDescriptorSetLayout, then you can allocate VkDescriptorSets with that layout from a VkDescriptorPool. The pool acts the same way as VkCommandPool, to let you allocate descriptors on different threads more efficiently by having a pool per thread.
This is a rough way to create. You create VkDescriptorSetLayout by passing a string of types, arrays, and bindings.
VkDescriptorSetLayoutBinding bindings[] = {
// binding 0 is a UBO, array size 1, visible to all stages
{ 0, VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER, 1, VK_SHADER_STAGE_ALL_GRAPHICS, NULL },
// binding 1 is a sampler, array size 1, visible to all stages
{ 1, VK_DESCRIPTOR_TYPE_SAMPLER, 1, VK_SHADER_STAGE_ALL_GRAPHICS, NULL },
// binding 5 is an image, array size 10, visible only to fragment shader
{ 5, VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE, 10, VK_SHADER_STAGE_FRAGMENT_BIT, NULL },
};
Example C++ outlining creation of a descriptor set layout
Once you have a descriptor set, you can update it directly to put specific values in the bindings, and also copy between different descriptor sets.
Once you have a set of descriptors, you can either update the value of the binding unit directly or copy it between different sets of descriptors.
When creating a pipeline, you specify N VkDescriptorSetLayouts for use in a VkPipelineLayout. Then when binding, you have to bind matching VkDescriptorSets of those layouts. The sets can update and be bound at different frequencies, which allows grouping all resources by frequency of update.
When creating a pipeline, you need to specify N descriptor set layouts in the pipeline layout. When binding, the binding needs to match the layout of the descriptor set. These sets can be updated and bound at different frequencies, allowing the grouping of resources by updating frequencies.
To extend the above analogy, this defines the pipeline as something like a function, and it can take some number of structs as arguments. When creating the pipeline you declare the types (VkDescriptorSetLayouts) of each argument, and when binding the pipeline you pass specific instances of those types (VkDescriptorSets).
When we create the pipeline, we declare all types of parameters (VkDescriptor Set Layouts), and when we start binding, you can pass the instance over.
The other side of the equation is fairly simple - instead of having shader or type namespaced bindings in your shader code, each resource in the shader simply says which descriptor set and binding it pulls from. This matches the descriptor set layout you created.
#version 430
layout(set = 0, binding = 0) uniform MyUniformBufferType {
// ...
} MyUniformBufferInstance;
// note in the C++ sample above, this is just a sampler - not a combined image+sampler
// as is typical in GL.
layout(set = 0, binding = 1) sampler MySampler;
layout(set = 0, binding = 5) uniform image2D MyImages[10];
Example GLSL showing bindings
Examples of GLSL binding
synchronization
I'm going to hand wave a lot in this section because the specific things you need to synchronise get complicated and long-winded fast, and I'm just going to focus on what synchronisation is available and leave the details of what you need to synchronise to reading of specs or more in-depth documents.
In this section, I'll describe a lot of things that need synchronization because they're complex and lengthy, so I'll specify what synchronization is possible. For details, you need to read specifications or more in-depth documentation.
This is probably the hardest part of Vulkan to get right, especially since missing synchronisation might not necessarily break anything when you run it!
This is probably the hardest part of Vulkan to get right, especially since incorrect synchronization may not interrupt rendering while the program is running.
Several types of objects must be 'externally synchronised'. In fact I've used that phrase before in this post. The meaning is basically that if you try to use the same VkQueue on two different threads, there's no internal locking so it will crash - it's up to you to 'externally synchronise' access to that VkQueue.
Some objects must be "synchronized outside". This means that if you want to use the same queue in two different threads, there is no guarantee of mutex lock, so it will crash - depending on whether you want to synchronize outside.
For the exact requirements of what objects must be externally synchronised when you should check the spec, but as a rule you can use VkDevice for creation functions freely - it is locked for allocation sake - but things like recording and submitting commands must be synchronised.
Objects that need to be synchronized outside are written in the specification. But as a rule, you can use VkDevice to create freely - it will be locked for allocation - but video recording and submitting commands must be synchronized.
N.B. There is no explicit or implicit ref counting of any object - you can't destroy anything until you are sure it is never going to be used again by either the CPU or the GPU.
Vulkan has VkEvent, VkSemaphore and VkFence which can be used for efficient CPU-GPU and GPU-GPU synchronisation. They work as you expect so you can look up the precise use etc yourself, but there are no surprises here. Be careful that you do use synchronisation though, as there are few ordering guarantees in the spec itself.Pipeline barriers are a new concept, that are used in general terms for ensuring ordering of GPU-side operations where necessary, for example ensuring that results from one operation are complete before another operation starts, or that all work of one type finishes on a resource before it's used for work of another type.
There are three types of barrier - VkMemoryBarrier, VkBufferMemoryBarrier and VkImageMemoryBarrier. A VkMemoryBarrier applies to memory globally, and the other two apply to specific resources (and subsections of those resources).
The barrier takes a bit field of different memory access types to specify what operations on each side of the barrier should be synchronised against the other. A simple example of this would be "this VkImageMemoryBarrier has srcAccessMask = ACCESS_COLOR_ATTACHMENT_WRITE and dstAccessMask = ACCESS_SHADER_READ", which indicates that all color writes should finish before any shader reads begin - without this barrier in place, you could read stale data.
Image layouts
Image barriers have one additional property - images exist in states called image layouts. VkImageMemoryBarrier can specify a transition from one layout to another. The layout must match how the image is used at any time. There is a GENERAL layout which is legal to use for anything but might not be optimal, and there are optimal layouts for color attachment, depth attachment, shader sampling, etc.
ImageBarrier has an additional attribute, ImageLayout (image usage status). VkImage Memory Barrier can specify the conversion process from one Image Layout to another. Layout should always match the use of Image. GENERAL's Layout can be used for any image, but may not be optimal. There are many other optimized Layouts for ColorAttachment, Depth Attachment, Shader Sampler, and so on.
Images begin in either the UNDEFINED or PREINITIALIZED state (you can choose). The latter is useful for populating an image with data before use, as the UNDEFINED layout has undefined contents - a transition from UNDEFINED to GENERAL may lose the contents, but PREINITIALIZED to GENERAL won't. Neither initial layout is valid for use by the GPU, so at minimum after creation an image needs to be transitioned into some appropriate state.
The starting state of an Image can be UNDEFINED or PREINITIALIZED. The latter can be used to create Image objects with data, because UNDEFINED Layout has no content, and data content will be lost when the Image is converted from UNDEFINED to GENERAL, but PREINITIALIZED to GENERAL will not. Whether the initial Layout is correct or not, the Image needs to be transformed to an appropriate state after it is created.
Usually you have to specify the previous and new layouts accurately, but it is always valid to transition from UNDEFINED to another layout. This basically means 'I don't care what the image was like before, throw it away and use it like this'.
Usually you need to specify Layout exactly, but the transition from UNDEFINED to other Layouts is always the right one. It's like "I don't care what he used to be, just use it now".
RenderPass
A VkRenderpass is Vulkan's way of more explicitly denoting how your rendering happens, rather than letting you render into then sample images at will. More information about how the frame is structured will aid everyone, but primarily this is to aid tile based renderers so that they have a direct notion of where rendering on a given target happens and what dependencies there are between passes, to avoid leaving tile memory as much as possible.
VkRenderPass is a more explicit representation of the rendering process than Vulkan is when you render the target Image as you wish.
N.B. Because I primarily work on desktops (and for brevity & simplicity) I'm not mentioning a couple of optional things you can do that aren't commonly suited to desktop GPUs like input and transient attachments. As always, read the spec :).
The first building block is a VkFramebuffer, which is a set of VkImageViews. This is not necessarily the same as the classic idea of a framebuffer as the particular images you are rendering to at any given point, as it can contain potentially more images than you ever render to at once.
Note that because my work is focused on desktop platforms (for simplification), I won't mention some of the options common to non-desktop platforms, such as temporary Attachment (read spec s as always).
The first component is the Framebuffer, which is actually a bunch of VkImage Views. It is not used as a special Image to render to a specific target like the traditional Framebuffer. It can contain more images in one rendering process.
A VkRenderPass consists of a series of subpasses. In your simple triangle case and possibly in many other cases, this will just be one subpass. For now, let's just consider that case. The subpass selects some of the framebuffer attachments as color attachments and maybe one as a depth-stencil attachment. If you have multiple subpasses, this is where you might have different subsets used in each subpass - sometimes as output and sometimes as input.
A VkRenderPass consists of several subpass. In this simple triangle example and many other possible scenarios, there is usually only one sub-pass. Now, let's just think about that. Subpass chooses some attachment s as color targets, others as depth and template targets. If you have multiple sub-passes, each sub-pass will have a different set, some for input and some for output.
Drawing commands can only happen inside a VkRenderPass, and some commands such as copies clears can only happen outside a VkRenderPass. Some commands such as state binding can happen inside or outside at will. Consult the spec to see which commands are which.
Drawn instruction calls can only be used in one VkRenderPass, but only outside if it is a copy-related instruction. State-bound instructions can be invoked either internally or externally as desired. The specifications are described in detail.
Subpasses do not inherit state at all, so each time you start a VkRenderPass or move to a new subpass you have to bind/set all of the state. Subpasses also specify an action both for loading and storing each attachment. This allows you to say 'the depth should be cleared to 1.0, but the color can be initialised to garbage for all I care - I'm going to fully overwrite the screen in this pass'. Again, this can provide useful optimisation information that the driver no longer has to guess.
Subpass does not inherit state, so every time you start a renderpass or switch to a new subpass, you need to rebind state.
The last consideration is compatibility between these different objects. When you create a VkRenderPass (and all of its subpasses) you don't reference anything else, but you do specify both the format and use of all attachments. Then when you create a VkFramebuffer you must choose a VkRenderPass that it will be used with. This doesn't have to be the exact instance that you will later use, but it does have to be compatible - the same number and format of attachments. Similarly when creating a VkPipeline you have to specify the VkRenderPass and subpass that it will be used with, again not having to be identical but required to be compatible.
Finally, we need to consider the compatibility between these different objects. When you create a VkRenderPass without any reference, but with a format and all attachment usage, and then create a VkFramebuffer, you have to choose a VkRenderPass. This does not necessarily require that the Renderpass instance be compatible, but the number and format of attachments must be the same. Similarly, RenderPass and Subpass used to create VkPipeline do not require the same instance, but are compatible.
There are more complexities to consider if you have multiple subpasses within your render pass, as you have to declare barriers and dependencies between them, and annotate which attachments must be used for what. Again, if you're looking into that read the spec.
If your renderpass has multiple sub-passes, you need to consider more complex situations. You have to declare barriers and dependencies between sub-passes and explain the purpose of attachment.
Backbuffers and display
I'm only going to talk about this fairly briefly because not only is it platform-specific but it's fairly straightforward.
I'm going to give you a brief introduction to the process of Backbuffer display, because it's not only platform specific, but also more straightforward.
Note that Vulkan exposes native window system integration via extensions, so you will have to request them explicitly when you create your VkInstance and VkDevice.
To start with, you create a VkSurfaceKHR from whatever native windowing information is needed.
Note that Vulkan can expose local Windows systems through extensions, so when you create VkInstance and VkDevice, you will have to deal with it. It's time to start, and you'll create VkSurface.
Once you have a surface you can create a VkSwapchainKHR for that surface. You'll need to query for things like what formats are supported on that surface, how many backbuffers you can have in the chain, etc.
Once you create the surface, you can create VkSwapchain KHR on the surface. You will query the format supported by surface, the maximum number of nackbuffer s, and so on.
You can then obtain the actual images in the VkSwapchainKHR via vkGetSwapchainImagesKHR(). These are normal VkImage handles, but you don't control their creation or memory binding - that's all done for you. You will have to create an VkImageView each though.
You can then get the actual image in VkSwapChain KHR through vkGetSwapchain Images KHR (). These are normal VkImage handles, but you can't control their creation and memory binding - all of these steps have been completed. You will create a VkImageView for each Image.
When you want to render to one of the images in the swapchain, you can call vkAcquireNextImageKHR() that will return to you the index of the next image in the chain. You can render to it and then call vkQueuePresentKHR() with the same index to have it presented to the display.
When you want to render an Image in Swapchain, you can call vkAcquireNextImageKHR(), which returns the index of the next Image, which you can render and then call vkQueuePresentKHR() to display.
There are many more subtleties and details if you want to get really optimal use out of the swapchain, but for the dead-simple hello world case, the above suffices.
If you want to make the best use of SwapChain, there are many other details, but this simple hello world example is almost complete.
summary
Appendix (Pseudo Code Example)
#include <vulkan/vulkan.h>
// That's what the pseudocode for vulkan apps looks like. Most of the creation structures and all synchronization and error checks are ignored here.
// This is not a copy-and-paste tutorial!
void DoVulkanRendering()
{
const char *extensionNames[] = { "VK_KHR_surface", "VK_KHR_win32_surface" };
// The latter structure will not be explained in detail. This is just for illustration.
// Application info is optional (you can specify application/engine name and version)
// Note we activate the WSI instance extensions, provided by the ICD to
// allow us to create a surface (win32 is an example, there's also xcb/xlib/etc)
VkInstanceCreateInfo instanceCreateInfo = {
VK_STRUCTURE_TYPE_INSTANCE_CREATE_INFO, // VkStructureType sType;
NULL, // const void* pNext;
0, // VkInstanceCreateFlags flags;
NULL, // const VkApplicationInfo* pApplicationInfo;
0, // uint32_t enabledLayerNameCount;
NULL, // const char* const* ppEnabledLayerNames;
2, // uint32_t enabledExtensionNameCount;
extensionNames, // const char* const* ppEnabledExtensionNames;
};
VkInstance inst;
vkCreateInstance(&instanceCreateInfo, NULL, &inst);
// The last parameter of the enumeration call is set to NULL to get the number of physical devices.
// Call again to get the device handle.
VkPhysicalDevice phys[4]; uint32_t physCount = 4;
vkEnumeratePhysicalDevices(inst, &physCount, phys);
VkDeviceCreateInfo deviceCreateInfo = {
// Skip over
};
VkDevice dev;
vkCreateDevice(phys[0], &deviceCreateInfo, NULL, &dev);
// Obtaining vkCreateWin32 Surface KHR Extension Function Pointer by vkGetInstanceProcAddr
VkWin32SurfaceCreateInfoKHR surfaceCreateInfo = {
// HINSTANCE, HWND, etc
};
VkSurfaceKHR surf;
vkCreateWin32SurfaceKHR(inst, &surfaceCreateInfo, NULL, &surf);
VkSwapchainCreateInfoKHR swapCreateInfo = {
// surf goes in here
};
VkSwapchainKHR swap;
vkCreateSwapchainKHR(dev, &swapCreateInfo, NULL, &swap);
// Again this should be properly enumerated
VkImage images[4]; uint32_t swapCount;
vkGetSwapchainImagesKHR(dev, swap, &swapCount, images);
// Synchronization is needed here
uint32_t currentSwapImage;
vkAcquireNextImageKHR(dev, swap, UINT64_MAX, presentCompleteSemaphore, NULL, ¤tSwapImage);
// Pass creationInfo to create ImageView
VkImageView backbufferView;
vkCreateImageView(dev, &backbufferViewCreateInfo, NULL, &backbufferView);
VkQueue queue;
vkGetDeviceQueue(dev, 0, 0, &queue);
VkRenderPassCreateInfo renderpassCreateInfo = {
// here you will specify the total list of attachments
// (which in this case is just one, that's e.g. R8G8B8A8_UNORM)
// as well as describe a single subpass, using that attachment
// for color and with no depth-stencil attachment
};
VkRenderPass renderpass;
vkCreateRenderPass(dev, &renderpassCreateInfo, NULL, &renderpass);
VkFramebufferCreateInfo framebufferCreateInfo = {
// include backbufferView here to render to, and renderpass to be
// compatible with.
};
VkFramebuffer framebuffer;
vkCreateFramebuffer(dev, &framebufferCreateInfo, NULL, &framebuffer);
VkDescriptorSetLayoutCreateInfo descSetLayoutCreateInfo = {
// whatever we want to match our shader. e.g. Binding 0 = UBO for a simple
// case with just a vertex shader UBO with transform data.
};
VkDescriptorSetLayout descSetLayout;
vkCreateDescriptorSetLayout(dev, &descSetLayoutCreateInfo, NULL, &descSetLayout);
VkPipelineCreateInfo pipeLayoutCreateInfo = {
// A DescSet Layout maintains a descriptor Set
};
VkPipelineLayout pipeLayout;
vkCreatePipelineLayout(dev, &pipeLayoutCreateInfo, NULL, &pipeLayout);
// Upload SPIR-V shaders
VkShaderModule vertModule, fragModule;
vkCreateShaderModule(dev, &vertModuleInfoWithSPIRV, NULL, &vertModule);
vkCreateShaderModule(dev, &fragModuleInfoWithSPIRV, NULL, &fragModule);
VkGraphicsPipelineCreateInfo pipeCreateInfo = {
// There are many structures that need to be fully filled.
// It will point to shaders, pipeline layouts, and RenderPass.
};
VkPipeline pipeline;
vkCreateGraphicsPipelines(dev, NULL, 1, &pipeCreateInfo, NULL, &pipeline);
VkDescriptorPoolCreateInfo descPoolCreateInfo = {
// the creation info states how many descriptor sets are in this pool
};
VkDescriptorPool descPool;
vkCreateDescriptorPool(dev, &descPoolCreateInfo, NULL, &descPool);
VkDescriptorSetAllocateInfo descAllocInfo = {
// from pool descPool, with layout descSetLayout
};
VkDescriptorSet descSet;
vkAllocateDescriptorSets(dev, &descAllocInfo, &descSet);
VkBufferCreateInfo bufferCreateInfo = {
// buffer for uniform usage, of appropriate size
};
VkMemoryAllocateInfo memAllocInfo = {
// skipping querying for memory requirements. Let's assume the buffer
// can be placed in host visible memory.
};
VkBuffer buffer;
VkDeviceMemory memory;
vkCreateBuffer(dev, &bufferCreateInfo, NULL, &buffer);
vkAllocateMemory(dev, &memAllocInfo, NULL, &memory);
vkBindBufferMemory(dev, buffer, memory, 0);
void *data = NULL;
vkMapMemory(dev, memory, 0, VK_WHOLE_SIZE, 0, &data);
// fill data pointer with lovely transform goodness
vkUnmapMemory(dev, memory);
VkWriteDescriptorSet descriptorWrite = {
// write the details of our UBO buffer into binding 0
};
vkUpdateDescriptorSets(dev, 1, &descriptorWrite, 0, NULL);
// Finally, we can render!
// ...
// That's about it.
VkCommandPoolCreateInfo commandPoolCreateInfo = {
// hehe
};
VkCommandPool commandPool;
vkCreateCommandPool(dev, &commandPoolCreateInfo, NULL, &commandPool);
VkCommandBufferAllocateInfo commandAllocInfo = {
// Distribution from commandPool
};
VkCommandBuffer cmd;
vkAllocateCommandBuffers(dev, &commandAllocInfo, &cmd);
// Now you can render it.
vkBeginCommandBuffer(cmd, &cmdBeginInfo);
vkCmdBeginRenderPass(cmd, &renderpassBeginInfo, VK_SUBPASS_CONTENTS_INLINE);
// Binding pipeline
vkCmdBindPipeline(cmd, VK_PIPELINE_BIND_POINT_GRAPHICS, pipeline);
// Binding descriptor
vkCmdBindDescriptorSets(cmd, VK_PIPELINE_BIND_POINT_GRAPHICS,
descSetLayout, 1, &descSet, 0, NULL);
// Setting up rendering viewport
vkCmdSetViewport(cmd, 1, &viewport);
// Draw a triangle
vkCmdDraw(cmd, 3, 1, 0, 0);
vkCmdEndRenderPass(cmd);
vkEndCommandBuffer(cmd);
VkSubmitInfo submitInfo = {
// this contains a reference to the above cmd to submit
};
vkQueueSubmit(queue, 1, &submitInfo, NULL);
// Now we can show
VkPresentInfoKHR presentInfo = {
// swap and currentSwapImage are used here
};
vkQueuePresentKHR(queue, &presentInfo);
// Waiting for all operations to complete and destroy objects
}