Cheet

• Concurrency is when we have multiple computations being executed at once – Generally a program is synchronous, but it may run asynchronous with other programs at the same time depending on the Operating System

A Process is an instance of a program being executed – This presentation is running in a process

Multitasking is the term given when an Operating System (OS) is able to run multiple Processes at once With time-sharing each core generally runs multiple tasks

Each process can also spawn separate Threads that may run on other cores

In computing the term Thread is short for Thread of Execution – A Process runs a Thread when it begins

Processes are able to spawn multiple threads

A Thread is simply a function that runs asynchronously to other threads

Threads can read and write to the same memory locations

When two threads try to do something at the same time, not in the order we would have expected, then this is called a Race Condition

One way to prevent Race Conditions is through the use of a Mutex – Short for Mutual Exclusion

Critical section is where a race condition can occur

A Mutex is an object that can be locked and unlocked, but can only be locked by one thread at a time – If a thread tries to lock an already locked Mutex then it will pause and wait until the mutex is unlocked

One problem that can occur with threads is Deadlock – Deadlocks refer to Circular References and Dependencies

Pipeline/System Threads – Will run in ‘lockstep’ if information needs to be transferred

Dedicated Task Threads –  • A thread which runs tasks which – Only run occasionally – Are usually expensive and would slow the main thread down May have more than one task assigned to it

Pattern 3 – Parallel Tasks – Break larger systems down into smaller independent tasks – These can then be run in parallel on their own threads A more complex system may include dependencies –  Tasks can only be started once certain tasks have been completed – Tasks can be prioritized to allow more important tasks to complete first

– Only one thread should access memory during a write operation’

There are three main ways to control memory access during a thread write operation – Critical Sections (Mutex) – Atomics (std::atomic) – Non-Blocking Algorithms

Mutex Advantages • Quite easy to implement • Programmer has direct control • Allows for multiple operations to be done in a thread safe manner. Mutex Disadvantages • Very slow • A single thread can continue to block other threads if it re-acquire the lock too quickly

An atomic variable can be used like a normal variable but will only ever require a single instruction to execute the operation

Atomics and std::atomic Advantages • Can be easier to thread-safe a single variable than other solutions • Impossible to cause deadlocks – Unlike with a mutex, which may leave a thread waiting for a lock indefinitely • Once initialized, code looks quite similar to non thread-safe version. Atomics and std::atomic Disadvantages • Only works on a single operation level – Can not thread-safe blocks of code • More computationally expensive than plain data types when thread-safety isn’t required • Only works for certain types

Non-Blocking Algorithms • Defined as: – An algorithm is called non-blocking if failure or suspension of any thread cannot cause failure or suspension of another thread • Two types of Non-Blocking Algorithms: – Wait-Free – Lock-Free Wait-Free Algorithms • Wait-Free alg Lock-Free Algorithms • Lock-Free Algorithms are simply algorithms where at least one thread will always be executing – Individual threads can lock, but never all of them at the same time. – The program itself should progress even if some threads are blocked. • From the “Art of Multiprocessor Programming”: – “In an infinite execution, infinitely often some method call finishes.” Lock-Free Algorithms • Some of the threading patterns in the previous presentations can be handled with lock-free implementations. • Easier to code than Wait-Free, but still requires some threads to be blocked some of the time. Non-Blocking Advantages • Correct implementation can ensure that threads aren’t prevented from doing work by any other thread Non-Blocking Disadvantages • Hard to get right • Can require higher memory overhead • Very strict usage requirements – Particularly for wait-free algorithms TCP/IP and UDP are the most commonly used protocols in games • HTTP is also used when we need to access a web server.Every computer connected to the network has its own unique IP address – This is a 32 bit value, normally written as a set of 4 single byte values separated by dots. Each computer also has 65535 ‘ports’ for an application to communicate through in addition to your IP address A computer does not need to be connected directly to the computer it is communicating with – Special hardware (Routers, Switches) can forward a message to the proper computer.A chunk of data sent over a network is typically called a packet The packet (along with the destination IP, port, error checking bits and a few other pieces of data) is sent from the source All the destination application receives is a series of bytes The first byte of every packet you send can just be a code for what the rest of the data in the packet is There are a number of different Network APIs that you can use to implement Networking in your projects – Winsock (Windows only, very low level) – Socket (Linux/Unix only, very low level) – Raknet (Cross platform, high level, Free) – POCO (Cross platform, high level, Free) – ICE (Cross platform, high level, Free for non-commercial) As the previous example displayed, information doesn’t have to be sent every single frame Packets could also be batched into a single packet holding multiple message data streams Some information should have higher priority than others Packets can usually be sent with a priority TCP/IP • After a packet is sent, the sender waits for a acknowledgement pack to be received – If it doesn’t receive one within a given time frame, it resends the packet • This allows packets to be guaranteed to be sent, and to be received in order • But requires the sender to wait for acknowledgement before sending more data – slow on bad connections • Great for data that –must- arrive at the receiver, both in order and completely UDP • Will just send each packet once – Does not wait for acknowledgement • Can send packets as soon as they are built, rather than waiting for a specific order of packets to be sent • Great for information that is sent often enough that it doesn’t matter if it gets lost (or arrives a little late) The time taken for packets to travel is referred to as Latency. Network errors occur, and sometimes packets never make it to their final destination – This is commonly referred to as Packet Loss • TCP packets will be resent if they are lost – But remember that this will slow down the transmission of additional packets • UDP packets will not be resent, and the data will never arrive

Peer to Peer • In a peer to peer setup each player connects directly to every other player • When one player’s game world is updated, it sends a message saying what changed to all the other players • Each of the other players receive an update message and update their game state accordingly Peer to Peer • Advantages: – No single point of failure – Reduced latency due to not being relayed from server • Disadvantages: – Large amounts of traffic • Each client must send and receive n-1 messages – No central storage • Mods and updates harder to maintain and distribute – Easier to cheat – No single world state • If two players disagree about where something should be, how do we decide who is right? Server / Clients • We can have a central server Server / Clients • Advantages: – Each client sends to one location – Easier cheat proofing – Central point for updates and mods • Disadvantages: – Central point of failure Server Clusters – MMO server architecture • MMOs typically need more than just one server to share to load of simulating their entire world • So they don’t just need to figure out how users connect to them, but which user connects to which server and also how servers communicate with each other Server Clusters – MMO server architecture • Three Main types – Servers are shards • Each shard replicates world • Each world is independent • Players meet in specific shards – Servers are mirrors • Each mirror replicates world • All worlds are synchronised • Players see everyone • Mirrors need to be consistent – Servers are regions • World is split across servers • Each server hosts a region • Servers need to be consistent Dead Reckoning First used by sailors to calculate their position based on direction and distance/time travelled • Not perfect, but better than having no idea where you are! • Can be used in network games when a client is waiting for information from the server… Dead Reckoning occasionally gets it wrong.  If we simply placed the player at the correct location when the packet is received then we may get an artefact called Snapping. One way to overcome snapping is to use interpolation • In some games the player does not necessarily need to be standing in the exact location on both screens We can also interpolate the client back to the correct position slowly as the agent is moving When done properly, this can hide any ‘fixing’ of a units position from the player completely • Only do this when exact player positioning is not required Due to the unpredictability of networks, sometimes your packets will come through corrupted, or incorrect One method is to provide a checksum for all data that is sent across the network Sound is the oscillation of pressure transmitted through a medium Amplitude: the volume of a sound • Wavelength: how long before the wave repeats again • Hertz (Hz): The number of cycles per second that the wave repeats – Often referred to as frequency Storing a sound digitally isn’t as easy as storing it on an analogue medium Audio is stored as a series of discrete numbers, known as samples • PCM encoding is simply sampling that wave a certain number of times per second, and using a specific number of bits per sample – ms = samples * 1000 / samplerate – samples = ms * samplerate / 1000 – samplerate = samples * 1000 / ms – bytes = samples * bits * channels / 8 – samples = bytes * 8 / bits / channels Samples: – Good for small sounds that need to be played more than once at a time (e.g. sound effects) – Generally use little or no CPU to play back, can be hardware accelerated • Streams: – Good for sounds too large to fit into memory and need to be streamed from disk into a small ringbuffer – Take a small amount of CPU and disk bandwidth based on the file format – For example mp3 takes more cpu power to decode in real-time than a PCM decompressed wav file does – A streaming sound can only be played once, not multiple times due to it only having 1 file handle per stream and 1 ring buffer to decode into Build Systems are tools set up to automatically build your project when changes are made, or at set times Continuous Integration (CI) is when automated builds happen regularly Commit hooks or polling – Build server can pull latest check-ins from version control before rebuilding • Build script or CI plugin – Triggers a build / compilation of the project • Other tasks – Process assets: compile shaders, convert assets to binary, etc – Packaging: compress assets in to resource files (faster loading) – Unit Test – Check for asset errors: missing textures, shader errors, etc. • Report results – Artists notified of missing assets – Programmers notified of compile / assert locations – QA notified of failed unit test play-throughs so that they can focus their human testing A Neural Network is a collection of nodes, called Neurons, arranged in a graph-like structure – Neurons can be one of three types: • Input Neurons • Output Neurons • Hidden Neurons – Neurons are connected via Synapses • A Synapse passes a charge from one Neuron to another, called an Action Potential (AP) nput Neurons are connected the Hidden Neurons, which are then connected to Output Neurons – Input Neurons can be thought of receiving input such as a smell, or colour, and Output Neurons are what make us think “smells like oranges” Neurons and Synapses • Input Neurons are triggered in some way – For example, Smell and Colour • Triggered neurons send an Action Potentialto Hidden Neuronsthey are connected to via Synapses – The Synapse applies a weighted modifier to the Action Potential as it passes it along • Hidden Neurons sum the AP they receive, and if it passes some threshold value for the Hidden Neuron then it sends along its own AP – Hidden Neurons can be connected to other Hidden Neurons • Eventually the AP’s make their way to Output Neurons – If the AP for an Output Neuron passes its threshold then that output is “triggered” Typically Neurons need to be trained to adjust their trigger threshold, and new Synapse links need to be formed when new data is presented Neural Networks need a lot of initial setup and training – Neural Networks are a form of Supervised Learning • Typically Neurons need to be trained to adjust their trigger threshold, and new Synapse links need to be formed when new data is presented – When we first see an Orange fruit a new Synapse associates the colour orange with the fruit Advantages • Can arrive at an Output via many different Inputs – The colour orange, and it’s dinner time, therefor Carrots! – The colour orange, and I’m at an orchard, therefor Oranges! • Can be trained for many situations Disadvantages • Requires training – Can be trained to do anything, but has no idea to train for it unless you create the algorithms to do so • Not easily controlled by a game designer Another form of learning algorithm is a Genetic Algorithm (GA), which is a type of Evolutionary Algorithm • GA’s attempt to mimic the process of Natural Selection, i.e. Survival of the Fittest – Uses a Population of potential solutions – Evaluates the Fitness or “survivability” of each solution – Combines the fittest solutions to create new solutions – Repeats the process until a potential solution converges on the actual solution For a GA to work we need to be able to encode a solution into a genetic representation – One common example is attempting to find a path out of a maze • The problem is “get Agent from point A to B” • We could encode the moves that the Agent needs to perform to exit the maze into a genetic representation • The genetic representation is commonly called a Chromosome A collection of Chromosomes are called a Genome – Agents in a GA contain a Genome which contain the Chromosomes that the GA uses to determine Fitness Chromosomes • In the earlier example of an Agent finding a way out of a maze – A move represent a Chromosome • Move left, move right, etc – The Genome is the collection of moves that the Agent will take • “Move left, then up, then right, then up etc” Evaluate the fitness of every member of the population Evolve the population into a new generation • Use Selection to select members of the population to “breed” • Each pair of “parents” typically creates 2 “siblings” • Crossover the parents genomes in some way, i.e. Combine their genes • Apply potential Mutation to the siblings • Original population is replaced by the new generation • Repeat the process until a solution to the problem has been found from the population of potential solutions Evaluating a population’s fitness depends on the problem you are trying to solve Each member of the population is assigned a fitness score Selection consists of selecting two members of the population to combine via Crossover to inherit their traits for a new generation – Our goal is to create 2 new siblings based off of the genomes of the parents • Selecting members typically makes use of their Fitness score – Otherwise we have no way of trying to converge towards a “fit” solution • But, as in life, the “fittest” isn’t always combined with another “fittest” Roulette-Wheel Selection is the process of finding the Total Fitness of the population – Simply adding up the fitness scores – A member with a higher score makes up a larger part of the total – Can visualise this as a pie chart • We then randomly select a value within the Total Fitness score Once we have selected 2 parents to combine we perform Crossover Two-Point Crossover is one method of performing Crossover • We select a location within the chromosomes of parent 1 and swap them with the chromosomes in parent 2 to create sibling 1, and the opposite for sibling 2 One problem that can occur is that a population might head in the wrong direction towards a solution – In the case of the maze solver it may get closer to the exit but actually head down a dead-end, with the exit on the other side of the wall – Majority of the population may head down the dead-end because it gives them the best fitness score, and they can’t evolve out of that solution • This problem is called Local Minima Mutation • One way to solve Local Minima is to add in Mutation When we create siblings we simply give each of their Chromosomes a tiny chance of mutating – This is the Mutation Rate – If it is too high then the GA falls apart Advantages • Can arrive at unknown solutions • Can optimise a solution by continuous evolution even if a solution has solved the problem – Fitness scores can take in to account not just if it achieved its goal, but also the cost of achieving it Disadvantages • Can take quite a bit of time to arrive at a solution – Not guaranteed to converge on the correct solution • Can be difficult to map Agent behaviours to chromosomes – And can be difficult to create a suitable Fitness function The problem with the previous definition is that our definition of a heap of sand is “fuzzy” • We do not have an exact definition of when the heap ceases to be a heap and becomes something else • Fuzzy Logic is a branch of mathematics which attempts to address these types of problems These yes/no, black and white numbers are called Crisp Sets Crisp Sets have clearly defined boundaries • Real numbers (1, 2, 3, 4) can be called crisp data Our goal is to convert the crisp data into fuzzy data – Fuzzy data doesn’t have a clear boundary, data can belong to multiple sets • Converting from crisp to fuzzy data is known as fuzzification – Fuzzification involves finding the degrees of membership (DOM) of the crisp sets and placing them into predefined fuzzy sets These sets can also have Boolean operators applied to them (AND, NOT, OR) Example : The DOM of an IQ of 115, in the set of people who are Average AND Clever would result in an output value of 0.25 Using the OR operator between two sets, for some value, results in the maximum DOM being returned This example shows the DOM of an IQ of 115 in the set of people who are Average OR Clever, and the result is 0.75 NOT is a little trickier as it gives the opposite of a value for a DOM, for example a NOT DOM of M is 1-M • This example shows the DOM of an IQ of 115 in the set of people who are NOT Clever, and the result is 1 – 0.75 = 0.25 This is the process of converting our fuzzy membership values into a single value we can use to make a decision with The method we will use is called Average of Maxima Crisp Value = Σ maxMembership * confidence / Σ confidence What is a Markov chain? • A process that uses probability to control the transition between states in a state space • At any instant in discrete time (t=1,t=2,t-3,etc..) the probability of going to the next state can only depend on the state we are in at that instant Newton’s First Law: – “Every object persists in its state of rest or uniform motion in a straight line unless it is compelled to change that state by forces impressed on it” • Newton’s Second Law: – “Force is equal to the change in momentum per change in time. For a constant mass, force equals mass times acceleration” • Newton’s Third Law: – “For every action there is an equal and opposite reaction”A variable timestep is not appropriate for physics simulations • To ensure replicability, we need to use a fixed timestep for physics • A fixed physics timestep may cause a problem known as temporal aliasing – Temporal aliasing can be solved by interpolating between the past and present state of a physics object Kinematics is the study of movement over time – It doesn’t concern itself with what causes the object to start moving in the first place, it just deals with the actual movement itself • Dynamics is the study of forces and masses that cause the kinematic qualities to change over time – The “rigid body” part refers to constraints we place on the objects we’re simulating – A rigid body’s shape does not change during the simulation Properties of Particles • Position – A location in space, in either 2 or 3 dimensions – Stored as a vector – We say that position is the integral of velocity • Velocity – A vector describing how the position changes over time – We say that the velocity is the derivative of position, and the integral of acceleration • Acceleration – A vector describing how the velocity changes over time – We say that the acceleration is the derivative of acceleration Properties of Particles • Mass – The amount of matter in the object – It is the measure of an object’s resistance to acceleration – Not the same as weight (although often determined by measuring the object’s weight) • An object on the moon would weigh less than on earth, but would still have the same mass • Momentum – The mass of the particle multiplied by its velocity • Force – Measured as the change in momentum over time ???? = ???????? or ???? = ???? / m where F = force m = mass and a = acceleration Momentum is the mass of an object multiplied by its velocity: ???? = ????v P = momentum, m = mass and v = velocity We treat rigid bodies as single point masses (particles) • For a given force, we find the acceleration of the point by dividing the force by the mass • This gives us the acceleration to use in our integration, and we can calculate the object’s movement A set of formulas that relate to five kinematic variables: – ∆???? Displacement • A change in an object’s position – ???? Time interval • Representing the interval of time taken for the object to be displaced • Sometimes written as ∆???? to more clearly indicate a time interval – ????0 Initial Velocity – ???? Final Velocity – ???? Constant acceleration • If we know three of these variables for an object under constant acceleration, we can use kinematic formula to solve for one of the unknown variables 1st Calculate the final velocity given the starting velocity, acceleration and time interval ???? = ????0 + ???????? 2nd Calculate how far an object moved using the initial and final velocity, and time interval ∆???? = (???? + ???? / 2)???? 3rd Find how far an object moved when the final velocity is unknown (but acceleration is known) ∆???? = ????0 ???? + (1\2 ????????squared 4th Find the final velocity given initial velocity, acceleration and distance travelled ???? squared = ????0 squared + 2????∆???? Adding Gravity ∆???? = ????0 ???? + 1/2 ???????? squared final ???? = ????0 + ????0????????  final ???? = ????0 + ????0???????? + 1/2 ???????? squared ???? – final x position ????0 – initial x position ???? – final y position ????0 – initial y position ????0???? – initial x velocity ????0???? – initial y velocity ???? – gravity ???? – time In games, we use numerical integration to move our physics objects a little each frame. The numerical integration method we used is called Euler Integration, and it has some disadvantages – Varying the time step has a significant effect on the results – To work out where our object is after N frames we have to advance through all those frames in sequence – There is no way to calculate values such as launch angle for a specific distance etc., which are useful for AI We could use the objects mesh for collision • But this is often too computationally intensive – For example consider a soccer game where the artist has accurately modelled the ball, complete with stitched segments and irregularities. – Whilst this is great for rendering It’s very expensive for collision detection – The solution is to use Collision Primitives Now that we have planes as well as spheres we need to handle collision between them which is calculated as: – ????1 = ???? ∙ ???? − ???? − ???? • Where : – ????1 is the distance of the sphere surface to the plane surface – ???? is the centre of the sphere – ???? is the normal to the plane – ???? is the distance of the plane from the origin – ???? is the radius of the sphere • If ????1 is negative it means the sphere has collided with the plane Relative velocity is given by subtracting the velocity of object B from A: The collision normal may be supplied by the collision detection algorithm • Choosing a normal can be tricky – In this case, we can use the normalized difference in position (???????????????? − ????????????????) Resolving a Collision: Coefficient of Restitution • The coefficient of restitution models the complicated compression and restitution of impacting bodies with a single scalar – I.e., it tells us how ‘bouncy’ something is – Denoted using an e • It modifies the relative velocity to account for energy absorbed by the objects during collision: ????′???????????? ∙ ???? = −???????????????????? ∙ ???? Vrel = relative velocity before collision V’rel = relative velocity after collision e = coefficient of elasticity n = collision normal ????′ to power of ???? = ???? to power ???? + (???? / ????????) n

When we calculate angular acceleration we use this formula: – ???? is angular acceleration – ???? is the torque exerted on the body – ???? is the moment of inertia a = T / I In physics engines we have three types of Rigid Bodies – Static – Dynamic – Kinematic Types of Joints • Ball-and-socket joint • Hinge joint • Slider joint (prismatic joint) • Universal joint • Conical joint Ball-and-Socket Joints • Constrains two objects via a shared point – The shared point maintains a fixed relative position between the two objects • Think of one object having a ball, and the other having a socket – The constraint attempts to force the two to be coincident Hinge Joint • A ball-and-socket joint with an additional constraint that two axes (one connected to the ball, the other to the socket) must be collinear • The two objects can rotated around the shared axis like a hinge – Additional limits may be set on the hinge rotation angle Slider Joint • Two axes are fixed relative to the body space of the two objec Universal Joint • Similar to a ball-and-socket joint • Allows for limits to be specified on the angles of rotation – Angular limits are Euler angles – This results in a rectangular range of motion • Limits on the twisting (“bank”) can also be set Conical Joint • Similar to a universal joint • The rotation limits are conical rather than rectangular A spring-damper system can be modelled as follows: ???? = −???????? − ???????? – ???? is the coefficient of damping • higher values will cause the spring to come to rest more quickly – ???? is the relative velocity of the two points connected by the spring F = force exerted by spring k = spring constant (stiffness of spring) x = displacement of end of spring from rest position b = coefficient of damping v = relative velocity between spring end points Dynamic Rigid Bodies • Dynamic Rigid Bodies are objects that are fully effected by simulated forces, such as gravity, and that react to collisions with other objects – Hitting a wall would stop the body, or cause a bounce, and the wall being static would not move – Collisions between dynamic bodies cause them both to simulate the collision response – This is great for simulating real-world physics but can lead to frustration for the player if try to use a dynamic rigid body to control their avatar Kinematic Rigid Bodies • Kinematic Rigid Bodies have similarities to both Static and Dynamic – Unaffected by dynamic forces – But can move through the world • Will exert a force upon other bodies if collision occurs when they are moved • You can visualize the kinematic object as a dynamic rigid body with infinite mass – As in it is not effected by other bodies Kinematic Rigid Bodies as Players? • Usually a Dynamic Body is unsuitable to represent the player as items contacting it can affect it • Because Kinematic Bodies can be moved with user defined forces people typically assume they are a good choice for a player • However, because Kinematic Bodies are unaffected by other bodies it means they are a horrible choice! – Moving a Kinematic player against a Static Body wall will cause the player to walk right through it! – If the player stands on a moving platform we want the player to be moved • This doesn’t happen with a kinematic body! The two common ways of simulating fluid dynamics are: – Grids – Particles In the grid method, the space is divided up into cells with a velocity and density: – Can also be a 3D grid • Object movement in the field is simulated by applying a force to the object based on the nearest vector in the grid Implementation – Grids • In order to implement fluid dynamics with the Grid Method the grid must contain information at each point: – Velocity – Density of Fluid – Fluid Specific Info • Temperature, Colour, Humidity • The key terms that need to be defined are: – Advection – Diffusion – Incompressible – Mass Conserving Advection • Advection is the movement of quantities from one location to another based upon the velocities: Mass Conservation • Mass Conserving means that no mass is lost or gained during advection: – When advecting, a quantity must be subtracted • Pressure in a volume chan Diffusion • Diffusion is a force that causes particles to move from areas of high density to areas of low density: Compressible / Incompressible • Incompressible means that the fluid cannot be placed into a smaller volume: – Water is incompressible – Smoke is compressible Smoothed-particle Hydrodynamics • Smoothed-particle Hydrodynamics is a common method of simulating fluids with particles: Smoothed-particle Hydrodynamics • Each particle has similar properties to the grid method: – Density – Mass – Velocity – Temperature – Etc • Each particle isn’t constrained to a grid • Mass is conserved as each particle takes their mass with them as they move Smoothed-particle Hydrodynamics • A particle determines its advection / diffusion by using a kernel to sample nearby particles: – Common kernels are Gaussian and Cubic Spline • With the samples from the kernel it calculates the new properties of the particle and moves it • Particles can be sorted into bucketsfor fast access, as sampling thousands of particles can be a slow process: – Games have started using GPGPU to process SPH fluid! Visualisation • There are different methods for visualising fluids depending on the method used to simulate them: – One approach would be to render a simple quad particle at each point in a grid, or for each particle in SPH, and adjust its colour based on fluid density at that point – This method works great for simple simulations • •But other methods include… 2D Visualisation • In order to visualise fluids in 2D we can simply manipulate the texel colour within a texture – Easily simulates the grid method – For SPH we divide the space up into a 2D grid and calculate the density at each texel based on the particles within a cell – Colour is based on the particle properties, like temperature and density 3D Visualisation • In order to visualise fluids in 3D we divide the space up into a 3D grid – 3D grid-method is already divided – For SPH we need to divide the space up and calculate the density at each cell based on the particles within each cell 3D Visualisation • Particle Billboards – A simple technique using particle billboards at each cell, coloured based on fluid properties such as temperature, and transparency based off density: Ray-Traced Fluids • Ray-Tracing is the process of casting a ray from each pixel on the screen into a scene: – The ray intersects the scene and the pixel colour is chosen based on these intersections • This is a fairly slow process, but there are modifications that can speed it up for fluids: – Ray-Marching is a process of tracing with discrete sized rays for a number of iterations – If the fluid is represented as a volume, or a “Distance Field”, then we don’t have to do intensive ray intersections with a 3D scene but can instead test against known samples – The method can be complex to understand, but the results speak for themselves… Marching Cubes • The Marching Cubes algorithm is a technique of wrapping polygons around a volume to create a polygon mesh: – Iterating through a volume it samples 8 points for a cube’s corners, and based off which points are inside and outside the volume, builds triangles separating the “inside” and “outside” cells of the grid (there are 15 different combinations) – The same technique can be used in 2D – If the volume doesn’t have discrete “solid” and “nonsolid”, but instead has density, you can treat cells with a density below a certain threshold as non-solid Marching Cubes • The resolution of our volume dictates how many triangles are used to create the polygon mesh: Marching Cubes • Many games, engines, and tools are already using Marching Cubes, and not just for fluid… Summary • Fluid can add interesting dynamics 3 types of spring connections need to be used: – Structural Springs: connect to the 4 adjacent nondiagonal particles to maintain the “sheet” shape – Shear Springs: connect to the 4 adjacent diagonal particles to prevent “shearing” – Bend Springs: connect to particles 2 across, and prevents the shape from folding over itself Continuous Collision Detection • The problem with discrete collision is that our timestep does not just represent a single point in time, but instead the entire duration of the frame. • In order to avoid tunnelling, we need some way to represent the changing position of our objects over the entire timestep, not just the start and end  Buffer Objects • The inputs and outputs of a typical Render Pipeline are called Buffer Objects • There are different types of buffers, usually they are simply arrays of data – Vertex Buffer Object, containing the points that make up geometry for a mesh – Index Buffer Object, containing topology information for the geometry, i.e. triangles – Various image buffers, usually containing image pixel data, which can include data representing the screen being displayed or textures mapped onto geometry • Buffers can be used as input to various stages – Vertex and Index buffers are used as input to the pipeline – Image buffers can be accessed at different points through the pipeline • Some stages in a render pipeline may also output buffers – For example, the final output of the pipeline usually goes into an image buffer The Vertex Shader Stage • Once the buffers have been setup and sent to the GPU the first stage they usually enter when we make a render call is the Vertex Shader stage • This is a programmable stage that processes each vertex in a vertex buffer individually and separate from each other – Multiple vertices are processed at once through simple functions • Typically the task of the Vertex Shader is to transform vertices from object space to screen space for later stages in the pipeline Primitive / Patch Assembly Stage • The next stage is a fixed stage, and it is responsible for arranging the vertex data into primitives for the later stages – Makes use of vertex and index buffers • This stage is usually controlled simply by specifying the primitive type when rendering – Triangles, lines, points, patches • Outputs to either an optional Tessellation stage or to the Geometry Shader stage Tessellation Stages • These are an optional set of three stages – Tessellation Control Stage (Programmable) – Tessellation Primitive Generation Stage (Fixed) – Tessellation Evaluation Stage (Programmable) • The tessellator takes in patch pri Geometry Shader Stage • An optional programmable stage – If not set then it passes primitives straight to the Rasterisation stage • Receives all the information needed for a primitive – For example, 3 vertices in the case of a triangle, 2 for a line • Can perform any last minute processing on the primitive Rasterisation & Interpolation Stage • This fixed stage receives vector-based primitives and plots out the pixels covered by the shape • It then interpolates the data from each vertex across the pixels covered, passing the interpolated data at each pixel to the next stage in the pipeline Fragment Shader Stage • This is the final programmable stage in a typical Render Pipeline, but not the last stage • Also called the Pixel Shader, as it receives the interpolated pixel data from the rasteriser • Typically its job is to output a desired colour at the specified pixel, based on the input data, and by sampling other buffers usually containing texture information – Can output more than one pixel to more than one output buffer! • However the data returned is not necessarily what appears on screen! Raster Operations Stage • The final stage in the pipeline is a fixed stage • Receives data from the Fragment Shader • Using flags, it blends the pixel with any existing pixel at that location within the output buffer ???????????????????? ???????????????????????? × ???????????????????? ???????????????????? = ???????????????????? ???????????????????? ???????????????? ???????????????????????? × ???????????????????? ???????????????????? = ???????????????? ???????????????????? The GPU has no concept of a camera – There is no object moving around in virtual space that displays your scene for you • The GPU will display any geometry that is positioned within Clip Space – Typically an area in the range [-1,1] in all axis, centred at (0,0,0) – Any geometry not in this space won’t appear We multiply the Projection, View and Model (aka Global or World) matrices together – This matrix becomes our Projection View Model transform ???????????? = ???????????????????????????????????????? × ???????????????? × ???????????????????? ???????????????? ???????????????????? ???????????????????????? = ???????????? × ???????????????????? ???????????????????? ???????????????????????? ???? = ???????????????? ???????????????????? ???????????????????? ???? = ???????????????????????????????? (????????????????ℎ, ????????????????ℎ????, 1) ???????????????????????????????????????????? = (???? × 0.5 + 0.5) × V A quaternion has the mathematical form: – where: – i, j, and k are the imaginary dimensions – w, x, y and z in our case all relate to the rotations around those imaginary dimensions – We only have to deal with the scalar w and the vector [ x y z ] when we use quaternions in computer graphics ???? = ???? + ???????? + ???????? + ???????? ???? 2 = ???? 2 = ???? 2 = ???????????? = −1i, j, and k are the imaginary dimensions – w, x, y and z in our case all relate to the rotations around those imaginary dimensions – We only have to deal with the scalar w and the vector [ x y z ] when we use quaternions in computer graphics  A Scene refers to an environment such as a game level Frustum Culling is the term given to determining if an item is within view by testing it against the camera’s frustum – The frustum can be thought of as a pyramid shape with the top cut off – The edges of the frustum are treated as 3-D planes with normals facing inwards Bounding Volumes are shapes that encase all the elements that they bound – For example, a Sphere large enough to contain all the vertices in a mesh Scene Partitioning is the term given to dividing up a scene in some manner that allows us to easily cull swathes of data from even needing to be tested against a frustum Scene Graph is the term for structuring a scene within some form of graph structure Bounding Volume Hierarchies are tree graph structures where each node has some form of bounding volume shape – Sphere, Axis-Aligned Bounding Box, Orientated Bounding Box • Each node typically has two volumes – Local, which contains just the data that makes up the item – Global, which contains its Local bounds in addition to the Global bounds of all of its child nodesQuad-Trees (2-D) and Oct-Trees (3-D) are graphs that divide the scene into quarters • The scene is sub-divided and items are placed within the nodes that physically contain them – Nodes can remain empty if no item is located in the space the node represents – A node containing multiple items can be sub-divided further • Scenes attempt to place a single item within a single node, but in some cases a large item must appear in multiple nodes as it is too big for just one Binary Space Partitioning • A BSP tree is a graph where the branches represent a plane that divides the scene in two Portal Rendering is another form of partitioning • In Portal rendering a scene is divided up into Cells and Portals – A Cell could be thought of as a room – A Portal could be thought of as a door into other rooms – Commonly a Cyclic Graph All geometry on the GPU is represented by a buffer of vertices – A Vertex is a structure that defines all the information at a point in a primitive • Vertices can contain lots of information, but most importantly they usually hold the position of the point in 2D or 3D space Vertices are use to represent the shape of a primitive Primitives dictate the shape of the geometry represented by the vertex buffer Points – Each vertex represents an individual point Lines – Each pair of vertices represent a single line Line Strip – The first two vertices represent a line, then every additional vertex uses the last to represent another line Triangles (the most common in 3D meshes) – Every three vertices represent a different triangle Triangle Strip – The first three vertices represent a triangle, then every additional vertex uses the last two to represent another triangle Complex geometry, such as a character mesh, can contain thousands of triangles – Using vertices we would need to have triple the amount of vertices as there are triangles – Many of the vertices would represent the exact same data! – Instead we can make use of an Index Buffer to help reduce the vertex data Fragment Shader Stage • This programmable stage is the final programmable stage Fragment Shaders are written much like Vertex Shaders • Their input comes from the Rasteriser Stage and they have access to… – Any data that was output from the last used programmable stage – Built-in math functions – Variables / Flow Control / Functions – Texture and Image Buffers bound to the device This programmable stage is the final programmable stage Also called the Pixel Shader, as it typically is used to output a pixel colour to the screen ragment Shaders are written much like Vertex Shaders • Their input comes from the Rasteriser Stage and they have access to… – Any data that was output from the last used programmable stage – Built-in math functions – Variables / Flow Control / Functions – Texture and Image Buffers bound to the device • Their output is usually one or more pixel colours One of the most common uses of a Fragment Shader is to sample Texture buffers • A texture is simply a big image buffer of data bound to the GPU – 1-dimensional, 2-dimensional, and even 3-dimensional buffers are available! – Data contained in a texture isn’t called a Pixel, it is called a Texel (Texture Element or Texture Pixel) – Hardware requires textures have power of 2 dimensions, ie 512×512, 1024×2048 • Sampling a texture has a few different rules to how they work Applying a texture to geometry is called texture mapping • To be able to map a texture to geometry we need to know which parts of the texture apply to which parts of the mesh – To do this we need texture coordinates Texture coordinates are in the range 0.0 to 1.0 – The U represents left to right – The V represents bottom to top – The W represents depth bottom to top Texture Addressing refers to how textures are sampled outside the standard [0,1] range – They are set in the application when creating a texture • There are a few options – Wrap – Mirror – Edge Clamp – Border Clamp The rasterised image has to go somewhere • The pipeline outputs the rendered pixels into a buffer called a Frame Buffer – A type of Image Buffer, just like a Texture The GPU uses a frame buffer to store the pixels that get displayed to the screen – Commonly called a Back Buffer At the end of each frame this buffer is presented to the screen for the user to see  Doing this means we can render to the Back Buffer while the previous Back Buffer is being displayed to the user – When using an additional buffer this is called Double Buffering When we render a scene directly into the Back Buffer this is called Forward Rendering • We aren’t just limited to rendering to the Back Buffer though • Rendering into any frame buffer is possible – These off-screen buffers are commonly referred to as Render Targets – We can even render to more than one at a time, referred to as Multiple Render Targets • These buffers could then be used as input textures when rendering to any other buffer – This can give rise to many MANY new visual techniques! Currently when we render directly to the Back Buffer we don’t have access to any of the pixel fragments around the one currently being rendered • If instead we were to render the entire scene to a frame buffer, then render using that buffer as a texture stretched across the screen… – During the Fragment Shader we can sample the buffer at the matching pixel/texel location and see the original render – But we can also sample the pixels/texels around it! – If pixels have access to neighbouring pixels it can give rise to a great range of effects, such as Edge Detection • We can also render from other angles or locations – For example, render from the view of a security camera then apply the frame buffer as a texture across a computer screen in front of the player The most common use of Render Targets is Post-Processing Post-Processing usually consists of drawing the entire scene to a off-screen Frame Buffer – Once complete, a final render call is made, using the Back Buffer as the Render Target, drawing a single full-screen quad with the Render Target applied to it as a texture When sampling an image we usually sample a single texel • We can of course sample more than one, and with Post-Processing we have access to all the “visible” pixels that were rendered to a Render Target • One way to sample multiple texels is with a kernel filter – A kernel filter works by using a sampling matrix – The sampling matrix has weighting values that are multiplied by sampled texels around the current texel, and the current pixel is the sum of those weighted texels High Dynamic Range (HDR) represents lighting ranges that may not usually be seen but do still exist when the exposure is adjusted The Problem of Forward Rendering • Forward Rendering is when we draw a mesh to the back-buffer, then the next mesh, and the next, each separate from the other This means our complex lighting calculations and shaders get wasted We also only have access to the pixel currently being rendered A popular method of rendering that exists in almost all major PC/Console engines is Deferred Rendering – Also called Deferred Shading A scene is rendered to off-screen buffers without applying lighting/shading, then the lighting is applied as a Post-Processing step – This reduces the lighting calculations by not having to perform them for pixels that would get written over with overdraw This stage is done as a forward pass – Each mesh is rendered into off-screen buffers rather than the Back Buffer, called the G-Buffers The next pass in Deferred Rendering is the Light Pass – We render in to a new render target, the Light Buffer – We use 2 of the render targets from the previous pass as textures; the Position Buffer and Normal Buffer We then render all lights as geometry, with additive blending enabled Within the Fragment Shader for the light geometry we calculate the light for the pixel • We don’t use the surface colour in the calculation, only the light colour We then output the colour with additive blending We end up with a buffer that contains all light for each pixel Once all lights have been rendered we can move on to the Composite Pass The main benefit of Deferred Rendering is the reduction in lighting complexity and ability to have many lights – To be able to light a mesh with many lights in Forward Rendering we would have to render the mesh multiple times, or have shaders available for all the different light counts????(????????????ℎ???????????????????? × ????????????ℎ????????????????????????) • With Deferred we can now have many lights as they simply render as geometry – Each mesh is rendered, then each light  ????(????????????ℎ???????????????????? + ????????????ℎ????????????????????????) This means we can have environments with hundreds or thousands of lights! Transparency is extremely hard to include! – Typically transparent objects are rendered afterwards in a forward rendering manner, back to front – Modern techniques allow forms of “layers” or pixel linked-lists in deferred rendering for transparent objects, but these can be slow and require high-end hardware • Memory Footprint Issues – Not all systems like Multiple Render Targets – High Definition displays require large G-Buffers • Around 8mb for a single buffer using 4 bytes, closer to 32mb for a vec4-based buffer! • Video card memory can disappear fast • All meshes use the same material! Animation data consists of a sequence of predrawn (or rendered) images called KeyFrames Playing of animation is simply a matter of playing the correct frames in sequence at the correct frame rate t is good to now mention a rather simple shader technique that can be used to achieve great results – Animated texture coordinates, commonly called UV Animation • UV Animation is when texture coordinates are animated over time Morph targets – Often used for facial expressions – Similar to traditional animation with a series of key frames, except that we can interpolate between key frames rather than use one specific frame – Very precise control of model, but data can be very large • Skinning – The system used for most modern game animation – Model is rigged to a skeleton of bones – The skeleton is animated with key frames and the mesh wraps around the skeleton based off weighted values – Less precise control of model but data is much smaller Morphing aka Vertex Animation • Morphing is the process of animating vertices by interpolating each vertex from one key frame to another vertex in another key frame • Each key frame contains an entire set of every vertex within the mesh at their position for that key frame – Can be memory intensive for a model like a character with many vertices and many animations • This form of animation was used in games like Quake, before Skinning was in common use Morphing is useful in situations where a skeleton would be complicated to achieve fine detailed animation • Can be combined with skinning to add an extra level of animation detail Morphing can also be used on items such as cloth and flags – Flags can be given a far more detailed animation that has a greater level of realism than skeletal animation – Soft-body physics is increasingly used for these types of application Morphing requires a duplicate of the vertex data at each key frame – Positions, texture coordinates, normals, etc can all be animated – Only elements which change are needed – We find the two key frames that fall on either side of our current time within the animation – Interpolate the vertex data between the key frames using either CPU or GPU (preferred) • To process morphing on the GPU you will need to send across multiple vertex buffers When rendering an object we aren’t limited to a single vertex buffer – Meshes can be composed of multiple vertex buffer streams – Each stream can have more or less data, whatever is relevant • We send the streams for the two current key frames then interpolate using a time interval between them • The Vertex program uses linear interpolation on each pair of vertices Skeletal Animation is when we animate the vertices of a model based on weighted transforms – These transforms are commonly called Bones • These bones are placed throughout a model and the vertices around it are influenced by any movement made by them • The vertices that make up a mesh are known as the Skin • Key frames contain movement of the bones, not the vertices, so less information is needed to be stored – Usually less than 100 bones, compared to thousands of vertices Skeletons • A skeleton is a collection of transforms called bones • Typically these bones are arranged in a hierarchy – If one bone moves then the bones attached to it also move, but don’t require their own key frames to represent the movement • Not all bones need to have key frames for particular animations – An animation of a head nodding doesn’t require the legs to have key frames – This results in far less memory being required for animations Bones • Bone transforms are concatenated with their parent bone to calculate their actual position – For example, the nod animation would rotate the neck bone but not the head bone, but concatenating the matrices would move the head accordingly • Transforms are typically a Matrix – 4×4 Matrix storing rotation, scale and translation all in one – But as animation is stored as key frames we need to be able to interpolate between key frames, and with a Matrix we can’t properly interpolate • Key frame transforms are usually stored with separate properties – Position, Rotation and Scale are each separate • Position and Scale can both be represented by 3 floats each – X, Y and Z, and X-Scale, Y-Scale and Z-Scale • Rotation is stored as a Quaternion so that it can be interpolated properly Bones • Bone Key Frames can then Skinning • Now that we have updated the skeleton bones and have their transforms we still need to skin the skeleton • All Vertices have 2 additional properties – Bone Indices to specify which bones influence the vertex’s movement – Blend Weights to specify how much each bone influences it as a percentage • It is common for a Vertex to be influenced by 2-4 bones – The combined Blend Weights must add up to 100% and are typically values between 0.0 and 1.0, and thus must total 1.0 – A Vertex might be heavily influenced by two bones (both with weights of 0.4) but slightly influenced by two more (both at 0.1) Skinning • Before transforming a Vertex by its mesh’s World Transform and the View / Projection Transforms we first transform it by the bones influencing – We can do this on the GPU by sending an array of matrices representing the bones as a uniform array of mat4 and by giving each vertex in the vertex buffer indices and weights Bind Pose • There is just one step that we must do before we send the bones to the GPU • We need the bones that we send to the GPU to be offsets from their initial starting location – This initial set of transforms is called the Bind Pose • If they weren’t offsets then the skin would be transformed incorrectly for some rather hilarious yet extremely broken results Bind Pose • To calculate the correct bone offsets we need the inverse transform of the Bind Pose – The Inverse being the reverse of the transforms – The Inverse multiplied with the Bind Pose results in the Identity matrix, but the Inverse multiplied against a bone that has moved results in just the offset movement from the Bind Pose • We can use this inverse in two ways – Multiply the current bone transform by the Inverse Bind Pose to calculate the offset – Multiple the Vertex against the Inverse Bind Pose bones that effect it based off the Blend Weights • The first way is often preferred as the mesh is still able to be correctly rendered without animation Advantages of Skinning over Morphing • The same animation data can be played on multiple meshes • Animation data can be acquired using motion capture then played on a range of models • Animation data can be blended in different ways – In series: used for smooth transitions e.g. run to walk – In parallel: used for playing multiple animations simultaneously such as a run animation on the lower body and a combat animation on the upper Animation Blending • A bonus of a skeleton hierarchy is that we can setup complex animation systems that allow us to blend animations together, either completely or in segments: – Waist-down from the “Run” animation combined with the Waist-up from the “Shoot” animation can give us a “Running-Shoot” animation without an artist having to make a 3rd animation • We can also blend our own transforms on top of a bone: – Procedurally make a walking character look at an item by rotating it’s “head” bone to face the item Animation Blending • We can also use skeletons on other models created with a different skeleton: – Bones are just arranged in an array, so as long as a skeleton has enough bones for the vertex bone indices it is usable – We could use a single set of character animations for all character models in a fighting game for example – We could also use a “dog” skeleton on a “horse” model for humorous effects Each particle also has information about how it changes over time: – Velocity – Drag – Change in Rotation – Change in Transparency – Change in Size – Change in Colour – Lifespan Emitter • Particles are spawned from an Emitter • Emitters can be any shape, with particles spawning from within and / or on it – Point, cone, box, sphere, line, and mesh are common emitters • Usually specifies starting position and velocity of particles • Particles spawn from the emitter at specified rates, with a containing system maintaining the collection of alive particles A parent system updates the collection of particles each frame • Particle lifetime is reduced, and particles that die are no longer rendered • Emitters typically have a spawn rate and spawn new particles each frame as needed • Different emitters can be combined, each spawning particles with different properties, to create complex particle effects – Such as a smoke emitter + flame emitter + shockwave emitter In order to draw a particle, a quad of two triangles (i.e. 4 points) which always face the camera, is drawn – We render all of the visible particles • Making them face the camera is called Billboarding • For most effects we also enable alpha blending • We also typically disable writing to the depth buffer when rendering particles – We still use depth testing however – This is because we don’t know the order that our particles will render in and a closer particle might render before a further away particle and obscure it We can also lock an axis of the sprite, for example, locking an object’s up axis: – Create a temporary “forward” vector from the object to the camera – The right axis of the transform is the result of a cross product between the locked up vector and the temporary “forward” axis – The forward axis is the result of a cross product between the right axis and the locked up axis – We leave the up axis as the locked “up” axis Once orientated, the particles usually need to be rendered with Alpha Blending enabled and Depth Write disabled while still using Depth Testing – Alpha Blending is the term for blending colours based off alpha channel values – Depth Write is the term for writing a pixel’s depth to the depth buffer, which can be optionally disabled – Depth Testing is the term for testing a pixel’s depth to determine if it is occluded by a closer pixel • Usually any pixel further away than the current pixel at the same location is culled • This can be disabled, and different comparison methods can be used Alpha • Transparency is usually dictated by an image’s Alpha Channel • There are two types of Alpha – Alpha Test: boolean visible or notvisible based on Alpha comparison – Alpha Blend: interpolated blending of original colour and new colour, based off the Alpha value Alpha Test • Pixels can either be fully transparent or opaque – As in there is no semi-transparent) – Also called 1-bit Alpha • A fast technique usually used to define hard edges at a pixel level – Common for grass and leaves in games – 8-bit style 2D games Alpha Testing • The Alpha Test consists of two parts – The Alpha Test Value – The Alpha Test Function • The Alpha Test Function is a simple comparison function for determining if a pixel is culled or not – It has the following values: • EQUAL NOT_EQUAL • GREATER LESS • GREATER_EQUAL LESS_EQUAL • NEVER ALWAYS • The Alpha Test Value is simply a number to compare the pixel’s Alpha value to – i.e. a pixel with an Alpha value of 0.7 compared against an Alpha Test Value of 0.5 with an Alpha Test Function of GREATER_EQUAL would pass, and thus be visible Alpha Blending Options • When you wish to interpolate pixel colours based on transparency, which is common with particles, then you need to enable Alpha Blending • Alpha Blending specifies how to combine a new pixel colour, the Source, with one that already exists in the pixel it is rendering into, the Destination – Occurs after the Fragment Shader • There are two parts to Alpha Blending – Blend Equation – Blend Function Alpha Blending Options • The Blend Equation specifies how to combine the two pixels – Options are Add, Subtract, Reverse Subtract, Min and Max – The default is Add • The Blend Function is mathematical parameters applied to the Source and Destination first – There are many parameters, such as Zero, One, Source Alpha, One Minus Source Alpha, and more • The blend usually works as: ????????????????????????????????( ???????????????????????? ???????????????????????? × ???????????????????????? ????????????????????????????????????, ???????????????? ???????????????????????? × ???????????????? ???????????????????????????????????? ) An optional programmable stage • Executed for each primitive in a mesh • Receives the vertex data for all vertices used in the primitive – 1 for a point, 2 for a line, 3 for a triangle • Can output more or less primitives than it receives – Can even change the type of the primitives! – And can output 0 primitives! – Does not automatically output the received primitive Geometry Shader Input and Output • Despite there being multiple primitive types when rendering, the Geometry Shader only has 3 – Points – Lines – Triangles • It can also access adjacency information for primitives near the one being processed – Lines with adjacency and Triangles with adjacency • It also has a limited set of output primitives – Points – Line Strips – Triangle Strips • The Geometry Stage has one other advanced feature; it can output the processed vertices back to the application, without sending them to the Rasteriser to be drawn, through a method called Transform Feedback Writing Geometry Shaders is similar to the other shader stages, except that it requires a bit of extra work – You must define the input primitive layout and the output primitive layout – You must also define the maximum number of output vertices layout(triangles) in; layout(triangle_strip, max_vertices = 3) out; Writing Geometry Shaders • Here is an example of a simple Vertex Shader and Geometry Shader – The Geometry Shader is receiving triangle primitives and outputting triangle strip primitives with a max of 3 vertices, so just 1 triangle • During the shader you must notify when a vertex has been fully defined – EmitVertex() notifies the shader that elements of the current vertex have been set Shadows can be simple alpha “blobs” drawn under the character – Ray trace down from the centre of the character – Note the first solid object the ray hits – Calculate the intersection point – Draw a quad, textured with the shadow, at the intersection point • Such shadows don’t add much to realism but help a lot with depth perception Advantages: – Fast to implement and calculate at runtime • Disadvantages: – Shadow shape doesn’t match the object – Shadow doesn’t map to object it’s cast onto Shadow Volumes are areas occluded from light by an object – The shadow then appears on the surface of a receiver, but the area between the occluder and receiver is the volume in shadow • Commonly implemented using Stencil Buffers that identify areas where the shadow volume intersects receiving geometry Any object that casts a shadow is called an Occluder • Any object that receives a shadow is called a Receiver • To determine what areas are in shadow we need to determine the silhouette of the occluders, extruded and projected onto the shadow receivers – This is the Shadow Volume Creating the Shadow Volume can be difficult • Determining the silhouette involves 1. Determine if a triangle is lit 2. If triangle is lit, check the triangles that share its edges 3. If a triangle with a shared edge is lit and its neighbour isn’t then that edge is part of the silhouette Once we have created a Shadow Volume we need to create a Stencil Buffer • A stencil buffer is a type of render target – Acts as a “mask” rather than colour • In its simplest form a stencil buffer is used to specify areas that are in shadow or aren’t in shadow – 0 if it is not in shadow – 1+ if it is in shadow Shadow Volume Process • First we need a depth buffer containing the receivers – i.e. Render all receivers so that we have a depth buffer • Next we render the front faces of the shadow volumes to the stencil buffer, disabling writing to the depth buffer – Any faces NOT culled by the depth buffer add a 1 to the stencil buffer at that pixel • Next we render the back faces of the shadow volumes to the stencil buffer, still disabling writing to the depth buffer – Any faces NOT culled by the depth buffer subtract 1 from the stencil buffer at that pixel • What remains is a stencil buffer with 0’s written where there are no shadows Advantages and Disadvantages • Advantages: – Automatic self shadowing – Broadly supported in most hardware • Disadvantages: – Requires a high fill-rate – Silhouette determination can be costly – Artefacts can appear near object edges – Shadows have sharp edgesShadow Mapping Theory • First we render all the shadow casting objects from the perspective of the light into our shadow map – Storing distance to the light rather than colour – It is best to give our shadow a maximum distance and scale the distance to fit within the range [0.0 and 1.0] • Next we render the scene as usual from the perspective of the camera, and all shadow receiving pixels test their pixels against their corresponding pixel in the shadow map – If the current pixel is further from the light than the pixel the light can see, then it is shadowed • We can determine which pixel in the shadow map matches the current pixel by projecting the shadow map onto the scene – This technique is called Projected Texture Mapping Projected Shadow Scene From Camera Scene From Light Shadow Map Shadowed Scene Projected Texture Mapping • A technique which maps a texture across a surface by generating texture coordinates based off a projection matrix – It appears as if the texture is cast over the surface from a projector • When we render a scene we project it onto the screen using a camera’s View and Projection matrices – These matrices flatten our 3D scene onto a 2D plane, within a [-1,1] range • We can use a similar technique to generate texture coordinates – Create a View and Projection matrix at the location and facing that our texture is projecting from – Transform the scene vertices against the camera as usual, but also project onto the new View and Projection matrix separately to create texture coordinates in the range [-1,1] – Convert the result of the transform from [-1,1] to [0,1] – Use this new result as texture coordinates! Projected Texture Mapping • Projected textures have many uses – Stained Glass – Spell Effects – Decals • For Shadow Mapping we project the shadow across the scene using the same View and Projection as we used for rendering from the perspective of the light Shadow Mapping Summary • Advantages: – Shadow Mapping is an image-space technique that can work for any objects rendered on the GPU – Avoids high fill-rate – Handles curved surfaces • Disadvantages: – Shadow quality is usually dependant on the resolution of the shadow map – The scene geometry must be rendered again for each light source – GPU usage increases exponentially as more lights and objects are added to the scene – Distance Lights require large textures to contain the world, or only shadow small areas – Point Lights require the use of Cube Maps and rendering to 6 Shadow Maps Alternative Shadow Mapping Techniques • New advances in how shadow maps are calculated have resulted in many new and improved methods for shadow mapping – Cascaded Shadow Maps – Variance Shadow Maps – Percentage Closer Filtering – And many more… • The quality of the shadow map is also controlled by the size and format of the render target used, and how much area the shadow map covers – A large map could be used for an entire scene, or a small map used for the player character’s shadow Soft Shadow Tweaks • Both Shadow Volumes and Shadow Mapping create shadows with hard edges – Real shadows have soft edges called the penumbra Soft Shadow Tweaks • Soft shadows can be simulated by either – Jittering the light source and averaging all the frames – Blurring the shadow texture in the case of Shadow Mapping These patterns are called Pseudo-Random as they are not truly random – If we used the same seed value then the results would be the same as before – If we know the seed ahead of time, and the way in which the random pattern is generated from the seed, then we could predict all of the results, and thus they aren’t “random” • Different seed values can give wildly differing results, or just slightly different results Pseudo-Random Number Generators (PRNGs) are methods of creating random values based off of a seed value • There are many different methods – Some generate a long list of results from a single seed – Some need an initial seed to generate a value, and the next value is based off the previous result, and so on – Some need a seed value each time they are called and return a single result Perlin Noise is a well-known PRNG that creates smooth “noise” • Random 1-D, 2-D and 3-D patterns are formed where the seed values come from the location of the current sample – Values are smoothly interpolated with neighbouring samples creating a gradient 2-Dimensional Perlin Noise Perlin Noise • Perlin Noise works by sampling “noise” and interpolating the results together – In 1 dimension, sampling for X we would interpolate between X-1 and X+1, using X as a scale between X-1 and X+1 for interpolation • As an example, if X was a value 3.7, we could sample noise at 3.0 and 4.0, then use 0.7 as the interpolation scale between the noise of 3.0 and 4.0 Typically Perlin Noise is generated multiple times, with the sample location X scaled by a Frequency, and the returned result scaled by an Amplitude – Frequency controls the gradient changes – Amplitude controls the strength of the gradient • The following are different 2-D Perlin Noise results with various Frequency and Amplitude: ℎ = ???????????????????????? ???? × ???? × a Advantages • More content! • Can increase the replay value of games • Less time spent on content development • Unique looking game objects • Reduces size of game content: Great for downloading and streaming online games Disadvantages • Game design and game-play issues: – Random is unpredictable and not always interesting • Unrealistic looking environments • Repetitive game objects: – Bad random can be bland and repetitive • Generation time • Tech can be very complicated and require significant up-front investment