V

Avatar



Ping-pong technique on GPU

Hello there. Here is a new tutorial, this time about ping-pong on the gpu. I’ve been wanting to write about it for sometime, finally it’s off my todo list. Let’s get down to business. Ping-pong technique is normally used with a shader that needs it’s result as a source parameter for it’s next iteration. This is usually used in the gpu as for now it is not possible to write a program’s result to itself, so, we’ll need another equal buffer to save the current result for how next step/iteration. That was too hard?

So imagine we have 2 image buffers Image1, Image2. Usually to change data from Image1, you would simply access it and write directly back to the same positioion. Now, when we’re talking about a shader fragment program we can’t simply do that. You may ask now, what’s the solution? Ping-pong it!

Here is what i’m talking about:

//
// Initialization
//
int W = 100;
int H = 100;
ImageArray = new int[2][W*H];
int CurrActiveBuffer = 0; // Current active buffer index
 
 
//
// Mainloop
//
for( int j=0; j<H; j++ )
{
	for( int i=0; i<W; i++ )
	{
		// ERROR! Write to same buffer. Not possible in gpu shader
		//Image1[i+j*W] = Image1[i+j*W] * 2;  // Mul by 2
 
		// Ping-pong version
		int src = CurrActiveBuffer; // Current active buffer (Input)
		int dest = 1-CurrActiveBuffer;  // Back buffer (Output)
		Image1[dest][i+j*W] = Image1[src][i+j*W] * 2;  // Mul by 2
	}
}
 
// Swap back and front buffers (read becomes write and vice-versa)
CurrActiveBuffer = 1-CurrActiveBuffer;    // CurrActiveBuffer ? 0 : 1;

As you can see we start with buffer 0. That’s where we get out data from (read) and write it to Buffer 1. Once the operation is done, we swap buffers and so on. This way we’ll be able to use last iteration’s data as input for the next iteration.

Now let’s put this in OpenGL way. I will be using a FrameBufferObject (FBO) and 2 textures here. The framebuffer will be holding to both textures as 2 Color Attachments. So let’s get coding:

//
// Initialization
//
int W = 100;
int H = 100;
int FboID;
int TexID[2];
int CurrActiveBuffer = 0;  // Current active buffer index
 
// Create the frambuffer
glGenFramebuffersEXT( 1, &amp;FboID );
glBindFramebufferEXT( GL_FRAMEBUFFER_EXT, FboID );
 
// Create 2 textures for input/output.
glGenTextures( 2, TexID );
 
for( int i=0; i<2; i++ )
{
	glBindTexture( GL_TEXTURE_2D, TexID[i] );
	glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR );
	glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR );
	glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE );
	glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE );
	glTexImage2D( GL_TEXTURE_2D, 0, GL_RGBA, W, H, 0, GL_RGBA, GL_FLOAT, NULL );
	if( _hasMipmapping )  glGenerateMipmapEXT( GL_TEXTURE_2D );
}
 
// Now attach textures to FBO
int src = CurrActiveBuffer;
int dest = 1-CurrActiveBuffer;
glFramebufferTexture2DEXT( GL_FRAMEBUFFER_EXT, 
                           GL_COLOR_ATTACHMENT0_EXT, 
                           GL_TEXTURE_2D, TexID[src], 0 );
glFramebufferTexture2DEXT( GL_FRAMEBUFFER_EXT, 
                           GL_COLOR_ATTACHMENT1_EXT, 
                           GL_TEXTURE_2D, TexID[dest], 0 );
);
 
 
 
//
// Mainloop
//
int src = CurrActiveBuffer;
int dest = 1-CurrActiveBuffer;
 
FBO.Bind();
 
glDrawBuffer( dest );
 
glBindTexture( GL_TEXTURE_2D, TexID[src] );
ShaderProgram.SetTextureUniform( 0 );
RenderScene();
 
FBO.Unbind();
 
 
// Swap back and front buffers (read becomes write and vice-versa)
CurrActiveBuffer = 1-CurrActiveBuffer;    // CurrActiveBuffer ? 0 : 1;

Why would you want to ping-pong? Well for instance imagine you are doing a water effect in the gpu? You need to access data from your previous buffer right? In the CPU that would be trivial as you know but if you want to push it’s limits by using GPU this is the next step for you.

Have fun.


Flattr this





Evaluate a Cubic Bézier on GPU

I’ve made an application as an example for this thread on how to compute/evaluate a Cubic Bézier Curve using a Geometry Shader. The formula is pretty straightforward as described by this wikipedia article (look for Cubic Bézier Curve). I will not go over the bézier math or theory. I assume you have some knowledge in shader programming (GLSL is the case) and some math background would help, while not really a need. All that said, let’s get to work.

On this case we will need 4 points: 2 anchor points (the line end points) and 2 control points. The control points won’t really touch the curve, they work more as directional information on the curve itself.

As we need to send this data to the shader i have decided to use  LINES as input primitive, 2 points define a line so it’s perfect, we’ll use that for the 2 anchor points. As for the control points 2 different texture units (glMultiTexCoord3f) attached to the line’s vertex data will do. Using geometry shaders besides setting the input primitive type we also need to set the output type. LINE_STRIP is fine, as it works perfectly for what we’re doing. That’s all on the application side.

On the vertex shader side it’s pretty simple:

[VERTEX SHADER]
void main( void )
{
    gl_FrontColor = gl_Color;
    ControlPoint1 = gl_MultiTexCoord0.xyz;
    ControlPoint2 = gl_MultiTexCoord1.xyz;
 
    gl_Position = gl_Vertex;
}

What the code is doing is sending the data further down the pipeline to the geometry shader, where all the magic happens. At this stage, having both Anchor and Control points we can now define the curve by a given detail. Think of detail as a number of step-points along the curve which makes it look smoother or flatten (tesselation, subdivision, smoothing, etc).

Both control points are sent by the application for the geometry shader, still, as in the the vertex shader comes before the geometry shader, we will need to send them down on the vertex-shader, otherwise the GS won’t be able to “see” them (i know, hurray for Cg). We’re now almost done. By using the function from the above link(s) and as shown below we compute the curve with a given detail on the geometry shader, by generating new vertices along the curve purely on the gpu side. I think it to be pretty straightforward and the code should be self-explanatory.

[GEOMETRY SHADER]
uniform int g_Detail;
varying in vec3 ControlPoint1[];
varying in vec3 ControlPoint2[];
 
// Found in nvidia sdk
vec3 evaluateBezierPosition( vec3 v[4], float t )
{
    vec3 p;
    float OneMinusT = 1.0 - t;
    float b0 = OneMinusT*OneMinusT*OneMinusT;
    float b1 = 3.0*t*OneMinusT*OneMinusT;
    float b2 = 3.0*t*t*OneMinusT;
    float b3 = t*t*t;
    return b0*v[0] + b1*v[1] + b2*v[2] + b3*v[3];
}
 
void main()
{
    vec3 pos[4];
    pos[0] = gl_PositionIn[0].xyz;
    pos[1] = ControlPoint1[0];
    pos[2] = ControlPoint2[0];
    pos[3] = gl_PositionIn[1].xyz;
    float OneOverDetail = 1.0 / float(g_Detail-1.0);
    for( int i=0; i<g_Detail; i++ )
    {
        float t = i * OneOverDetail;
        vec3 p = evaluateBezierPosition( pos, t );
        gl_FrontColor = gl_FrontColorIn[0]; 
        gl_Position = gl_ModelViewProjectionMatrix * vec4( p.xyz, 1.0 );
        EmitVertex();
    }
 
    EndPrimitive();
}

What’s next? That is up to you. You’re not going to leave me with all the work, are you ?

Download the example + source.
You will also need to install Vitamin 0.5.6 as the project is built with it.





Float

I have recently released at inércia Demoparty, a gift i made for Sue. It started as a christmas gift that would express myself and ofcourse keep me from following everybody else to the local shop and spend my money on something  made by others.

There is no executable this time, as i don’t know if i can spread the music, still, nobody can stop me  (for the moment) from giving you a video. So here it is, Float.





Grazing Jellies

Grazing Jellies is an augmented reality project i had the pleasure to work in. I made team with Neil Mendonza and Hudson-Powell, commissioned by the Abandon Normal Devices festival. Grazing Jellies takes place in a forest, a realtime portal into a colorful dream-like world of jelly creatures. The creatures were created to react to movement of the ambient/people and can also be called by making noise, When nothing is going on they wander around the world and hunt for food. My work on this project was mostly about the jellies generation/animation/rendering.

At first, there was the idea to use metaballs for the creatures, we wanted to give this jelly surfaces some wobbling/round forms. After some testing we decided not to, as i did not see any advantage specially in performace,  so we went with a more “traditional” method.  The creatures were generated from 2 steps. The body and the head.

The body: Create a skeleton line and generate a deforming cylindrical body from the line. After some time and tweaking the body right. After just had to start playing with values to get the animation going. Some trig + using creature’s motion parameters worked just fine, we got it right, but, there was still a problem, the creature’s heads. The head: Well i tried different methods, interpolating the body’s end with some spherical shape wish ease on, but it did not work out. The head textures just didn’t look good, “pinched” as Jody said several times.  We ended up creating the head as a second step, building an hemisphere and then “attaching”  it to the body’s end. Good news is later on, it came handy, as it made things alot easier to map the head’s texture. Some things just come handy sometimes. I had to tweak a bit to get the head and body animation get along, but in the end it turned out to look pretty good.

As for the lighting side, a kind of “ambient light” + phong lighting + cubemap reflections made it to the final version.

This is an awesome project and i really enjoyed working with the team. As said before this should not be a closed project, it was projected to live and change so it can fit other ambients, so expect to see more from the *mighty* Grazing Jellies.

Some videos on the development blog.

Full article on the festival @ Wired.





Mass cube rendering

I have recently done some experimental work in order to render mass amounts of cubes. I was moved by this video from Smash, i wanted to know how far i would be able to go on my NVidia GT240M (rendering side). My first choice was Geometry Shaders.
I quickly wrote an app that sent a list of points in space to the GPU and a geometry shader would generate a cube mesh for each one of  the points. Tested it on 100.000 cubes and the framerate was bad(10fps or so). The time was now for optimization.

Next step was to optimize the cube generation by lowering from a 24 vertex cube to 14 vertex triangle strip. Things got better, but nothing close to my expectations. I was not satisfied, i mean, i had alot of cubes on screen (100K which was not that much) and that was it, nothing else. We’re talking about 20fps or so, for 100.000 cubes (around 1.2 million triangles per frame). Later on i added vertex normals to the geometry shader and started to work on some lighting/shadowing, but i ended up going back on the rendering side of the job. Meanwhile, i was speaking with a friend of mine about this idea and we were discussing ways to compute lighting but i couldn’t stop thinking about my real problem. So it came to me.

Previously, i have done some experiments with opengl hardware instancing, but never got  to do much about it .  What better time than now, so i grabbed the project and took it for a spin. After a few hours i had the same amount of cubes on screen with a much much better framerate. Quickly implemented some eye-candy (coloring, texturing, vertex lighting), some tweaking here and there and as i was listening to Mr. Peter Broderick (hi, i love you man) added some audio analysis to the feature list.
Last but not least, a kind of “Brownian Motion” was used to generate points in space, increased the cube count to 512*512 and watched it flow ( at 20fps ).

In conclusion, Hardware instancing was much easier to implement  and performance seems much better at first sight. Above is a video of 262.144 audio-reactive cubes with GPU animation and basic lighting at around 20fps. For my video card i think that is very good. On a sidenote, i have not given up on the geometry shaders. I am not sure what will be my next step regarding the subject (back to geometry shaders?) but for now this is it. Hardware Instancing kicked Geometry Shaders in the ass.