V

Avatar



MSAOpenCL for Java and Processing

Recently I have been doing some work with OpenCL and i found myself messing up with old code and re-writing most of it .  At first it was interesting and quite funny but no more, so,  I thought it was time to make things easier for myself and just make things reusable by creating a wrapper which would wrap most of the boring and time-consuming parts. This would be a wrapper on OpenCL for Java and/or Processing. Well, a well known guy named Mehmet “Memo” Akten has done such a wrapper in C++ for the OpenFrameworks and Cinder libraries. I have ported it  to Java.

There are some problems with reading/writing from/to GL textures. For some reason it crashes on me. I have been working with JavaCL‘s author, so hopefully it will be working soon.

:: Download

Download it, install it and try it.
If you find any problems or if you have suggestions, let me know.

Have fun.





Ping-pong technique on GPU

Hello there. Here is a new tutorial, this time about ping-pong on the gpu. I’ve been wanting to write about it for sometime, finally it’s off my todo list. Let’s get down to business. Ping-pong technique is normally used with a shader that needs it’s result as a source parameter for it’s next iteration. This is usually used in the gpu as for now it is not possible to write a program’s result to itself, so, we’ll need another equal buffer to save the current result for how next step/iteration. That was too hard?

So imagine we have 2 image buffers Image1, Image2. Usually to change data from Image1, you would simply access it and write directly back to the same positioion. Now, when we’re talking about a shader fragment program we can’t simply do that. You may ask now, what’s the solution? Ping-pong it!

Here is what i’m talking about:

//
// Initialization
//
int W = 100;
int H = 100;
ImageArray = new int[2][W*H];
int CurrActiveBuffer = 0; // Current active buffer index
 
 
//
// Mainloop
//
for( int j=0; j<H; j++ )
{
	for( int i=0; i<W; i++ )
	{
		// ERROR! Write to same buffer. Not possible in gpu shader
		//Image1[i+j*W] = Image1[i+j*W] * 2;  // Mul by 2
 
		// Ping-pong version
		int src = CurrActiveBuffer; // Current active buffer (Input)
		int dest = 1-CurrActiveBuffer;  // Back buffer (Output)
		Image1[dest][i+j*W] = Image1[src][i+j*W] * 2;  // Mul by 2
	}
}
 
// Swap back and front buffers (read becomes write and vice-versa)
CurrActiveBuffer = 1-CurrActiveBuffer;    // CurrActiveBuffer ? 0 : 1;

As you can see we start with buffer 0. That’s where we get out data from (read) and write it to Buffer 1. Once the operation is done, we swap buffers and so on. This way we’ll be able to use last iteration’s data as input for the next iteration.

Now let’s put this in OpenGL way. I will be using a FrameBufferObject (FBO) and 2 textures here. The framebuffer will be holding to both textures as 2 Color Attachments. So let’s get coding:

//
// Initialization
//
int W = 100;
int H = 100;
int FboID;
int TexID[2];
int CurrActiveBuffer = 0;  // Current active buffer index
 
// Create the frambuffer
glGenFramebuffersEXT( 1, &amp;FboID );
glBindFramebufferEXT( GL_FRAMEBUFFER_EXT, FboID );
 
// Create 2 textures for input/output.
glGenTextures( 2, TexID );
 
for( int i=0; i<2; i++ )
{
	glBindTexture( GL_TEXTURE_2D, TexID[i] );
	glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR );
	glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR );
	glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE );
	glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE );
	glTexImage2D( GL_TEXTURE_2D, 0, GL_RGBA, W, H, 0, GL_RGBA, GL_FLOAT, NULL );
	if( _hasMipmapping )  glGenerateMipmapEXT( GL_TEXTURE_2D );
}
 
// Now attach textures to FBO
int src = CurrActiveBuffer;
int dest = 1-CurrActiveBuffer;
glFramebufferTexture2DEXT( GL_FRAMEBUFFER_EXT, 
                           GL_COLOR_ATTACHMENT0_EXT, 
                           GL_TEXTURE_2D, TexID[src], 0 );
glFramebufferTexture2DEXT( GL_FRAMEBUFFER_EXT, 
                           GL_COLOR_ATTACHMENT1_EXT, 
                           GL_TEXTURE_2D, TexID[dest], 0 );
);
 
 
 
//
// Mainloop
//
int src = CurrActiveBuffer;
int dest = 1-CurrActiveBuffer;
 
FBO.Bind();
 
glDrawBuffer( dest );
 
glBindTexture( GL_TEXTURE_2D, TexID[src] );
ShaderProgram.SetTextureUniform( 0 );
RenderScene();
 
FBO.Unbind();
 
 
// Swap back and front buffers (read becomes write and vice-versa)
CurrActiveBuffer = 1-CurrActiveBuffer;    // CurrActiveBuffer ? 0 : 1;

Why would you want to ping-pong? Well for instance imagine you are doing a water effect in the gpu? You need to access data from your previous buffer right? In the CPU that would be trivial as you know but if you want to push it’s limits by using GPU this is the next step for you.

Have fun.


Flattr this





Evaluate a Cubic Bézier on GPU

I’ve made an application as an example for this thread on how to compute/evaluate a Cubic Bézier Curve using a Geometry Shader. The formula is pretty straightforward as described by this wikipedia article (look for Cubic Bézier Curve). I will not go over the bézier math or theory. I assume you have some knowledge in shader programming (GLSL is the case) and some math background would help, while not really a need. All that said, let’s get to work.

On this case we will need 4 points: 2 anchor points (the line end points) and 2 control points. The control points won’t really touch the curve, they work more as directional information on the curve itself.

As we need to send this data to the shader i have decided to use  LINES as input primitive, 2 points define a line so it’s perfect, we’ll use that for the 2 anchor points. As for the control points 2 different texture units (glMultiTexCoord3f) attached to the line’s vertex data will do. Using geometry shaders besides setting the input primitive type we also need to set the output type. LINE_STRIP is fine, as it works perfectly for what we’re doing. That’s all on the application side.

On the vertex shader side it’s pretty simple:

[VERTEX SHADER]
void main( void )
{
    gl_FrontColor = gl_Color;
    ControlPoint1 = gl_MultiTexCoord0.xyz;
    ControlPoint2 = gl_MultiTexCoord1.xyz;
 
    gl_Position = gl_Vertex;
}

What the code is doing is sending the data further down the pipeline to the geometry shader, where all the magic happens. At this stage, having both Anchor and Control points we can now define the curve by a given detail. Think of detail as a number of step-points along the curve which makes it look smoother or flatten (tesselation, subdivision, smoothing, etc).

Both control points are sent by the application for the geometry shader, still, as in the the vertex shader comes before the geometry shader, we will need to send them down on the vertex-shader, otherwise the GS won’t be able to “see” them (i know, hurray for Cg). We’re now almost done. By using the function from the above link(s) and as shown below we compute the curve with a given detail on the geometry shader, by generating new vertices along the curve purely on the gpu side. I think it to be pretty straightforward and the code should be self-explanatory.

[GEOMETRY SHADER]
uniform int g_Detail;
varying in vec3 ControlPoint1[];
varying in vec3 ControlPoint2[];
 
// Found in nvidia sdk
vec3 evaluateBezierPosition( vec3 v[4], float t )
{
    vec3 p;
    float OneMinusT = 1.0 - t;
    float b0 = OneMinusT*OneMinusT*OneMinusT;
    float b1 = 3.0*t*OneMinusT*OneMinusT;
    float b2 = 3.0*t*t*OneMinusT;
    float b3 = t*t*t;
    return b0*v[0] + b1*v[1] + b2*v[2] + b3*v[3];
}
 
void main()
{
    vec3 pos[4];
    pos[0] = gl_PositionIn[0].xyz;
    pos[1] = ControlPoint1[0];
    pos[2] = ControlPoint2[0];
    pos[3] = gl_PositionIn[1].xyz;
    float OneOverDetail = 1.0 / float(g_Detail-1.0);
    for( int i=0; i<g_Detail; i++ )
    {
        float t = i * OneOverDetail;
        vec3 p = evaluateBezierPosition( pos, t );
        gl_FrontColor = gl_FrontColorIn[0]; 
        gl_Position = gl_ModelViewProjectionMatrix * vec4( p.xyz, 1.0 );
        EmitVertex();
    }
 
    EndPrimitive();
}

What’s next? That is up to you. You’re not going to leave me with all the work, are you ?

Download the example + source.
You will also need to install Vitamin 0.5.6 as the project is built with it.





JavaCL, the new OpenCL4Java

I have noticed that OpenCL4Java is on version 1.4beta by now and that my examples were crashing when running on a GPU device. Today i took the time to do something about it. I have downloaded the new version and have updated the examples to run with 1.4beta. Everything seems to work just fine now, if you have a different opinion, please do let me know.

Download: http://victamin.googlecode.com/files/JavaCL_1_4b.zip

Have fun!





OpenCL 4 Java & Processing

I finally took  the time to play with OpenCL. I was motivated by the particle example from Rui Madeira. After speaking with him, he gave me a few other links on more examples like Memo Akten‘s 1.000.000 particles running with mouse interaction on the GPU, the very NVidia’s first OpenCL application example, etc. I was intriged!  So i took the day to play with OpenCL4Java and ported Rui’s example to Java running on Processing’s IDE.  I’ve tested it with a Intel Quad core and i found the Rui’s sample to crash with GTX280 videocard. I didn’t gave it much thought but it might be for that for-loop in the program on the particle-particle iteration. In the other hand, Memo’s example ran easily on the gpu side. I should have made an example of mine, but it was easier to just port their examples and get things running. That was the main goal.

I have only tested this under Windows XP 64bit with ATI Stream SDK and Nvidia drivers. If you find any problem, please report.
One other thing: Memo’s example isn’t the real thing. It’s simply the CL program. So you won’t be able to get all the fuzzy million particles around. Not yet.

Library/Examples available here: http://victamin.googlecode.com/files/OpenCL4Java_1b.zip

Installation steps:
1. Copy the library to processing’s libraries folder
2. Install ATI Stream SDK (i have packed the OpenCL.dll file, still you might have to install the whole pack). If you’re using Nvidia, install the OpenCL drivers and toolkit
3. Open the example and run it.

Enjoy and have fun!