Compute Shader GLSL Variables
uvec3 gl_NumWorkGroups global work group size we gave to glDispatchCompute() uvec3 gl_WorkGroupSize local work group size we defined with layout uvec3 gl_WorkGroupID position of current invocation in global work group uvec3 gl_LocalInvocationID position of current invocation in local work group uvec3 gl_GlobalInvocationID unique index of current invocation in global work group uint gl_LocalInvocationIndex 1d index representation of gl_LocalInvocationID
Execution:
执行渲染是:一个texture到full-screen quad,当然是要用个矩形绘制填充NDC
Creating Texture/Image创建纹理:
创建32位图,最后一句话 OpenGL treats "image units" slightly differently to textures, so we call a glBindImageTexture() function to make this link. Note that we can set this to "write only".
这个贴图单元与普通得textures稍微不一样,用最后一句函数可以让 图片写入。
// dimensions of the image int tex_w = 512, tex_h = 512; GLuint tex_output; glGenTextures(1, &tex_output); glActiveTexture(GL_TEXTURE0); glBindTexture(GL_TEXTURE_2D, tex_output); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR); glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA32F, tex_w, tex_h, 0, GL_RGBA, GL_FLOAT, NULL); glBindImageTexture(0, tex_output, 0, GL_FALSE, 0, GL_WRITE_ONLY, GL_RGBA32F);
Determining the Work Group Size线程开辟
glDispatchCompute()函数可以决定 我们在compute shader invocations定义计算量。首先得到最大size of the total work group
通过以下函数得到在x,y,z上 total work group 范围:
int work_grp_cnt[3]; glGetIntegeri_v(GL_MAX_COMPUTE_WORK_GROUP_COUNT, 0, &work_grp_cnt[0]); glGetIntegeri_v(GL_MAX_COMPUTE_WORK_GROUP_COUNT, 1, &work_grp_cnt[1]); glGetIntegeri_v(GL_MAX_COMPUTE_WORK_GROUP_COUNT, 2, &work_grp_cnt[2]); printf("max global (total) work group counts x:%i y:%i z:%i ", work_grp_cnt[0], work_grp_cnt[1], work_grp_cnt[2]);
得到最大支持 local work group 大小(sub-division of the total number of jobs总任务局部细分),这个是着色器内部定义的,用关键字layout。
int work_grp_size[3]; glGetIntegeri_v(GL_MAX_COMPUTE_WORK_GROUP_SIZE, 0, &work_grp_size[0]); glGetIntegeri_v(GL_MAX_COMPUTE_WORK_GROUP_SIZE, 1, &work_grp_size[1]); glGetIntegeri_v(GL_MAX_COMPUTE_WORK_GROUP_SIZE, 2, &work_grp_size[2]); printf("max local (in one shader) work group sizes x:%i y:%i z:%i ", work_grp_size[0], work_grp_size[1], work_grp_size[2]);
更进一步:在compute shader的local work group 最大工作单元(work group units)是 如果我们如果在one local work group 执行32X32的工作任务,意味着不能超过这个32*32 = 1024这个值。
local work group size是根据设备来的。用合理的限制,让用户合理的调整 local work group 的大小可以获取更好的性能。
我的本机输出:
max global (total) work group size x:2147483647 y:65535 z:65535 max local (in one shader) work group sizes x:1024 y:1024 z:64 max computer shader invocations 1024
从简单的设置开始:
------把全局工作组大小(Global work group size) 设置与贴图一样大512*512
------局部工作组大小(Local work group size) 设置成1个像素 1*1
------设置Z大小为1
编写最基础的Compute shader
#version 450 layout(local_size_x = 1, local_size_y = 1) in; layout(rgba32f, binding = 0) uniform image2D img_output;
第一行的layout是定义local work group 大小。这个是后台决定,如果要调整 local work group 更大点的话,我们不用改材质。
这里我们决定用1个像素用1*1,如果工作组要支持 1d or 3d 需要改结构。
第二行的layout是图片设置,注意不是uniform sampler XXX,而是 uniform image2D XXX
现在开始设置一个黑色shader:
void main() { // base pixel colour for image vec4 pixel = vec4(0.0, 0.0, 0.0, 1.0); // get index in global work group i.e x,y position ivec2 pixel_coords = ivec2(gl_GlobalInvocationID.xy); // // interesting stuff happens here later
// // output to a specific pixel in the image imageStore(img_output, pixel_coords, pixel); }
第二行坐标是用:GLSL内建变量 gl_GlobalInvocationID.xy定位调用在工作组的坐标。
编译材质用这个类型:GL_COMPUTE_SHADER:
GLuint ray_shader = glCreateShader(GL_COMPUTE_SHADER); glShaderSource(ray_shader, 1, &the_ray_shader_string, NULL); glCompileShader(ray_shader); // check for compilation errors as per normal here GLuint ray_program = glCreateProgram(); glAttachShader(ray_program, ray_shader); glLinkProgram(ray_program); // check for linking errors and validate program as per normal here
我们当然还要渲染图片到最后的quad上,这样就可以读取我们这个 image2d贴图
Dispatch the shaders 渲染循环中执行材质:
第一步先tm的渲染compute shader texture,把z轴设置为1:
glUseProgram(ray_program); glDispatchCompute((GLuint)tex_w, (GLuint)tex_h, 1);
// make sure writing to image has finished before read glMemoryBarrier(GL_SHADER_IMAGE_ACCESS_BARRIER_BIT);
To make sure that the compute shaders have completely finished writing to the image before we start sampling, we put in a memory barrier with glMemoryBarrier() and the image access bit 。
为了保证图片完成之后采样,所以用了个glMemoryBarrier(),也可以You can instead use GL_ALL_BARRIER_BITS to be on the safe side for all types of writing
第二步,正常绘制到quad上:
// normal drawing pass glClear(GL_COLOR_BUFFER_BIT); glUseProgram(quad_program); glBindVertexArray(quad_vao); glActiveTexture(GL_TEXTURE0); glBindTexture(GL_TEXTURE_2D, tex_output); glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
我把贴图绘制设置成绿色:
NEXT:Raytracing
一开始是将scene直接hard-code模式写入我们compute shader。
Ray定义:origin和direction
我们想分布ray's origin 到我们像素,作者定义为NDC:-5.0到5.0 X和Y都在这个区间
float max_x = 5.0; float max_y = 5.0; ivec2 dims = imageSize(img_output); // fetch image dimensions float x = (float(pixel_coords.x * 2 - dims.x) / dims.x); float y = (float(pixel_coords.y * 2 - dims.y) / dims.y); vec3 ray_o = vec3(x * max_x, y * max_y, 0.0); vec3 ray_d = vec3(0.0, 0.0, -1.0); // ortho
在这里我不映射 -5到5 ,直接就按照一个像素 1 的大小。 整个图片中心为(0,0)点。然后每个像素的中心为光线中心。
所以应该应用:
#version 450 core layout(local_size_x = 1, local_size_y = 1) in; layout(rgba32f, binding = 0) uniform image2D img_output; void main() { const float pixel_size = 1.0f; ivec2 texsize = imageSize(img_output); // get current texture size , 500 * 500 // base pixel colour for image vec4 pixel = vec4(0.0, 0.0, 0.0, 1.0); // get index in global work group i.e x,y position ivec2 pixel_coords = ivec2(gl_GlobalInvocationID.xy); pixel.r = pixel_size * (float(pixel_coords.x) - 0.5f* float(texsize.x-1.0)); pixel.g = pixel_size * (float(pixel_coords.y) - 0.5f* float(texsize.y-1.0)); // output to a specific pixel in the image imageStore(img_output, pixel_coords, pixel); }
此时范围在xy属于-250 到 249。 如果在这个基础上做多重抖动采样可以刚好 满足 -250 到 250:
抖动采样伪代码:
// for samples for (int i = 0; i < vp.num_samples; i++) { sample_point = vp.sampler_ptr->sample_unit_square(); pixel_pos[0] = vp.size * (c - 0.5*vp.hres + sample_point[0]); pixel_pos[1] = vp.size * (r - 0.5*vp.vres + sample_point[1]); ray.o = RT_VEC_3D({ pixel_pos[0],pixel_pos[1],RT_SCALAR(zw) }); pixel_color += tracer_ptr->trace_ray(ray); } pixel_color /= vp.num_samples;
对比houdini中的测试:
Raytracing a sphere:
#version 450 core layout(local_size_x = 1, local_size_y = 1) in; layout(rgba32f, binding = 0) uniform image2D img_output; const float k=0.000001; struct Ray{ vec3 o; vec3 d; }; struct Sphere{ vec3 c; float r; }; bool intersectSphere(Ray ray, Sphere sphere){ float t; float tmin = 0.0f; float A = dot(ray.d,ray.d); float B = 2* dot(ray.o-sphere.c,ray.d); float C = dot(ray.o-sphere.c,ray.o-sphere.c) - sphere.r * sphere.r; float disc = B * B - 4.0f * A * C; if(disc <0.0 ){ return false; } else{ float e = sqrt(disc); float denom = 2.0 * A; t = (-B -e) / denom; // smaller root if(t > k){ tmin = t; return true; } t = (-B + e) / denom; // large root if(t > k){ tmin = t; return true; } } return false; } void main() { const float pixel_size = 1.0f; ivec2 texsize = imageSize(img_output); // get current texture size , 500 * 500 // base pixel colour for image vec4 pixel = vec4(0.0, 0.0, 0.0, 1.0); // get index in global work group i.e x,y position ivec2 pixel_coords = ivec2(gl_GlobalInvocationID.xy); // // interesting stuff happens here later // float x = pixel_size * (float(pixel_coords.x) - 0.5f* float(texsize.x-1.0f)); float y = pixel_size * (float(pixel_coords.y) - 0.5f* float(texsize.y-1.0f)); // Ray ray; // ortho ray.o = vec3(x,y,0.0f); ray.d = vec3(0.0f, 0.0f, -1.0f); Sphere sphere1; sphere1.c = vec3(150,0.0,-200); sphere1.r = 100.0f; if(intersectSphere(ray,sphere1)){ pixel.r = 1.0f; } // output to a specific pixel in the image imageStore(img_output, pixel_coords, pixel); }
#define GLEW_STATIC // GLEW #include <GL/glew.h> #include <cstdlib> #undef GLFW_DLL // GLFW #include <GLFW/glfw3.h> #include <iostream> #include "ALG_GLFWCamera.h" #include "ALG_FrameWindow.h" #include "ALG_DrawPostPocessingQuad.h" #include "ALG_OGLHelper.h" #include <cmath> #include <glm/glm.hpp> #include <glm/gtc/matrix_transform.hpp> #include <glm/gtc/type_ptr.hpp> #include "ALG_FPS.h" #include "ALG_LoadComputeShader.h" using namespace AlgebraMaster; const unsigned int SRC_WIDTH = 500; const unsigned int SRC_HEIGHT = 500; void init(); void display(); // camera static GLFWCamera *camera; static float lastX = float(SRC_WIDTH) / 2.0f; static float lastY = float(SRC_HEIGHT) / 2.0f; static bool firstMouse = true; static bool firstMiddowMouse = true; // timing static float deltaTime = 0.0f; // time between current frame and last frame static float lastFrame = 0.0f; static FrameWindow *frameWindow ; static DrawPostProcessingQuad<GL_RGBA32F> quad; static LoadComputeShade computeShade; GLuint rayTexture; void init(){ camera = new GLFWCamera; if(!CheckExtension("GL_ARB_shading_language_include")){ cout << "---------------ERROR:: SHADER DO NOT SUPPORT INCLUDE SHADER---------------------------------- "; } AddCommonShaderFile("shaders/common/material_interface.glsl"); AddCommonShaderFile("shaders/common/light_interface.glsl"); AddCommonShaderFile("shaders/common/shadow.glsl"); AddCommonShaderFile("shaders/common/utils.glsl"); AddCommonShaderFile("shaders/common/postprocess.glsl"); // Dump OGL infos int work_grp_size[3], work_grp_inv; // maximum global work group (total work in a dispatch) glGetIntegeri_v( GL_MAX_COMPUTE_WORK_GROUP_COUNT, 0, &work_grp_size[0] ); glGetIntegeri_v( GL_MAX_COMPUTE_WORK_GROUP_COUNT, 1, &work_grp_size[1] ); glGetIntegeri_v( GL_MAX_COMPUTE_WORK_GROUP_COUNT, 2, &work_grp_size[2] ); printf( "max global (total) work group size x:%i y:%i z:%i ", work_grp_size[0], work_grp_size[1], work_grp_size[2] ); // maximum local work group (one shader's slice) glGetIntegeri_v( GL_MAX_COMPUTE_WORK_GROUP_SIZE, 0, &work_grp_size[0] ); glGetIntegeri_v( GL_MAX_COMPUTE_WORK_GROUP_SIZE, 1, &work_grp_size[1] ); glGetIntegeri_v( GL_MAX_COMPUTE_WORK_GROUP_SIZE, 2, &work_grp_size[2] ); printf( "max local (in one shader) work group sizes x:%i y:%i z:%i ", work_grp_size[0], work_grp_size[1], work_grp_size[2] ); // maximum compute shader invocations (x * y * z) glGetIntegerv( GL_MAX_COMPUTE_WORK_GROUP_INVOCATIONS, &work_grp_inv ); printf( "max computer shader invocations %i ", work_grp_inv ); // create texture glCreateTextures(GL_TEXTURE_2D,1,&rayTexture); glBindTexture(GL_TEXTURE_2D,rayTexture); glTextureParameteri(GL_TEXTURE_2D,GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE); glTextureParameteri(GL_TEXTURE_2D,GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE); glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR ); glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR ); // same internal format as compute shader input glTexImage2D( GL_TEXTURE_2D, 0, GL_RGBA32F, SRC_WIDTH, SRC_HEIGHT, 0, GL_RGBA, GL_FLOAT, NULL ); // bind to image unit so can write to specific pixels from the shader glBindImageTexture( 0, rayTexture, 0, GL_FALSE, 0, GL_WRITE_ONLY, GL_RGBA32F ); // load compute shader computeShade.load("shaders/raytracing/constant_coulour.glsl"); // initialize our quad geo quad.initialize(); quad.setupWidthHeight(SRC_WIDTH,SRC_HEIGHT); } // ----------- Render Loop ---------- void display(){ int display_w, display_h; glfwGetFramebufferSize(frameWindow->getWindow(), &display_w, &display_h); computeShade.use(); glDispatchCompute( (GLuint)SRC_WIDTH, (GLuint)SRC_HEIGHT, 1 ); // keep z as 1 // prevent sampling befor all writes to image are done glMemoryBarrier( GL_SHADER_IMAGE_ACCESS_BARRIER_BIT ); ClearAllBufferColor(); glBindTexture(GL_TEXTURE_2D,rayTexture); quad.draw(); } void processInput(GLFWwindow *window); void framebuffer_size_callback(GLFWwindow* window, int width, int height); // framezize void mouse_callback(GLFWwindow* window, double xpos, double ypos); // Maya Alt+LeftMouse void scroll_callback(GLFWwindow *window, double xoffset, double yoffset); int main() { glfwInit(); frameWindow = new FrameWindow(SRC_WIDTH,SRC_HEIGHT); glfwSetFramebufferSizeCallback(frameWindow->getWindow(), framebuffer_size_callback); glfwSetCursorPosCallback(frameWindow->getWindow(),mouse_callback); glfwSetScrollCallback(frameWindow->getWindow(), scroll_callback); glewInit(); init(); //double lastTime = glfwGetTime(); //double targetFps = 60.0f; FPSLimit fpsLimit; fpsLimit.targetFps = 60.02f; FPSGet fpsGet; while(!glfwWindowShouldClose(frameWindow->getWindow())){ processInput(frameWindow->getWindow()); fpsLimit.limitFPS(); fpsGet.updateFPS(frameWindow->getWindow()); // Rendering objs Loop display(); glfwSwapBuffers(frameWindow->getWindow()); glfwPollEvents(); } delete camera; delete frameWindow; return 0; } void framebuffer_size_callback(GLFWwindow* window, int width, int height) { // make sure the viewport matches the new window dimensions; note that width and // height will be significantly larger than specified on retina displays. glViewport(0, 0, width, height); } void processInput(GLFWwindow *window) { if (glfwGetKey(window, GLFW_KEY_ESCAPE) == GLFW_PRESS) glfwSetWindowShouldClose(window, true); if (glfwGetKey(window, GLFW_KEY_W) == GLFW_PRESS) camera->processKeyboardMove(deltaTime,GLFWCamera::FORWARD); if (glfwGetKey(window, GLFW_KEY_S) == GLFW_PRESS) camera->processKeyboardMove(deltaTime,GLFWCamera::BACKWARD); if (glfwGetKey(window, GLFW_KEY_A) == GLFW_PRESS) camera->processKeyboardMove(deltaTime,GLFWCamera::LEFT); if (glfwGetKey(window, GLFW_KEY_D) == GLFW_PRESS) camera->processKeyboardMove(deltaTime,GLFWCamera::RIGHT); } // ROTATE VIEW DIR void mouse_callback(GLFWwindow* window, double xpos, double ypos){ int middow_mouse_state = glfwGetMouseButton(window,GLFW_MOUSE_BUTTON_MIDDLE); int mouse_state = glfwGetMouseButton(window,GLFW_MOUSE_BUTTON_LEFT); int key_state = glfwGetKey(window,GLFW_KEY_LEFT_ALT); // set up the camera view if( mouse_state == GLFW_PRESS && key_state== GLFW_PRESS) { if (firstMouse){ lastX = xpos; lastY = ypos; firstMouse = false; } float xoffset = xpos - lastX; float yoffset = lastY - ypos; // reversed since y-coordinates go from bottom to top lastX = xpos; lastY = ypos; camera->processMouseMove(xoffset,yoffset); } if (key_state == GLFW_RELEASE || mouse_state == GLFW_RELEASE){ firstMouse = true; } // Move Camera Position if( middow_mouse_state == GLFW_PRESS) { if (firstMiddowMouse){ lastX = xpos; lastY = ypos; firstMiddowMouse = false; } float xoffset = xpos - lastX; float yoffset = lastY - ypos; // reversed since y-coordinates go from bottom to top lastX = xpos; lastY = ypos; camera->pos.x += xoffset*0.01f; camera->pos.y += yoffset*0.01f; } if ( middow_mouse_state == GLFW_RELEASE){ firstMiddowMouse = true; } } void scroll_callback(GLFWwindow *window, double xoffset, double yoffset){ camera->processFov(yoffset); }
#ifndef ALG_COMPUTESHADERLOADER_H #define ALG_COMPUTESHADERLOADER_H #include <string> #define GLEW_STATIC // GLEW #include <GL/glew.h> namespace AlgebraMaster { using namespace std; struct LoadComputeShade{ void fromSrc(const string &vertCode); void load(const string &filePath); void compileShadeProgram(); LoadComputeShade(); ~LoadComputeShade(); GLuint rayShader; GLuint rayProgram; inline void use(){glUseProgram(rayProgram);} }; } #endif // ALG_COMPUTESHADERLOADER_H
#include "ALG_LoadComputeShader.h" #include <fstream> #include <sstream> #include <iostream> #include "ALG_CheckError.h" namespace AlgebraMaster{ LoadComputeShade::LoadComputeShade(){} LoadComputeShade::~LoadComputeShade(){} void LoadComputeShade::load(const string&filePath){ ifstream stream; stringstream ss; stream.exceptions(ifstream::badbit); try { stream.open(filePath); // open file ss << stream.rdbuf(); // get strings from file } catch (ifstream::failure e) { cout << "ERROR::OPEN FILE:" << filePath << endl; } // close file handle stream.close(); // get str() from stringstream string shaderCode = ss.str(); fromSrc(shaderCode); } void LoadComputeShade::fromSrc(const string &vertCode){ rayShader = glCreateShader(GL_COMPUTE_SHADER); const char * src = vertCode.c_str(); // add common shader glShaderSource(rayShader, 1, &(src), NULL ); // ----add common shader include path------ const char* const searchPath = "shaders/common/"; glCompileShaderIncludeARB(rayShader, 1, &searchPath, nullptr); // compile shader glCompileShader(rayShader ); CheckShader(rayShader); // compile our program compileShadeProgram(); } void LoadComputeShade::compileShadeProgram(){ rayProgram = glCreateProgram(); glAttachShader(rayProgram, rayShader ); glLinkProgram(rayProgram ); CheckShadeProgramLink(rayProgram); // Delete the allShaders as they're linked into our program now and no longer necessery glDeleteShader(rayShader ); } }// end of namespace
ray dir light with plane:
#version 450 core layout(local_size_x = 1, local_size_y = 1) in; layout(rgba32f, binding = 0) uniform image2D img_output; const float k=0.000001; struct Ray{ vec3 o; vec3 d; }; struct Sphere{ vec3 c; float r; }; struct Plane{ vec3 P; vec3 N; }; struct RayHit{ vec3 hitP; vec3 hitN; bool state; float t; // o + td }; RayHit intersectSphere(Ray ray, Sphere sphere){ float t; float tmin = 111111110.0f; vec3 temp = ray.o - sphere.c; float A = dot(ray.d,ray.d); float B = 2 * dot(temp, ray.d); float C = dot(ray.o-sphere.c,ray.o-sphere.c) - sphere.r * sphere.r; float disc = B * B - 4.0f * A * C; RayHit hit; hit.state = false; // default = false if(disc <0.0 ) hit.state = false; else{ float e = sqrt(disc); float denom = 2.0 * A; t = (-B -e) / denom; // smaller root if(t > k){ tmin = t; hit.state = true; hit.t = t; hit.hitP = ray.o + t*ray.d; hit.hitN = normalize(hit.hitP - sphere.c); return hit; } t = (-B + e) / denom; // large root if(t > k){ tmin = t; hit.state = true; hit.t = t; hit.hitP = ray.o + t*ray.d; hit.hitN = normalize(hit.hitP - sphere.c); return hit; } } return hit; } RayHit intersectPlane(Ray ray, Plane plane) { float t = dot( plane.P - ray.o, plane.N) / dot(ray.d , plane.N); RayHit hit; hit.state = false; if(t>k){ hit.state = true; hit.hitP = ray.o + t * ray.d; hit.hitN = plane.N; return hit; }else{ hit.state =false; return hit; } } void main() { const float pixel_size = 1.0f; ivec2 texsize = imageSize(img_output); // get current texture size , 500 * 500 // base pixel colour for image vec4 pixel = vec4(0.0, 0.0, 0.0, 1.0); // get index in global work group i.e x,y position ivec2 pixel_coords = ivec2(gl_GlobalInvocationID.xy); // // interesting stuff happens here later // float x = pixel_size * (float(pixel_coords.x) - 0.5f* float(texsize.x-1.0f)); float y = pixel_size * (float(pixel_coords.y) - 0.5f* float(texsize.y-1.0f)); // Ray ray; // ortho ray.o = vec3(x,y,0.0f); ray.d = vec3(0.0f, 0.0f, -1.0f); // scene objs Sphere sphere1; sphere1.c = vec3(0,0.0,-200); sphere1.r = 100.0f; Plane plane= {vec3(0,0,0), vec3(0,1,0)}; vec3 lgtdir = normalize(vec3(1,1,0)); RayHit hit; hit = intersectPlane(ray,plane); if(hit.state){ float radiance = dot(lgtdir,hit.hitN); pixel = vec4(vec3(radiance),1.0f); } hit = intersectSphere(ray,sphere1); if(hit.state){ float radiance = dot(lgtdir,hit.hitN); pixel = vec4(vec3(radiance),1.0f); } // output to a specific pixel in the image imageStore(img_output, pixel_coords, pixel); }
Ray hard-code scene:
这里改成了-1到-1 范围,可以让场景范围适当改小:
#version 450 core layout(local_size_x = 1, local_size_y = 1) in; layout(rgba32f, binding = 0) uniform image2D img_output; struct Ray{ vec3 o; vec3 d; }; struct Sphere{ vec3 c; float r; }; struct Plane{ vec3 P; // cross this plane vec3 N; }; struct RayHit{ vec3 hitP; vec3 hitN; float t;// o + td }; float random (vec2 st) { return fract(sin(dot(st.xy, vec2(12.9898,78.233)))* 43758.5453123); } vec2 random2(vec2 st){ float x = random(st); float y = random(st*4358.954 + vec2(213.031,823.113)); return vec2(x,y); } struct Scene{ Sphere sphere; Plane plane; }; bool raySphere(inout Ray ray, inout Sphere sphere, float tmin, float tmax, inout RayHit hitRecord){ float t; vec3 temp = ray.o - sphere.c; float A = dot(ray.d, ray.d); float B = 2 * dot(temp, ray.d); float C = dot(ray.o-sphere.c, ray.o-sphere.c) - sphere.r * sphere.r; float disc = B * B - 4.0f * A * C; if (disc <0.0) return false; else { float e = sqrt(disc); float denom = 2.0 * A; t = (-B -e) / denom;// smaller root if (t > tmin && t< tmax){ hitRecord.t = t; hitRecord.hitP = ray.o + t*ray.d; hitRecord.hitN = normalize(hitRecord.hitP - sphere.c); return true; } t = (-B + e) / denom;// large root if (t > tmin && t<tmax){ hitRecord.t = t; hitRecord.hitP = ray.o + t*ray.d; hitRecord.hitN = normalize(hitRecord.hitP - sphere.c); return true; } } return false; } bool rayPlane(inout Ray ray, inout Plane plane, float tmin, float tmax, inout RayHit hitRecord){ vec3 n = normalize(plane.N); double denom = dot(ray.d,n); double t = dot((plane.P - ray.o) , n) / denom; if(t > tmin && t< tmax){ hitRecord.t = float(t); hitRecord.hitP = ray.o + float(t)*ray.d; hitRecord.hitN = plane.N; return true; } return false; } bool worldHit(inout Ray ray, inout Scene scene , float tmin , float tmax, inout RayHit rec){ bool hit_anyting = false; float close_so_far = tmax; RayHit temp_rec; if(rayPlane(ray,scene.plane,tmin,close_so_far, temp_rec) ){ hit_anyting = true; close_so_far = temp_rec.t; rec = temp_rec; } if(raySphere(ray,scene.sphere,tmin,close_so_far,temp_rec)){ hit_anyting = true; close_so_far = temp_rec.t; rec = temp_rec; } return hit_anyting; } vec3 color(inout Ray ray, inout Scene scene){ float tmax = 10000000; // default very far scene RayHit record; // Hit record if(worldHit(ray,scene,0.0, tmax,record )){ return record.hitN; } return vec3(0.0); } int samples = 1; void main() { const float pixel_size = 1.0f; ivec2 texsize = imageSize(img_output);// get current texture size , 500 * 500 // base pixel colour for image vec4 pixel = vec4(0.0, 0.0, 0.0, 1.0); // get index in global work group i.e x,y position ivec2 pixel_coords = ivec2(gl_GlobalInvocationID.xy); // // interesting stuff happens here later // // scene objs Sphere sphere1; sphere1.c = vec3(0, 0, -1.0); sphere1.r = 0.5; Plane plane1; plane1.P = vec3(0,-0.49,0); plane1.N = vec3(0,1,0); // Keep 1:1 aspect , that is a square image float x = pixel_size * (float(pixel_coords.x) - 0.5f* float(texsize.x-1.0f)); float y = pixel_size * (float(pixel_coords.y) - 0.5f* float(texsize.y-1.0f)); x /= float(texsize.x/2); y /= float(texsize.y/2); // define the ray Ray ray; ray.o = vec3(0, 0, 10.0f); // o as the camera pos ray.d = vec3(x,y,0) - ray.o; // d is viewplane.pos - camera.pos ray.d = normalize(ray.d); // Scene scene; scene.plane = plane1; scene.sphere = sphere1; pixel.rgb = color(ray, scene); // output to a specific pixel in the image imageStore(img_output, pixel_coords, pixel); }
抗锯齿采样:用的最垃圾随机采样
构建一个宽度为主的画面,比如Raytracing from one weekend是2:1的画面比例。
#version 450 core layout(local_size_x = 1, local_size_y = 1) in; layout(rgba32f, binding = 0) uniform image2D img_output; struct Ray{ vec3 o; vec3 d; }; struct Sphere{ vec3 c; float r; }; struct Plane{ vec3 P; // cross this plane vec3 N; }; struct RayHit{ vec3 hitP; vec3 hitN; float t;// o + td }; struct Scene{ Sphere sphere; Plane plane; }; float rand(float n){return fract(sin(n) * 43758.5453123);} bool raySphere(inout Ray ray, inout Sphere sphere, float tmin, float tmax, inout RayHit hitRecord){ float t; vec3 temp = ray.o - sphere.c; float A = dot(ray.d, ray.d); float B = 2 * dot(temp, ray.d); float C = dot(ray.o-sphere.c, ray.o-sphere.c) - sphere.r * sphere.r; float disc = B * B - 4.0f * A * C; if (disc <0.0) return false; else { float e = sqrt(disc); float denom = 2.0 * A; t = (-B -e) / denom;// smaller root if (t > tmin && t< tmax){ hitRecord.t = t; hitRecord.hitP = ray.o + t*ray.d; hitRecord.hitN = normalize(hitRecord.hitP - sphere.c); return true; } t = (-B + e) / denom;// large root if (t > tmin && t<tmax){ hitRecord.t = t; hitRecord.hitP = ray.o + t*ray.d; hitRecord.hitN = normalize(hitRecord.hitP - sphere.c); return true; } } return false; } bool rayPlane(inout Ray ray, inout Plane plane, float tmin, float tmax, inout RayHit hitRecord){ vec3 n = normalize(plane.N); double denom = dot(ray.d,n); double t = dot((plane.P - ray.o) , n) / denom; if(t > tmin && t< tmax){ hitRecord.t = float(t); hitRecord.hitP = ray.o + float(t)*ray.d; hitRecord.hitN = plane.N; return true; } return false; } bool worldHit(inout Ray ray, inout Scene scene , float tmin , float tmax, inout RayHit rec){ bool hit_anyting = false; float close_so_far = tmax; RayHit temp_rec; if(rayPlane(ray,scene.plane,tmin,close_so_far, temp_rec) ){ hit_anyting = true; close_so_far = temp_rec.t; rec = temp_rec; } if(raySphere(ray,scene.sphere,tmin,close_so_far,temp_rec)){ hit_anyting = true; close_so_far = temp_rec.t; rec = temp_rec; } return hit_anyting; } vec3 color(inout Ray ray, inout Scene scene){ float tmax = 10000000; // default very far scene RayHit record; // Hit record if(worldHit(ray,scene,0.0, tmax,record )){ return record.hitN; } else { vec3 unit_direction = ray.d; float t = 0.5 *(unit_direction.y + 1.0); return (1.0-t)*vec3(1.0,1.0,1.0) + t*vec3(0.5,0.7,1.0); } } float rand(vec2 n) { return fract(sin(dot(n, vec2(12.9898, 4.1414))) * 43758.5453); } float PHI = 1.61803398874989484820459 * 00000.1; // Golden Ratio float PI = 3.14159265358979323846264 * 00000.1; // PI float SQ2 = 1.41421356237309504880169 * 10000.0; // Square Root of Two float gold_noise(in vec2 coordinate, in float seed){ return fract(tan(distance(coordinate*(seed+PHI), vec2(PHI, PI)))*SQ2); } int MIN = -2147483648; int MAX = 2147483647; // A single iteration of Bob Jenkins' One-At-A-Time hashing algorithm. uint hash( uint x ) { x += ( x << 10u ); x ^= ( x >> 6u ); x += ( x << 3u ); x ^= ( x >> 11u ); x += ( x << 15u ); return x; } // Compound versions of the hashing algorithm I whipped together. uint hash( uvec2 v ) { return hash( v.x ^ hash(v.y) ); } uint hash( uvec3 v ) { return hash( v.x ^ hash(v.y) ^ hash(v.z) ); } uint hash( uvec4 v ) { return hash( v.x ^ hash(v.y) ^ hash(v.z) ^ hash(v.w) ); } // Construct a float with half-open range [0:1] using low 23 bits. // All zeroes yields 0.0, all ones yields the next smallest representable value below 1.0. float floatConstruct( uint m ) { const uint ieeeMantissa = 0x007FFFFFu; // binary32 mantissa bitmask const uint ieeeOne = 0x3F800000u; // 1.0 in IEEE binary32 m &= ieeeMantissa; // Keep only mantissa bits (fractional part) m |= ieeeOne; // Add fractional part to 1.0 float f = uintBitsToFloat( m ); // Range [1:2] return f - 1.0; // Range [0:1] } // Pseudo-random value in half-open range [0:1]. float random( float x ) { return floatConstruct(hash(floatBitsToUint(x))); } float random( vec2 v ) { return floatConstruct(hash(floatBitsToUint(v))); } float random( vec3 v ) { return floatConstruct(hash(floatBitsToUint(v))); } float random( vec4 v ) { return floatConstruct(hash(floatBitsToUint(v))); } void main() { const float pixel_size = 1.0f; ivec2 texsize = imageSize(img_output);// get current texture size , 500 * 500 // base pixel colour for image vec4 pixel = vec4(0.0, 0.0, 0.0, 1.0); // get index in global work group i.e x,y position ivec2 pixel_coords = ivec2(gl_GlobalInvocationID.xy); // // interesting stuff happens here later // // scene objs Sphere sphere1; sphere1.c = vec3(0, 0, -1.0); sphere1.r = 0.5; Plane plane1; plane1.P = vec3(0,-0.49,0); plane1.N = vec3(0,1,0); Scene scene; scene.plane = plane1; scene.sphere = sphere1; int samples = 10; for(int i=0;i<samples;i++){ // Keep 1:1 aspect , that is a square image float ox = 0.0f; float oy = 0.0f; if(false){ ox = gold_noise(pixel_coords,0+i); oy = gold_noise(pixel_coords,1+i); ox = clamp(ox,0,1); oy = clamp(oy,0,1); } else{ // ---------------- GEN unit-square sampers -------------- float seedx = (float(i) + 1.0f) * 231; float seedy= (float(i) + 5000.0f) * 2311; ox = random(seedx); oy = random(seedy); } float x = pixel_size * (float(pixel_coords.x) - 0.5f* float(texsize.x -1.0f ) + ox ); float y = pixel_size * (float(pixel_coords.y) - 0.5f* float(texsize.y -1.0f ) + oy ); float aspect = float(texsize.y) / float(texsize.x); x /= float(texsize.x)/2.0f; y /= float(texsize.y)/2.0f; x /= aspect; // define the ray Ray ray; ray.o = vec3(0, 0, 10.0f); // o as the camera pos ray.d = vec3(x,y,0) - ray.o; // d is viewplane.pos - camera.pos ray.d = normalize(ray.d); // pixel.rgb += color(ray, scene); } pixel.rgb /= float(samples); // output to a specific pixel in the image imageStore(img_output, pixel_coords, pixel); }
环境遮挡:
把球位移到法线中心,也就是球中心为: hitP + hitN ,那么:
ray.o = hitP;
ray.d = hitP+hitN + unit_sample_sphere() - ray.o;
然后用递归循环:
递归结束条件就是光线一直追踪不到的时候,然后返回背景色,也就是他的else{}代码块。原文是将全球放到hitP上方,然后做一直递归循环直到收敛。
由于GLSL不能写递归,所以只能一次获取一个sampler来获取环境。当然噪点满屏
#version 450 core layout(local_size_x = 1, local_size_y = 1) in; layout(rgba32f, binding = 0) uniform image2D img_output; float PI = 3.1415926; struct Ray{ vec3 o; vec3 d; }; struct Sphere{ vec3 c; float r; vec3 Cd; }; struct Plane{ vec3 P; // cross this plane vec3 N; vec3 Cd; }; struct RayHit{ vec3 hitP; vec3 hitN; float t;// o + td vec3 hitCd; }; struct Scene{ Sphere spheres[3]; Plane plane; }; float rand(float n){return fract(sin(n) * 43758.5453123);} bool raySphere(inout Ray ray, inout Sphere sphere, float tmin, float tmax, inout RayHit hitRecord){ float t; vec3 temp = ray.o - sphere.c; float A = dot(ray.d, ray.d); float B = 2 * dot(temp, ray.d); float C = dot(ray.o-sphere.c, ray.o-sphere.c) - sphere.r * sphere.r; float disc = B * B - 4.0f * A * C; if (disc <0.0) return false; else { float e = sqrt(disc); float denom = 2.0 * A; t = (-B -e) / denom;// smaller root if (t > tmin && t< tmax){ hitRecord.t = t; hitRecord.hitP = ray.o + t*ray.d; hitRecord.hitN = normalize(hitRecord.hitP - sphere.c); hitRecord.hitCd = sphere.Cd; return true; } t = (-B + e) / denom;// large root if (t > tmin && t<tmax){ hitRecord.t = t; hitRecord.hitP = ray.o + t*ray.d; hitRecord.hitN = normalize(hitRecord.hitP - sphere.c); hitRecord.hitCd = sphere.Cd; return true; } } return false; } bool rayPlane(inout Ray ray, inout Plane plane, float tmin, float tmax, inout RayHit hitRecord){ vec3 n = normalize(plane.N); double denom = dot(ray.d,n); double t = dot((plane.P - ray.o) , n) / denom; if(t > tmin && t< tmax){ hitRecord.t = float(t); hitRecord.hitP = ray.o + float(t)*ray.d; hitRecord.hitN = plane.N; hitRecord.hitCd = plane.Cd; return true; } return false; } bool worldHit(inout Ray ray, inout Scene scene , float tmin , float tmax, inout RayHit rec){ bool hit_anyting = false; float close_so_far = tmax; // raytracing plane RayHit temp_rec; if(rayPlane(ray,scene.plane,tmin,close_so_far, temp_rec) ){ hit_anyting = true; close_so_far = temp_rec.t; rec = temp_rec; } // raytracing spheres for(int i=0;i<3;i++){ if(raySphere(ray,scene.spheres[i],tmin,close_so_far,temp_rec)){ hit_anyting = true; close_so_far = temp_rec.t; rec = temp_rec; } } return hit_anyting; } uint rand(inout uint state) { uint x = state; x ^= x << 13; x ^= x >> 17; x ^= x << 15; state = x; return x; } // ------------------------------------------------------------------ float random_float_01(inout uint state) { return (rand(state) & uint(0xFFFFFF)) / 16777216.0f; } uint g_state =123; // global state for random vec3 random_in_unit_sphere(inout uint state) { float d, x, y, z; do { x = random_float_01(state) * 2.0 - 1.0; state +=1 ; y = random_float_01(state) * 2.0 - 1.0; state +=3 ; z = random_float_01(state) * 2.0 - 1.0; d = x*x + y*y + z*z; } while(d > 1.0); return vec3(x, y, z); } vec3 color(inout Ray ray, inout Scene scene){ vec3 unit_direction = ray.d; float t = 0.5 *(unit_direction.y + 1.0); vec3 background = (1.0-t)*vec3(1.0,1.0,1.0) + t*vec3(0.5,0.7,1.0); float tmax = 123213214; // default very far scene RayHit record; // Hit record if( worldHit(ray,scene,0.0, tmax, record )){ vec3 target = record.hitP + record.hitN + random_in_unit_sphere(g_state); Ray tempRay; tempRay.o = record.hitP; tempRay.d = normalize(target - record.hitP); RayHit newRec; if(worldHit(tempRay,scene,0.0, tmax, newRec ) ) { // if in shadow hit return vec3(0); } else{ return background; // else return back ground color } //return record.hitCd; } else { return background; } } void main() { const float pixel_size = 1.0f; ivec2 texsize = imageSize(img_output);// get current texture size , 500 * 500 // base pixel colour for image vec4 pixel = vec4(0.0, 0.0, 0.0, 1.0); // get index in global work group i.e x,y position ivec2 pixel_coords = ivec2(gl_GlobalInvocationID.xy); // // interesting stuff happens here later // // scene objs Sphere sphere1; sphere1.c = vec3(0, 0, -1.0); sphere1.r = 0.5; sphere1.Cd = vec3(1,0,0); Sphere sphere2; sphere2.c = vec3(-1.1, 0, -1.0); sphere2.r = 0.5; sphere2.Cd = vec3(0,1,0); Sphere sphere3; sphere3.c = vec3(1.1, 0, -1.0); sphere3.r = 0.5; sphere3.Cd = vec3(0,0,1); Plane plane1; plane1.P = vec3(0,-0.49,0); plane1.N = vec3(0,1,0); plane1.Cd = vec3(1,1,1); Scene scene; scene.plane = plane1; scene.spheres[0] = sphere1; scene.spheres[1] = sphere2; scene.spheres[2] = sphere3; int samples = 8; vec3 acccolor = vec3(0); for(int i=0;i<samples;i++){ // Keep 1:1 aspect , that is a square image float ox = 0.0f; float oy = 0.0f; uint seedx = int( pixel_coords.x + i*132) * 231; uint seedy = int( pixel_coords.y + i*126) * 2311; ox = random_float_01(seedx); oy = random_float_01(seedy); float x = pixel_size * (float(pixel_coords.x) - 0.5f* float(texsize.x -1.0f ) + ox ); float y = pixel_size * (float(pixel_coords.y) - 0.5f* float(texsize.y -1.0f ) + oy ); float aspect = float(texsize.y) / float(texsize.x); x /= float(texsize.x)/2.0f; y /= float(texsize.y)/2.0f; x /= aspect; // define the ray Ray ray; ray.o = vec3(0, 0, 10.0f); // o as the camera pos ray.d = vec3(x,y,0) - ray.o; // d is viewplane.pos - camera.pos ray.d = normalize(ray.d); // acccolor += color(ray, scene); } acccolor /= float(samples); //acccolor = random_in_unit_sphere(g_state); // output to a specific pixel in the image imageStore(img_output, pixel_coords, vec4(acccolor,1.0f)); }
代码的gstate 就是个全局变量对于每个像素来说都是属于一个像素内变量,所以可以使用它来追踪随机函数的值变化。
实际上(Fixing Shadow Acne)噪点的出现,经过我测试,由于raybias的原因,对代码稍微做修改(下面改了camera的origin):
vec3 color(inout Ray ray, inout Scene scene){ vec3 unit_direction = ray.d; float t = 0.5 *(unit_direction.y + 1.0); vec3 background = (1.0-t)*vec3(1.0,1.0,1.0) + t*vec3(0.5,0.7,1.0); float bias = 0.0001f; float tmax = 123213214; // default very far scene RayHit record; // Hit record if( worldHit(ray,scene,0.0, tmax, record )){ vec3 target = record.hitP + record.hitN + unit_sphere_sample(g_state); Ray tempRay; tempRay.o = record.hitP + record.hitN * bias; tempRay.d = normalize(target - record.hitP); RayHit newRec; if(worldHit(tempRay,scene,0.0, tmax, newRec ) ) { // if in shadow hit return vec3(0.2); } else{ return vec3(1.0); // else return back ground color } //return record.hitCd; } else { return background; } }
当然(Fixing Shadow Acne)也可以直接把tmin = 0.000001f;
RayHit newRec; if(worldHit(tempRay,scene,0.000001f, tmax, newRec ) ) { // if in shadow hit return vec3(0.2); } else{ return vec3(1.0); // else return back ground color }
接下来用cos-weighted半球分布完成,这里面都是随机采样方法.
vec2 unit_square_sample( float i){ //uint sx = i+152; //uint sy = i*100+317; //return vec2(rand(sx),rand(sy)); float x = random(i + 10); float y = random(i + 10000); return vec2(x,y); } vec3 unit_sphere_sample(inout uint state) { float d, x, y, z; do { x = rand(state) * 2.0 - 1.0; state +=1 ; y = rand(state) * 2.0 - 1.0; state +=3 ; z = rand(state) * 2.0 - 1.0; d = x*x + y*y + z*z; } while(d > 1.0); return vec3(x, y, z); } // Xi should use unit_square_sample() get vec3 cos_weighted_hemisphere(vec2 Xi,float e){ float cos_phi = cos(2.0f * PI * Xi.x); float sin_phi = sin(2.0f * PI * Xi.x); float cos_theta = pow((1.0f - Xi.y) , 1.0f / (e+1.0)); float sin_theta = sqrt(1.0f - cos_theta * cos_theta); // Generation Z based sphere float pu = sin_theta * cos_phi; float pv = sin_theta * sin_phi; float pw = cos_theta; vec3 H = vec3(pu,pv,pw); return H; } // align the input vector to Normal vec3 sample_on_normal(vec3 inputSample,vec3 N){ // from tangent-space vector to world-space sample vector //vec3 up = abs(N.z) < 0.999 ? vec3(0.0, 0.0, 1.0) : vec3(1.0, 0.0, 0.0); vec3 up = vec3(0.000001f,1.0,0); up = normalize(up); vec3 tangent = normalize(cross(up, N)); vec3 bitangent = cross(N, tangent); bitangent = normalize(bitangent); vec3 sampleVec = tangent * inputSample.x + bitangent * inputSample.y + N * inputSample.z; sampleVec = normalize(sampleVec); return sampleVec; }
uint randbit(inout uint state) { uint x = state; x ^= x << 13; x ^= x >> 17; x ^= x << 15; state = x; return x; } // ------------------------------------------------------------------ // return 0-1 float rand(inout uint state) { return float(randbit(state) & uint(0xFFFFFF)) / 16777216.0f; } // ----------------------------- HDK liked Noise ------------------------------------------------------- // A single iteration of Bob Jenkins' One-At-A-Time hashing algorithm. uint hash( uint x ) { x += ( x << 10u ); x ^= ( x >> 6u ); x += ( x << 3u ); x ^= ( x >> 11u ); x += ( x << 15u ); return x; } // Compound versions of the hashing algorithm I whipped together. uint hash( uvec2 v ) { return hash( v.x ^ hash(v.y) ); } uint hash( uvec3 v ) { return hash( v.x ^ hash(v.y) ^ hash(v.z) ); } uint hash( uvec4 v ) { return hash( v.x ^ hash(v.y) ^ hash(v.z) ^ hash(v.w) ); } // Construct a float with half-open range [0:1] using low 23 bits. // All zeroes yields 0.0, all ones yields the next smallest representable value below 1.0. float floatConstruct( uint m ) { const uint ieeeMantissa = 0x007FFFFFu; // binary32 mantissa bitmask const uint ieeeOne = 0x3F800000u; // 1.0 in IEEE binary32 m &= ieeeMantissa; // Keep only mantissa bits (fractional part) m |= ieeeOne; // Add fractional part to 1.0 float f = uintBitsToFloat( m ); // Range [1:2] return f - 1.0; // Range [0:1] } // Pseudo-random value in half-open range [0:1]. float random( float x ) { return floatConstruct(hash(floatBitsToUint(x))); } float random( vec2 v ) { return floatConstruct(hash(floatBitsToUint(v))); } float random( vec3 v ) { return floatConstruct(hash(floatBitsToUint(v))); } float random( vec4 v ) { return floatConstruct(hash(floatBitsToUint(v))); } // ----------------------------- HDK liked Noise ------------------------------------------------------- float PHI = 1.61803398874989484820459; // Φ = Golden Ratio float gold_noise(in vec2 xy, in float seed){ return fract(tan(distance(xy*PHI, xy)*seed)*xy.x); }
#version 450 core #extension GL_ARB_shading_language_include : require layout(local_size_x = 1, local_size_y = 1) in; layout(rgba32f, binding = 0) uniform image2D img_output; #include "/shaders/raytracing/common/constant.glsl" // first include constant #include "/shaders/raytracing/common/rand.glsl" // this have rand #include "/shaders/raytracing/common/samples.glsl" // this will use the constant mem, rand() struct Ray{ vec3 o; vec3 d; }; struct Sphere{ vec3 c; float r; vec3 Cd; }; struct Plane{ vec3 P; // cross this plane vec3 N; vec3 Cd; }; struct RayHit{ vec3 hitP; vec3 hitN; float t;// o + td vec3 hitCd; }; struct Scene{ Sphere spheres[3]; Plane plane; }; bool raySphere(inout Ray ray, inout Sphere sphere, float tmin, float tmax, inout RayHit hitRecord){ vec3 oc = ray.o- sphere.c; float a = dot(ray.d,ray.d); float half_b = dot(oc, ray.d); float c = dot(oc,oc) - sphere.r * sphere.r; float discriminant = half_b*half_b - a*c; if (discriminant > 0) { float root = sqrt(discriminant); float temp = (-half_b - root)/a; if (temp < tmax && temp > tmin) { hitRecord.t = float(temp); hitRecord.hitP = ray.o + hitRecord.t * ray.d; hitRecord.hitN = (hitRecord.hitP - sphere.c) / sphere.r; hitRecord.hitCd = sphere.Cd; return true; } temp = (-half_b + root) / a; if (temp < tmax && temp > tmin) { hitRecord.t = float(temp); hitRecord.hitP = ray.o + hitRecord.t * ray.d; hitRecord.hitN = (hitRecord.hitP - sphere.c) / sphere.r; hitRecord.hitCd = sphere.Cd; return true; } } return false; } bool rayPlane(inout Ray ray, inout Plane plane, float tmin, float tmax, inout RayHit hitRecord){ vec3 n = normalize(plane.N); double denom = dot(ray.d,n); double t = dot((plane.P - ray.o) , n) / denom; if(t > tmin && t< tmax){ hitRecord.t = float(t); hitRecord.hitP = ray.o + float(t)*ray.d; hitRecord.hitN = plane.N; hitRecord.hitCd = plane.Cd; return true; } return false; } bool worldHit(inout Ray ray, inout Scene scene , float tmin , float tmax, inout RayHit rec){ bool hit_anyting = false; float close_so_far = tmax; // raytracing plane RayHit temp_rec; if(rayPlane(ray,scene.plane,tmin,close_so_far, temp_rec) ){ hit_anyting = true; close_so_far = temp_rec.t; rec = temp_rec; } // raytracing spheres for(int i=0;i<3;i++){ if(raySphere(ray,scene.spheres[i],tmin,close_so_far,temp_rec)){ hit_anyting = true; close_so_far = temp_rec.t; rec = temp_rec; } } return hit_anyting; } uint g_state = 0; // global state for random const int cos_samples = 64; vec3 color(inout Ray ray, inout Scene scene){ vec3 unit_direction = ray.d; float t = 0.5 *(unit_direction.y + 1.0); vec3 background = (1.0-t)*vec3(1.0,1.0,1.0) + t*vec3(0.5,0.7,1.0); float bias = .00001f; float tmax = 34132145; // default very far scene RayHit record; // Hit record if( worldHit(ray,scene,0.0, tmax, record )){ vec3 cos_weighted_L = vec3(0); for(int i=0;i<cos_samples;i++){ g_state += i; float x = rand(g_state); g_state += i; float y = rand(g_state); vec2 Xi = vec2(x,y); float cos_weight = 0.001f; vec3 z_hemisphere = cos_weighted_hemisphere(Xi, cos_weight); vec3 hemisphere = z_hemisphere; vec3 dir = sample_on_normal(hemisphere, record.hitN); // next ray tracing Ray tempRay; tempRay.o = record.hitP + record.hitN * bias ; tempRay.d = normalize(dir ); RayHit newRec; if(worldHit(tempRay,scene,0.0, tmax, newRec ) ) { // if in shadow hit cos_weighted_L += vec3(0); } else{ cos_weighted_L += background; // else return back ground color } } cos_weighted_L /= float(cos_samples); cos_weighted_L *= record.hitCd; // add sphere default color return cos_weighted_L; } else { return background; } } void main() { const float pixel_size = 1.0f; ivec2 texsize = imageSize(img_output);// get current texture size , 500 * 500 // base pixel colour for image vec4 pixel = vec4(0.0, 0.0, 0.0, 1.0); // get index in global work group i.e x,y position ivec2 pixel_coords = ivec2(gl_GlobalInvocationID.xy); // // interesting stuff happens here later // g_state = gl_GlobalInvocationID.x * 1973 + gl_GlobalInvocationID.y * 9277 + 2699 | 1; // scene objs Sphere sphere1; sphere1.c = vec3(0, 0, -1.0); sphere1.r = 0.5; sphere1.Cd = vec3(1,0,0); Sphere sphere2; sphere2.c = vec3(-1.1, 0, -1.0); sphere2.r = 0.5; sphere2.Cd = vec3(0,1,0); Sphere sphere3; sphere3.c = vec3(1.1, 0, -1.0); sphere3.r = 0.5; sphere3.Cd = vec3(0,0,1); Plane plane1; plane1.P = vec3(0,-0.49,0); plane1.N = vec3(0,1,0); plane1.Cd = vec3(1,1,1); Scene scene; scene.plane = plane1; scene.spheres[0] = sphere1; scene.spheres[1] = sphere2; scene.spheres[2] = sphere3; //genCosWeigtedSamples(); int AA = 6; vec3 acccolor = vec3(0); for(int i=0;i<AA;i++){ // Keep 1:1 aspect , that is a square image float ox = 0.0f; float oy = 0.0f; ox = rand(g_state); oy = rand(g_state); float x = pixel_size * (float(pixel_coords.x) - 0.5f* float(texsize.x -1.0f ) + ox ); float y = pixel_size * (float(pixel_coords.y) - 0.5f* float(texsize.y -1.0f ) + oy ); float aspect = float(texsize.y) / float(texsize.x); x /= float(texsize.x)/2.0f; y /= float(texsize.y)/2.0f; x /= aspect; // define the ray Ray ray; ray.o = vec3(0, 0, 5.0f); // o as the camera pos ray.d = vec3(x,y,0) - ray.o; // d is viewplane.pos - camera.pos ray.d = normalize(ray.d); // acccolor += color(ray, scene); } acccolor /= float(AA); //acccolor = vec3(random(pixel_coords),random(pixel_coords+0.0001f),random(pixel_coords+0.001f) ); acccolor = sqrt(acccolor); // output to a specific pixel in the image imageStore(img_output, pixel_coords, vec4(acccolor,1.0f)); }
当然如上面(Fixing Shadow Acne)要解决raybias问题也可以直接设置tmin
vec3 color(inout Ray ray, inout Scene scene){ vec3 unit_direction = ray.d; float t = 0.5 *(unit_direction.y + 1.0); vec3 background = (1.0-t)*vec3(1.0,1.0,1.0) + t*vec3(0.5,0.7,1.0); //float bias = .000001f; float tmin = 0.00001f; float tmax = 34132145; // default very far scene RayHit record; // Hit record if( worldHit(ray,scene,tmin, tmax, record )){ vec3 cos_weighted_L = vec3(0); for(int i=0;i<cos_samples;i++){ g_state += i; float x = rand(g_state); g_state += i; float y = rand(g_state); vec2 Xi = vec2(x,y); float cos_weight = 0.001f; vec3 z_hemisphere = cos_weighted_hemisphere(Xi, cos_weight); vec3 hemisphere = z_hemisphere; vec3 dir = sample_on_normal(hemisphere, record.hitN); // next ray tracing Ray tempRay; tempRay.o = record.hitP ; tempRay.d = normalize(dir ); RayHit newRec; if(worldHit(tempRay,scene,tmin, tmax, newRec ) ) { // if in shadow hit cos_weighted_L += vec3(0); } else{ cos_weighted_L += background; // else return back ground color } } cos_weighted_L /= float(cos_samples); cos_weighted_L *= record.hitCd; // add sphere default color return cos_weighted_L; } else { return background; } }
REF:http://antongerdelan.net/opengl/compute.html
Raytracing from ground up
https://relativity.net.au/gaming/glsl/Functions.html
Raytracing in one weekend
https://gist.github.com/patriciogonzalezvivo/670c22f3966e662d2f83
rand:https://www.itranslater.com/qa/details/2126153185973765120,史上最好的rand,和houdini一样
https://github.com/diharaw/GPUPathTracer progressive rendering